[PDF] Polynomial modular product verification and its implications

Abstract

Polynomial multiplication is known to have quasi-linear complexity in both the dense and the sparse cases. Yet no truly linear algorithm has been given in any case for the problem, and it is not clear whether it is even possible. This leaves room for a better algorithm for the simpler problem of verifying a polynomial product. While finding deterministic methods seems out of reach, there exist probabilistic algorithms for the problem that are optimal in number of algebraic operations. We study the generalization of the problem to the verification of a polynomial product modulo a sparse divisor. We investigate its bit complexity for both dense and sparse multiplicands. In particular, we are able to show the primacy of the verification over modular multiplication when the divisor has a constant sparsity and a second highest-degree monomial that is not too large. We use these results to obtain new bounds on the bit complexity of the standard polynomial multiplication verification. In particular, we provide optimal algorithms in the bit complexity model in the dense case by improving a result of Kaminski and develop the first quasi-optimal algorithm for verifying sparse polynomial product.

Full PDF

PPolynomial modular product veriﬁcation and its implications

Pascal Giorgi Bruno Grenet Armelle Perret du CrayLIRMM, Univ. Montpellier, CNRSMontpellier, France { pascal.giorgi,bruno.grenet,armelle.perret-du-cray } @lirmm.frJanuary 7, 2021 Abstract

Polynomial multiplication is known to have quasi-linear complexity in both the dense and the sparsecases. Yet no truly linear algorithm has been given in any case for the problem, and it is not clear whether itis even possible. This leaves room for a better algorithm for the simpler problem of verifying a polynomialproduct. While ﬁnding deterministic methods seems out of reach, there exist probabilistic algorithms for theproblem that are optimal in number of algebraic operations.We study the generalization of the problem to the veriﬁcation of a polynomial product modulo a sparsedivisor. We investigate its bit complexity for both dense and sparse multiplicands. In particular, we are ableto show the primacy of the veriﬁcation over modular multiplication when the divisor has a constant sparsityand a second highest-degree monomial that is not too large. We use these results to obtain new bounds onthe bit complexity of the standard polynomial multiplication veriﬁcation. In particular, we provide optimalalgorithms in the bit complexity model in the dense case by improving a result of Kaminski and develop theﬁrst quasi-optimal algorithm for verifying sparse polynomial product.

Polynomials are one of the most basic objects in computer algebra and the study of fast polynomial operationsremains a very challenging task. Polynomials can be represented using either the dense representation, thatstores all the coefﬁcients in a vector, or the more compact sparse representation, that only stores nonzeromonomials. Depending on which representation is chosen the problems might have a very different ﬂavorleading to two very separate lines of research.Polynomial multiplication is the most noticeable problem that attracted a lot of attention since manydecades, culminating nowadays with quasi-optimal algorithms [

2, 15 ] . Although such algorithms are reallyefﬁcient in theory and in practice, there are not yet optimal and they often rely on complex approaches thatcan be error prone. Therefore, looking for rather simple procedure to verify the correctness of polynomialproducts is of great interest. From a theoretical perspective, the goal is then to provide asymptotically fasteralgorithms than those for multiplying polynomials, ultimately seeking for an optimal algorithm. In practicethe objective is barely to ﬁnd simpler and faster procedures that reveal easier to trust.In this work, we intend to present the most recent advances in verifying polynomial products in both thedense and sparse case, to extend such results to either optimal algorithms or to more reliable solutions inpractice. Finally, we extend the problem to some speciﬁc modular multiplication of polynomials which seemsto not having been explored yet. Dense polynomial multiplication

We know from the early 60’s that dense polynomial arithmetic is sub-quadratic, and that it can even be quasi-linear when the so-called FFT applies [ ] . It has been more than twodecades later that Cantor and Kaltofen [ ] provide a quasi-linear algorithm without any assumption on thepolynomial algebra. They show that two dense polynomials of degree less than n over an algebra (cid:65) can bemultiplied with (cid:79) ( n log n log log n ) operations in (cid:65) . In regards to the bit complexity model, the operations inthe base ring (cid:65) cannot count (cid:79) ( ) anymore, and the previous algorithms may not lead to the best complexityestimates for speciﬁc domains such as (cid:65) = (cid:70) q or (cid:65) = (cid:90) . There, the use of Kronecker substitution togetherwith fast integer multiplication turns out to be the best alternative [

8, Section 8.4 ] . It has been showed byHarvey and van der Hoeven in [ ] that one can reach a bit complexity of (cid:79) ( n log q log ( n log q ) log ∗ ( n ) ) for poly-nomial multiplication over (cid:70) q [ X ] for any prime ﬁeld (cid:70) q . We shall mention that very recently, such complexityhave been further improved to (cid:79) ( n log q log ( n log q )) bit operations [ ] under some mild hypothesis. For1 a r X i v : . [ c s . S C ] J a n olynomials with integer coefﬁcients bounded by an integer C , the complexity falls down to multiplying twointegers of bit length (cid:79) ( n ( log n + log C )) which gives (cid:79) ( n ( log n + log C log n + log C log log C ) = ˜ (cid:79) ( n log C ) when we assume that n -bits integer multiplication complexity is I ( n ) = (cid:79) ( n log n ) [ ] . For clarity in thepresentation, we will often use M ( n ) as the number of operations in (cid:82) required to multiply two dense poly-nomials of size n , while M q ( n ) will denote the bit complexity for such multiplication over a prime ﬁeld (cid:70) q . Sparse polynomial multiplication

In the sparse representation, a polynomial F = (cid:80) ni = f i X i ∈ (cid:82) [ X ] isexpressed as a list of pairs ( e i , f e i ) such that all the f e i are nonzero. We denote by F the sparsity of thepolynomial F which corresponds to its number of nonzero coefﬁcients. Let F be a polynomial of degree n ,and log C be a bound on bit length of its coefﬁcients. Then, the size of the sparse representation of F is (cid:79) ( F ( log n + log C )) bits. Contrary to the dense case, note that fast algorithms for sparse polynomials musthave a (poly-)logarithmic dependency on the degree, and that the size of the output does not exclusivelydepend on the size of the inputs. Indeed, the product of two polynomials F and G has at most F G nonzerocoefﬁcients. But it may have as few as 2 nonzero coefﬁcients, as shown by the following example. Example . Let F = X + X + G = X + X + H = X − X +

2. Then

F G = X + X + X + X + X + X + X + X + F H = X + (cid:79) ( ) [

3, 25 ] . The classical approach for computing the product of twopolynomials of sparsity T is to generate all the T possible monomials, and to sort them and merge thoseof equal degree to collect the monomials of the result. Using radix sort, this algorithm takes for instance (cid:79) ( T ( I ( log C ) + log n )) bit operations over (cid:90) and it exhibits a T factor in the space complexity, whatever thenumber of terms in the result. Many improvements have been proposed to reduce this space complexity, toextend the approach to multivariate polynomials, and to provide fast implementations in practice [

19, 22, 23 ] .Yet, none of these results reduces the T factor in the time complexity. In general, no complexity improvementis expected as the output polynomial may have as many as T nonzero coefﬁcients. However, this numberof nonzero coefﬁcients can be overestimated, giving the opportunity for output-sensitive algorithms. Suchalgorithms have ﬁrst been proposed for special cases. Notably, when the output size is known to be small dueto sufﬁciently structured inputs [ ] , especially in the multivariate case [

17, 16 ] , or when the support of theoutput is known in advance [ ] .Output-sensitive multiplication algorithms try to take into account the two reasons that can decrease thesparsity of the product. The ﬁrst one is exponent collisions, while the second one occurs when these collisionsimply some coefﬁcient cancellations. The exponent collision is captured by the sumset of the exponents of F = (cid:80) Ti = f i X α i and G = (cid:80) Tj = g j X β j , that is { α i + β j : 1 ≤ i , j ≤ T } . Arnold and Roche call this set the structural support of the product F G and its size the structural sparsity [ ] . If H = F G , then the structuralsparsity S of the product F G satisﬁes 2 ≤ H ≤ S ≤ T . Observe that although H and S can be close, theirdifference can reach (cid:79) ( T ) as shown by the next example. Example . Let F = (cid:80) T − i = X i , G = (cid:80) T − i = ( X iT + − X iT ) and H = F G . We have F = T , G = T and thestructural sparsity of F G is T + H = X T − (cid:79) ( S log n ) operations in the RAM model with (cid:79) ( log ( C n )) word size [ ] , where log ( C ) bounds the bitsize of the coefﬁcients. Arnold and Roche improve thiscomplexity to ˜ (cid:79) ( S log n + H log C ) bit operations for polynomials with both positive and negative integercoefﬁcients [ ] . A recent algorithm of Nakos avoids the dependency on the structural sparsity for the caseof integer polynomials [ ] , using the same word RAM model as Cole and Hariharan. Unfortunately, the bitcomplexity of this algorithm, ˜ (cid:79) (( T log + H log n ) log ( C n ) + log n ) , is not quasi-linear. More recently, wepropose in [ ] the ﬁrst quasi-optimal algorithm for sparse polynomial multiplication yielding a bit complexityof ˜ (cid:79) ( T (cid:48) ( log n + log C )) where T (cid:48) = max ( T , H ) . More precisely, taking k = T (cid:48) ( log n + log c ) which is the bitlength of the input and output, we are able to reach a bit complexity of (cid:79) ( k log k log T ( log T + log log k )) . Veriﬁcation of polynomial products

Considering the non-optimality of polynomial multiplication in bothrepresentations, it is quite natural to ask whether it is rather a simple task or not to verify an instance of theproblem. More formally, given three polynomials F , G and H , can we assert that H is equal to the productof F and G in less operations than computing the product itself? Furthermore, we want such procedure to Here, and throughout the article, ˜ (cid:79) ( f ( n )) denotes (cid:79) ( f ( n ) log k ( f ( n ))) for some constant k >

2e as simple as possible and to not rely on polynomial multiplication if possible. Unfortunately, doing thiswith a deterministic procedure is not yet known, but using probabilistic algorithms lead to positive answers asshown by several papers [

6, 29, 32, 10 ] . Here and henceforth, polynomials are assumed to have coefﬁcientsin an integral domain (cid:82) , rather than in a more general algebra.For dense polynomials this veriﬁcation amounts to choosing a random element α in a ﬁnite subset of (cid:82) and to assert that H ( α ) − F ( α ) G ( α ) is zero. In that case, the complexity for the veriﬁcation becomes (cid:79) ( n ) operations in (cid:82) , which is optimal. Of course, the probability of error is less than one as soon as (cid:82) has morethan n elements. If this not the case, for instance when (cid:82) = (cid:70) , it is not desirable to choose α in a sufﬁcientlarge extension of (cid:82) to have (cid:79) ( n ) elements. The latter would require an extension of degree (cid:79) ( log n ) and itwould raise the complexity to (cid:79) ( n M ( log n )) . This is actually larger than the complexity M ( n ) of computingthe product. In [ ] , Gamin’s solved the latter problem by replacing the evaluation, that corresponds tocomputing within (cid:82) [ X ] / ( X − α ) , by doing a polynomial multiplication within (cid:82) [ X ] / ( X i − ) for a randominteger i < n . More precisely, by choosing i = (cid:79) ( n − e ) for some 0 < e < /

2, his veriﬁcation algorithm runs in (cid:79) ( n ) operations in (cid:82) , whatever its size, with a probability of error bounded away from one. While the resultsounds optimal from a theoretical perspective, it might be mitigated for practical applications as it veriﬁespolynomial multiplication of degree n by doing multiplication of polynomials of degree (cid:79) ( n − e ) .All these results remain valid under the bit complexity model, but the obtained complexity might not beoptimal. For polynomials over (cid:70) q [ X ] , both approaches using products in (cid:70) q [ X ] / ( X − α ) or in (cid:70) q [ X ] / ( X i − ) lead to a bit complexity of (cid:79) ( n I ( log q )) = (cid:79) ( n log q log log q ) . While being non optimal, they remainhowever asymptotically faster than the computation of the initial products by a factor log n / log log q . Actually,Kaminski’s approach has a better bit complexity than the standard method and can even yield a linear bitcomplexity in favorable cases. For polynomials over (cid:90) [ X ] , the result is more surprising as it is possible toreach an optimal bit complexity of (cid:79) ( n log C ) for any input. This result should be attributed to Kaminski, ashe provided in [ ] all the necessary materials while not noticing the result explicitly. It seems surprising,but we haven’t found any references advertising such result. Thus, we propose to provide the description ofthose optimal veriﬁcations of polynomial products.For sparse polynomials, the veriﬁcation of products remains less studied. It is misleading to think thatusing polynomial evaluation is satisfactory. Assuming that only T coefﬁcients are nonzero, sparse polynomialevaluation is not quasi-linear in the input size (cid:79) ( T ( log n + log C )) . Indeed, computing α n requires (cid:79) ( log n ) operations in (cid:82) which implies a complexity of (cid:79) ( T log n ) operations in (cid:82) when applied to the T nonzeromonomials. Since one needs to use a subset of (cid:82) of size at least n to ensure a nonzero probability of success,this implies that the bit complexity is at least (cid:79) ( T log n ) . Using similar ideas as Kaminski’s [ ] , we proposedrecently in [ ] to verify sparse polynomial identities by doing the computation in (cid:82) [ X ] / ( X p − ) for a randomprime p . In particular, we prove that choosing p = (cid:79) ( T log n ) ensures that ( H − F G ) mod ( X p − ) = H − F G = (cid:79) ( T ( log n + log C )) bit operations.Another important measure for randomized veriﬁcation algorithms is the probability of failure. All theknown veriﬁcation algorithms are True-biased one-sided Monte Carlo algorithms. This means that they al-ways return True if H = F G and return False with probability at least 1 − ε otherwise. Given an algorithm witherror probability at most ε , we can attain any smaller probability of error τ by repeating (cid:79) ( log τ log ε ) rounds of thealgorithm. This shows that the complexity of the algorithm is actually dependent on the target error probabil-ity. In our results, we always explicitly indicate this dependency. We can distinguish several regimes of valuesfor the error probability: the constant regime ε = (cid:79) ( ) , the inverse polynomial regime ε = / n (cid:79) ( ) and theinverse exponential regime ε = / (cid:79) ( n ) , where n is the input degree. Given an algorithm with constant errorprobability, one can attain any smaller constant probability using a constant number of rounds. This keepsthe same asymptotic complexity. The same is true for two probabilities inside the inverse polynomial regime.To get to the inverse polynomial regime from the constant regime, the number of rounds must be (cid:79) ( log n ) ,slightly increasing the asymptotic complexity. The inverse exponential regime can then be attained using apolynomial number of rounds. In our context of linear and quasi-linear algorithms, the inverse exponentialregime is not attainable in general. The best known veriﬁcation algorithms have linear bit complexity in theinverse polynomial regime. Contributions

As an extension of our prior work [ ] , we propose to study more generally the veriﬁcation ofpolynomial multiplication in (cid:82) [ X ] / P where P is a monic sparse polynomial. In the dense case, this generalizesa work from one of the authors on the probabilistic veriﬁcation on polynomial middle product [ ] . By reusingour modular product’s veriﬁcation, we show that we can address the difﬁculty of Kaminski’s approach thatveriﬁes polynomial products using products of roughly the same degree, more than (cid:112) n . In particular, we3how that we can avoid the dependency on polynomial multiplication in every cases. When dealing withﬁnite ﬁeld arithmetic it is quite common to rely on irreducible polynomials that are sparse [ ] . Therefore,having the possibility to verify multiplication over ﬁnite ﬁelds in less operations than computing the productseems of great interest. In particular, we show that the veriﬁcation of products in (cid:70) q s can be done in (cid:79) ( s P ) operations in (cid:70) q where P ∈ (cid:70) q [ X ] is the monic irreducible polynomial of degree s used to deﬁne (cid:70) q s . Clearlyfor irreducible polynomial with constant sparsity, as it is often the case over (cid:70) [ ] and more generally (cid:70) q [ ] , this offers an optimal veriﬁcation procedure. Finally, for sparse polynomials, this work extendsour prior result for (cid:82) [ X ] / ( X p − ) in [ ] that was of great importance to achieve the ﬁrst quasi-linear timealgorithm for sparse polynomial multiplication. We hope our new insight on this problem will leverage otherfast algorithms for sparse polynomial operations, especially for the division problem [ ] .All our techniques and results extend to the more general problem of verifying a polynomial identity of theform (cid:80) i F i G i mod P =

0, where the sum may have an arbitrary number of terms. It would be interesting to beable to extend these results to more general polynomial identities. As a very simple example, given F , F , F , H and P ∈ (cid:82) [ X ] for some integral domain (cid:82) , what is the complexity of the veriﬁcation of H = F F F mod P ?Obviously if the inputs are dense polynomials, the computation of F F F mod P can be done in quasi-lineartime. But the question is to design an algorithm than runs faster than performing the computation. In thesparse case, the computation may increase the input size quite a lot and even a quasi-linear time algorithm islacking. More generally, the problem is to verify identities of the form (cid:80) i (cid:81) j (cid:80) k · · · (cid:81) (cid:96) F i , j , k ,..., (cid:96) mod P = Modular Polynomial Identity Testing (Modular PIT) problem. The standardPolynomial Identity Testing (PIT) problem takes as input an arithmetic circuit, or equivalently a straight-lineprogram, and consists in deciding whether the polynomial it represents is zero. In this extension, a polynomial P is also given as input and the question is whether the polynomial represented by the circuit is divisible by P . The standard PIT problem admits polynomial-time, and even quasi-linear-time, randomized algorithm. Avery important open question is whether it also admits a polynomial-time deterministic algorithm. For theModular PIT problem, the question is already to design efﬁcient randomized algorithms. If the dense case,the challenge is to obtain faster algorithms than performing the product, ideally linear-time algorithms. Inthe sparse case, it is not even known how to solve the problem in randomized polynomial-time. Our resultsmay be seen as a ﬁrst step towards this goal. Outline

We start our work in Section 2 by introducing all the technical materials that serve to demonstrateour main results. Then, Section 3 is devoted to the study of the evaluation of modular multiplication. In par-ticular, we provide algorithms and their thorough analysis for evaluating ( F G ) mod P on α without computing F G mod P . The results of that section serve to derive efﬁcient algorithms in Section 4 for the veriﬁcation ofmodular multiplication of polynomials. Finally, we present in Section 5 the more general results on the ver-iﬁcation of classical polynomial multiplication. In particular, we extend the work of Kaminski [ ] for thedense case with a thorough analysis of its bit complexity that enables to reach optimal veriﬁcation. We alsogive a more detailed presentation of our ﬁrst quasi-optimal algorithm for the sparse case that appears in [ ] . Let Q ∈ (cid:82) [ X ] be a degree- n polynomial. We denote its coefﬁcient of degree i by q i . The sparsity of Q is itsnumber of nonzero monomials and is denoted by Q . The support of Q is the set supp ( Q ) = { i : q i (cid:54) = } . If Q is a polynomial over (cid:90) , we denote by (cid:107) Q (cid:107) ∞ its norm, deﬁned as max ≤ i ≤ n | q i | . We denote by log ( · ) the base-2logarithm and by ln ( · ) the natural logarithm. We also use log b ( · ) to denote the base- b logarithm deﬁned bylog b ( x ) = log x log b .We work in this paper with dense and sparse polynomials. A dense polynomial is represented as the vectorof its coefﬁcients, which has size n + n polynomial. A sparse polynomial is represented by thelist of its nonzero monomials. We consider that we work, for sparse polynomials, with an abstract structure of sparse vector . In practice, this can be implemented by several data structures, depending on the operations thatneed to be performed. A standard choice in sparse polynomial arithmetic is the use of heaps [

19, 22, 23 ] . Toget better complexities, we might resorting to van Emde Boas Trees [

5, Chapter 20 ] as in Section 3.3.We alsouse sparse vectors in some algorithms to represent data which are not directly polynomials. The underlyingdata structure is the same as for sparse polynomials. 4 omplexity of dense polynomial multiplications We denote by M ( n ) the number of ring operationsneeded to compute a product of degree- n dense polynomials over an integral domain. We can take M ( n ) = (cid:79) ( n log n log log n ) [ ] . We denote by M q ( n ) the bit complexity of the multiplication of two degree- n densepolynomials over a ﬁnite ﬁeld (cid:70) q . The best known bounds on M q ( n ) are (cid:79) ( n log q log ( n log q ) log ∗ ( n log q ) ) [ ] unconditionally and (cid:79) ( n log q log ( n log q )) assuming the existence of some Linnik constant [ ] . To simplifythe notation, we assume the existence of this Linnik constant. The cost of multiplying two elements in anextension ﬁeld (cid:70) q d is the cost of degree- d polynomial multiplications, that is (cid:79) ( M q ( d )) . Let F , G ∈ (cid:90) [ X ] of degree n and norm C . Their product has norm at most nC . To compute F G , we can evaluate both F and G on some power of 2 larger than nC , multiply the resulting integers (that have size n log ( nC ) ), andread the coefﬁcients of F G directly on the output integer. Let I ( m ) = (cid:79) ( m log m ) denote the bit complexityof multiplying two m -bit integers [ ] . Then the bit complexity of multiplying F and G is I ( n log ( nC )) = (cid:79) ( n log n + n log n log C + n log C log log C ) . Complexity of polynomial evaluation

The evaluation of a dense degree- n polynomial F ∈ (cid:82) [ X ] on apoint α ∈ (cid:82) requires (cid:79) ( n ) operations in (cid:82) using for instance Horner scheme. If α lies in an extensionring (cid:82) ext of (cid:82) , the evaluation requires (cid:79) ( n ) operations in (cid:82) ext . If F has coefﬁcients in a ﬁnite ﬁeld (cid:70) q , thistranslates directly to a linear number of operations in (cid:70) q . Now if F has coefﬁcients in (cid:90) , one must take intoaccount the growth of the integers during the computation. Using a divide-and-conquer approach to usebalanced integer multiplications, the cost of the evaluation is (cid:79) ( I ( n log C ) log ( n log C )) bit operations where C = max ( | α | , (cid:107) F (cid:107) ∞ ) . We note that this cost is quasi-linear in the worst case output size while using Hornerscheme would have been quadratic.To evaluate a sparse polynomial F ∈ (cid:82) [ X ] on α ∈ (cid:82) , we compute the relevant powers of α and thenperform F multiplications and additions in (cid:82) . Computing each power independently yields (cid:79) ( F log n ) operations in (cid:82) . Using simultaneous exponentiation [ ] , the cost is reduced to (cid:79) ( log n + F log n / log log n ) operations in (cid:82) . Again, this directly translates to operations in (cid:70) q if (cid:82) = (cid:70) q . For polynomials with integercoefﬁcients, the growth is much more severe than in the dense case. Indeed, α n has (cid:79) ( n log | α | ) bits. Thisimplies that the bit complexity is at least linear in n which is exponentially larger than the input size. Thecost is actually not better than with dense polynomials. Reducing a polynomial modulo P changes its norm and sparsity. We provide bounds on these growths. Theyrely on the gap between the degree of P and its second degree , that is the degree of its second highest-degreemonomial. Deﬁnition 1.

Let P = X n + (cid:80) ki = p i X i for k < n and p k (cid:54) =

0. The second degree of P is the integer k . The gapparameter γ of P is γ = n ( n − k ) .In particular, the second degree of P is ( − γ ) n . The parameter γ is between 0 and 1. If γ is close to0, the polynomial actually has no gap, while γ = aX n + b . We note that giventhis deﬁnition, γ is always upper bounded by n . Polynomials with a large gap are also known as sedimentary polynomials [ ] . A polynomial is said t -sedimentary if it is of the form X n + H where deg ( H ) = t . A t -sedimentary polynomial is a polynomial with gap parameter ( n − t ) / n and conversely a monic polynomialwith a gap parameter γ is ( − γ ) n -sedimentary.The norm of the product of two polynomials is classically related to their norms and degrees. This can beslightly reﬁned using the sparsities instead of the degrees. Lemma 2.1.

Let F and G be two polynomials over (cid:90) . Then (cid:107)

F G (cid:107) ∞ ≤ min ( F , G ) (cid:107) F (cid:107) ∞ (cid:107) G (cid:107) ∞ .Proof. Let H = (cid:80) k h k X k be the product of F and G . Then h k = (cid:80) i + j = k f i g j . Let T = min ( F , G ) . Thenthe sum to deﬁne h k has size of most T . Since | f i | ≤ (cid:107) F (cid:107) ∞ for all i and | g j | ≤ (cid:107) G (cid:107) ∞ for all j by deﬁnition, | h k | ≤ T (cid:107) F (cid:107) ∞ (cid:107) G (cid:107) ∞ , whence the result.The modular reduction of polynomials has a bigger impact on the norm. It is actually related to severalparameters such as the gap parameter of the divisor and the difference of the degrees. The following exampleshows a large increase in the norm, as well as a densiﬁcation of the result. Example . Let P = X + X + X − X + X + Q = X + X + X − X − X + X + X .Here the gap parameter of P is γ = , P =

6, Q = (cid:107) P (cid:107) ∞ = (cid:107) Q (cid:107) ∞ =

8. The polynomial Q mod P has degree 79, sparsity 53 and norm 11912. 5he following proposition bounds the growth on the different parameters of the polynomial after a mod-ular reduction. Proposition 2.2.

Let Q be a sparse polynomial of degree at most n − + k and P a monic polynomial of degreen with P ≥ . The polynomial Q mod P has at most Q ( P − ) (cid:100) k γ n (cid:101) monomials. If Q and P are deﬁned over (cid:90) , (cid:107) Q mod P (cid:107) ∞ ≤ (cid:107) Q (cid:107) ∞ ( P (cid:107) P (cid:107) ∞ ) (cid:100) k γ n (cid:101) .Proof. We analyse the growth of the norm and the sparsity while performing the euclidean division.Instead of following the classical quadratic algorithm, we ﬁrst reduce once all the monomials of Q withdegree at least n to obtain a new dividend. We repeat this process until the dividend has degree less than n . Letus deﬁne the sequence ( Q [ i ] ) i by Q [ ] = Q and Q [ i + ] = ( Q [ i ] mod X n ) + ( Q [ i ] quo X n )( X n − P ) . Then Q [ i ] mod P = Q mod P for all i . Since deg ( Q [ i ] quo X n ) = deg ( Q [ i ] ) − n and deg ( X n − P ) ≤ ( − γ ) n , deg ( Q [ i + ] ) ≤ max ( n −

1, deg ( Q [ i ] ) − γ n ) , whence deg ( Q [ i ] ) ≤ max ( n −

1, deg ( Q ) − i γ n ) .Also, Q [ i + ] is at most Q [ i ] ( P − ) , thus Q [ i ] ≤ Q ( P − ) i . Finally, (cid:107) Q [ i + ] (cid:107) ∞ ≤ (cid:107) Q [ i ] (cid:107) ∞ ( + min ( Q [ i ] , P − ) (cid:107) P (cid:107) ∞ ) .Therefore, (cid:107) Q [ i ] (cid:107) ∞ ≤ ( P (cid:107) P (cid:107) ∞ ) i (cid:107) Q (cid:107) ∞ .Since deg ( Q [ i ] ) ≤ n + k − − i γ n , deg ( Q [ i ] ) < n if i = (cid:100) k γ n (cid:101) . This implies that Q [ i ] = Q mod P . We collect in this section some useful results to produce random prime numbers and random irreduciblepolynomials over ﬁnite ﬁelds.

Proposition 2.3 ( [ ] ) . If λ ≥ , there are at least λ/ ln λ prime numbers in [ λ , 2 λ ] . Using this proposition together with Miller-Rabin probability test, we can produce integers that are primewith good probability [ ] . Proposition 2.4.

There exists an algorithm R ANDOM P RIME ( λ , ε ) that returns an integer q in [ λ , 2 λ ] , such thatq is prime with probability at least − ε . It requires (cid:79) ( log ( ε ) log ( λ ) I ( log λ ) log log λ ) bit operations. Proposition 2.5.

Let H ∈ (cid:82) [ X ] be a nonzero polynomial of degree at most n and sparsity at most T , < ε < and λ = max ( ε T ln n ) . With probability at least − ε , R ANDOM P RIME ( λ , ε ) returns a prime number p suchthat H mod X p − (cid:54) = .Proof. It is sufﬁcient, for H mod X p − e of H that is not congruentto any other exponents e j modulo p . In other words, it is sufﬁcient that p does not divide any of the T − δ j = e j − e .Noting that δ j ≤ n , the number of primes in [ λ , 2 λ ] that divide at least one δ j is at most ( T − ) ln n ln λ . Sincethere exists λ/ ln λ primes in this interval, the probability that a prime randomly chosen from it divides atleast one δ j is at most ε/

2. R

ANDOM P RIME ( λ , ε/ ) returns a prime in [ λ , 2 λ ] with probability at least 1 − ε/ Proposition 2.6.

Let H ∈ (cid:90) [ X ] be a nonzero polynomial, < ε < and λ ≥ max ( ε ln (cid:107) H (cid:107) ∞ ) . Then withprobability at least − ε , R ANDOM P RIME ( λ , ε ) returns a prime q such that H mod q (cid:54) = .Proof. Let h i be a nonzero coefﬁcient of H , a random prime from [ λ , 2 λ ] divides h i with probability at most ln (cid:107) H (cid:107) ∞ /λ ≤ ε/

2. Since R

ANDOM P RIME ( λ , ε/ ) returns a prime in [ λ , 2 λ ] with probability at least 1 − ε/ Proposition 2.7 ( [

30, Chapter 19 ] ) . The number of irreducible monic polynomial of degree d over a ﬁeld (cid:70) q isbetween q d d and q d d . Proposition 2.8 ( [

30, Chapter 20 ] ) . There exists an algorithm that, given a ﬁnite ﬁeld (cid:70) q , an integer d and < ε < , computes a degree-d polynomial in (cid:70) q [ X ] that is irreducible with probability at least − ε . It requires (cid:79) ( log ( ε ) d M ( d )( log q + log log d )) operations in (cid:70) q or (cid:79) ( log ( ε ) d log q ) operations in (cid:70) q if using only naivepolynomial multiplications. Remark 2.9.

Shoup [ ] presents Las Vegas algorithms for Propositions 2.4 and 2.8. We consider Monte Carloversions of his algorithms. Also, he analyses the complexities with naive algorithms. Our complexity estimatesuse fast integer and polynomial arithmetic. Evaluation for polynomial multiplication in a quotient ring

As seen earlier, the veriﬁcation of polynomial multiplication mainly relies on the evaluation of the polynomialidentity at a random point. In this section we present algorithms to efﬁciently compute the evaluation of amodular product ( F G ) mod P on a point α , without computing ( F G ) mod P . There, the modulus P is alwaysconsidered as a sparse polynomial, while F and G can be either dense or sparse.Section 3.1 describes our method in the simpler case where P is a binomial. We obtain linear-time evalu-ations, whether F and G are dense or sparse. Section 3.2 generalizes the method to the product of two densepolynomials modulo a sparse modulus, and Section 3.3 presents the case of a sparse modular product. Let us ﬁrst present our method to evaluate a modular product

F G mod P where P = X n −

1. This specialcase illustrates our more general method. It also has its own interest since it is used as the main tool for theveriﬁcation of a product of two polynomials in Section 5, either for dense or sparse representation.We ﬁrst describe the algorithm for dense polynomials F and G . Theorem 3.1.

Let F and G be two polynomials in (cid:82) [ X ] of degrees less than n and α ∈ (cid:82) . The polynomial ( F G ) mod X n − can be evaluates on α using (cid:79) ( n ) operations in (cid:82) .Proof. Let H = F G and M = H mod X n −

1. We denote by f i (resp. g i , h i , m i ) the coefﬁcient of degree i of thepolynomial F (resp. G , H , M ). Let also (cid:126) g = ( g , . . . , g n − ) T , (cid:126) h = ( h , . . . , h n − ) T and (cid:126) m = ( m , . . . , m n − ) T .It is a well-known fact that considering F as ﬁxed, the multiplication by F is a linear map described by aToeplitz matrix. More precisely, we have (cid:126) h = T F (cid:126) g where T F =  f f f ... ... f n − . . . . . . f f n − f ... ... f n −  .Since M = H mod X n − m i = h i + h i + n − for 0 ≤ i < n − m n − = h n − . Therefore, (cid:126) m = C F (cid:126) g where C F is the circulant matrix C F =  f f n − · · · f f f · · · f ... ... ... f n − f n − · · · f  .On the other hand, evaluating M on α corresponds to the inner product (cid:126)α n (cid:126) m where (cid:126)α n = ( α , . . . , α n − ) .Therefore, our aim is to compute (cid:126)α n C F (cid:126) g . The standard way to perform this evaluation corresponds to ﬁrstcomputing (cid:126) m = C F (cid:126) g and then (cid:126)α n (cid:126) m . As noticed by Giorgi [ ] , the bracketing ( (cid:126)α n C F ) (cid:126) g yields a faster algorithmdue to the structure of the matrix C F .Let (cid:126) c = (cid:126)α n C F . Then c j + = (cid:80) n − (cid:96) = α (cid:96) f ( (cid:96) − j − ) mod n = f n − j − + α (cid:80) n − (cid:96) = α (cid:96) f ( (cid:96) − j ) mod n . Since for j > (cid:80) n − (cid:96) = α (cid:96) f ( (cid:96) − j ) mod n = c j − α n − f n − j − , we obtain the recurrence relation (cid:168) c j + = α c j − P ( α ) f n − j − for j ≥ c = F ( α ) (1)where P = X n − c = F ( α ) .It is immediate that exploiting such recurrence relation for computing the evaluation of ( F G ) mod P leadsto a complexity of (cid:79) ( n ) operations in (cid:82) . Indeed, once c and P ( α ) = α n − c j canbe computed sequentially at cost (cid:79) ( ) .For completeness, we provide the full description of this method in Algorithm 3.1.We can actually be more precise on the number of operations required by Algorithm 3.1. In particularwhen α does not lie into (cid:82) but in an extension (cid:82) ext of (cid:82) , we can distinguish between operations in (cid:82) and (cid:82) ext . In the next corollary, we call scalar multiplications those that are multiplications of an element of (cid:82) ext by an element of (cid:82) . The following analysis minimizes the number of non-scalar multiplications.7 lgorithm 1 E VALUATION M ODULO B INOMIAL

Input: F , G ∈ (cid:82) [ X ] with deg ( F ) , deg ( G ) < n , and α ∈ (cid:82) . Output: ( F G mod X n − )( α ) c ← F ( α ) P α ← α n − β ← c g for j = n − do c ← α c − P α f n − j β ← β + c g j return β Corollary 3.2.

Let F and G be two polynomials in (cid:82) [ X ] of degree less than n and α ∈ (cid:82) ext . The polynomial ( F G ) mod X n − can be evaluated on α using n − multiplications and n − additions in (cid:82) ext , and n − scalar multiplications.Proof. We can ﬁrst compute α , α , . . . , α n using ( n − ) multiplications. Then, F ( α ) can be computed using ( n − ) scalar multiplications and additions, and P ( α ) = α n − c g of β requires one scalar multiplication. Then each iteration of the loop require one multiplication, twoscalar multiplications and two additions. Therefore, the complete evaluation require 3 n − n − n − F usingHorner’s scheme with ( n − ) multiplications, one scalar multiplication and ( n − ) additions. Then α n has to becomputed using at most 2 log n multiplications. This results in 3 n − n − + n multiplicationsand 2 n scalar multiplications. The total number of multiplications (scalar or not) is a bit less.We now turn to the analysis of the algorithm for F and G given in sparse representation. Theorem 3.3.

Let F and G be two sparsely represented polynomials in (cid:82) [ X ] of degrees less than n and α ∈ (cid:82) .The polynomial ( F G ) mod X n − can be evaluated on α using (cid:79) (( F + G ) log n ) operations in (cid:82) .Proof. We use the same notations as in the previous proof. If the support of G is supp ( G ) = { j , . . . , j G − } with j < · · · < j G − , the inner product (cid:126) c (cid:126) g is equal to (cid:80) G − k = c j k g j k . This means that only the G entries c j , . . . , c j G − of (cid:126) c need to be computed. Applying the recurrence relation (1) as many times as necessary, weobtain the new recurrence relation (cid:168) c j k + = α j k + − j k c j k − P ( α ) (cid:80) j k + (cid:96) = j k + α (cid:96) f n − (cid:96) for k ≥ c j = (( X j F ) mod X n − )( α ) . (2)The initial value c j can be computed using (cid:79) ( F log n ) operations in (cid:82) since it needs F exponentiationsof α with exponent bounded by n . Most values of f n − (cid:96) are actually equal to zero since F is sparse.A nonzero coefﬁcient f t of F appears in the deﬁnition of c j k + if and only if n − j k + ≤ t < n − j k . Thus, each f t is used exactly once to compute all the c j k ’s. Since for each summand, one needs to compute α (cid:96) for some (cid:96) < n , the total cost for computing all the sums is (cid:79) ( F log n ) operations in (cid:82) . Similarly, the computationof α j k + − j k c j k for all k ∈ [

0, G − ] costs (cid:79) ( G log n ) operations in (cid:82) plus G − (cid:79) ( log n ) -bitintegers to get the exponents. As one operation in (cid:82) requires at least one bit operation, the integer additionsthat costs (cid:79) ( G log n ) bit operations are negligible. The last remaining step is the ﬁnal inner product whichcosts (cid:79) ( G ) operations in (cid:82) , whence the result.As in the dense case, one can be more precise on the complexity if α liesin an extension (cid:82) ext . In contrary tothe dense case where there is more operations in (cid:82) than in (cid:82) ext , one can note that the number of operationsin (cid:82) is negligible in the sparse case. Corollary 3.4.

Let F and G be two sparsely represented polynomials in (cid:82) [ X ] of degrees less than n and α ∈ (cid:82) ext .The polynomial ( F G ) mod X n − can be evaluated on α using log n + (cid:79) (( F + G ) log n / log log n ) operationsin (cid:82) ext plus G − additions of (cid:79) ( log n ) -bit integers.Proof. We ﬁrst notice that in the sparse case the operations on α dominate the complexity. These operationsare operations in (cid:82) ext . To improve the complexity estimates, we remark that in the sparse settings we need tocompute α t for several values of t . The computation of c j requires to know F values of α t , more precisely8hose with t = (cid:96) − j mod n for each nonzero coefﬁcient f (cid:96) of F . To apply Equation (2), one needs to compute α t for t = j k + − j k , 1 ≤ k < G , and for t = (cid:96) where f n − (cid:96) (cid:54) =

0. The value α n is also needed to compute P ( α ) .Finally, the inner product requires to compute α t for each nonzero g t . Altogether, one needs α t for at most2 ( F + G ) values of t , each at most n . They can be computed independently using fast exponentiation, usingat most (cid:79) (( F + G ) log n ) multiplications, as it is done in Theorem 3.3. Actually, Yao [ ] shows that thesevalues of α t can be computed simultaneously using only log n + (cid:79) (( F + G ) log n / log log n ) multiplications.Once these α t have been computed, computing c j and the c j k ’s by means of Equation (2), as well as the innerproduct (cid:126) c (cid:126) g , only require (cid:79) ( F + G ) operations. In this section, we extend the previous algorithm to the evaluation of a polynomial

F G mod P where P is anymonic sparse polynomial. We ﬁrst consider the case where F and G are given in dense representation. Thecase where they are given is sparse representation is postponed to the next section.The algorithm goes along the same lines as the evaluation modulo X n −

1. Let F [ i ] = ( X i F ) mod P . We canrewrite F G mod P = (cid:80) n − i = g i F [ i ] where g i is the coefﬁcient of degree i in G . The evaluation of this equalityon a point α yields the formula ( F G mod P )( α ) = n − (cid:88) i = g i F [ i ] ( α ) . (3)To make use of this formula, we need to be able to efﬁciently evaluate each F [ i ] on α . Note that consecutive F [ i ] ’s are bound by the recurrence relation F [ i + ] = ( X F [ i ] ) mod P . Since deg ( F [ i ] ) = n − ( X F [ i ] ) mod P = X F [ i ] − f [ i ] n − P where f [ i ] n − is the coefﬁcient of degree n − F [ i ] . Consequently we have the followingrecurrence relation (cid:168) F [ i + ] ( α ) = α F [ i ] ( α ) − f [ i ] n − P ( α ) for i ≥ F [ ] ( α ) = F ( α ) (4)The evaluations of each F [ i ] on α can thus be computed iteratively from F ( α ) , only knowing the coefﬁcient f [ i ] n − of F [ i ] for 0 < i < n − P = X n − f [ i ] n − = f n − − i . In the general case, thecomputation is based on the recurrence relation F [ i + ] = X F [ i ] − f [ i ] n − P , which implies f [ i + ] k = f [ i ] k − − f [ i ] n − p k (5)for 0 < k ≤ n −

1. This allows to compute each f [ i ] n − , starting from the values of f [ ] k for all k . These valuesare given as input since F [ ] = F by deﬁnition. Note that since P is a sparse polynomial, Equation (5) actuallyreduces to an equality f [ i + ] k = f [ i ] k − in many cases. Algorithm 2 takes this into account and only performs therequired updates. Algorithm 2 L EADING C OEFFICIENTS

Input:

Two polynomials P and F in (cid:82) [ X ] , with deg ( F ) < deg ( P ) = n and P monic. Output:

The vector [ f [ ] n − , f [ ] n − , . . . , f [ n − ] n − ] , where f [ i ] n − is the coefﬁcient of degree n − F [ i ] = ( X i F ) mod P . V ← [ f n − , f n − , . . . , f ] for i = n − do for k ∈ supp ( P ) such that i < k < n do V [ i + n − k ] ← V [ i + n − k ] − p k V [ i ] return V Lemma 3.5.

Algorithm 2 is correct. It uses (cid:79) ( n P ) operations in (cid:82) .Proof. The number of operations is clear: all operations are performed at Step 4 and it is called (cid:79) ( n P ) times.Note that the external for loop can be stopped as soon as there exists no k ∈ supp ( P ) such that i < k < n . Inother words, i never goes beyond deg ( X n − P ) − i of the external loop,the following property (cid:80) ( i ) holds: V [ j ] = f [ j ] n − for any j ≤ i + V [ j ] = f [ i + ] n − ( j − i ) for j > i + (cid:80) ( − ) holds since it reads V [ j ] = f [ ] n − j − for all j , and F = F [ ] by deﬁnition.Suppose that (cid:80) ( i − ) holds. In particular, V [ j ] = f [ j ] n − for j ≤ i . During iteration i , only V [ i + ] to V [ n − ] can be modiﬁed so these equalities remain after that iteration. For j > i , V [ j ] = f [ i ] n − ( j − i + ) beforethe iteration by hypothesis. After the iteration, it becomes V [ j ] = f [ i ] n − ( j − i + ) − p n − j + i V [ i ] = f [ i ] n − j + i − − p n − j + i f [ i ] n − .Equation (5) shows that V [ j ] = f [ i + ] n − j + i after Step 4, and (cid:80) ( i ) holds.To conclude, after the last iteration, V [ j ] = f [ j ] n − for all j ≤ n − F G mod P on a point α . In the following algorithm, weassume that α belongs to some extension ring (cid:82) ext of (cid:82) . Our analysis distinguishes between operations in (cid:82) and in (cid:82) ext . Algorithm 3 M ODULAR E VALUATION

Input: P , F , G ∈ (cid:82) [ X ] with deg ( F ) , deg ( G ) < deg ( P ) = n , P monic, and α ∈ (cid:82) ext . Output: ( F G mod P )( α ) V ← [ f [ ] n − , . . . , f [ n − ] n − ] using a call to L EADING C OEFFICIENTS ( P , F ) P α ← P ( α ) F α ← F ( α ) β ← F α g for i = n − do F α ← α F α − V [ i − ] P α β ← β + F α g i return β Theorem 3.6.

Algorithm 3 is correct. It uses (cid:79) ( n P ) operations in (cid:82) and (cid:79) ( n ) operations in (cid:82) ext .Proof. Step 6 relies on Equation (4) to compute F α = F [ i ] ( α ) . Step 7 uses this evaluation together with Equa-tion (3) to correctly compute ( F G mod P )( α ) . The ﬁrst step requires (cid:79) ( n P ) operations in (cid:82) by Lemma 3.5.(It does not depend on α .) The other steps require (cid:79) ( n ) operations in (cid:82) ext .As before, we notice that the operations in (cid:82) ext are sometimes scalar multiplications, that is multiplicationsof an element of (cid:82) ext by an element of (cid:82) . We provide an analysis that minimizes the number of non-scalarmultiplications. Corollary 3.7.

Let P, F , G and α as in Algorithm 3. Then ( F G ) mod P can be evaluated on α using n − multiplications and ( n − + P ) additions in (cid:82) ext , ( n − + P ) scalar multiplications in (cid:82) ext , and ( n − )( P − ) multiplications and additions in (cid:82) .Proof. We ﬁrst note that the number of operations performed by Algorithm 2 is at most ( n − )( P − ) multiplications and additions in (cid:82) . In Algorithm 3, we need to evaluate both P and F on α . To minimizethe number of non-scalar multiplications, we ﬁrst compute α , . . . , α n using n − (cid:82) ext .We can then compute P α using P − P − F α using n − n − β and the for loop require n − n − n − n − (cid:82) ext , ( n − + P ) scalar multiplications in (cid:82) ext , and ( n − + P ) additions in (cid:82) ext .In a different context where the aim is speciﬁcally to compute the evaluation with no restriction to the useof polynomial arithmetic one can ﬁrst compute the polynomial F G mod P and then evaluate it on α ∈ (cid:82) ext .Such method requires (cid:79) ( n log n log log n ) operations in (cid:82) for the polynomial multiplication F × G and divisionby P and (cid:79) ( n ) operations in (cid:82) ext for the evaluation. Thus we see that if P veriﬁes P < log n log log n ourtechnique is more efﬁcient. In this section, we adapt and analyse the previous algorithms for polynomials F and G given in sparse rep-resentation. Our results depend on the difference between the highest and the second highest exponents in P . Recall that the gap parameter γ is a measure of this difference, deﬁned by 1 − γ = n max { k < n : p k (cid:54) = } .10n particular, the second highest exponent with nonzero coefﬁcient in P is ( − γ ) n . Proposition 2.2 givesa relation between the gap parameter and the sparsity of ( F G ) mod P . The potential growth of the sparsityinduced by the reduction modulo P explains the dependency of our results on the gap parameter.As G is sparse, Equation (3) becomes ( F G mod P )( α ) = (cid:88) i ∈ supp ( G ) g i F [ i ] ( α ) , (6)with the same notations as in the previous section. The recurrence relation F [ i + ] = X F [ i ] − f [ i ] n − P still holds,hence F [ i + ] ( α ) = α F [ i ] ( α ) − f [ i ] n − P ( α ) too. The goal now is to efﬁciently compute F [ i ] ( α ) for all i ∈ supp ( G ) only, not for all indices i . When γ is not close to zero, there are actually few indices i such that f [ i ] n − (cid:54) =

0. Infact, the number of such indices depends on F , P and γ . Let I = { i , . . . , i t } denote this set of indices. Wewill prove in Lemma 3.8 that this set is of size (cid:79) ( F P (cid:100) /γ − (cid:101) ) . We decide to ﬁrst provide some argumentsand an explicit algorithm to prove this claim.An important remark is that for any 0 ≤ j < n −

1, in particular those verifying j ∈ supp ( G ) , if we assume i to be the largest index in I not larger than j , then Equation (5) implies F [ j ] ( α ) = α j − i F [ i ] ( α ) . (7)Therefore, the recurrence relation given in (4) becomes (cid:168) F [ i k + ] ( α ) = α i k + − i k − ( α F [ i k ] ( α ) − f [ i k ] n − P ( α )) for k ≥ F [ i ] ( α ) = α i F [ ] ( α ) (8)To efﬁciently use Equations (7) and (8) to perform the evaluation, we need to provide a sparse variant ofAlgorithm 2. It computes a sparse representation of the vector V = [ f [ ] n − , . . . , f [ n − ] n − ] , that is the sparse vector { ( i , f [ i ] n − ) : f [ i ] n − (cid:54) = } .The idea of Algorithm 4 is to mimic Algorithm 2 in the sparse settings. For simplicity of the presentation,we ﬁrst consider to store V as a sparse vector as it is sufﬁcient to our needs for proving our claims on the sizeof the set I . We will show in Corollary 3.9 that we must require another structure to minimize the complexityattached to data management.The initial nonzero values in V are the nonzero coefﬁcients of F = F [ ] , with V [ i ] = f n − − i if n − i − ∈ supp ( F ) . Let now consider the external loop in Algorithm 2. Iteration i does not require any operation if f [ i ] n − = f [ i + ] k = f [ i ] k − in that case. Therefore, we must loop over indices i suchthat f [ i ] n − is nonzero. For such an index i , the same updates as in Step 4 of Algorithm 2 are required. For k ∈ supp ( P ) , i < k < n , we must perform the update V [ i + n − k ] ← V [ i + n − k ] − p k f [ i ] n − . If V [ i + n − k ] is already nonzero, its value is already stored in V and must be updated. Otherwise, the new value − p k f [ i ] n − must be inserted in V with index i + n − k .It remains to be able to only loop over the indices i such that f [ i ] n − (cid:54) =

0. Let us assume that iteration i hasbeen performed since f [ i ] n − (cid:54) =

0. The proof of Lemma 3.5 shows that V [ i + ] then contains f [ i + ] n − . Thereforein the sparse setting, we know that iteration i + V [ i + ] (cid:54) =

0. Moregenerally, the next index to be considered is the index of the next nonzero entry of V after V [ i ] . Algorithm 4below uses such method for computing all the indices i such that f [ i ] n − is nonzero. Algorithm 4 S PARSE L EADING C OEFFICIENTS

Input: P , F ∈ (cid:82) [ X ] with deg ( F ) < deg ( P ) = n , and P monic Output:

The list { ( i , f [ i ] n − ) : 0 ≤ i < n − f [ i ] n − (cid:54) = } , sorted by increasing values of i . L ← empty list V ← { ( i , f n − − i ) : n − − i ∈ supp ( F ) } (sparse vector) while V is not empty do ( i , v ) ← extract the element of minimal index from V if v (cid:54) = then Add ( i , v ) to the list L for k ∈ supp ( P ) such that i < k < n do V [ i + n − k ] ← V [ i + n − k ] − p k v return L Lemma 3.8.

Algorithm 4 is correct. If the polynomial P has a gap parameter γ , the algorithm uses (cid:79) ( F P (cid:100) /γ − (cid:101) ) operations in (cid:82) and additions of (cid:79) ( log n ) -bits integers. In particular, there are at most F P (cid:100) /γ − (cid:101) indices i such that f [ i ] n − (cid:54) = . roof. As explained above, Algorithm 4 is an adaptation of Algorithm 2 to the sparse settings that only com-putes those f [ i ] n − that are nonzero, using Equations (7) and (8) in place of Equation (4). Instead of consideringall the F i [ n − ] one after the other it only considers those which are not zero. Let us call “iteration i ” theiteration in the while loop that extract a pair ( i , v ) from V . To prove the correctness, we prove by inductionthat at the end of iteration i , V = { ( j , f [ i + ] n − j + i ) : j > i , f [ i + ] n − j + i (cid:54) = } and L = { ( j , f [ j ] n − ) : j ≤ i , f [ j ] n − (cid:54) = } .Before the loop (“iteration − L is empty and V contains exactly the pairs ( j , f [ ] n − j − ) such that f [ ] n − j − (cid:54) = (cid:96) , and let ( i , v ) be the pair extracted at thenext iteration. We ﬁrst prove that f [ i ] n − = v and f [ j ] n − = (cid:96) < j < i . By minimality of i and inductionhypothesis, f [ (cid:96) + ] n − j + (cid:96) = (cid:96) < j < i at the end of iteration (cid:96) . In particular, f [ (cid:96) + ] n − =

0. By Equation (5), f [ (cid:96) + ] n − j + (cid:96) + = f [ (cid:96) + ] n − j + (cid:96) − f [ (cid:96) + ] n − p n − j + (cid:96) + =

0. And an easy recurrence shows that f [ j ] n − = (cid:96) < j < i . Nowthis implies that f [ i ] n − = f [ i − ] n − = · · · = f [ (cid:96) + ] n − i + (cid:96) . Yet by induction hypothesis, at the end of iteration (cid:96) , V contains ( j , f [ (cid:96) + ] n − j + i ) if f [ (cid:96) + ] n − j + i (cid:54) =

0. Therefore, if f [ i ] n − is nonzero, f [ (cid:96) + ] n − i + (cid:96) is nonzero too and V contains the pair ( i , f [ (cid:96) + ] n − i + (cid:96) ) = ( i , f [ i ] n − ) . In other words, the value v extracted from V is indeed equal to f [ i ] n − and the propertyholds for L after iteration i .Now with the same argument, f [ i ] n − − j + i = f [ (cid:96) + ] n − j + (cid:96) for (cid:96) < j < i . Right before iteration i , V contains thenthe pairs ( j , f [ i ] n − − j + i ) for f [ i ] n − − j + i (cid:54) =

0. After iteration i , such pairs are replaced by ( j , f [ i ] n − − j + i − p n − j + i f [ i ] n − ) ,that is ( j , f [ i + ] n − j + i ) by Equation (5). And if f [ i ] n − − j + i = p n − j + i (cid:54) =

0, a new pair ( j , − p n − j + i f [ i ] n − ) = ( j , f [ i ] n − j + i ) is inserted into V . Therefore, the property holds for V too after iteration i .The second point is to count the number of operations. Since the while loop stops when V is empty, thenumber of operations in (cid:82) is at most twice the number of pairs that are inserted into V during the algorithmand the same number of additions in (cid:90) are performed as the index of each pair is computed by two additionsof number at most n . We will classify the pairs by generations . Initially, V contains F pairs which formgeneration 0. New pairs can be inserted into V when a pair ( i , v ) is extracted. If ( i , v ) is a pair of generation t , the new pairs inserted at iteration i belong to generation t +

1. At any iteration, at most P − V . Therefore, there are at most F ( P − ) pairs of generation 1, F ( P − ) pairs ofgeneration 2, and in general F ( P − ) t pairs of generation t . Now we need to bound the number ofgenerations. Note the pairs of generation 0 have an index i between 0 and n −

1. But at generation 1, thenew pairs have index ( i + n − k ) for some k ∈ supp ( P ) , k < n . There comes the gap into account: If P has gapparameter γ , the largest exponent less than n in supp ( P ) is ( − γ ) n by deﬁnition. Therefore, at generation 1,all pairs have an index at least i + n − ( − γ ) n ≥ γ n . At generation 2, all pairs have then an index at least 2 γ n .At generation t , all pairs have an index at least t γ n . Since indices are bounded by n −

1, there cannot be anypair of generation t if t γ n ≥ n . In other words, the largest possible generation is t = (cid:100) /γ − (cid:101) . Altogether,the total number of pairs inserted into V is at most (cid:100) /γ − (cid:101) (cid:88) t = F ( P − ) t = F ( P − ) (cid:100) /γ (cid:101) − P − P >

2, and is at most (cid:100) /γ (cid:101) F if P =

2. To simplify the exposition, we bound both of them by F P (cid:100) /γ − (cid:101) in the following. Note that of course, this number is also a bound on the number of pairsthat are extracted from V during the algorithm.This has two consequences. First, the number of extracted pairs is a bound on the size of the list at thenend of the algorithm. Therefore there are at most (cid:79) ( F P (cid:100) /γ − (cid:101) ) nonzero values f [ i ] n − . Second, this numberalso bounds the total number of executions of Step 8, that is the total number of operations. Corollary 3.9.

All operations of I NSERTION , R EMOVAL , M INIMUM and S EARCH of pairs ( i , ν ) in the data structureV within Algorithm 4 can be done with (cid:79) ( F P (cid:100) /γ − (cid:101) log log n ) bit operations.Proof. By deﬁnition the size of the sparse vector V is at most n . Therefore, using a data structure for V oftype van Emde Boas tree with a universe of size n , ensures that any requested operations can be done with (cid:79) ( log log n ) bit operations, see [

5, Chapter 20 ] . Remark 3.10.

As the bit-complexity of all the operations in (cid:90) of Algorithm 4 is (cid:79) ( F P (cid:100) /γ − (cid:101) log n ) the costdriven by the data structure of V is negligible. F G mod P on some point α , when F and G are given in sparse representation, relies on Equations (6), (8) and (7). More precisely, we ﬁrst compute each F [ i ] ( α ) for indices i such that f [ i ] n − (cid:54) = F [ j ] ( α ) for j ∈ supp ( G ) by means of Equation (7). Finally, we deduce ( F G mod P )( α ) using Equation (6).In Algorithm 5, all these computations are intertwined. The idea is to loop over all indices j such thateither f [ j ] n − (cid:54) = j ∈ supp ( G ) . If f [ j ] n − (cid:54) =

0, we update the value F [ j ] ( α ) using Equation (8). If j ∈ supp ( G ) ,we accumulate partial evaluations of ( F G mod P )( α ) using Equations (6) and (7). Algorithm 5 S PARSE M ODULAR E VALUATION

Input: P , F and G ∈ (cid:82) [ X ] , with deg ( F ) , deg ( G ) < deg ( P ) = n , P monic and α ∈ (cid:82) ext . Output: ( F G mod P )( α ) V ← { ( i , f [ i ] n − ) : 1 ≤ i < n − f [ i ] n − (cid:54) = } , using a call to S PARSE L EADING C OEFFICIENTS ( P , F ) if f [ ] n − = then insert (

0, 0 ) in V P α ← P ( α ) F α ← F ( α ) β ← F α g (cid:46) β ← / ∈ supp ( G ) i ← for j ∈ supp ( V ) ∪ supp ( G ) \ { } , by increasing order do if j ∈ supp ( V ) then F α ← α j − i − ( α F α − V [ i ] P α ) (cid:46) Equation (8) i ← j if j ∈ supp ( G ) then β ← β + α j − i F α g j (cid:46) Equations (6) and (7) return β Theorem 3.11.

Algorithm 5 is correct. It uses (cid:79) ( F P (cid:100) γ − (cid:101) ) operations in (cid:82) , (cid:79) (( F P (cid:100) γ − (cid:101) + G ) log n ) operations in (cid:82) ext .Proof. We prove that at the end of iteration j , F α = F [ j ] ( α ) if j ∈ supp ( F ) and β = (cid:80) i g i F [ i ] ( α ) where thesum ranges over indices i ∈ supp ( G ) ∩ {

0, . . . , j } . The property is satisﬁed after iteration 0 (before enteringthe loop) since F α = F [ ] ( α ) and β = F α g = g F [ ] ( α ) . Let us assume that the property holds before enteringiteration j . Index i denotes the previous index that belongs to supp ( F ) . Therefore, if j ∈ supp ( F ) , Equation (8)ensures that F α has the right value after iteration j since V [ i ] = f [ i ] n − . And Equations (6) and (7) justify that β also has the right value if j ∈ supp ( G ) .The evaluations P ( α ) and F ( α ) require (cid:79) ( P log n ) and (cid:79) ( F log n ) operations in (cid:82) ext respectively.Steps 9 and 12 each require (cid:79) ( log n ) operations in (cid:82) ext to compute powers of α and (cid:79) ( ) additions withintegers of size (cid:79) ( log n ) to compute the appropriate exponent. These steps are executed (cid:79) ( V + G ) times. Since V = (cid:79) ( F P (cid:100) /γ − (cid:101) ) this gives a total of (cid:79) (( F P (cid:100) γ − (cid:101) + G ) log n ) operations in (cid:82) ext plus (cid:79) (( F P (cid:100) γ − (cid:101) + G ) log n ) bit operations for the integer additions. Since we can easily assume that one op-eration in (cid:82) ext will cost more than one bit operation, the latter complexity is dominated by the computationpart in (cid:82) ext . The cost of Step 1 is given by Lemma 3.8. We can still use van Edme Boas tree to iterate over the union of the supports of V and G at Step 7 witha total of (cid:79) (( F P (cid:100) γ − (cid:101) + G ) log log n ) bit operations which is less than the number bit operations requiredby the additions in (cid:90) and thus negligible.Obviously, since polynomial multiplication over integral domains is commutative, the roles of F and G can be exchanged in Algorithm 5. In particular if G < F , this exchange decreases the complexity inTheorem 3.11. In other words, the statement remains valid if F is replaced by min ( F , G ) and G bymax ( F , G ) . The same remark applies to subsequent results. Remark 3.12.

As in Corollary 3.4, we can decrease the number of operations over (cid:82) ext by using simultaneousexponentiation on α . This results in log n + (cid:79) (( F P (cid:100) /γ − (cid:101) + G ) log n / log log n ) operations in (cid:82) ext . If the gap parameter γ is close to n , the polynomial F G mod P is in general a dense polynomial even if F G is sparse and the dense modular evaluation will be more appropriate. On the contrary,

F G mod P remainssparse if γ is close to 1, in particular if γ ≥ . Remark 3.13. If γ ≥ , the evaluation requires (cid:79) (( F P + G ) log n ) operations in (cid:82) ext . emark 3.14. The factor F P (cid:100) γ − (cid:101) + G in the complexity may be larger than the actual sparsity of F G mod

P.For instance, the sparsity can be if P divides F G. Yet, it is smaller than the general bound F G ( P − ) (cid:100) γ (cid:101) given by Proposition 2.2. Thus in general it is more efﬁcient to use our method than to directly evaluate F G mod Pif the polynomial is known.

This section is devoted to the veriﬁcation of polynomial modular product. That is, given F , G , H and P , suchthat deg ( F ) , deg ( G ) , deg ( H ) < deg ( P ) = n , we want to test whether H = F G mod P . The idea is classical, thatis to evaluate the identity at a random point. Contrary to the more straightforward veriﬁcation of polynomialmultiplication, we cannot do such evaluation directly since we do not know the polynomial Q = ( F G − ( F G mod P )) / P . As seen in the previous section, we provide new algorithms to do such evaluation efﬁcientlywithout reverting to the computation of Q . We remind that P is always taken monic. Note that this is a mildassumption since ( F G ) mod P = ( F G ) mod ( λ P ) for any invertible constant λ .In the following, all algorithms are analysed both when the polynomials F , G and H are dense and whenthey are sparse. On the other hand, P is always considered as a sparse polynomial. We shall recall that γ denotes the gap parameter of P , deﬁned by 1 − γ = deg ( P − X n ) , and it serves to control the densiﬁcation ofthe modular reduction.We ﬁrst begin in Section 4.1 with an abstract case where the polynomials are deﬁned over an integraldomain. There, we analyse the algorithms by counting the number of ring operations. In Sections 4.2 and 4.3we discuss some adaptations of the algorithm to the case of integers and small ﬁnite ﬁelds in order to provideﬁner analysis in the bit complexity model. (cid:82) [ X ] Algorithm 6 depicted below is straightforward from Theorems 3.6 or 3.11. We mainly provide his descriptionto serve as a starting point for its adaptations in the next sections. The algorithm covers both the dense andthe sparse case. The only difference is at Step 1.

Algorithm 6 M ODULAR V ERIFICATION

Input: F , G , H and P ∈ (cid:82) [ X ] , P monic of degree n and F , G and H of degrees < n ; 0 < ε < Output:

True if H = F G mod P , False with probability at least 1 − ε otherwise. if F , G and H are given in sparse representation then if H > F G ( P − ) (cid:100) /γ (cid:101) then return False (cid:46)

Proposition 2.2 α ← random element from a subset (cid:69) of (cid:82) , of size ≥ ε ( n − ) β ← ( F G mod P )( α ) (cid:46) using Theorem 3.6 or 3.11 return True if β = H ( α ) , False otherwise Theorem 4.1. If (cid:82) has at least ε ( n − ) elements, Algorithm 6 is correct.If F , G and H are dense, the algorithm uses (cid:79) ( Pn ) operations in (cid:82) .If F , G and H are sparse, the algorithm uses (cid:79) (( F P (cid:100) γ − (cid:101) + G + H ) log n ) operations in (cid:82) .Proof. Step 1 dismisses a trivial mistake if the polynomials are sparse. If H = ( F G ) mod P , H ( α ) = ( F G mod P )( α ) for any α and the algorithm always returns True. Otherwise, let ∆ = H − ( F G ) mod P . Then ∆ hasdegree < n , hence has at most n − α , randomlychosen in (cid:69) , is a root of ∆ is at most ( n − ) / ε ( n − ) = ε . The algorithm returns True in that case withprobability at most ε .The complexity is given by the cost of a single modular product evaluation that is stated in Theorem 3.6for the dense case and in Theorem 3.11 for the sparse case.If (cid:82) is not large enough, the algorithm fails and it is customary to revert to an extension ring (cid:82) ext toperform the evaluation in a larger set. Using Theorems 3.6 and 3.11, we get the following extension whenthe polynomials are evaluated on a random point of (cid:82) ext rather than (cid:82) . Corollary 4.2.

Let (cid:82) be an integral domain with less than ε ( n − ) elements, and (cid:82) ext an extension ring of (cid:82) with at least ε ( n − ) elements. Then Algorithm 6 can be adapted by choosing a random element from (cid:82) ext , withthe same probability of success. It uses (cid:79) ( n P ) operations in (cid:82) and (cid:79) ( n ) operations in (cid:82) ext if F , G and H aredense, and (cid:79) (( F P (cid:100) γ − (cid:101) + G + H ) log n ) operations in (cid:82) ext if they are sparse.

14n the dense case, Algorithm 6 uses an optimal number of operations in (cid:82) as soon as P is constant.It is always faster than a modular product when Pn < M ( n ) that is when P < log ( n ) log log ( n ) for ageneral ring (cid:82) . In the sparse case, Theorem 4.1 is not linear in the input size. Indeed, P is raised to apotentially large power (cid:100) γ − (cid:101) and more importantly there is a factor log n in the number of operationsin (cid:82) while the input has only ( F + G + H ) elements of (cid:82) . Nevertheless, the efﬁciency of veriﬁcationhas to be compared with the cost of computing F G mod P . We assume the latter to be done with a sparsemultiplication followed by a sparse division with P . This hypothesis seems reasonable as no work have beendone to optimize such operation yet. Letting aside the division, the number of operations in (cid:82) for sparsemultiplication could be either (cid:79) ( F G ) with naive approach or ˜ (cid:79) ( F + G + ( F G )) using [ ] . Assuming P to be constant, our veriﬁcation has a complexity of (cid:79) (( F + G + H ) log n ) which is always faster when ( F G ) = o ( F G / log n ) . If we assume that ( F G ) = (cid:79) ( F G ) , our veriﬁcation will be faster at least when n = (cid:79) ( F + G ) and P constant. Depending on the cost of the division, our algorithm could be faster in morecases.Of course, these conditions are not very restrictive. We use Algorithm 6 in Section 5 to verify classical poly-nomial multiplication, where P will be a binomial of degree either logarithmic in the sparsity or polynomialin the input degree.Yet the efﬁciency of Algorithm 6 depends heavily on the integral domain (cid:82) . Indeed the complexity ofpolynomial multiplication in (cid:82) [ X ] can be faster than (cid:79) ( n log n log log n ) operations in (cid:82) [

13, 15 ] . Further-more, if (cid:82) is small, one operation in (cid:82) ext corresponds to a non-constant number of operations in (cid:82) . In thefollowing sections we consider polynomials over the integers or ﬁnite ﬁelds and we provide thorough analysestogether with adapted versions when necessary. (cid:90) [ X ] If the polynomials are deﬁned over (cid:90) , there is no difﬁculty with the size of (cid:69) in Algorithm 6. However wemust prevent the integers growth during the evaluation. It is very classical to choose a random prime q andto map the whole computation into (cid:70) q . To do so, we must ensure two properties on the prime q . First, q must be large enough to use the algorithm, that is at least ε ( n − ) . Second, if H (cid:54) = ( F G ) mod P , we need thisinequality to hold modulo q as well. For this second property, we deﬁne ∆ = H − ( F G ) mod P . To ensure that ∆ does not vanish modulo q , we need that at least one coefﬁcient of ∆ is nonzero modulo q . We then needto bound its coefﬁcients to assess the latter fact. Proposition 4.3.

The coefﬁcients of ∆ are bounded by (cid:107) H (cid:107) ∞ + min ( F , G ) (cid:107) F (cid:107) ∞ (cid:107) G (cid:107) ∞ ( P (cid:107) P (cid:107) ∞ ) (cid:100) γ (cid:101) .Proof. The coefﬁcient of ∆ are bounded by (cid:107) H (cid:107) ∞ + (cid:107) F G mod P (cid:107) ∞ . The bound follows from Proposition 2.2,with Q = F G and (cid:107) Q (cid:107) ∞ ≤ min ( F , G ) (cid:107) F (cid:107) ∞ (cid:107) G (cid:107) ∞ by Lemma 2.1.Using this bound and Proposition 2.6 we can determine an appropriate prime q to adapt the Algorithm 6to the integer case. This is done in the Algorithm M ODULAR V ERIFICATION O VER

Z below.

Algorithm 7 M ODULAR V ERIFICATION O VER Z Input: F , G , H and P ∈ (cid:90) [ X ] , P monic of degree n and F , G and H of degrees < n ; 0 < ε < Output:

True if H = F G mod P , False with probability at least 1 − ε otherwise. ∆ ∞ ← (cid:107) H (cid:107) ∞ + min ( F , G ) (cid:107) F (cid:107) ∞ (cid:107) G (cid:107) ∞ ( P (cid:107) P (cid:107) ∞ ) (cid:100) γ (cid:101) q ← R ANDOM P RIME ( λ , ε ) where λ = max ( ε n , ε ln ∆ ∞ ) ( F q , G q , H q , P q ) ← ( F mod q , G mod q , H mod q , P mod q ) return M ODULAR V ERIFICATION ( F q , G q , H q , P q , ε ) Theorem 4.4.

Algorithm 7 is correct. If C = max ( (cid:107) P (cid:107) ∞ , (cid:107) F (cid:107) ∞ , (cid:107) G (cid:107) ∞ , (cid:107) H (cid:107) ∞ ) and T = max ( F , G , H ) ,the algorithm requires (cid:79) ( log ( n ε log C ) log log ( n ε log C ) log ε I ( log ( n ε log C ))) bit operations to get a prime number,plus • (cid:79) (( Pn + n log C log ( n ε log C ) ) I ( log ( n ε log C ))) bit operations if F , G and H are dense, or • (cid:79) (( T P (cid:100) γ − (cid:101) log n + ( T + P ) log C log ( n ε log C ) ) I ( log ( n ε log C ))) bit operations if F , G and H are sparse.Proof. To ensure that M

ODULAR V ERIFICATION works properly, we need q to be at least ε ( n − ) . This is thecase since λ ≥ ε n . The algorithm always returns the correct answer when H = ( F G ) mod P . Otherwise,15t may incorrectly return True in two cases: Either H q = ( F q G q ) mod P q while the equality does not holdover (cid:90) , or M ODULAR V ERIFICATION incorrectly returns True. Both situations occur with probability at most ε . Indeed, by Proposition 4.3, ∆ ∞ is always a bound on (cid:107) ∆ (cid:107) ∞ where ∆ = H − ( F G ) mod P . Therefore,Proposition 2.6 shows that with probability at least ε , the number q chosen at Step 2 is a prime number suchthat ∆ mod q (cid:54) =

0. The error probability of Algorithm 7 is thus at most ε .Let us now analyse the complexity of the algorithm. As a ﬁrst step, we shall express q in terms of the inputsize. Since P , F , G ≤ n , ∆ ∞ = (cid:79) ( n + (cid:100) γ (cid:101) C + (cid:100) γ (cid:101) ) . Thus, ln ∆ ∞ = (cid:79) ( γ ( log n + log C )) = (cid:79) ( n ( log n + log C )) since γ ≤ n . This implies q = (cid:79) ( n ε ( log n + log C )) and log q = (cid:79) ( log ( n ε log C )) .By Proposition 2.4, Step 2 costs (cid:79) ( log ε log q log log q I ( log q )) bit operations, that is (cid:79) (cid:0) log ε log ( n ε log C ) log log ( n ε log C ) I ( log ( n ε log C )) (cid:1) . (9)Step 3 requires (cid:79) ( n log C log q I ( log q )) bit operations in the dense case, and (cid:79) (( T + P ) log C log q I ( log q )) bit operationsin the sparse case. By Theorem 4.4, Step 4 requires (cid:79) ( Pn I ( log q )) bit operations in the dense case and (cid:79) ( T P (cid:100) γ − (cid:101) log n I ( log q )) bit operations in the sparse case. Adding the complexities of all these steps leadsto the claimed bit complexity. Remark 4.5.

In most cases, the cost of ﬁnding a prime number is negligible in comparison to the rest of thealgorithm. In the dense case, it is negligible as long as ε = / n (cid:79) ( ) . In the sparse case, it is negligible whenthe degree n is not too large compared to the other input parameters. More precisely, this is the case when n ε = (( T + P ) log C ) (cid:79) ( ) . When computing a multiplication followed by a modular reduction with polynomials in (cid:90) [ X ] , the sizeof the coefﬁcients can grow signiﬁcantly as shown by the bound on (cid:107) F G mod P (cid:107) ∞ given in the proof ofProposition 4.3. At the opposite, our veriﬁcation algorithm is done with bounded integers of bit length (cid:79) ( log ( n ε log C )) . This is logarithmic in the input size in dense representation and linear in the sparse one.Our veriﬁcation therefore avoids paying the coefﬁcient growth, contrary to the direct computation. Takingthis growth into account to compare our veriﬁcation algorithm with the computation seems hard. We onlydetail the cases where our algorithm is already faster even without considering the coefﬁcient growth. Weshall mention that for sparse polynomial, ( F G mod P ) might be smaller that H . In our analysis, we assumefor simplicity that both have approximately the same size. Remark 4.6.

For ε = / n (cid:79) ( ) , Algorithm 7 is faster than the polynomial modular product(i) when the polynomials are dense and P < min ( log n loglog n , log C logloglog C ) ;(ii) when the polynomials are sparse and n ε = (( T + P ) log C ) (cid:79) ( ) .Proof. We assume ε = / n (cid:79) ( ) . In particular, log n ε = (cid:79) ( log n ) . To simplify the analysis, we place ourselvesin the case of a negligible cost for ﬁnding the prime q , as described in Remark 4.5. In the sparse case, thisimplies that n ε must be polynomial in ( T + P ) log C . Therefore log n < min ( T , P ) and I ( log ( n log C )) = I ( log log C ) . This makes our veriﬁcation faster than computing the modular product since the latter requiresat least ( F G mod P ) = (cid:79) ( T P (cid:100) γ (cid:101) ) operations on integers of bit length log C .For the dense case, we compare to the cost of multiplying two polynomials of degree n and coefﬁcientsbounded by C . As seen in the introduction, this reduces to integer multiplication with Kronecker substitutionand it costs I ( n ( log n + log C )) = (cid:79) ( n ( log n + log n log C + log C log log C ) bit operations. By Theorem 4.4 ourveriﬁcation needs (cid:79) ( Pn I ( log n + log log C ) + n log C log ( log n + log log C )) bit operations. The second termis dominated by the complexity of multiplying the polynomials when ε = / n (cid:79) ( ) . The ﬁrst term is (cid:79) ( Pn (( log n + log log C ) log log n + ( log n + log log C ) log log log C )) .When P < min ( log n loglog n , log C logloglog C ) , this term is bounded by (cid:79) ( n ( log n + log n log C + log C log log C )) , that isthe complexity of computing the product. (cid:70) q [ X ] The situation over a ﬁnite ﬁeld (cid:70) q is different since there is no growth to prevent. When q is large enough,Theorem 4.1 applies directly. Otherwise, one can revert to computing in a sufﬁciently large extension ﬁeld of (cid:70) q , where Corollary 4.2 can be applied. We ﬁrst give the precise complexity bounds for these two cases.16 orollary 4.7. Let F , G, H and P ∈ (cid:70) q [ X ] as in Algorithm 6. One can test whether H = ( F G ) mod P usingAlgorithm 6, with (cid:79) ( log ε ( log q n ε ) M ( log q n ε )( log q + log log q n ε )) operations in (cid:70) q to get an irreducible polynomialof degree (cid:79) ( log q n ε ) , plus • (cid:79) ( n P + n M ( log q n ε )) operations in (cid:70) q if F , G and H are dense, or • (cid:79) ( T P (cid:100) γ − (cid:101) log n M ( log q n ε )) operations in (cid:70) q if F , G and H are sparse with at most T nonzero monomials.Proof. Let us assume that q < ε ( n − ) otherwise Theorem 4.1 applies straightforwardly. In that case, Corol-lary 4.2 requires to choose a random point in an extension (cid:70) q d of (cid:70) q with at least ε ( n − ) elements. Moreprecisely, we use Proposition 2.8 to produce with probability 1 − ε an irreducible polynomial of degree d over (cid:70) q , where d is the smallest integer such that q d ≥ ε ( n − ) . The algorithm may be incorrect if either thepolynomial used to deﬁned (cid:70) q d fails to be irreducible, or if the Algorithm 6 fails. If we choose α at random in (cid:70) q d , the error probability of Algorithm 6 is at most ε . This gives a total probability of error of at most ε .By deﬁnition, the degree of the extension is d = (cid:79) ( log q n ε ) . The cost of generating the irreducible poly-nomial of degree d is (cid:79) ( log ε d M ( d )( log q + log d )) by Proposition 2.8. Using Corollary 4.2 and the fact thatan operation in (cid:70) q d costs M ( d ) operations in (cid:70) q , we obtain the claimed costs. Remark 4.8. If ε = / n (cid:79) ( ) , the cost of getting an irreducible polynomial is negligible in the dense case. Thenthe algorithm requires (cid:79) ( Pn + n M ( log q n )) operations in (cid:70) q . If we add the degree constraint log n = o ( T ) ,the cost of getting an irreducible polynomials is also negligible in the sparse case and the algorithm requires (cid:79) ( T P (cid:100) γ − (cid:101) log n M ( log q n )) operations in (cid:70) q . As ( F G mod P ) is bounded by T P (cid:100) γ (cid:101) and computing F G mod P requires at least one operation in (cid:70) q for each monomial, this remark gives directly a case where the veriﬁcation is faster than the modular productin general. Moreover, naive algorithms can be used to perform products in the extension of (cid:70) q . Remark 4.9.

When ε = / n (cid:79) ( ) and n = T (cid:79) ( ) , Algorithm 6 in the sparse case is in general faster than themodular multiplication and requires ˜ (cid:79) ( T P (cid:100) γ − (cid:101) ) bit operations if q < n ε . In the dense case, we can see from Remark 4.8 that the veriﬁcation complexity might in fact be larger thanthe cost of computing the modular product

F G mod P . Indeed, assuming M q ( n ) = (cid:79) ( n log q ( log n + log log q )) bit operations [ ] , we have M q ( log q n ε ) = (cid:79) ( log n ε ( log log n ε + log log q )) . While the cost of computing F G mod P uses M q ( n ) bit operations, our veriﬁcation requires (cid:79) ( Pn log q log log q + n log n ε ( log log n ε + log log q )) bitoperations. When P is not a constant, the latter is always larger. The following remark precise when wecan expect a positive result. Remark 4.10.

Assuming

P to be constant and ε = / n (cid:79) ( ) , Algorithm 6 in the dense case with (cid:82) = (cid:70) q isasymptotically faster than the modular multiplication when(i) log n ε < q < n ε , since modular multiplication costs (cid:79) ( n log q log n ) bit operations while veriﬁcation, whichdoes need an extension, is (cid:79) ( n log n ε log log n ε ) .(ii) n ε < q < n ε , since modular multiplication costs (cid:79) ( n log q log n ) bit operations while veriﬁcation, whichdoes not use an extension, is (cid:79) ( n log q log log q ) ; If the ﬁeld is very large q > n ε , Algorithm 6 is asymptotically as fast as the modular multiplication. Thedominant factor in both complexity is (cid:79) ( n log q log log q ) .We shall mention that the veriﬁcation cost in ( i ) of Remark 4.10 assumes the use of fast multiplication ofpolynomials in order to check fast multiplication of polynomial of degree (cid:79) ( n ) . Even though this dependencyis not a problem in theory it might not be satisfactory in practice. One solution would be to use a naivepolynomial multiplication for the extension ﬁeld arithmetic but this further tightens the superiority of theveriﬁcation. Remark 4.11.

Assuming that extension ﬁeld arithmetic is done naively using quadratic polynomial multiplica-tion, Algorithm 6 remains faster than modular multiplication only when n / < q in the dense case. We now propose an novel method that enables us to improve all the dense cases where an extension ﬁeld isnecessary, while not relying on any polynomial arithmetic. More precisely, we show that fast veriﬁcation doesexist when q < log n , which is of great interest for the ﬁeld (cid:70) . It is based on the evaluation of polynomialson matrices rather than scalars, combined with Freivalds algorithm for verifying matrix multiplication [ ] .17ndeed, choosing α from an extension ﬁeld inherently leads to depend on polynomial multiplication.Instead of picking a random point that is probably not a root of ∆ = H − ( F G ) mod P when ∆ (cid:54) =

0, we picka polynomial R ∈ (cid:70) q [ X ] of degree k < n that is probably not a divisor of ∆ (cid:54) =

0. To test whether R divides ∆ ,we evaluate ∆ on the companion matrix C R of R , deﬁned by C R =  · · · − r · · · − r · · · − r ... ... ... ... ...0 0 · · · − r k  where R = (cid:80) ki = r i X i . This strategy relies on the fact that R is the minimal polynomial of its companion matrix.Therefore, R ( C R ) = ∆ such that ∆ ( C R ) = R . In other words, R divides ∆ if and only if ∆ ( C R ) =

0. We will show that taking R irreducible over (cid:70) q of degree k = (cid:79) ( log n ) makes this approach faster then the one using extension ﬁeld when ε is constant. Furthermore, it will extendthe possibility to have fast veriﬁcation for any ﬁelds, whatever the size of the polynomials.To check whether ∆ ( C R ) =

0, we need to evaluate H and ( F G ) mod P on C R , and to verify that theevaluations match. Of course, one cannot directly evaluate those polynomials on C R as it would cost (cid:79) ( nk ω ) operations in (cid:70) q for the dense case, where ω < [ ] .Since k = (cid:79) ( log n ) , this would not give any improvement to Remark 4.10.Instead, we rely on the so-called Freivald’s technique to verify matrix multiplication [ ] . The idea is thatthe matrix product C = A × B ∈ (cid:82) k × k can be veriﬁed by asserting that uC = ( uA ) × B for a random vector u ∈ {

0, 1 } n with a probability of error of 1 /

2. To assert that two polynomials evaluations on the matrix C R match, it is sufﬁcient to verify that their projection by the vector u are equal. Given a degree- n polynomials H ∈ (cid:70) q [ X ] , one can compute uH ( C R ) in (cid:79) ( nk ) operation in (cid:70) q using Horner evaluation: uH ( C R ) = u n (cid:88) i = h i C Ri = (cid:130) n (cid:88) i = h i uC Ri − (cid:140) C R + uh . (10)Since matrix-vector product with C R only costs (cid:79) ( k ) operations in (cid:70) q , and Horner procedure only uses n ofthose matrix-vector products, the cost is clear. Remark 4.12.

It is sufﬁcient to replace the evaluation of F ( α ) and P ( α ) by uF ( C R ) and uP ( C R ) in Algorithm 3( M ODULAR E VALUATION ) to reach a complexity of (cid:79) ( n ( P + deg ( R ))) operations in (cid:70) q for computing u ( F G mod P )( C R ) in the dense case. More informally, it is sufﬁcient to say that any of the operations in (cid:82) ext have now thecost of one matrix-vector product with C R . Theorem 4.13.

Let F , G, H and P ∈ (cid:70) q [ X ] , as in Algorithm 6. We can check whether H = F G mod

P in (cid:79) ( Pn + n log q n log ε ) operations in (cid:70) q with a probability of error at most ε if H (cid:54) = F G.Proof.

Let 0 < ε < be a ﬁxed probability. The algorithm needs two steps. First it computes with probabilityat least 1 − ε an irreducible polynomial R of degree d = (cid:100) log q n ε (cid:101) using Proposition 2.8. Second, it computes uH ( C R ) and u ( F G mod P )( C R ) for some random vector u ∈ {

0, 1 } d . If both evaluations are distinct, thealgorithm returns False. Otherwise, it repeats (cid:79) ( log ε ) these two steps until one of the repetition fails. If thisis the case it return false, otherwise the algorithm return true.If H = ( F G ) mod P , the algorithm always returns True. Let us assume that H (cid:54) = ( F G ) mod P , and let ∆ = H − ( F G ) mod P . For the algorithm to return True, each repetition must ensure that uH ( C R ) = u ( F G mod P )( C R ) . This may happen if either R divides ∆ , whence H ( C R ) = ( F G mod P )( C R ) , or R does not divide ∆ but uH ( C R ) = u ( F G mod P )( C R ) . Since there are at least q d / d irreducible polynomials of degree d in (cid:70) q byProposition 2.7 and at most n / d of them divide ∆ , the probability that R divides ∆ is at most 2 n / q d ≤ ε provided R is irreducible. Taking into account the probability that R is not irreducible, the probability that R divides ∆ is at most 2 ε . Then, using Freivalds standard argument, if R does not divide ∆ , the probability u ∆ ( C R ) = . Altogether, the probability that one iteration returns True is at most + ε < ε / log ( + ε ) independent iterations all return True is at most ε .Let us now analyse the complexity of the algorithm. Since ε is a constant, the second step uses (cid:79) ( Pn + n log q n ) operations in (cid:70) q using Remark 4.12. The ﬁrst step is negligible, even if naive polynomial arithmeticis used. Note that in this complexity, Pn is the cost of Algorithm 2 (L EADING C OEFFICIENTS ). Since it isdeterministic and only depends on P and F , it can be called only once rather than at each iteration.We then get the complexity (cid:79) ( Pn + n log q n log ε ) .18 emark 4.14. Note that compared to using evaluation at α in an extension ﬁeld, no operation depends on ε . If ε is ﬁxed, the new method replaces a factor M ( log q n ) in the complexity by log q n. Moreover our new approachrequires only simple computations: additions of vectors, multiplication of a vector by a scalar and matrix-vectorproduct with a companion matrix. Furthermore, when P and ε are constants, the veriﬁcation is always fasterthan the modular multiplication, whatever the size of q. This new method still requires some polynomial arithmetic even if only naive polynomial multiplicationis used. This is because the algorithm called to provide the degree- d polynomial R relies on polynomialproducts and GCDs to ensure that R is probably irreducible. In order to remove the dependency to polynomialarithmetic, we can just choose a random monic degree- d polynomial R and compute the evaluation on C R even if R is not irreducible. This implies to take several random polynomials R to reach the target probability ε . Corollary 4.15.

Let F , G, H and P ∈ (cid:70) q [ X ] , as in Algorithm 6. Without using any polynomial multiplicationwe can check whether H = F G mod

P in (cid:79) ( Pn + n ( log q n ) log ε ) operations in (cid:70) q with a probability of errorat most ε if H (cid:54) = F G.Proof.

In the proof of Theorem 4.13, we replace one evaluation on C R with R irreducible with probability atleast 1 − ε by few evaluations on several C R with R random monic polynomial of degree d . As the probabilityof a random monic polynomial R to be irreducible is at least d by Proposition 2.7, we need to generate (cid:79) ( d log ε ) random polynomials R to reach a probability at least 1 − ε that at least one of them is irreducible.Thus the evaluation part of the algorithm is repeated (cid:79) ( d ) = (cid:79) ( log q n ) times since ε is constant.Even if this new approach does not improve the complexity from evaluation at α in an extension ﬁeldusing naive polynomial multiplication, it allows to remove completely the dependency to any polynomialmultiplication algorithm While this result might not being seen useful as ﬁrst sight, it will be used in Sec-tion 5.1 to provide efﬁcient veriﬁcation for polynomial multiplication. Indeed, in that case we will need touse veriﬁcation with P being of degree smaller than the input degree.This new method using companion matrix also works in the sparse case. Indeed, any power α t in Al-gorithm 5 (S PARSE M ODULAR E VALUATION ) are now replaced with C tR . However, only few powers C tR with1 < t < n are relevant and we cannot compute all of them as in the dense case. This implies that com-puting u ( F G mod P )( C R ) instead of ( F G mod P )( C R ) is useless in that case. Indeed, using fast exponenti-ation together with the structure of the powers of companion matrices [ ] already yield a complexity of (cid:79) (( F P (cid:100) γ − (cid:101) + G ) log n log q n ) operations in (cid:70) q , and we cannot hope to lower this down by some randomvector projection. In that case, using Freivald technique is useless and we have a better probability of success.Choosing (cid:79) ( log n ε log ε ) polynomials R at random, at least one of them is irreducible and does not divide ∆ = H − F G mod P with probability 1 − ε . This leads to an algorithm which does not use any polynomialproduct in the sparse case too. Even though it is asymptotically not as fast as the veriﬁcation in an extensionﬁeld where naive polynomial arithmetic is used, it is still quasi-linear. Corollary 4.16.

Let P ∈ (cid:70) q [ X ] be monic of degree n and F , G, H ∈ (cid:70) q [ X ] of degree less than n and sparsity atmost T , and < ε < . Using a direct evaluation on a companion matrix, we can check whether H = F G mod

Pwith a probability of error at most ε if H (cid:54) = F G in (cid:79) ( T P (cid:100) γ − (cid:101) log n ( log q n ε ) log ε ) operations in (cid:70) q , withoutperforming any polynomial product. In this section we study the simpler problem of verifying a classical polynomial multiplication. Given threepolynomials F , G and H ∈ (cid:82) [ X ] of respective degrees n , n and 2 n , the classical idea to verify H = F G simplyfalls down to testing H ( α ) = F ( α ) G ( α ) for some random α in a large enough set (cid:83) . As mentioned in theintroduction, this strategy may or may not have an optimal bit complexity, depending on the context. Herewe are concerned with two difﬁculties that arise in either the dense or the sparse cases.If the polynomials are dense, the veriﬁcation through evaluation requires a number of operations in (cid:83) that is linear in the input polynomials degree n . When (cid:82) has more than n elements, taking (cid:83) ⊂ (cid:82) issufﬁcient to use evaluation. However, multiplication in (cid:82) has not a linear bit complexity and best knownresults remain quasi-linear [

2, 15, 14 ] . The evaluation therefore leads to a quasi-linear bit complexity of (cid:79) ( n I ( log n )) = (cid:79) ( n log n log log n ) . When (cid:82) is too small, for instance with a small ﬁnite ﬁeld, (cid:83) is classicallytaken as a ﬁeld extension of (cid:82) , large enough to make it unlikely that α is a root of H − F G . Therefore, each19peration in (cid:83) corresponds to an operation over (cid:82) [ X ] with non-negligible degree, meaning that the numberof operations in (cid:82) is no more linear in the inputs degree n . As mentioned in the introduction, Kaminski’sapproach [ ] circumvents the later problem by replacing the evaluation with a computation in (cid:82) [ X ] / ( X i − ) for random integer i in a prescribed range. There, his algorithm is able to verify dense polynomial productswith a linear number of operations in (cid:82) whatever the ring size. However, the same difﬁculty as for largerings may arise. Since operations in (cid:82) do not have an linear bit complexity, unless say (cid:82) = (cid:70) , this is notalways sufﬁcient to reach an optimal bit complexity for the veriﬁcation. In Section 5.1, we present Kaminski’sapproach [ ] and we provide a thorough analysis in the bit complexity model. In particular, we show thatit is possible to get optimal veriﬁcation in the bit complexity model for any polynomial in (cid:90) [ X ] and for somepolynomials in (cid:70) q [ X ] , depending on the relation between q and n .If the polynomials F , G and H are sparse with at most T nonzero coefﬁcients, the evaluation requires anumber of operations in (cid:83) that is (cid:79) ( T log n ) . However the input bit size is given by the size of the exponents plus the size of the coefﬁcients, that is (cid:79) ( T + log n ) bits. Since (cid:83) has to be of size at least (cid:79) ( log n ) , thebit complexity of evaluation would be (cid:79) ( T log n ) which is not even quasi-linear. In Section 5.2 we developa novel method, already appearing in [ ] , to verify sparse polynomial multiplication with a quasi-linear bitcomplexity of ˜ (cid:79) ( T + log n ) . In [ ] , Kaminski describes an algorithm to verify a polynomial product H = F G ∈ (cid:82) [ X ] using a linearnumber of operations in (cid:82) , regardless of its size. His method chooses at random a polynomial P that probablydo not divides ∆ = H − F G if ∆ (cid:54) =

0. Then he veriﬁes H = F G ∈ (cid:82) [ X ] / P using fast polynomial multiplication.Surprisingly, taking P of degree o ( n ) in his algorithm enables to reach a linear number of operations in (cid:82) . Inthe following we will often use δ > The ﬁrst step is to randomly select a polynomial from a ﬁxed set, such that it most probably does not divide ∆ = H − F G if ∆ (cid:54) =

0. A standard approach could be to consider irreducible polynomials. This would be thedirect generalization of the evaluation method. However, Kaminski considers polynomials that are instead ofthe form X i −

1, for some integers i >

0. These polynomials have two advantages: Reduction modulo X i − Proposition 5.1 ( [ ] ) . For any integer set I ⊂ (cid:78) , (cid:81) i ∈ I Φ i divides lcm { X i − i ∈ I } , where Φ i is the i-thcyclotomic polynomial in (cid:82) [ X ] . Kaminski also gives a lower bound on the degree of lcm { X i − i ∈ I } , depending in I . In particular theproposition implies that a nonzero polynomial, divisible by k polynomials, of the form X i −

1, cannot have atoo small degree. In the converse direction, a nonzero polynomial ∆ of degree at most 2 n cannot have toomany divisors of the form X i −

1. This is the content of Kaminski’s main theorem.

Theorem 5.2 ( [ ] ) . Let ∆ be a nonzero polynomial in (cid:82) [ X ] of degree ≤ n and < e < . Let k = (cid:100) δ n e ln ln ( n − e ) (cid:101) . At most k − polynomials in the set { X i − | n − e ≤ i < n − e } divide ∆ . Kaminski’s approach is then to choose a random integer i ∈ [ n − e , 2 n − e [ , to reduce the input polynomialsmodulo X i − (cid:82) [ X ] / ( X i − ) . We provide in Algorithm K AMINSKI V ERIFICATION a more precise description of this approach.

Algorithm 8 K AMINSKI V ERIFICATION

Input: F , G , H ∈ (cid:82) [ X ] of degree n , n and 2 n ; and 0 < e < . Output:

True if H = F G , False with probability at least 1 − ( (cid:100) δ n e ln ln ( n − e ) (cid:101) − ) / n − e otherwise. i ← random integer in [ n − e , 2 n − e [ F i , G i , H i ← F mod X i − G mod X i − H mod X i − M ← F i G i (cid:46) Using a fast multiplication algorithm M i ← M mod X i − return M i = H i heorem 5.3 ( [ ] ) . Let F , G and H ∈ (cid:82) [ X ] of degree at most n, n and n, < e < and an integer k asin Theorem 5.2. Algorithm 8 uses (cid:79) ( n ) operations in (cid:82) , and its failure probability is at most ( k − ) / n − e ifH (cid:54) = F G.

Remark 5.4.

To be more precise, Algorithm 8 requires (cid:79) ( n ) additions in (cid:82) at Step 2 to compute the ﬁrst threereductions modulo X i − , M ( n − e ) operations in (cid:82) to compute the product at Step 3, and (cid:79) ( n − e ) additions in (cid:82) to compute the last reduction. One shall remark that the product in Step 3 must be computed with a subquadratic algorithm such that M ( n − e ) = (cid:79) ( n ) since e < /

2. If the parameter e is taken close enough to 1 /

2, Karatsuba’s algorithm sufﬁcesto reach a linear number of operations. The failure probability is (cid:79) ( log log n / n − e ) , whence the need to have e < /

2. We can bound this probability by (cid:79) ( n e (cid:48) ) for any positive integer e (cid:48) < − e . In order to reach aprobability ε of error, the algorithm should be repeated (cid:79) ( log n ε ) times. Note that this number of rounds isconstant if ε is taken as 1 / n (cid:79) ( ) .The drawback of such approach is to crucially rely on a somewhat fast multiplication algorithm, and toperform multiplications of polynomials of degrees more than (cid:112) n . This means that optimal veriﬁcation of theproduct of two degree- n polynomials uses a product of polynomials of degrees close to n . In some contexts,such as verifying an implementation, relying on the same problem is deﬁnitively problematic.We note that all steps starting from Step 3 aim to verify H i = F i G i mod X i − Corollary 5.5. If (cid:82) = (cid:90) or a ﬁnite ﬁeld, and F , G and H ∈ (cid:82) [ X ] of degrees n, n and n. We can checkwhether H = F G with a probability of failure at most ε if H (cid:54) = F G. This requires (cid:79) ( n log n ε ) additions in (cid:82) pluso ( n log n ε ) operations in (cid:82) , without reverting to any polynomial multiplication. In particular, the algorithmuses an optimal number of operations in (cid:82) when ε = / n (cid:79) ( ) .Proof. We replace the last three steps of Algorithm 8 by a modular product veriﬁcation, with a probability offailure at most 1 / n . Over (cid:90) or large ﬁnite ﬁelds, the complexity of this part is given by the dense version ofTheorem 4.1 with P = i = (cid:79) ( n − e ) . Over small ﬁnite ﬁelds, we rely instead on Corollary 4.15.In both cases, one can achieve a failure probability at most 1 / n with at most (cid:79) ( log n ) repetitions of thealgorithm, for a total number of operations in (cid:82) that remains o ( n ) .The total probability of failure of the modiﬁed algorithm is then 1 / n + (cid:79) ( / n e (cid:48) ) = (cid:79) ( / n e (cid:48) ) for some e (cid:48) >

0. We can repeat this modiﬁed algorithm for (cid:79) ( log n ε ) rounds to get the announced failure probabilityand complexity. In [ ] , Kaminski only details the algebraic complexity of its polynomial product veriﬁcation, and no furtherinsights on the bit complexity are given. We now perform this analysis for polynomials over ﬁnite ﬁelds andover (cid:90) . We surprisingly prove that his algorithm remains linear in number of bit operations in many cases.For polynomials over (cid:70) q , the algorithm fails to be linear only when q is doubly exponentially larger than thedegree. For polynomials over (cid:90) , a similar condition applies. However, we are able to describe a variant ofthe algorithm that has linear bit complexity for polynomials with large coefﬁcients. Hence we prove thatpolynomial product veriﬁcation over (cid:90) has linear bit complexity in all cases. Our variant is based on integerproduct veriﬁcation, for which Kaminski actually gives also in [ ] a linear-time algorithm. Of course allthose algorithms are therefore optimal.The next theorem provides the bit complexity analysis of Kaminski’s algorithm over ﬁnite ﬁelds. Theorem 5.6.

Let F , G and H ∈ (cid:70) q [ X ] of degrees n, n and n, and < e < . Algorithm 8 requires (cid:79) ( n log q + n − e log q log log q ) bit operations. When log log q = (cid:79) ( n e ) , one can verify if H = F G with failure probability atmost ε if H (cid:54) = F G, using (cid:79) ( n log q log n ε ) bit operations which is optimal when ε = / n (cid:79) ( ) .Proof. We apply the count of operations given in Remark 5.4. The additions give the term (cid:79) ( n log q ) . Thebit complexity of the product of degree- (cid:79) ( n − e ) polynomials over (cid:70) q is M q ( n − e ) = (cid:79) ( n − e log q log ( n log q )) ,which is (cid:79) ( n log q + n − e log q log log q ) . We obtain the claimed complexity.The second part directly follows from the observation that (cid:79) ( log n ε ) rounds of the algorithm yield a failureprobability at most ε . 21ote that the bound log log q = (cid:79) ( n e ) to get a linear number of bit operations in n log q is only validwhen using the fastest known multiplication algorithm. If we replace by a slower algorithm, the boundbecomes smaller. For instance, using Karatsuba’s algorithm the product of degree- (cid:79) ( n − e ) polynomials uses (cid:79) ( n ( − e ) log3 log q log log q ) ring operations. For the algorithm to still have an optimal complexity, we needthat n ( − e ) log3 log log q = (cid:79) ( n ) . This implies e ≥ − / log 3 (cid:39) q = (cid:79) ( n − ( − e ) log3 ) . If we take e close to 1 /

2, say 0.45, the bound reads log log q = (cid:79) ( n ) while it is log log q = (cid:79) ( n ) using the fastest multiplication algorithm.Further, as mentioned previously, using a fast multiplication algorithm for the veriﬁcation of a polynomialproduct is problematic. We now analyse the bit complexity of our variant that does not use any polynomialproduct, that is of Corollary 5.5. We show that the same complexity and the same bound on q can be obtainedwithout any polynomial product. Remark 5.7.

Let F , G, H ∈ (cid:70) q [ X ] of degrees n, n and n, and < e < . Algorithm 8 can be implemented usinga modular product veriﬁcation and without any polynomial product. This variant has bit complexity (cid:79) ( n log q + n − e log q log log q ) . When log log q = (cid:79) ( n e ) , one can verify if H = F G with failure probability at most ε ifH (cid:54) = F G, using (cid:79) ( n log q log n ε ) bit operations, which is optimal when ε = / n (cid:79) ( ) , and without reverting to anypolynomial product.Proof. The proof simply consists in using Corollary 4.15 in place of Remark 5.4 in the previous proof.Now we consider F , G and H ∈ (cid:90) [ X ] with (cid:107) F (cid:107) ∞ , (cid:107) G (cid:107) ∞ , (cid:107) H (cid:107) ∞ ≤ C . We ﬁrst analyse the bit complexityof Algorithm 8 and provide conditions for the algorithm to use a linear number of bit operations. Laterwe propose a variant to be able to verify H = F G with a linear number of bit operations for any integerpolynomials..

Theorem 5.8.

Let F , G and H ∈ (cid:90) [ X ] of degrees n, n and n, and norms at most C, and < e < . Algorithm 8requires (cid:79) ( n log C + n − e log C log log C ) bit operations. When log log C = (cid:79) ( n e ) , one can verify if H = F G withfailure probability at most ε if H (cid:54) = F G, using (cid:79) ( n log C log n ε ) bit operations which is optimal when ε = / n (cid:79) ( ) .Proof. The ﬁrst three reductions require (cid:79) ( n ) additions in (cid:90) to compute F i , G i and H i , whose norms are atmost n e C . A careful computation of these additions using a binary tree uses (cid:79) ( (cid:80) log ni = n i log ( iC ) = (cid:79) ( n log C ) bit operations. Then the polynomial product is performed with inputs of degree n − e and norm n e C . Asdiscussed in the introduction, it requires I ( n − e ( log ( n e C ) + log n − e )) bit operations, that is (cid:79) ( n − e ( log n + log C )( log n + log log C )) = (cid:79) ( n log C + n − e log C log log C ) . Finally the last reduction is performed withdegree 2 n − e and norm n ( n e C ) in (cid:79) ( n log C ) bit operations.Repeating (cid:79) ( log n ε ) times the algorithm provides the second part of the theorem.As for polynomials over ﬁnite ﬁelds, the ﬁnal computations can be replaced by a modular product veriﬁca-tion. Here this yields a slightly better complexity. This improvement translates into an exponentially smallerconstraint on the norm C for the algorithm to be optimal. Remark 5.9.

Let F , G and H ∈ (cid:90) [ X ] , of degrees n, n and n and norms at most C, and < e < . Algorithm 8can be implemented using a modular product veriﬁcation and without any polynomial product. This variant hasbit complexity (cid:79) ( n log C + n − e log ( C ) log log log ( C )) . When log log log C = (cid:79) ( n e ) , one can verify if H = F Gwith failure probability at most ε if H = F G, using (cid:79) ( n log C log n ε ) bit operations, which is optimal when ε = / n (cid:79) ( ) , and without reverting to any polynomial product.Proof. The proof is once again similar, using the dense part of Theorem 4.4 for the modular product veriﬁ-cation. This veriﬁcation is performed on polynomials of degrees (cid:79) ( n − e ) and norm at most n e C . Its bit com-plexity is then (cid:79) ( n − e ( I ( log ( n log C )+ log ( C ) log log ( n log C )))) which is (cid:79) ( n log C + n − e log ( C ) log log log ( C )) .This proves the ﬁrst part of the remark. The second part relies on repetition of Algorithm 8.As long as the coefﬁcients are not insanely huge compared to the degree, the previous remark appliesand the polynomial product veriﬁcation is linear. More precisely, this corresponds to C ranging from (cid:79) ( ) to2 (cid:79) ( n ) . To deal with this extreme case of huge coefﬁcients, we develop another approach that is valid as soonas log n = (cid:79) ( log C ) . This means that all cases are covered with an optimal bit complexity. We shall mentionthat both methods are applicable when C is ranging from n (cid:79) ( ) to 2 (cid:79) ( n ) , which could be interesting whendesigning the most efﬁcient implementation.To treat the huge coefﬁcient case, we rely on a result of Kaminski about the veriﬁcation of the productof two integers. His technique is similar to the polynomial case: He reduces s -bit integers modulo 2 i − i between s − e and 2 s − e , and then performed the product with reduced integers.22 heorem 5.10 ( [ ] ) . Let a, b, c be integers of at most s, s and s bits, < e < and k = (cid:100) δ s e ln ln ( s − e ) (cid:101) where δ > . We can check whether ab = c in (cid:79) ( s ) bit operations with a probability of error at most ( k − ) / s − e if ab (cid:54) = c. To verify a polynomial product H = F G over (cid:90) , we use the same idea as for computing the product. We useKronecker substitution. If we evaluate each polynomial on β that is some large power of two, the coefﬁcientsof F G can directly be read on the digits of the integer F ( β ) G ( β ) . These evaluations at β require no operation.The polynomial product veriﬁcation is thus reduced to an integer product veriﬁcation H ( β ) = F ( β ) G ( β ) . Theorem 5.11.

Let F , G, H ∈ (cid:90) [ X ] of respective degrees n, n and n, and norm at most C. If log n = (cid:79) ( log C ) ,we can check whether H = F G with failure probability at most ε if H (cid:54) = F G, using (cid:79) ( n log C log n log C ε ) bitoperations, which is optimal when ε = / n (cid:79) ( ) .Proof. As F and G have norm C and degree n , F G has norm at most nC . Let β be the ﬁrst power of 2 greaterthan nC . Then H = F G if and only if H ( β ) = F ( β ) G ( β ) .The integers F ( β ) , G ( β ) and H ( β ) have bit length (cid:79) ( n log β ) = (cid:79) ( n log ( nC )) = (cid:79) ( n log C ) since log n = (cid:79) ( log C ) . As β is a large enough power of 2, the evaluation on β does not require any operation. Thereforeall the cost comes from the veriﬁcation of F ( β ) G ( β ) = H ( β ) . This is linear in the size of F ( β ) , G ( β ) and H ( β ) by Theorem 5.10, hence linear in n log C .To get the appropriate probability bound, we use (cid:79) ( log n log C ε ) round of this algorithm. This is supportedby the fact that the probability bound in Theorem 5.10 is 1 / s (cid:79) ( ) . Given three sparse polynomials F , G and H in (cid:82) [ X ] , we want to assert that H = F G . As already mentioned,evaluating the polynomials at a random point α cannot yield a quasi-linear algorithm. Our approach is totake a random prime p and to verify the equality modulo X p − (cid:82) . We furtherextend the description and the analysis of this algorithm for the speciﬁc cases (cid:82) = (cid:90) and (cid:82) = (cid:70) q . Algorithm 9 S PARSE V ERIFICATION

Input: H , F , G ∈ (cid:82) [ X ] ; 0 < ε < Output:

True if H = F G , False with probability at least 1 − ε otherwise. Deﬁne 0 < ε < and 0 < ε < ε + ( − ε ) ε ≤ ε n ← deg ( H ) if H > F G or n (cid:54) = deg ( F ) + deg ( G ) then return False λ ← max ( ε ( F G + H ) ln n ) p ← R ANDOM P RIME ( λ , ε ) ( F p , G p , H p ) ← ( F mod X p − G mod X p − H mod X p − ) return True if H p = ( F p G p ) mod X p −

1, False otherwise (cid:46) using Theorem 4.1 with probability ε Theorem 5.12. If (cid:82) is an integral domain of size ≥ ε ε ( F G + H ) ln ( n ) , Algorithm 9 works as speci-ﬁed. Assuming that n = deg ( H ) and T = max ( F , G , H ) , it requires (cid:79) ( T log ( ε T log n )) operations in (cid:82) ,and (cid:79) ( T log n log log ( ε T log n )) bit operations plus (cid:79) ( log ε log ( ε T log n ) log log ( ε T log n )) bit operations toobtain a prime p.Proof. Step 3 dismisses two trivial mistakes and ensures that n is a bound on the degree of each polynomial.If H = F G , the algorithm always returns True. Otherwise, there are two sources of failure. Either X p − H − F G . Since this polynomial has at most H + F G terms, this failure occurs with probability atmost ε by Proposition 2.5. Or X p − H − F G but the modular product veriﬁcation fails.This occurs with probability at most ε . Altogether, the failure probability is at most ε + ( − ε ) ε ≤ ε .To analyse the complexity, we consider ε , ε ∼ ε (for example ε = ε and ε = ε ). Let us remarkthat p = (cid:79) ( ε T log n ) . To get the prime p , Step 5 requires only (cid:79) ( log ε log p log log p ) bit operations byProposition 2.4. This gives the announced complexity once log p is replaced by (cid:79) ( log ( ε T log n )) .The operations in Step 6 are T divisions by p on integers bounded by n . Their cost is (cid:79) ( T log n log p I ( log p )) = (cid:79) ( T log n log log p ) bit operations, that is (cid:79) ( T log n log log ( ε T log n ))) , plus T additions in (cid:82) .23n Step 7, F p , G p and H p have degree p = (cid:79) ( ε T log n ) and at most T monomials. They are still sparseand we can use the sparse version of Theorem 4.1 with P = X p −

1. The veriﬁcation of H p = F p G p mod X p − (cid:79) ( T log p ) = (cid:79) ( T log ( ε T log n )) operations in (cid:82) . Other steps have negligible cost.To clarify the complexity, we will use the notation (cid:79) ε ( f ( n )) as a shortcut for (cid:79) ( f ( n ) log k ε ) for some k . Using this notation, the complexity of Algorithm 9 becomes (cid:79) ε ( T log ( T log n )) operations in (cid:82) plus (cid:79) ε ( T log n log log ( T log n ))) bit operations as getting the prime p is logarithmic in T and log n .The rest of the section is dedicated to the bit complexity analysis of this algorithm over integers or ﬁniteﬁelds. Our goal is to have bit complexities that are as close as possible to linear. To ease the comparisonwith truly linear complexity, we express these bit complexities in terms of the total bit size s of the input. Adegree- n polynomial with T monomials has bit size s = (cid:79) ( T ( log n + log q )) if it has coefﬁcients in (cid:70) q , and s = (cid:79) ( T ( log n + log C )) if it has coefﬁcients in (cid:90) of absolute value at most C .We ﬁrst note that reducing the input polynomials modulo X p − (cid:79) ε ( T log n log log ( T log n )) , which is (cid:79) ε ( s log log s ) . We shall provethat in some cases, this step is actually the dominant term in the complexity.We begin with the analysis over the integers. Corollary 5.13.

Let F , G and H ∈ (cid:90) [ X ] of degree at most n, with norm at most C and sparsity at most T . ThenAlgorithm 9has bit complexity (cid:79) ε ( s log s log log s ) , where s = T ( log n + log C ) is the input size.Proof. The modiﬁcation only concerns Step 7, where we use Theorem 4.4 for the modular product veriﬁcationwith P = X p − F p , G p , H p that have sparsity T and norm T C . So this step costs (cid:79) ε ( T log p I ( log ( p log C )) + T log ( T C ) log log ( p log T C )) .Since T ≤ n , T log p = (cid:79) ε ( T log n ) = (cid:79) ε ( s ) . And log ( p log C ) = (cid:79) ε ( log ( T log n log C )) = (cid:79) ε ( log ( T log n ) + log log C ) = (cid:79) ε ( log s ) . Thus the ﬁrst term is (cid:79) ε ( s log s log log s ) . Also, T log ( T C ) = (cid:79) ( T log nC ) = (cid:79) ( s ) .As log ( p log T C ) = (cid:79) ε ( log ( T log n ) + log log C ) = (cid:79) ε ( log S ) , the second term is (cid:79) ε ( s log log s ) .Since Step 6 is unchanged and has bit complexity (cid:79) ε ( s log log s ) , the result follows.The complexity is actually better for very sparse polynomials. Remark 5.14.

If F , G, H ∈ (cid:90) [ X ] of bit size s have sparsity at most T = Θ ( log k n ) for some k, Algorithm 9 hasbit complexity (cid:79) ε ( s log log s ) .Proof. The input size is s = Θ ( log k + n + log k n log C ) . In this case, log p = (cid:79) ε ( log log n ) . In the previousproof, there is one dominant term of order (cid:79) ε ( s log s log log s ) , while the other terms are already of order (cid:79) ε ( s log log s ) . It is sufﬁcient to prove that with the new assumption, the dominant term is also (cid:79) ε ( s log log s ) .The dominant term (cid:79) ε ( s log s log log s ) in the complexity comes from the term (cid:79) ε ( T log p I ( log ( p log C ))) .Since log ( p log C ) = (cid:79) ε ( log log n + log log C ) , this dominant term becomes (cid:79) ε ( log k n log log n ( log log n + log log C ) log ( log log n + log log C )) .Note that log log n and log log C are both (cid:79) ( log s ) , therefore this can be rewritten (cid:79) ε ( log k n log s log log s ) .Since log k n = (cid:79) ( s k / ( k + ) ) , this yields (cid:79) ε ( s log log s ) .We now switch to polynomials over ﬁnite ﬁelds. There are more cases to consider, depending on the sizeof the ﬁeld with respect to the degree and sparsity of the inputs. The ﬁrst easy case is the case of large ﬁniteﬁelds: If there are enough points for the evaluation, the generic algorithm keeps its guarantee of successwhile offering a quasi-linear bit complexity. Corollary 5.15.

Let F , G and H ∈ (cid:70) q [ X ] of degree at most n and sparsity at most T where q > ε ε ( F G + H ) ln n. Then Algorithm 9 has bit complexity (cid:79) ε ( s log ( s )) where s = T ( log n + log q ) is the input size.Proof. It is still enough to analyse Step 7. Each ring operation in (cid:70) q costs (cid:79) ( log ( q ) log log ( q )) bit operationswhich implies that the bit complexity of Step 7 is (cid:79) ε ( T log ( T log n ) log ( q ) log log ( q )) . Since both T log q and T log n are (cid:79) ( s ) and log log q = (cid:79) ( log s ) , the result follows.24f the ﬁeld is not large enough, we need to use some extension ﬁeld. This slightly modiﬁes the algorithmbut actually yields a better complexity bound than for large ﬁnite ﬁelds. This is due to the fact that in thatcase, we choose an extension of the exact appropriate size. Note that the probability of success remainsunchanged. Corollary 5.16.

Let F , G and H ∈ (cid:70) q [ X ] of degree at most n and sparsity at most T where q < ε ε ( F G + H ) ln n. Algorithm 9 has bit complexity (cid:79) ε ( s log s log log s ) , where s = T ( log n + log q ) is the input size.Proof. By Corollary 4.7 Step 7 requires (cid:79) ε ( T log p M q ( log q p ) + ( log q p ) M q ( log q p )( log q + log log q p )) op-erations in (cid:70) q . Since log p = (cid:79) ε ( log ( T log n )) = (cid:79) ε ( log s ) and log q = (cid:79) ε ( log s ) too, the second termis polylogarithmic in s . As log p = (cid:79) ε ( log ( T log n )) the ﬁrst term is (cid:79) ε ( T log ( T log n ) M q ( log q ( T log n ))) .Since log ( T log n ) = (cid:79) ( log n ) , T log ( T log n ) = (cid:79) ( s ) . Furthermore, log ( T log n ) = (cid:79) ( log s ) and the ﬁrstterm simpliﬁes to (cid:79) ε ( s M q ( log q s )) . Now M q ( log q s ) = (cid:79) ( log s log log s ) . Altogether T log p M q ( log q p ) = (cid:79) ε ( s log s log log s ) . The result follows.Again, we note that for very sparse polynomials over some ﬁelds, the complexity is even better. Remark 5.17.

Let F , G and H ∈ (cid:70) q [ X ] of degree at most n and sparsity at most T , where q < ε ε ( F G + H ) ln n. The bit complexity of Algorithm 9 is(i) (cid:79) ε ( s log s ) if log q ( T log n ) = (cid:79) ( ) ,(ii) (cid:79) ε ( s log log s ) if T = Θ ( log k n ) for some constant k.Proof. The most signiﬁcant term in the complexity is (cid:79) ε ( T log ( T log n ) M q ( log q ( T log n )) . In the ﬁrstcase, it becomes (cid:79) ε ( T log ( T log n ) M q ( )) = (cid:79) ε ( T log ( T log n ) log q log log q ) . As log q = (cid:79) ε ( log log n ) , T log q log log q = (cid:79) ε ( s ) and the complexity becomes (cid:79) ε ( s log s ) . In the second case, the most signiﬁcantterm can be bounded by (cid:79) ε ( T log ( T log n )) . But T = (cid:79) ( s k / ( k + ) ) , and this most signiﬁcant term becomes (cid:79) ε ( s ) only. The global bit complexity is then dominated by Step 6 and is (cid:79) ε ( s log log s ) .To conclude, the bit complexity of Algorithm 9 over integers or ﬁnite ﬁelds range from (cid:79) ε ( s log log s ) in the most favorable cases, to (cid:79) ε ( s log s ) in more complicated situations. We note that in the best cases,the complexity is actually dominated by the cost of the modular reduction of the exponents of the inputpolynomials. Remark 5.18.

Veriﬁcation of a sparse product is always faster than computing the sparse product over (cid:90) or (cid:70) q .Proof. Assuming s = T ( log n + log ζ ) to be the input size of the sparse polynomial F , G and H . Over (cid:90) we havelog ζ = log C where C is the norm of the coefﬁcients, while log ζ = log q when in (cid:70) q . The best know result forcomputing the product F G needs (cid:79) ε ( s log ( s ) log ( T )( log T + log log s )) bit operations [ ] . Taking the worstcase complexity for our veriﬁcation yields a cost of (cid:79) ε ( s log s ) . This means that we are always faster by afactor (cid:79) ( log ( T )( log T + log log s )) . Of course, for some small ﬁnite ﬁelds we are even beyond this value. References [ ] A. Arnold and D. S. Roche. Output-sensitive algorithms for sumset and sparse polynomial multiplication.In

ISSAC ’15 , pages 29–36. ACM, 2015. [ ] D. G. Cantor and E. Kaltofen. On fast multiplication of polynomials over arbitrary algebras.

Acta Infor-matica , 28:693–701, 1991. [ ] R. Cole and R. Hariharan. Verifying candidate matches in sparse and wildcard matching. In

STOC , pages592–601. ACM, 2002. [ ] J. W. Cooley and J. W. Tukey. An Algorithm for the Machine Calculation of Complex Fourier Series.

Mathematics of Computation , 19:297–301, 1965. [ ] Th. H. Cormen, Ch. E. Leiserson, R. L. Rivest, and C. Stein.

Introduction to Algorithms . The MIT Press,3rd edition, 2009. [ ] R. A. Demillo and R. J. Lipton. A probabilistic remark on algebraic program testing.

Information Pro-cessing Letters , 7(4):193 – 195, 1978. 25 ] R. Freivalds. Fast probabilistic algorithms. In

Mathematical Foundations of Computer Science , volume 74,pages 57–69. Springer Berlin Heidelberg, 1979. [ ] J. von zur Gathen and J. Gerhard.

Modern Computer Algebra (third edition) . Cambridge University Press,2013. [ ] P. Giorgi, B. Grenet, and A. Perret du Cray. Essentially optimal sparse polynomial multiplication. In

Proceedings of the 2020 international symposium on symbolic and algebraic computation , ISSAC, pages202–209. ACM, 2020. [ ] P. Giorgi. A probabilistic algorithm for verifying polynomial middle product in linear time.

InformationProcessing Letters , 139:30 – 34, 2018. [ ] S. W. Golomb.

Shift register sequences . Aegean Park Press, 1982. [ ] D. Gries and G. Levin. Computing ﬁbonacci numbers (and similarly deﬁned functions) in log time.

Inf.Process. Lett. , 11:68–69, 1980. [ ] D. Harvey and J. van der Hoeven. Faster polynomial multiplication over ﬁnite ﬁelds using cyclotomiccoefﬁcient rings.

Journal of Complexity , 54:101404, 2019. [ ] D. Harvey and J. van der Hoeven. Integer multiplication in time O ( n log n ) . To appear in Ann. of Math. ,March 2019. [ ] D. Harvey and J. van der Hoeven. Polynomial multiplication over ﬁnite ﬁelds in time O(n log n). workingpaper or preprint, March 2019. [ ] J. van der Hoeven, R. Lebreton, and É. Schost. Structured FFT and TFT: Symmetric and Lattice Polyno-mials. In

ISSAC’13 , pages 355–362. ACM, 2013. [ ] J. van der Hoeven and G. Lecerf. On the Complexity of Multivariate Blockwise Polynomial Multiplication.In

ISSAC’12 , pages 211–218. ACM, 2012. [ ] J. van der Hoeven and G. Lecerf. On the bit-complexity of sparse polynomial and series multiplication.

J. Symb. Comput. , 50:227–254, 2013. [ ] S. C. Johnson. Sparse polynomial arithmetic.

ACM SIGSAM Bulletin , 8(3):63–71, 1974. [ ] M. Kaminski. A note on probabilistically verifying integer and polynomial products.

J. ACM ,36(1):142–149, January 1989. [ ] F. Le Gall. Powers of tensors and fast matrix multiplication. In

ISSAC , pages 296–303, New York, NY,USA, 2014. ACM. [ ] M. Monagan and R. Pearce. Parallel sparse polynomial multiplication using heaps. In

ISSAC , page 263.ACM, 2009. [ ] M. Monagan and R. Pearce. Sparse polynomial division using a heap.

J. Symb. Comput. , 46(7), 2011. [ ] G. L. Mullen and D. Panario.

Handbook of Finite Fields . Chapman & Hall / CRC, 1st edition, 2013. [ ] V. Nakos. Nearly optimal sparse polynomial multiplication.

IEEE Transactions on Information Theory ,66(11):7231–7236, 2020. [ ] D. S. Roche. Chunky and equal-spaced polynomial multiplication.

Journal of Symbolic Computation ,46(7):791 – 806, 2011. doi:10.1016 / j.jsc.2010.08.013. [ ] D. S. Roche. What can (and can’t) we do with sparse polynomials? In

ISSAC , 2018. [ ] J. B. Rosser and L. Schoenfeld. Approximate formulas for some functions of prime numbers.

Illinois J.Math. , 6(1):64–94, 03 1962. http: // projecteuclid.org / euclid.ijm / [ ] J. T. Schwartz. Fast probabilistic algorithms for veriﬁcation of polynomial identities.

J. ACM , 27(4):701–717, October 1980. [ ] V. Shoup.

A Computational Introduction to Number Theory and Algebra . Cambridge University Press,second edition, 2008. 26 ] A. C. Yao. On the Evaluation of Powers.

SIAM Journal on Computing , 5(1):100–103, 1976. [ ] R. Zippel. Probabilistic algorithms for sparse polynomials. In