[PDF] Essentially Optimal Sparse Polynomial Multiplication

Abstract

We present a probabilistic algorithm to compute the product of two univariate sparse polynomials over a field with a number of bit operations that is quasi-linear in the size of the input and the output. Our algorithm works for any field of characteristic zero or larger than the degree. We mainly rely on sparse interpolation and on a new algorithm for verifying a sparse product that has also a quasi-linear time complexity. Using Kronecker substitution techniques we extend our result to the multivariate case.

Full PDF

EEssentially Optimal Sparse Polynomial Multiplication

Pascal Giorgi Bruno Grenet Armelle Perret du CrayLIRMM, Univ. Montpellier, CNRSMontpellier, France { pascal.giorgi,bruno.grenet,armelle.perret-du-cray } @lirmm.frJune 8, 2020 Abstract

We present a probabilistic algorithm to compute the product of two univariate sparse polynomials over aﬁeld with a number of bit operations that is quasi-linear in the size of the input and the output. Our algorithmworks for any ﬁeld of characteristic zero or larger than the degree. We mainly rely on sparse interpolationand on a new algorithm for verifying a sparse product that has also a quasi-linear time complexity. UsingKronecker substitution techniques we extend our result to the multivariate case.

Polynomials are one of the most basic objects in computer algebra and the study of fast polynomial operationsremains a very challenging task. Polynomials can be represented using either the dense representation, thatstores all the coefﬁcients in a vector, or the more compact sparse representation, that only stores nonzeromonomials. In the dense representation, we know quasi-optimal algorithms for decades. Yet, this is not thecase for sparse polynomials.In the sparse representation, a polynomial F = (cid:80) Di = f i X i ∈ R [ X ] is expressed as a list of pairs ( e i , f e i ) such that all the f e i are nonzero. We denote by F its sparsity , i.e. the number of nonzero coefﬁcients. Let F be a polynomial of degree D , and B a bound on the size of its coefﬁcients. Then, the size of the sparserepresentation of F is O ( F ( B + log D )) bits. It is common to use B = + max i ( (cid:98) log ( | f e i | ) (cid:99) ) if R = (cid:90) and B = + (cid:98) log q (cid:99) if R = (cid:70) q . The sparse representation naturally extends to polynomials in n variables: Eachexponent is replaced by a vector of exponents which gives a total size of O ( F ( B + n log D ) .Several problems on sparse polynomials have been investigated to design fast algorithms, including arith-metic operations, interpolation and factorization. We refer the interested readers to the excellent surveyby Roche and the references therein [ ] . Contrary to the dense case, note that fast algorithms for sparsepolynomials have a (poly-)logarithmic dependency on the degree. Unfortunately, as shown by several NP -hardness results, such fast algorithms might not even exist unless P = NP . This is for instance the case for GCD computations [ ] .In this paper, we are interested in the problem of sparse polynomial multiplication. In particular, weprovide the ﬁrst quasi-optimal algorithm whose complexity is quasi-linear in both the input and the outputsizes. The main difﬁculty and the most interesting aspect of sparse polynomial multiplication is the fact that thesize of the output does not exclusively depend on the size of the inputs, contrary to the dense case. Indeed,the product of two polynomials F and G has at most F G nonzero coefﬁcients. But it may have as few as2 nonzero coefﬁcients. Example . Let F = X + X + G = X + X + H = X − X +

2. Then

F G = X + X + X + X + X + X + X + X + F H = X + T can be computed by generating the T possible monomials,sorting them by increasing degree and merging those with the same degree. Using radix sort, this algorithmtakes O ( T ( M R + log D )) bit operations, where M R denotes the cost of one operation in R . A major drawback ofthis approach is its space complexity that exhibits a T factor, even if the result has less than T terms. Manyimprovements have been proposed to reduce this space complexity, to extend the approach to multivariate1 a r X i v : . [ c s . S C ] J un olynomials, and to provide fast implementations in practice [

15, 16, 17 ] . Yet, none of these results reducesthe T factor in the time complexity.In general, no complexity improvement is expected as the output polynomial may have as many as T nonzero coefﬁcients. However, this number of nonzero coefﬁcients can be overestimated, giving the oppor-tunity for output-sensitive algorithms. Such algorithms have ﬁrst been proposed for special cases. Notably,when the output size is known to be small due to sufﬁciently structured inputs [ ] , especially in the multi-variate case [

11, 10 ] , or when the support of the output is known in advance [ ] . It is possible to go one stepfurther by studying the conditions for small outputs. A ﬁrst reason is exponent collisions . Let F = (cid:80) Ti = f i X α i and G = (cid:80) Tj = g j X β j . A collision occurs when there exist distinct pairs of indices ( i , j ) and ( i , j ) such that α i + β j = α i + β j . Such collisions decrease the number of terms of the result. The second reason is coefﬁ-cient cancellations. In the previous example, the resulting coefﬁcient is ( f i g j + f i g j ) , which could vanishdepending on the coefﬁcient values. Taking into account the exponent collisions amounts to computing the sumset of the exponents of F and G , that is { α i + β j : 1 ≤ i , j ≤ T } . Arnold and Roche call this set the structuralsupport of the product F G and its size the structural sparsity [ ] . If H = F G , then the structural sparsity S ofthe product F G satisﬁes 2 ≤ H ≤ S ≤ T . Observe that although H and S can be close, their differencecan reach O ( T ) as shown by the next example. Example . Let F = (cid:80) T − i = X i , G = (cid:80) T − i = ( X Ti + − X Ti ) and H = F G . We have F = T , G = T and thestructural sparsity of F G is T + H = X T − H is exactly the sumset of the expo-nents of F and G , the structural support of H = F G . In this case, Cole and Hariharan describe a multiplicationalgorithm requiring ˜ O ( S log D ) operations in the RAM model with O ( log ( C D )) word size [ ] , where log ( C ) bounds the bitsize of the coefﬁcients. Arnold and Roche improve this complexity to ˜ O ( S log D + H log C ) bitoperations for polynomials with both positive and negative integer coefﬁcients [ ] . Note that they also extendtheir result to ﬁnite ﬁelds and to the multivariate case. A recent algorithm of Nakos avoids the dependency onthe structural sparsity for the case of integer polynomials [ ] , using the same word RAM model as Cole andHariharan. Unfortunately, the bit complexity of this algorithm ( ˜ O (( T log D + H log D ) log ( C D ) + log D ) ) isnot quasi-linear.In the dense case, quasi-optimal multiplication algorithms rely on the well-known evaluation-interpolationscheme. In the sparse settings, this approach is not efﬁcient. The fastest multiplication algorithms mentionedabove [

3, 18 ] mainly rely on a different method called sparse interpolation , that has received considerable at-tention. See e.g. the early results of Prony [ ] and Ben-Or and Tiwari [ ] or the recent results by Huang [ ] .Despite extensive analysis of this problem, no quasi-optimal algorithm exists yet. We remark that it is not theonly difﬁculty. Simply using a quasi-optimal sparse interpolation algorithm would not be enough to get aquasi-optimal sparse multiplication algorithm [ ] . Our main result is summarized in Theorem 1.1. We extend the complexity notations to O ε and ˜ O ε for hidingsome polynomial factors in log ( ε ) . Let F = (cid:80) Ti = f i X e i . We use (cid:107) F (cid:107) ∞ = max i | f i | to denote its height, F forits number of nonzero terms and supp ( F ) = { e , . . . , e T } its support. Theorem 1.1.

Given two sparse polynomials F and G over (cid:90) , Algorithm S PARSE P RODUCT computes H = F G in ˜ O ε ( T ( log D + log C )) bit operations with probability at least − ε , where D = deg ( H ) , C = max ( (cid:107) F (cid:107) ∞ , (cid:107) G (cid:107) ∞ , (cid:107) H (cid:107) ∞ ) and T = max ( F , G , H ) . The algorithm extends naturally to ﬁnite ﬁelds withcharacteristic larger than D with the same complexity where C denotes the cardinality. This result is based on two main ingredients. We adapt Huang’s algorithm [ ] to interpolate F G inquasi-linear time. Note that the original algorithm does not reach quasi-linear complexity.Sparse interpolation algorithms, including Huang’s, require a bound on the sparsity of the result. Wereplaced this bound by a guess on the sparsity and an a posteriori veriﬁcation of the product, as in [ ] .However, using the classical polynomial evaluation approach for the veriﬁcation does not yield a quasi-linearbit complexity (see Section 3). Therefore, we introduce a novel veriﬁcation method that is essentially optimal. Theorem 1.2.

Given three sparse polynomials F , G and H over (cid:70) q or (cid:90) , Algorithm V ERIFY SP tests whetherF G = H in ˜ O ε ( T ( log D + B )) bit operations, where D = deg ( H ) , B is a bound on the bitsize of the coefﬁcients of Here, and throughout the article, ˜ O ( f ( n )) denotes O ( f ( n ) log k ( f ( n ))) for some constant k > Despite their similar names, dense and sparse polynomial interpolation are actually two quite different problems. , G and H, and T = max ( F , G , H ) . The answer is always correct if F G = H, and the probability of error isat most ε otherwise. Finally, using Kronecker substitution, we show that our sparse polynomial multiplication algorithm ex-tends to the multivariate case with a quasi-linear bit complexity ˜ O ε ( T ( n log d + B )) where n is the number ofvariables and d the maximal partial degree on each variable. Nevertheless, over ﬁnite ﬁelds this approach re-quires an exponentially large characteristic. Using the randomized Kronecker substitution [ ] we derive a fastalgorithm for ﬁnite ﬁelds of characteristic polynomial in the input size. Its bit complexity is ˜ O ε ( nT ( log d + B )) .Even though it is not quasi-optimal, it achieves the best known complexity for this case. We denote by I ( n ) = O ( n log n ) the bit complexity of the multiplication of two integers of at most n bits [ ] .Similarly, we denote by M q ( D ) = O ( D log ( q ) log ( D log q ) log ∗ D ) the bit complexity of the multiplication of two dense polynomials of degree at most D over (cid:70) q where q is prime [ ] . The cost of multiplying two elements of (cid:70) q s is O ( M q ( s )) . The cost of multiplying two dense polynomials over (cid:90) of heights at most C and degrees atmost D is M (cid:90) ( D , C ) = I ( D ( log C + log D )) [

6, Chapter 8 ] .Since our algorithms use reductions modulo X p − p , we ﬁrst review useful relatedresults. Theorem 2.1 (Rosser and Schoenfeld [ ] ) . If λ ≥ , there are at least λ/ ln λ prime numbers in [ λ , 2 λ ] . Proposition 2.2 ( [

24, Chapter 10 ] ) . There exists an algorithm R ANDOM P RIME ( λ , ε ) that returns an integer pin [ λ , 2 λ ] , such that p is prime with probability at least − ε . Its bit complexity is ˜ O ε ( log λ ) . We need two distinct properties on the reductions modulo X p −

1. The ﬁrst one is classical in sparseinterpolation to bound the probability of exponent collision in the residue (see [

3, Lemma 3.3 ] ). Proposition 2.3.

Let H be a polynomial of degree at most D and sparsity at most T , < ε < and λ = max ( ε T ln D ) . Then with probability at least − ε , R ANDOM P RIME ( λ , ε ) returns a prime number p suchthat H mod X p − has the same number of terms as H, that is no collision of exponents occurs. The second property allows to bound the probability that a polynomial vanishes modulo X p − Proposition 2.4.

Let H be a nonzero polynomial of degree at most D and sparsity at most T , < ε < and λ = max ( ε T ln D ) . Then with probability at least − ε , R ANDOM P RIME ( λ , ε ) returns a prime number psuch that H mod X p − (cid:54) = .Proof. For H mod X p − e of H that is notcongruent to any other exponent e j modulo p . In other words, it is sufﬁcient that p does not divide any of the T − δ j = e j − e . Noting that δ j ≤ D , the number of primes in [ λ , 2 λ ] that divide at least one δ j is at most ( T − ) ln D ln λ . Since there exist λ/ ln λ primes in this interval, the probability that a prime randomlychosen from it divides at least one δ j is at most ε/

2. R

ANDOM P RIME ( λ , ε/ ) returns a prime in [ λ , 2 λ ] withprobability at least 1 − ε/

2, whence the result.The next two propositions are used to reduce integer coefﬁcients modulo some prime number and toconstruct an extension ﬁeld.

Proposition 2.5.

Let H ∈ (cid:90) [ X ] be a nonzero polynomial, < ε < and λ ≥ max ( ε ln (cid:107) H (cid:107) ∞ ) . Then withprobability at least − ε , R ANDOM P RIME ( λ , ε ) returns a prime q such that H mod q (cid:54) = .Proof. Let h i be a nonzero coefﬁcient of H . A random prime from [ λ , 2 λ ] divides h i with probability at most ln (cid:107) H (cid:107) ∞ /λ ≤ ε/

2. Since R

ANDOM P RIME ( λ , ε/ ) returns a prime in [ λ , 2 λ ] with probability at least 1 − ε/ Proposition 2.6 ( [

24, Chapter 20 ] ) . There exists an algorithm that, given a ﬁnite ﬁeld (cid:70) q , an integer s and < ε < , computes a degree-s polynomial in (cid:70) q [ X ] that is irreducible with probability at least − ε . Its bitcomplexity is ˜ O ε ( s log q ) . Sparse polynomial product veriﬁcation

Verifying a product

F G = H of dense polynomials over an integral domain R simply falls down to testing F ( α ) G ( α ) = H ( α ) for some random point α ∈ R . This approach exhibits an optimal linear number of op-erations in R but it is not deterministic. (No optimal deterministic algorithm exists yet.) When R = (cid:90) or (cid:70) q , a divide and conquer approach provides a quasi-linear complexity, namely ˜ O ( DB ) bit operations where B bounds the bitsize of the coefﬁcients.For sparse polynomials with T nonzero coefﬁcients, evaluation is not quasi-linear since the input sizeis only O ( T ( log D + B )) . Indeed, computing α D requires O ( log D ) operations in R which implies a bit com-plexity of ˜ O ( log ( D ) log ( q )) when R = (cid:70) q . Applying this computation to the T nonzero monomials givesa bit complexity of ˜ O ( T log ( D ) log ( q )) . We mention that the latter approach can be improved to ˜ O (( + T / log log ( D )) log ( D ) log ( q )) using Yao’s result [ ] on simultaneous exponentiation. When R = (cid:90) , the bestknown approach to avoid expression swell is to pick a random prime p and to perform the evaluations modulo p . One needs to choose p > D in order to have a nonzero probability of success. Therefore, the bit complexitycontains a T log D factor.Our approach to obtain a quasi-linear complexity is to perform the evaluation modulo X p − p . This requires to evaluate the polynomial [( F G ) mod X p − ] on α without computing it. Lemma 3.1.

Let F and G be two sparse polynomials in R [ X ] with deg F , deg G ≤ p − and α ∈ R. Then ( F G ) mod X p − can be evaluated on α using O (( F + G ) log p ) operations in R.Proof. Let H = ( F G ) mod X p −

1. The computation of H corresponds to the linear map  h h ... h p − (cid:124) (cid:123)(cid:122) (cid:125) (cid:126) h =  f f p − · · · f f f · · · f ... ... ... f p − f p − · · · f (cid:124) (cid:123)(cid:122) (cid:125) T F  g g ... g p − (cid:124) (cid:123)(cid:122) (cid:125) (cid:126) g where f i (resp. g i , h i ) is the coefﬁcient of degree i of F (resp. G , H ). Computing H ( α ) corresponds to theinner product (cid:126)α p (cid:126) h = (cid:126)α p T F (cid:126) g where (cid:126)α p = ( α , . . . , α p − ) . This evaluation can be computed in O ( p ) operationsin R [ ] . Here we reuse similar techniques in the context of sparse polynomials.To compute H ( α ) , we ﬁrst compute (cid:126) c = (cid:126)α p T F , and then the inner product (cid:126) c (cid:126) g . If supp ( G ) = { j , . . . , j G } with j < · · · < j G < p , we only need the corresponding entries of (cid:126) c , that is all c j k ’s for 1 ≤ k ≤ G . Since c j = (cid:80) p − (cid:96) = α (cid:96) f ( (cid:96) − j ) mod p , we can write c j = f p − j + α (cid:80) p − (cid:96) = α (cid:96) f ( (cid:96) − j + ) mod p , that is c j = α c j − + ( − α p ) f p − j .Applying this relation as many times as necessary, we obtain a relation to compute c j k + from c j k : c j k + = α j k + − j k c j k + ( − α p ) j k + (cid:88) (cid:96) = j k + α (cid:96) f p − (cid:96) .Each nonzero coefﬁcient f t of F appears in the deﬁnition of c j k + if and only if p − j k + ≤ t < p − j k . Thus, each f t is used exactly once to compute all the c j k ’s. Since for each summand, one needs to compute α (cid:96) for some (cid:96) < p , the total cost for computing all the sums is O ( F log p ) operations in R . Similarly, the computation of α j k + − j k c j k for all k costs O ( G log p ) . The last remaining step is the ﬁnal inner product which costs O ( G ) operations in R , whence the result.The complexity is improved to O ( log p + ( F + G ) log p / log log p ) using again Yao’s algorithm [ ] forsimultaneous exponentiation. Given three sparse polynomials F , G and H in R [ X ] , we want to assert that H = F G . Our approach is totake a random prime p and to verify this assertion modulo X p − ERIFY

SP that works over any large enough integral domain R . We further extend the description and the analysis of this algorithm for the speciﬁc cases R = (cid:90) and R = (cid:70) q in the next sections. 4 lgorithm 1 V ERIFY SP Input: H , F , G ∈ R [ X ] ; 0 < ε < Output:

True if

F G = H , False with probability ≥ − ε otherwise. Deﬁne c > and c > c + ( − c ) c ≤ ε D ← deg ( H ) if H > F G or D (cid:54) = deg ( F ) + deg ( G ) then return False λ ← max ( c ( F G + H ) ln D ) p ← R ANDOM P RIME ( λ , c ) ( F p , G p , H p ) ← ( F mod X p − G mod X p − H mod X p − ) Deﬁne (cid:69) ⊂ R of size > c p and choose α ∈ (cid:69) randomly. β ← [( F p G p ) mod X p − ]( α ) (cid:46) using Lemma 3.1 return β = H p ( α ) Theorem 3.2.

If R is an integral domain of size ≥ c c F G ln D V ERIFY SP works as speciﬁed and it re-quires O ε ( T log ( T log D )) operations in R plus O ε ( T I ( log D )) bit operations where D = deg ( H ) and T = max ( F , G , H ) .Proof. Step 3 dismisses two trivial mistakes and ensures that D is a bound on the degree of each polynomial.If F G = H , the algorithm returns True for any choice of p and α . Otherwise, there are two sources of failure.Either X p − F G − H , whence ( F G ) p ( α ) = H p ( α ) for any α . Or α is a root of the nonzero polynomial ( F G − H ) mod X p −

1. Since

F G − H has at most F G + H terms, the ﬁrst failure occurs with probability atmost c by Proposition 2.4. And since ( F G − H ) mod X p − p − (cid:69) has c p points, thesecond failure occurs with probability at most c . Altogether, the failure probability is at most c +( − c ) c .Let us remark that c , c = O ( ε ) and p = O ( ε T log D ) . Step 5 requires only ˜ O ( log ( ε T log D )) bit op-erations by Proposition 2.2. The operations in Step 6 are T divisions by p on integers bounded by D whichcost O ε ( T I ( log D )) bit operations, plus T additions in R . The evaluation of F p G p mod X p − α at Step 8requires O ( T log ( ε T log D )) operations in R by Lemma 3.1. The evaluation of H p on α costs O ( T log ( T log D )) operations in R . Other steps have negligible costs. The ﬁrst easy case is the case of large ﬁnite ﬁelds: If there are enough points for the evaluation, the genericalgorithm has the same guarantee of success and a quasi-linear time complexity.

Corollary 3.3.

Let F , G and H be three polynomials of degree at most D and sparsity at most T in (cid:70) q [ X ] where q > c c F G ln ( D ) . Then Algorithm V ERIFY SP has bit complexity O ε ( n log ( n ) log ∗ n ) where n = T ( log D + log q ) is the input size.Proof. By deﬁnition of n , the cost of Step 6 is O ε ( n log n ) bit operations. Each ring operation in (cid:70) q costs O ( log ( q ) log log ( q ) log ∗ q ) bit operations which implies that the bit complexity of Step 8 is O ε ( T log ( T log D ) log ( q ) log log ( q ) log ∗ q ) . Since T log q and T log D are bounded by n and log log q ≤ log n ,the result follows.We shall note that even if q < c c F G ln ( D ) we can make our algorithm to work by using an extensionﬁeld and this approach achieves the same complexity. Theorem 3.4.

One can adapt algorithm V ERIFY SP to work over ﬁnite ﬁelds (cid:70) q such that q < c c F G ln ( D ) .The bit complexity is O ε ( n log ( n ) log log ( n ) log ∗ n ) , where n = T ( log D + log q ) is the input size.Proof. To have enough elements in the set (cid:69) , we need to work over (cid:70) q s where q s > c p ≥ q s − . An ir-reducible degree- s polynomial can be computed in ˜ O ( s log q ) = ˜ O ( log ( T log D ) / log q ) by Proposition 2.6.Since α is taken in (cid:70) q s , the complexity becomes O ε ( T I ( log D ) + T log ( T log D ) M q ( s )) bit operations. Remark-ing that T ≤ D we have T log ( T log D ) ≤ T log ( D log D ) = O ( n ) . Since s log q = O ( log ( T log D )) = O ( log n ) we can obtain M q ( s ) = O ( log ( n ) log log ( n ) log ∗ n ) which implies that the second term of the complexity is O ( n log ( n ) log log ( n ) log ∗ n ) . The ﬁrst term is negligible since it is O ( n log n ) .In order to achieve the same probability of success, we ﬁx an error probability 1 / c < c and c in V ERIFY

SP such that 1 − ( − c )( − c )( − c ) ≤ ε .5e note that for very sparse polynomials over some ﬁelds, the complexity is only dominated by the oper-ations on the exponents. Corollary 3.5. V ERIFY SP has bit complexity O ε ( n log n ) in the following cases :(i) s = and log q = O ( log − α D ) for some constant < α < ,(ii) s > and T = Θ ( log k D ) for some constant k.Proof. In both cases the cost of reducing the exponents modulo p is O ε ( n log n ) bit operations. In the ﬁrstcase, each multiplication in (cid:70) q costs O ( log ( q ) log log ( q ) log ∗ q ) = O ( log D ) bit operations as log log ( q ) log ∗ q = O ( log α D ) . In the second case, n = O ( log k + D ) and s log q = O ε ( log ( T log D )) = O ε ( log log D ) which implies M q ( s ) = O ε ( s log ( q ) log ( s log q ) log ∗ s ) = O ε ( log D ) . In both cases, the algorithm performs O ε ( T log ( T log D )) = O ε ( T log n ) operations in (cid:70) q (or in (cid:70) q s ). Therefore the bit complexity is O ε ( n log n ) .The following generalization is used in our quasi-linear multiplication algorithm given in Section 4. Corollary 3.6.

Let ( F i , G i ) ≤ i < m and H be sparse polynomials over (cid:70) q of degree at most D and sparsity at most T .We can verify if (cid:80) m − i = F i G i = H, with error probability at most ε when they are different, in O ε ( m ( T I ( log D ) + T log ( mT log D ) M q ( s ))) bit operations. In order to keep a quasi-linear time complexity over the integers, we must work over a prime ﬁnite ﬁeld (cid:70) q toavoid the computation of too large integers. Indeed, H p ( α ) could have size p log ( α ) = O ε ( T log ( D ) log ( α )) which is not quasi-linear in the input size. Theorem 3.7.

One can adapt algorithm V ERIFY SP to work over the integers. The bit complexity isO ε ( n log n log log n ) , where n = T ( log D + log C ) is the input size with C = max ( (cid:107) F (cid:107) ∞ , (cid:107) G (cid:107) ∞ , (cid:107) H (cid:107) ∞ ) .Proof. Before Step 6, we choose a random prime number q = R ANDOM P RIME ( µ , c ) with µ = c max ( p , ln ( C T + C )) and we perform all the remaining steps modulo q . Let us assume that the polyno-mial ∆ = F G − H ∈ (cid:90) [ X ] is nonzero. Our algorithm only fails in the following three cases: p is such that ∆ p = ∆ mod X p − = q is such that ∆ p ≡ q ; α is a root of ∆ p in (cid:70) q .Using Proposition 2.4, ∆ p is nonzero with probability at least 1 − c . Actually, with the same probability,the proof of the proposition shows that at least one coefﬁcient of ∆ is preserved in ∆ p . Since (cid:107) ∆ (cid:107) ∞ ≤ C T + C ,Proposition 2.5 ensures that ∆ p (cid:54)≡ q with probability at least 1 − c . Finally, q has been chosen so that (cid:70) q has at least c p elements whence α is not a root of ∆ p mod q with probability at least 1 − c . Altogether,taking c , c ≥ such that 1 − ( − c )( − c )( − c ) ≤ ε , our adaptation of V ERIFY

SP has an error probabilityat most ε .The reductions of F , G and H modulo q add a term O ( T I ( log C )) to the complexity. Since operations in (cid:70) q have cost I ( log q ) , the complexity becomes O ( T I ( log D ) + T I ( log C ) + T log ( T log D ) I ( log q )) bit operations.The ﬁrst two terms are in O ( n log n ) . Moreover, q = O ε ( log ( C T ) + p ) and p = O ε ( T log D ) , thus log q = O ε ( log ( log C + T log D )) = O ε ( log n ) . Since T ≤ D , T log ( T log D ) = O ( n ) and the third term in the complexityis O ε ( n log n log log n ) .As over small ﬁnite ﬁelds, the complexity is actually better for very sparse polynomials. Corollary 3.8.

If T = Θ ( log k D ) for some k, V ERIFY SP has bit complexity O ε ( n log n ) .Proof. If T = Θ ( log k D ) , T log ( T log D ) = ˜ O ( log k D ) = o ( n ) , thus the last term of the complexity in the proofof Theorem 3.7 becomes negligible with respect to the ﬁrst two terms.For the same reason as for ﬁnite ﬁelds, we extend the veriﬁcation algorithm to a sum of products. Corollary 3.9.

Let ( F i , G i ) ≤ i < m and H be sparse polynomials of degree at most D, sparsity at most T , andheight at most C. We can verify if (cid:80) m − i = F i G i = H, with probability of error at most ε when they are different, inO ε ( mT I ( log D ) + mT I ( log C ) + mT log ( mT log D ) I ( log ( m log C + mT log D ))) bit operations. We shall only use this algorithm with m = ERIFY S UM SP ( H , F , G , F , G , ε ) .6 Sparse polynomial multiplication

Given two sparse polynomials F and G , our algorithm aims at computing the product H = F G through sparsepolynomial interpolation. We avoid the difﬁculty of computing an a priori bound on the sparsity of H neededfor sparse interpolation by using our veriﬁcation algorithm of Section 3. Indeed, one can start with an arbitrarysmall sparsity and double it until the interpolated polynomial matches the product according to V ERIFY

SP.The remaining difﬁculty is to interpolate H in quasi-optimal time given a sparsity bound, which is notyet achieved in the general case. In our case, we ﬁrst analyze the complexity of Huang’s sparse interpolationalgorithm [ ] when the input is a sum of sparse products. In order to obtain the desired complexity wedevelop a novel approach that interleaves two levels of Huang’s algorithm. In [ ] Huang proposes an algorithm that interpolates a sparse polynomial H from its SLP representation,achieving the best known complexity for this problem, though it is not optimal. Its main idea is to use thedense polynomials H p = H mod X p − H (cid:48) p = H (cid:48) mod X p − H (cid:48) is the derivative of H and p a small random prime. Indeed, if cX e is a term of H that does not collide during the reduction modulo X p − H p contains the monomial cX e mod p and H (cid:48) p contains ceX e − p , hence c and e can be recovered by a meredivision. Of course, the choice of p is crucial for the method to work. It must be small enough to get a lowcomplexity, but large enough for collisions to be sufﬁciently rare. Lemma 4.1.

There exists an algorithm F IND T ERMS that takes as inputs a prime p, two polynomials H p = H mod X p − , H (cid:48) p = H (cid:48) mod X p − , and bounds D ≥ deg ( H ) and C ≥ (cid:107) H (cid:107) ∞ and it outputs an approximationH ∗ of H that contains at least all the monomials of H that do not collide modulo X p − . Its bit complexity isO ( T I ( log C D )) , where T = H.Proof.

It is a straightforward adaptation of [

13, Algorithm 3.4 (UT

ERMS ) ] . Here, taking C as input allowsus to only recover coefﬁcients that are at most C in absolute value and therefore to perform divisions withintegers of bitsize at most log ( C D ) . Corollary 4.2.

Let H be a sparse polynomial such that H ≤ T , deg H ≤ D and (cid:107) H (cid:107) ∞ ≤ C, and < ε < . If λ = max ( ε T ln D ) and p = R ANDOM P RIME ( λ , ε ) , then with probability at least − ε , F IND T ERMS ( p , H mod X p − H (cid:48) mod X p − D , C ) returns H.Proof. With probability at least 1 − ε , no collision occurs in H mod X p −

1, and consequently neither in H (cid:48) mod X p −

1, by Proposition 2.3. In this case F

IND T ERMS correctly computes H , according to Lemma 4.1. Theorem 4.3.

There exists an algorithm I NTERP S UM SP that takes as inputs m sparse polynomials ( F i , G i ) ≤ i < m ,three bounds T ≥ H, D > deg ( H ) and C ≥ (cid:107) H (cid:107) ∞ where H = (cid:80) m − i = F i G i , a constant < µ < and the list (cid:80) of the ﬁrst N primes for N = max ( (cid:98) ( T − ) log D (cid:99) ) , and outputs H with probability at least − µ .Its bit complexity is ˜ O µ ( mT log ( D ) log ( C D )) where T , D and C are bounds on the sparsity, the degreeand the height of H and each F i and G i .Proof. It is identical to the proof of [

13, Algorithm 3.9 (UIP

OLY ) ] taking into account that H is not given asan SLP anymore but as (cid:80) m − i = F i G i where the polynomials F i and G i are given as sparse polynomials. Remark 4.4.

A ﬁner analysis of algorithm I NTERP S UM SP leads to a bit complexityO µ ( m log T M (cid:90) ( T log ( D ) log ( T log D ) , T C D ) . Remark 4.5.

Even when I NTERP S UM SP returns an incorrect polynomial, it has sparsity at most T , degree lessthan D and coefﬁcients bounded by C.

Our idea is to compute different candidates to

F G with a growing sparsity bound and to verify the result withV

ERIFY

SP. Unfortunately, a direct call to I

NTERP S UM SP with the correct sparsity T = max ( F , G , ( F G )) yields a bit complexity ˜ O ( T log ( D ) log ( C D )) if the coefﬁcients are bounded by C and the degree by D . Weshall remark that it is not nearly optimal since the input and output size are bounded by T log D + T log C .To circumvent this difﬁculty, we ﬁrst compute the reductions F p = F mod X p − G p = G mod X p − F (cid:48) p = F (cid:48) mod X p − G (cid:48) p = G (cid:48) mod X p − p as in Corollary 4.2. The polynomials H p = F G mod X p − H (cid:48) p =( F G ) (cid:48) mod X p − NTERP S UM SP and V

ERIFY

SP. Indeed, we ﬁrst compute F p G p byinterpolation and then reduce it modulo X p − H p . Similarly for H (cid:48) p we ﬁrst interpolate F (cid:48) p G p + F p G (cid:48) p before its reduction. Finally we can compute the polynomial F G from H p and H (cid:48) p using F IND T ERMS accordingto Corollary 4.2. Our choice of p , which is polynomial in the input size, ensures that each call to I NTERP S UM SPremains quasi-linear.

Algorithm 2 S PARSE P RODUCT

Input: F , G ∈ (cid:90) [ X ] . 0 < µ , µ < µ ≤ µ . Output: H ∈ (cid:90) [ X ] s.t. H = F G with probability at least 1 − µ . t ← max ( F , G ) , D ← deg ( F ) + deg ( G ) , C ← t (cid:107) F (cid:107) ∞ (cid:107) G (cid:107) ∞ λ ← max ( µ ( F G ) ln D ) , µ ∗ ← µ − µ p ← R ANDOM P RIME ( λ , µ ) F p ← F mod X p − G p ← G mod X p − F (cid:48) p ← F (cid:48) mod X p − G (cid:48) p ← G (cid:48) mod X p − repeat N ← max ( (cid:98) ( t − ) log p (cid:99) ) (cid:80) ← { the ﬁrst 2 N primes in increasing order } H ← I NTERP S UM SP ([( F p , G p )] , t , 2 p , C , µ ∗ , (cid:80) ) H ← I NTERP S UM SP ([( F p , G (cid:48) p ) , ( F (cid:48) p , G p )] , t , 2 p , C D , µ ∗ , (cid:80) ) t ← t until V ERIFY SP ( H , F p , G p , µ ) and (cid:46) H = F p G p V ERIFY S UM SP ( H , F p , G (cid:48) p , F (cid:48) p , G p , µ ) (cid:46) H = F (cid:48) p G p + F p G (cid:48) p H p ← H mod X p − H (cid:48) p ← H mod X p − return F IND T ERMS ( p , H p , H (cid:48) p , D , C ) .Lemmas 4.6 and 4.7 respectively provide the correctness and complexity bound of algorithm S PARSE P ROD - UCT . Together, they consequently form a proof of Theorem 1.1 by taking ε = µ + µ . Note that this approachtranslates mutatis mutandis to the multiplication of sparse polynomials over (cid:70) q where the characteristic of (cid:70) q is larger than D . Lemma 4.6.

Let F and G be two sparse polynomials over (cid:90) . Then algorithm S PARSE P RODUCT returns F G withprobability at least − µ .Proof. Since

F G has sparsity at most F G , Corollary 4.2 implies that if H p = F G mod X p − H (cid:48) p =( F G ) (cid:48) mod X p −

1, the probability that F

IND T ERMS does not return

F G is at most µ . The other reason for theresult to be incorrect is that one of these equalities does not hold, which means that one of the two veriﬁcationsfails. Since this happens with probability at most µ , S PARSE P RODUCT returns

F G with probability at least1 − µ . Lemma 4.7.

Let F and G be two sparse polynomials over (cid:90) , T = max ( F , G , ( F G )) , D = deg ( F G ) ,C = max ( (cid:107) F (cid:107) ∞ , (cid:107) G (cid:107) ∞ , (cid:107) F G (cid:107) ∞ ) and ε = µ + µ . Then algorithm S PARSE P RODUCT has bit complexity ˜ O ε ( T ( log D + log C )) with probability at least − µ . Writing n = T ( log D + log C ) , the bit complexity isO ε ( n log n log T ( log T + log log n )) .Proof. In order to obtain the given complexity, we ﬁrst need to prove that with high probability I

NTERP S UM SPnever computes polynomials with a sparsity larger than 4 ( F G ) .Let T p = max ( ( F p G p ) , ( F p G (cid:48) p + F (cid:48) p G p )) . If t ≤ T p then the polynomials H and H satisfy H , H ≤ T p by Remark 4.5. Unfortunately, T p could be as large as T and t might reach values larger than T p . Wenow prove that: (i) with probability at least 1 − µ ∗ the maximal value of t during the algorithm is less than2 T p ; (ii) with probability at least 1 − µ , T p ≤ ( F G ) . Together, this will prove that H , H ≤ ( F G ) withprobability at least 1 − µ ∗ − µ = − µ . (i) As soon as t ≥ T p , Steps 9 and 10 compute both F p G p and F p G (cid:48) p + F (cid:48) p G p with probability at least 1 − µ ∗ byTheorem 4.3. Since V ERIFY

SP never fails when the product is correct, the algorithm ends when T p ≤ t < T p with probability at least 1 − µ ∗ . (ii) Let us deﬁne the polynomials ˆ F p and ˆ G p obtained from F p and G p by replacing each nonzero coefﬁcientby 1. The choice of p in Step 3 ensures that with probability at least 1 − µ there is no collision in ( ˆ F p ˆ G p ) mod8 p − F p ˆ G p . In that case, there is also no collision in F p G p mod X p − F p G (cid:48) p + F (cid:48) p G p mod X p − ( F p G p ) ⊂ supp ( ˆ F p ˆ G p ) . Therefore, there are as many nonzerocoefﬁcients in F p G p as in F p G p mod X p −

1, which is equal to

F G mod X p −

1. Thus with probability at least1 − µ we have ( F p G p ) = ( F G ) ≤ T and similarly ( F (cid:48) p G p + F p G (cid:48) p ) = (( F G ) (cid:48) ) ≤ T .In the rest of the proof, we assume that the loop stops with t ≤ T p and that T p ≤ T . In particu-lar, the number of iterations of the loop is O ( log T ) . Since 2 p = O ( ε T log D ) , Steps 9 and 10 have abit complexity ˜ O ε ( T log ( p ) log ( pC D )) = ˜ O ε ( T log C D ) by Theorem 4.3. Using Remark 4.5, V ERIFY

SP andV

ERIFY S UM SP have polynomials of height at most tC D as inputs. By Corollary 3.9, Step 12 has bit com-plexity O ε ( T log ( T log p ) I ( log C D )) = ˜ O ε ( T log C D ) . The list (cid:80) can be computed incrementally, adding newprimes when necessary. At the end of the loop, (cid:80) contains O ( T log 2 p ) primes, which means that it is com-puted in O ε ( T log ( p ) log ( T log p ) log log ( T log p )) bit operations [

6, Chapter 18 ] , that is ˜ O ε ( T log log D ) sincelog p = O ( log ( T log D )) .The total cost for the O ( log T ) iterations of the loop is still ˜ O ε ( T log ( C D )) . Step 14 runs in time O ε ( T I ( log C D )) by Lemma 4.1 as the coefﬁcients of H (cid:48) p are bounded by 2 T C D with T ≤ D and H p , H (cid:48) p ≤ H . Since other steps have negligible costs this yields a complexity of ˜ O ε ( T ( log C + log D )) with probabilityat least 1 − µ .Using Remark 4.4, we can provide a more precise complexity for Steps 9 and 10 which is O ε ( log T M (cid:90) ( T log ( p ) log ( T log p ) , pDT C )) bit operations. It is easy to observe that the log T repetitions ofthese steps provide the dominant term in the complexity. A careful simpliﬁcation yields a bit complexity O ε ( n log n log T ( log T + log log n ) for S PARSE P RODUCT where n = T ( log D + log C ) bounds both input andoutput sizes. Using classical Kronecker substitution [

6, Chapter 8 ] one can extend straightforwardly S PARSE P RODUCT tomultivariate polynomials. Let F , G ∈ (cid:90) [ X , . . . , X n ] with (cid:107) F (cid:107) ∞ , (cid:107) G (cid:107) ∞ ≤ C and deg X i ( F ) + deg X i ( G ) < d .Writing F u ( X ) = F ( X , X d , . . . , X d n − ) and G u ( X ) = G ( X , X d , . . . , X d n − ) , one can easily retrieve F G from theunivariate product F u G u . It is easy to remark that the Kronecker substitution preserve the sparsity and theheight, and it increases the degree to deg F u , deg G u < d n . If F and G are sparse polynomials with at most T nonzero terms, their sizes are at most T ( n log d + log C ) which is exactly the sizes of F u and G u . Since theKronecker and inverse Kronecker substitutions cost ˜ O ( T n log d ) bit operations, one can compute F u G u usingS PARSE P RODUCT within the following bit complexity.

Corollary 4.8.

There exists an algorithm that takes as inputs F , G ∈ (cid:90) [ X , . . . , X n ] and < ε < , and computesF G with probability at least − ε , using ˜ O ε ( T ( n log d + log C )) bit operations where T = max ( F , G , ( F G )) ,d = max i ( deg X i F G ) and C = max ( (cid:107) F (cid:107) ∞ , (cid:107) G (cid:107) ∞ ) . Over a ﬁnite ﬁeld (cid:70) q s for some prime q , the previous technique requires that q > d n since S PARSE P RODUCT requires q to be larger than the degree. The randomized Kronecker substitution method introduced by Arnoldand Roche [ ] allows to apply S PARSE P RODUCT to ﬁelds of smaller characteristic. The idea is to deﬁne univari-ate polynomials F s ( X ) = F ( X s , . . . , X s n ) and G s ( X ) = G ( X s , . . . , X s n ) for some random vector (cid:126) s = ( s , . . . , s n ) such that these polynomials have much smaller degrees than those obtained with classical Kronecker substi-tution. As a result, we obtain an algorithm that works for much smaller q of order ˜ O ( nd F G ) .Our approach is to ﬁrst use some randomized Kronecker substitutions to estimate the sparsity of F G by computing the sparsity of H s = F s G s for several distinct random vectors (cid:126) s . With high probability, themaximal sparsity is close to the one of F G . Then, we use this information to provide a bound to some(multivariate) sparse interpolation algorithm. Note that our approach is inspired from [ ] that slightlyimproves randomized Kronecker substitution. Lemma 4.9.

Let H ∈ (cid:70) q s [ X , . . . , X n ] of sparsity T , and (cid:126) s be a vector chosen uniformly at random in S n whereS ⊂ (cid:78) is ﬁnite. The expected sparsity of H s ( X ) = H ( X s , . . . , X s n ) is at least T ( − T − S ) .Proof. If we ﬁx two distinct exponent vectors (cid:126) e u and (cid:126) e v of H , they collide in H s if and only if (cid:126) e u · (cid:126) s = (cid:126) e v · (cid:126) s . Since (cid:126) e u (cid:54) = (cid:126) e v , they differ at least on one component, say e u , j (cid:54) = e v , j . The equality (cid:126) e u · (cid:126) s = (cid:126) e v · (cid:126) s is then equivalent to s j = (cid:88) j (cid:54) = j e v , j − e u , j e u , j − e v , j s j .9riting Y for the right-hand side of this equation we havePr [ (cid:126) e u · (cid:126) s = (cid:126) e v · (cid:126) s ] = Pr [ s j = Y ] = (cid:88) y Pr [ s j = Y | Y = y ] Pr [ Y = y ] where the (ﬁnite) sum ranges over all possible values y of Y . Since s j is chosen uniformly at random in S ,Pr [ s j = Y | Y = y ] = Pr [ s j = y ] ≤ / S and the probability that (cid:126) e u and (cid:126) e v collide is at most 1 / S . Thisimplies that the expected number of vectors that collide is at most T ( T − ) / S . Corollary 4.10.

Let H be as in Lemma 4.9 and (cid:126) v , . . . , (cid:126) v (cid:96) ∈ S n be some vectors chosen uniformly and indepen-dently at random. Then Pr [ max i H v i ≤ T ( − T − S )] ≤ / (cid:96) .Proof. For each (cid:126) v i , the expected number of terms that collide in H v i ( X ) is at most T ( T − ) / S by Lemma 4.9.Using Markov’s inequality, we have Pr [ H v i ≤ T − T ( T − ) / S ] ≤ /

2. Since the vectors (cid:126) v i are independent,the result follows. Algorithm 3 S PARSITY E STIMATE

Input: F , G ∈ (cid:70) q s [ X , . . . , X n ] , 0 < ε < λ > Output:

An integer t such that t ≤ λ ( F G ) . N ← (cid:100) F G − − /λ (cid:101) , (cid:96) ← (cid:100) log ε (cid:101) . t (cid:48) ← µ ← ε (cid:96) . repeat (cid:96) times (cid:126) s ← random element of {

0, . . . , N − } n . F s ← F ( X s , . . . , X s n ) , G s ← G ( X s , . . . , X s n ) H s ← S PARSE P RODUCT ( F s , G s , µ , µ ) t (cid:48) ← max ( t (cid:48) , H s ) return λ t (cid:48) . Lemma 4.11.

Algorithm S PARSITY E STIMATE is correct when q ≥ D F G − /λ where D = max ( deg F , deg G ) . Withprobability at least − ε , it returns an integer t ≥ ( F G ) using ˜ O ε ( T ( n log d + s log q )) bit operations whereT = max ( ( F G ) , F , G ) and d = max i ( deg X i F G ) .Proof. Since each polynomial H s has sparsity at most ( F G ) , S PARSITY E STIMATE returns an integer boundedby λ ( F G ) . S PARSE P RODUCT can be used in step 5 since deg H s = deg F s + deg G s ≤ N D ≤ q by the deﬁnitionof N . Assuming that S PARSE P RODUCT returns no incorrect answer during the loop, Corollary 4.10 applied tothe product

F G implies that t (cid:48) ≥ ( F G )( − ( ( F G ) − ) / N ) with probability ≥ − ε/ N and since F G ≥ ( F G ) , t (cid:48) ≥ ( F G ) /λ . Taking into account the probability offailure of S PARSE P RODUCT , the probability that λ t (cid:48) ≥ ( F G ) is at least 1 − ε .The computation of F s and G s requires O ( T n I ( log max ( d , N )) + Ts log q ) bit operations in Step 5. Sincemax ( F s , G s , H s ) ≤ T and deg H s = O ( nd T ) in Step 6, the bit complexity of each call to S PARSE P RODUCT is ˜ O µ ( T ( log ( nd ) + s log q )) with probability at least 1 − µ using Lemma 4.7. Therefore, S PARSITY E STIMATE requires ˜ O ε ( T ( n log d + s log q )) bit operations with probability at least 1 − ε/

4. Together with the probabilityof failure this concludes the proof.

Theorem 4.12.

There exists an algorithm that takes as inputs two sparse polynomials F and G in (cid:70) q s [ X , . . . , X n ] and < ε < that returns the product F G in ˜ O ε ( nT ( log d + s log q )) bit operations with probability at least − ε , where T = max ( F , G , ( F G )) , d = max i ( deg X i F G ) , D = deg F G and assuming that q = Ω ( D F G + DT log ( D ) log ( T log D )) .Proof. The algorithm computes an estimate t on the sparsity of F G using S

PARSITY E STIMATE ( F , G , ε , λ ) forsome constant λ . The second step interpolates F G using Huang and Gao’s algorithm [

14, Algorithm 5(M UL P OLY

SI) ] which is parameterized by a univariate sparse interpolation algorithm. Originally, its inputs area polynomial given as a blackbox and bounds on its degree and sparsity. In our case, the blackbox is replacedby F and G , the sparsity bound is t and the univariate interpolation algorithm is S PARSE P RODUCT .The algorithm M UL P OLY

SI requires O ε ( n log t + log t ) interpolation of univariate polynomials with degree˜ O ( t D ) and sparsity at most t . Each interpolation with S PARSE P RODUCT is done with µ , µ such that µ + µ = ε/ ( n + ) log t , so that M UL P OLY

SI returns the correct answer in ˜ O ε ( nT ( log d + s log q )) bit operations withprobability at least 1 − ε [

14, Theorem 6 ] . Altogether, our two-step algorithm returns the correct answerusing ˜ O ε ( nT ( log d + s log q )) bit operations with probability at least 1 − ε . The value of q is such that it boundsthe degrees of the univariate polynomials returned by S PARSE P RODUCT during the algorithm.10 .4 Small characteristic

We now consider the case of sparse polynomial multiplication over a ﬁeld (cid:70) q s with characteristic smallerthan the degree of the product F G (or, in the multivariate case, smaller than the degree of the product afterrandomized Kronecker substitution). We can no more use Huang’s interpolation algorithm since it uses thederivative to encode the exponents into the coefﬁcients and thus it only keeps the value of the exponents modulo q . Our idea to circumvent this problem is similar to the one in [ ] that is to rather consider thepolynomials over (cid:90) before calling our algorithm S PARSE P RODUCT .The following proposition is only given for the multivariate case as it encompasses univariate’s one. Itmatches exactly with the complexity result given by Arnold and Roche [ ] . Proposition 4.13.

There exists an algorithm that takes as inputs two sparse polynomials F and G in (cid:70) q s [ X , . . . , X n ] and < ε < that returns the product F G in ˜ O ε ( S ( n log d + s log q )) bit operations with proba-bility at least − ε , where S is the structural sparsity of F G and d = max i ( deg X i F G ) .Proof. If s =

1, the coefﬁcients of F and G map easily to the integers in {

0, . . . , q − } . Therefore, the product F G can be obtained by using an integer sparse polynomial multiplication, as the one in Corollary 4.8, followedby some reductions modulo q . Unfortunately, mapping the multiplication over the integers implies that thecancellations that could have occurred in (cid:70) q do not hold anymore. Consequently, the support of the productin (cid:90) before modular reduction is exactly the structural support of F G .If s >

1, the coefﬁcients of F and G are polynomials over (cid:70) q of degree s −

1. As previously, mapping (cid:70) q tointegers, F and G can be seen as F Y , G Y ∈ (cid:90) [ Y ][ X , . . . , X n ] where the coefﬁcients are polynomials in (cid:90) [ Y ] ofdegree at most s − q − T = max ( F , G ) , the coefﬁcients of F Y G Y are polynomials of degree at most 2 s − Tsq . Therefore, the product F G ∈ (cid:70) q s can be computed by: (i) computing F B , G B ∈ (cid:90) [ X , . . . , X n ] byevaluating the coefﬁcients of F Y and G Y at B = Tsq (Kronecker substitution); (ii) computing the product H B = F B G B ; (iii) writing the coefﬁcients of H B in base B to obtain H Y = F Y G Y (Kronecker segmentation); (iv) and ﬁnally mapping back the coefﬁcients of H Y from (cid:90) [ Y ] to (cid:70) q s .Similarly as the case s = H B and then H Y have at most S nonzero coefﬁcients. The Kronecker substitu-tions in (i) require ˜ O ( Ts log q ) bit operations, while the Kronecker segmentations in (iii) need ˜ O ( Ss log q ) bitoperations. In (iv) we ﬁrst compute Ss reductions modulo q on integers smaller than B, and then S polynomialdivisions in (cid:70) q [ Y ] with polynomial of degree O ( s ) . Thus, it can be done in ˜ O ( Ss log q ) bit operations. Finallythe computation in (ii) is dominant and it requires ˜ O ε ( S ( n log d + s log q )) bit operations with probability atleast 1 − ε using Corollary 4.8. References [ ] Andrew Arnold.

Sparse polynomial interpolation and testing . PhD thesis, University of Waterloo, 2016. [ ] Andrew Arnold and Daniel S. Roche. Multivariate sparse interpolation using randomized Kroneckersubstitutions. In

ISSAC’14 , pages 35–42. ACM, 2014.

DOI : . [ ] Andrew Arnold and Daniel S. Roche. Output-sensitive algorithms for sumset and sparse polynomialmultiplication. In

ISSAC’15 , pages 29–36. ACM, 2015.

DOI : . [ ] Michael Ben-Or and Prasoon Tiwari. A Deterministic Algorithm for Sparse Multivariate PolynomialInterpolation. In

STOC’88 , pages 301–309. ACM, 1988.

DOI : . [ ] Richard Cole and Ramesh Hariharan. Verifying candidate matches in sparse and wildcard matching. In

STOC’02 , pages 592–601. ACM, 2002.

DOI : . [ ] Joachim von zur Gathen and Jürgen Gerhard.

Modern Computer Algebra . Cambridge University Press,3rd edition, 2013. [ ] Pascal Giorgi. A probabilistic algorithm for verifying polynomial middle product in linear time.

Inform.Process. Lett. , 139:30–34, 2018.

DOI : . [ ] David Harvey and Joris van der Hoeven. Faster polynomial multiplication over ﬁnite ﬁelds using cyclo-tomic coefﬁcient rings.

J. Complexity , 54, 2019.

DOI : . [ ] David Harvey and Joris van der Hoeven. Integer multiplication in time O(n log n), 2019.

URL : https://hal.archives-ouvertes.fr/hal-02070778 .11 ] Joris van der Hoeven, Romain Lebreton, and Éric Schost. Structured FFT and TFT: Symmetric andLattice Polynomials. In

ISSAC’13 , pages 355–362. ACM, 2013.

DOI : . [ ] Joris van der Hoeven and Grégoire Lecerf. On the Complexity of Multivariate Blockwise PolynomialMultiplication. In

ISSAC’12 , pages 211–218. ACM, 2012.

DOI : . [ ] Joris van der Hoeven and Grégoire Lecerf. On the bit-complexity of sparse polynomial and series mul-tiplication.

J. Symb. Comput. , 50:227–254, 2013.

DOI : . [ ] Qiao-Long Huang. Sparse polynomial interpolation over ﬁelds with large or zero characteristic. In

ISSAC’19 , pages 219–226. ACM, 2019.

DOI : . [ ] Qiao-Long Huang and Xiao-Shan Gao. Revisit Sparse Polynomial Interpolation Based on Ran-domized Kronecker Substitution. In

CASC’19 , pages 215–235. Springer, 2019.

DOI : . [ ] Stephen C. Johnson. Sparse polynomial arithmetic.

ACM SIGSAM Bulletin , 8(3):63–71, 1974.

DOI : . [ ] Michael Monagan and Roman Pearce. Parallel sparse polynomial multiplication using heaps. In

IS-SAC’09 , page 263. ACM, 2009.

DOI : . [ ] Michael Monagan and Roman Pearce. Sparse polynomial division using a heap.

J. Symb. Comput. ,46(7):807–822, 2011.

DOI : . [ ] Vasileios Nakos. Nearly optimal sparse polynomial multiplication, 2019. AR X IV : . [ ] David A. Plaisted. New NP-hard and NP-complete polynomial and integer divisibility problems.

Theor.Comput. Sci. , 31(1):125–138, 1984.

DOI : . [ ] R. Prony. Essai expérimental et analytique sur les lois de la dilatabilité de ﬂuides élastique et sur cellesde la force expansive de la vapeur de l’eau et de la vapeur de l’alkool, à différentes températures.

J.École Polytechnique , 1(Floréal et Prairial III):24–76, 1795.

URL : https://gallica.bnf.fr/ark:/12148/bpt6k433661n/f32.item . [ ] Daniel S. Roche. Chunky and equal-spaced polynomial multiplication.

J. Symb. Comput. , 46(7):791–806, 2011.

DOI : . [ ] Daniel S. Roche. What can (and can’t) we do with sparse polynomials? In

ISSAC’18 , pages 25–30. ACM,2018.

DOI : . [ ] J. Barkley Rosser and Lowell Schoenfeld. Approximate formulas for some functions of prime numbers.

Illinois J. Math. , 6(1):64–94, 1962.

DOI : . [ ] Victor Shoup.

A Computational Introduction to Number Theory and Algebra . Cambridge University Press,second edition, 2008. [ ] Andrew Chi-Chih Yao. On the Evaluation of Powers.

SIAM J. Comput. , 5(1):100–103, 1976.

DOI :10.1137/0205008