Fast Hermite interpolation and evaluation over finite fields of characteristic two
aa r X i v : . [ c s . S C ] J u l FAST HERMITE INTERPOLATION AND EVALUATION OVERFINITE FIELDS OF CHARACTERISTIC TWO
NICHOLAS COXON
Abstract.
This paper presents new fast algorithms for Hermite interpolationand evaluation over finite fields of characteristic two. The algorithms reducethe Hermite problems to instances of the standard multipoint interpolationand evaluation problems, which are then solved by existing fast algorithms.The reductions are simple to implement and free of multiplications, allowinglow overall multiplicative complexities to be obtained. The algorithms aresuitable for use in encoding and decoding algorithms for multiplicity codes. Introduction
Hermite interpolation is the problem of computing the coefficients of a polyno-mial given the values of its derivatives up to a given order at one or more evaluationpoints. The inverse problem, that of evaluating the derivatives of the polynomialwhen given its coefficients, is sometimes referred to as Hermite evaluation. Overfields of positive characteristic p , the i th formal derivative vanishes identically for i ≥ p . Consequently, it usual to consider Hermite interpolation and evaluation withrespect to the Hasse derivative over fields of small positive characteristic.For now, let F simply denote a field. Then, for i ∈ N , the map D i : F [ x ] → F [ x ]that sends F ∈ F [ x ] to the coefficient of y i in F ( x + y ) ∈ F [ x ][ y ] is called the i th Hasse derivative on F [ x ]. For distinct evaluation points ω , . . . , ω n − ∈ F andpositive integer multiplicities ℓ , . . . , ℓ n − , the Hermite interpolation problem over F asks that we compute the coefficients of a polynomial F ∈ F [ x ] of degree less than ℓ = ℓ + · · · + ℓ n − when given ( D i F )( ω j ) for j ∈ { , . . . , ℓ i − } and i ∈ { , . . . , n − } .The corresponding instance of the Hermite evaluation problem asks that we usethe coefficients of F to compute the ℓ derivatives of the interpolation problem.Different versions of the problems specify different bases on which the polynomialsare required to be represented. In this paper, the problems are considered withrespect to the monomial basis { , x, x , . . . } of F [ x ] only.The boundary case ℓ = · · · = ℓ n − = 1 corresponds to standard multipoint in-terpolation and evaluation, allowing both problems to be solved with O ( M ( ℓ ) log ℓ )operations in F by the use of remainder trees and fast Chinese remainder algo-rithms [17, 30, 7, 9, 8, 3, 37] (see also [41, Chapter 10]). Here, M ( ℓ ) denotes thenumber of operations required to multiply two polynomials in F [ x ] of degree less INRIA and Laboratoire d’Informatique de l’´Ecole polytechnique, Palaiseau, France.
E-mail address : [email protected] . Date : July 3, 2018.
Key words and phrases.
Hermite interpolation, Hermite evaluation, multiplicity codes.This work was supported by Nokia in the framework of the common laboratory between NokiaBell Labs and INRIA. than ℓ , which may be taken to be in O ( ℓ (log ℓ ) log log ℓ ) [34, 33, 12]. The com-plexity of solving the standard interpolation and evaluation problems reduces to O ( M ( ℓ )) operations when the evaluation points form a geometric progression [10].Similarly, fast Fourier transform (FFT) [15] based interpolation and evaluation of-fer complexities as low as O ( ℓ log ℓ ) operations on certain special sets of evaluationpoints.For the opposing boundary case of n = 1, the Hermite interpolation and eval-uation problems both reduce to computing Taylor expansions. Indeed, it followsdirectly from the definition of Hasse derivatives that(1.1) F = X i ∈ N ( D i F )( ω )( x − ω ) i for F ∈ F [ x ] and ω ∈ F . Consequently, Hermite interpolation and evaluation at a single evaluation point canbe performed in O ( M ( ℓ ) log ℓ ) operations in general [7, 39, 40], O ( M ( ℓ )) operationsif ( ℓ − O ( ℓ log ℓ ) operationsif the field has characteristic equal to two [19].The first quasi-linear time algorithms for solving the general Hermite problemswere proposed by Chin [14]. Truncating the Taylor expansion (1.1) after i termsgives the residue of degree less than i of F modulo ( x − ω ) i . Based on this observa-tion, Chin’s evaluation algorithm begins by using a remainder tree to compute theresidues of the input polynomial modulo ( x − ω i ) ℓ i for i ∈ { , . . . , n − } . The Taylorexpansion of each residue at its corresponding evaluation point is then computed toobtain the truncated Taylor expansion of the input polynomial. The interpolationproblem can be solved by reversing these steps, with the residues combined by afast Chinese remainder algorithm. It follows that the general Hermite interpolationand evaluation problems may be solved with O ( M ( ℓ ) log ℓ ) operations [14, 31] (seealso [6, 32]).In this paper, we present new algorithms for Hermite interpolation and eval-uation over finite fields of characteristic two. The algorithms require the set ofevaluation points to equal the field itself, and their corresponding multiplicities tobe balanced, with | ℓ i − ℓ j | ≤ i = j . While not solving the general interpola-tion and evaluation problems over these fields, the algorithms are suitable for usein multivariate Hermite interpolation and evaluation algorithms [16], encoding anddecoding algorithms for multiplicity codes [22, 16] and the codes of Wu [43], andprivate information retrieval protocols based on these codes [42, 2].Over a characteristic two finite field of order q , the restricted problems maybe solved with O ( M ( ℓ ) log q + ℓ log ℓ/q ) operations by existing algorithms. Thealgorithms presented in this paper yield the same complexities, but benefit bytheir simplicity and the low number of multiplications they perform. When ℓ isa multiple of q , as occurs in some encoding and decoding contexts, the Hermiteinterpolation algorithm presented here performs ℓ/q standard interpolations overthe q evaluation points, followed by O ( ℓ log ℓ/q ) additions. The Hermite evaluationalgorithm performs O ( ℓ log ℓ/q ) additions, followed by ℓ/q standard evaluationsover the q points. Using the generic bound of O ( M ( q ) log q ) operations for solvingthe standard problems leads to the above bound on solving the Hermite problemsin this special case.We are not prohibited from using faster FFT-based interpolation and evaluationalgorithms to solve the standard problems when they are supported by the field.Moreover, we have the option of using “additive FFT” algorithms [11, 29, 19, 5, AST HERMITE INTERPOLATION AND EVALUATION OVER CHARACTERISTIC TWO 3
4, 26, 25, 24, 13], which are specific to characteristic two finite fields and allowevaluation and interpolation over the q points of the field to be performed with O ( q log q ) or O ( q (log q ) log log q ) additions, depending on the degree of the field,and O ( q log q ) multiplications. With these algorithms and ℓ a multiple of q , theHermite interpolation and evaluation algorithms perform O ( ℓ (log q + log ℓ/q )) or O ( ℓ ((log q ) log log q + log ℓ/q )) additions, and only O ( ℓ log q ) multiplications.When ℓ is not a multiple of q , the Hermite interpolation and evaluation algo-rithms still perform ⌈ ℓ/q ⌉− q eval-uation points, but each must also solve one instance of a slightly generalised versionof the corresponding standard problem. However, these more general problems re-duce to the standard problems at the cost of O ( M ( q )) operations for performing onedivision with remainder of polynomials of degree less than q (see [41, Section 9.1]).Consequently, the algorithms retain their simplicity in this case.The reduction from Hermite to standard problems is provided in Section 3, wherewe develop divide-and-conquer algorithms for solving the Hermite interpolation andevaluation problems when ℓ/q is a power of two. The problems for arbitrary ℓ canbe reduced to this special case by zero padding. However, this approach almostdoubles the size of the initial problem when ℓ/q is slightly larger than a power oftwo, leading to large jumps in complexity. Instead, in Sections 4 and 5 we addressthe problems for arbitrary ℓ by transferring across ideas from pruned and truncatedFFT algorithms [28, 35, 36, 20, 21, 23], which are used to smooth similar unwantedjumps in the complexities of FFT-based evaluation and interpolation schemes. Weare consequently able to solve the Hermite interpolation and evaluation problemswith better complexity than obtained by zero padding.2. Properties of Hasse derivatives
We begin by recalling some basic properties of Hasse derivatives.
Lemma 2.1.
Let
F, G ∈ F [ x ] , α, β, ω ∈ F and i ∈ N . Then (1) D i ( αF + βG ) = α ( D i F ) + β ( D i G ) , (2) ( D i F )( ω ) is equal to the coefficient of x i in F ( x + ω ) , (3) ( D j F )( ω ) = 0 for j ∈ { , . . . , i − } if and only if ( x − ω ) i divides F , (4) D i x k = (cid:0) ki (cid:1) x k − i for k ∈ N , and (5) D i ◦ D j = (cid:0) i + ji (cid:1) D i + j for j ∈ N . Properties (1) and (2) of Lemma 2.1 follow readily from the definition of Hassederivatives provided in the introduction. Property (3) follows from Property (2).Property (4) follows from the definition of Hasse derivatives and the binomial the-orem. Property (5) follows from Properties (1) and (4), and the binomial identity (cid:18) k − ji (cid:19)(cid:18) kj (cid:19) = (cid:18) i + ji (cid:19)(cid:18) ki + j (cid:19) for i, j, k ∈ N . For ℓ >
0, let F [ x ] ℓ denote the space of polynomials in F [ x ] that have degreeless than ℓ . Then existence and uniqueness for the general Hermite interpolationproblem is provided by the following lemma. Lemma 2.2.
Let ω , . . . , ω n − ∈ F be distinct, ℓ , . . . , ℓ n − be positive integers,and ℓ = ℓ + · · · + ℓ n − . Then given elements h i,j ∈ F for i ∈ { , . . . , ℓ j − } and j ∈ { , . . . , n − } , there exists a unique polynomial F ∈ F [ x ] ℓ such that ( D i F )( ω j ) = h i,j for i ∈ { , . . . , ℓ j − } and j ∈ { , . . . , n − } . NICHOLAS COXON
Lemma 2.2 follows from Property (3) of Lemma 2.1, which implies that thekernel of the linear map from F [ x ] ℓ to F ℓ given by F (( D i F )( ω j )) ≤ i<ℓ j , ≤ j Hereafter, we assume that F is finite of characteristic two. Let q denote the orderof the field, and enumerate its elements as ω , . . . , ω q − . Define i div j = ⌊ i/j ⌋ and i mod j = i − ⌊ i/j ⌋ j for i, j ∈ Z such that j is nonzero. Then the Hermiteinterpolation problem we consider in the remainder of the paper can be stated asfollows: given ( h , . . . , h ℓ − ) ∈ F ℓ , compute the vector ( f , . . . , f ℓ − ) ∈ F ℓ suchthat F = P ℓ − i =0 f i x i satisfies ( D i div q F )( ω i mod q ) = h i for i ∈ { , . . . , ℓ − } . TheHermite evaluation problem we consider is the inverse problem, asking that wecompute the vector (( D i div q F )( ω i mod q )) ≤ i<ℓ when given the coefficient vectorof F ∈ F [ x ] ℓ . We call ℓ the length of an instance of either problem, and observethat if ℓ ≤ q , then the problems reduce to standard multipoint interpolation andevaluation with evaluation points ω , . . . , ω ℓ − .In this section, we introduce the main elements of our algorithms by temporarilylimiting our attention to instances of length 2 n q for some n ∈ N . The algorithmstake on their simplest form in this case, with each applying a simple reduction fromthe length 2 n q problem to two problems of length 2 n − q . Proceeding recursively,both algorithms ultimately reduce to problems of length q , which are then solvedby existing standard interpolation and evaluation algorithms. The reductions em-ployed by the algorithms are provided by the following lemma. Lemma 3.1. Let n ∈ N be nonzero, F , F ∈ F [ x ] n − q and (3.1) F = F ( x q − x ) n − + F . Then for ω ∈ F and i ∈ { , . . . , n − } , (cid:0) D i F (cid:1) ( ω ) = ((cid:0) D i F (cid:1) ( ω ) if i < n − , (cid:0) D i − n − ( F + D n − F ) (cid:1) ( ω ) otherwise . Proof. Let n ∈ N be nonzero, F , F ∈ F [ x ] n − q and define F by (3.1). Then F ( x + ω ) = F ( x + ω ) x n − q + F ( x + ω ) x n − + F ( x + ω )for ω ∈ F . Consequently, as 2 n − q ≥ n , Property (2) of Lemma 2.1 implies that (cid:0) D i F (cid:1) ( ω ) = ((cid:0) D i F (cid:1) ( ω ) if i < n − , (cid:0) D i − n − F (cid:1) ( ω ) + (cid:0) D i F (cid:1) ( ω ) otherwise , for ω ∈ F and i ∈ { , . . . , n − } . Therefore, linearity of Hasse derivatives impliesthat the lemma will follow if we can show that D i = D i − n − ◦ D n − for i ∈{ n − , . . . , n − } . To this end, we use Lucas’ lemma [27, p. 230] (see also [18]),which states that(3.2) (cid:18) uv (cid:19) ≡ (cid:18) u div 2 r v div 2 r (cid:19)(cid:18) u mod 2 r v mod 2 r (cid:19) (mod 2) for u, v, r ∈ N . AST HERMITE INTERPOLATION AND EVALUATION OVER CHARACTERISTIC TWO 5 By combining Lucas’ lemma with Property (5) of Lemma 2.1, we find that D i − n − ◦ D n − = (cid:18) n − + ( i − n − )2 n − (cid:19) D i = (cid:18) (cid:19)(cid:18) i − n − (cid:19) D i = D i for i ∈ { n − , . . . , n − } . (cid:3) Given a vector ( h , . . . , h n q − ) ∈ F n q that defines an instance of the Hermiteinterpolation problem, our algorithm recursively computes the corresponding poly-nomial F ∈ F [ x ] n q as follows. If n = 0, then we are in the base case of the recur-sion, and F is recovered by a standard interpolation algorithm. If n ≥ 1, then thealgorithm is recursively called on ( h , . . . , h n − q − ) and ( h n − q , . . . , h n q − ). Lem-mas 2.2 and 3.1 imply that the recursive calls return F and F + D n − F , where F and F are the unique polynomials in F [ x ] n − q that satisfy (3.1). Thus, thealgorithm next recovers F by computing D n − F and adding it to F + D n − F .Finally, F is computed by expanding (3.1) as(3.3) F = F x n − q + F x n − + F . Given a polynomial F ∈ F [ x ] n q , the evaluation algorithm uses a standard evalua-tion algorithm in its base case of n = 0, and if n ≥ 1, then it simply reverses the stepsof the interpolation algorithm by first computing F and F , then F + D n − F ,and finally recursively evaluating F and F + D n − F . In both algorithms, thefollowing lemma is used to compute derivatives. Lemma 3.2. Let n ∈ N be nonzero and F = P n − q − i =0 f i x i ∈ F [ x ] . Then (3.4) D n − F = q/ − X i =0 x n i n − − X j =0 f n − (2 i +1)+ j x j . Proof. Let n ∈ N be nonzero and F = P n − q − i =0 f i x i ∈ F [ x ]. Then Property (4) ofLemma 2.1 and Lucas’ lemma, in the form of (3.2), imply that D n − x n − (2 i + b )+ j = (cid:18) i + b (cid:19)(cid:18) j (cid:19) x n − (2 i + b − j = bx n − (2 i + b − j for b ∈ { , } , i ∈ N and j ∈ { , . . . , n − − } . Therefore, writing F in the form F = X b =0 q/ − X i =0 2 n − − X j =0 f n − (2 i + b )+ j x n − (2 i + b )+ j and applying D n − to each of its terms yields (3.4). (cid:3) Evaluation algorithm To solve the Hermite evaluation problem for arbitrary lengths we reduce to thespecial case of the preceding section by padding the input vector with zeros. Follow-ing the approach of pruned and truncated FFT algorithms, we lessen the penaltyincurred by having to solve the larger problems by pruning those steps of the algo-rithm that are specific to the computation of unwanted entries in the output. Thus,we consider the following revised problem in this section: given the coefficients ofa polynomial F ∈ F [ x ] n q and c ∈ { , . . . , n q } , compute ( D i div q F )( ω i mod q ) for NICHOLAS COXON i ∈ { , . . . , c − } . The length ℓ Hermite evaluation problem is then captured bytaking n = ⌈ log ⌈ ℓ/q ⌉⌉ and c = ℓ .The Hermite evaluation algorithm is described in Algorithm 2. The algorithmoperates on a vector ( a , . . . , a n q − ) ∈ F n q that initially contains the coefficientsof a polynomial F ∈ F [ x ] n q , and overwrites a i with ( D i div q F )( ω i mod q ) for i lessthan the input value c . The remaining entries of the vector are either unchanged orset to intermediate values from the computation. If c > n − q , then the algorithmfollows the steps described in the preceding section for the length 2 n q problem, withthe exception that the recursive call used to evaluate F + D n − F only computesthe c − n − q values required for the output. If c ≤ n − q , then the output dependson F only, so only F is computed (by the function PrepareLeft ) and recursivelyevaluated. Once again, the recursion terminates with n = 0, which is handle by analgorithm Evaluate that satisfies the specifications of Algorithm 1. Algorithm 1 Evaluate (( a , . . . , a q − ) , c ) Input: ( a , . . . , a q − ) ∈ F q and c ∈ { , . . . , q } such that P q − i =0 a i x i = F for some F ∈ F [ x ] q . Output: a i = F ( ω i ) for i ∈ { , . . . , c − } .Algorithm 1 may be realised with a complexity of O ( M ( q )+ M ( c ) log c ) operationsin F by the use of remainder trees. For small values of c , one can apply Horner’s rulefor each of the c evaluation points. Naive matrix-vector products are efficient forsmall q , while additive and (standard) multiplicative FFT algorithms become moreefficient for large q . For multiplicative FFT algorithms to be used, the multiplicativegroup of the field must have smooth cardinality, and it is necessary to first reducemodulo x q − − c < q by using the truncated FFT algorithm of Larrieu [23]. Proposition 4.1. Algorithm 2 is correct if Algorithm 1 is correctly implemented.Proof. Under the assumption that Algorithm 1 has been correctly implemented, weuse induction to show that for all n ∈ N , Algorithm 2 produces the correct outputwhen given inputs ( a , . . . , a n q − ) ∈ F n q and c ∈ { , . . . , n q } . Therefore, supposethat Algorithm 1 has been correctly implemented. Then for inputs with n = 0,the algorithm trivially produces the correct output since Algorithm 1 is simplyapplied in this case. Let n ≥ n . Suppose that ( a , . . . , a n q − ) ∈ F n q and c ∈ { , . . . , n q } are given to the algorithm as inputs, and let F ∈ F [ x ] n q bethe corresponding polynomial for which the input requirements are satisfied. Let F , F ∈ F [ x ] n − q such that (3.1) and, equivalently, (3.3) hold.Suppose that c > n − q . Then (3.3) implies that Lines 4 and 5 set a i equalto the coefficient of x i in F , and a n − q + i equal to the coefficient of x i in F , for i ∈ { , . . . , n − q − } . Consequently, Lemma 3.2 implies that Lines 6 to 8 set a n − q + i equal to the coefficient of x i in F + D n − F for i ∈ { , . . . , n − q − } .As these three lines do not modify a , . . . , a n − q − , which contain the coefficientsof F , the induction hypothesis and Lemma 3.1 imply that the recursive call ofLines 9 sets(4.1) a i = (cid:0) D i div q F (cid:1) ( ω i mod q ) = (cid:0) D i div q F (cid:1) ( ω i mod q ) AST HERMITE INTERPOLATION AND EVALUATION OVER CHARACTERISTIC TWO 7 Algorithm 2 HermiteEvaluate (( a , . . . , a n q − ) , c ) Input: ( a , . . . , a n q − ) ∈ F n q and c ∈ { , . . . , n q } such that n ∈ N and P n q − i =0 a i x i = F for some F ∈ F [ x ] n q . Output: a i = ( D i div q F )( ω i mod q ) for i ∈ { , . . . , c − } . If n = 0: Evaluate (( a , . . . , a q − ) , c ) /* Algorithm 1 */ Else if c > n − q : For i = 2 n − ( q + 1) − , n − ( q + 1) − , . . . , n − : a i ← a i + a n − ( q − i For i = q/ , . . . , q − For j = 0 , . . . , n − − a n i + j ← a n i + j + a n i + j − ( q − n − HermiteEvaluate (( a , . . . , a n − q − ) , n − q ) HermiteEvaluate (( a n − q , . . . , a n q − ) , c − n − q ) Else: PrepareLeft ( a, HermiteEvaluate (( a , . . . , a n − q − ) , c ) Function PrepareLeft (( a , . . . , a n q − ) , c ): For i = max( c, n − ) , . . . , n − q − a i ← a i + a n − ( q − i For i = max( c, n − ) , . . . , n − a i ← a i + a n ( q − i for i ∈ { , . . . , n − q − } . Similarly, the recursive call of Line 10 sets a n − q + i = (cid:16) D i div q (cid:16) F + D n − F (cid:17)(cid:17) ( ω i mod q )= (cid:16) D n − +( i div q ) F (cid:17) ( ω i mod q )= (cid:16) D (2 n − q + i ) div q F (cid:17)(cid:0) ω (2 n − q + i ) mod q (cid:1) (4.2)for i ∈ { , . . . , c − n − q − } . The algorithm stops at this point, and thus producesthe correct output.Suppose now that c ≤ n − q . Then (3.3) implies that Line 12 sets a i equal tothe coefficient of x i in F for i ∈ { , . . . , n − q − } . Consequently, the inductionhypothesis and Lemma 3.1 imply that (4.1) holds for i ∈ { , . . . , c − } after therecursive call of Line 13 has been performed. The algorithm stops at this point,and thus produces the correct output. (cid:3) For i, j ∈ Z such that j is nonzero, define i mod ∗ j = i − ( ⌈ i/j ⌉− j . The follow-ing proposition bounds the additive and multiplicative complexities of Algorithm 2in term of those of Algorithm 1. Proposition 4.2. For n ∈ N , define A n , M n : { , . . . , n q } → N as follows: A n ( c ) and M n ( c ) are respectively the number of additions and multiplications in F per-formed by Algorithm 2 (for some implementation of Algorithm 1) when given inputs NICHOLAS COXON ( a , . . . , a n q − ) ∈ F n q and c ∈ { , . . . , n q } . Then A n ( c ) ≤ A ( q )( ⌈ c/q ⌉ − 1) + A ( c mod ∗ q )+ (cid:18) ⌈ log ⌈ c/q ⌉⌉ − (cid:19) ( ⌈ c/q ⌉ − q + (2 n − q and M n ( c ) = M ( q )( ⌈ c/q ⌉ − 1) + M ( c mod ∗ q ) for n ∈ N and c ∈ { , . . . , n q } .Proof. For nonzero n ∈ N , define the indicator function δ n : { , . . . , n q } → { , } by δ n ( c ) = 1 if and only if c > n − q . Then given inputs ( a , . . . , a n q − ) ∈ F n q and c ∈ { , . . . , n q } for some nonzero n ∈ N , Lines 4 to 8 of Algorithm 2 per-form δ n ( c )(3 / n q additions, Line 9 performs δ n ( c ) A n − (2 n − q ) additions and δ n ( c ) M n − (2 n − q ) multiplications, and Line 10 performs δ n ( c ) A n − ( c − δ n ( c )2 n − q )additions and δ n ( c ) M n − ( c − δ n ( c )2 n − q ) multiplications. Furthermore, Line 12performs (1 − δ n ( c ))2 n − q additions, and Line 13 performs (1 − δ n ( c )) A n − ( c − δ n ( c )2 n − q ) additions and (1 − δ n ( c )) M n − ( c − δ n ( c )2 n − q ) multiplications. Sum-ming these contributions, it follows that(4.3) A n ( c ) = A n − (cid:0) c − δ n ( c )2 n − q (cid:1) + 2 n − q + δ n ( c ) (cid:0) A n − (cid:0) n − q (cid:1) + 2 n − q (cid:1) and(4.4) M n ( c ) = M n − (cid:0) c − δ n ( c )2 n − q (cid:1) + δ n ( c ) M n − (cid:0) n − q (cid:1) for nonzero n ∈ N and c ∈ { , . . . , n q } . In particular,(4.5) A n (2 n q ) = 2 A n − (cid:0) n − q (cid:1) + 34 2 n q and M n (2 n q ) = 2 M n − (cid:0) n − q (cid:1) for nonzero n ∈ N . Thus,(4.6) A n (2 n q ) = 2 n (cid:18) A ( q ) + 34 nq (cid:19) and M n (2 n q ) = 2 n M ( q ) for n ∈ N . Substituting these equations into (4.3) and (4.4), it follows that(4.7) A n ( c ) = A n − (cid:0) c − δ n ( c )2 n − q (cid:1) + 2 n − q + δ n ( c )2 n − (cid:18) A ( q ) + 34 ( n − q + q (cid:19) and(4.8) M n ( c ) = M n − (cid:0) c − δ n ( c )2 n − q (cid:1) + δ n ( c )2 n − M ( q )for nonzero n ∈ N and c ∈ { , . . . , n q } .For n, c, δ ∈ N such that n is nonzero, we have (cid:6) ( c − δ n − q ) /q (cid:7) = ⌈ c/q ⌉ − δ n − and c − δ n − q mod ∗ q = c mod ∗ q . Consequently, the formula for M n ( c ) statedin the proposition follows from (4.8) by induction on n . If n ∈ N is nonzero, c ∈ { , . . . , n q } and(4.9) ⌈ c/q ⌉ − i + i · · · · + i n − · n − , with i , . . . , i n − ∈ { , } , then i n − = δ n ( c ). Therefore, it follows from (4.7) byinduction on n , that(4.10) A n ( c ) = A ( c mod ∗ q )+ (2 n − q + (cid:16) A ( q ) + q (cid:17) ( ⌈ c/q ⌉ − q n − X k =0 k i k k AST HERMITE INTERPOLATION AND EVALUATION OVER CHARACTERISTIC TWO 9 for n ∈ N and c ∈ { , . . . , n q } , where i , . . . , i n − ∈ { , } are the coefficients ofthe binary expansion (4.9). Here, n − X k =0 k i k k ≤ max( ⌈ log ⌈ c/q ⌉⌉ − , n − X k =0 k i k = ( ⌈ log ⌈ c/q ⌉⌉ − ⌈ c/q ⌉ − , since i k = 0 if k ≥ ⌈ log ⌈ c/q ⌉⌉ . Combining this inequality with (4.10) yields theupper bound on A n ( c ) stated in the proposition. (cid:3) The functions A and M defined in Proposition 4.2 describe the additive andmultiplicative complexities of the implementation of Algorithm 1. When n = 0or c > n − q for some nonzero n , as may be assumed when solving the Her-mite evaluation problem, the third and fourth terms of the bound on A n ( c ) arein O ( c log ⌈ c/q ⌉ ). By taking A and M to be in O ( M ( q ) + M ( c ) log c ), and makingthe common assumption (used, for instance, in [41]) that M ( ℓ ) /ℓ is a nondecreasingfunction of ℓ , it follows that the length ℓ Hermite evaluation problem can be solvedwith O ( M ( ℓ ) log q + ℓ log ⌈ ℓ/q ⌉ ) operations in F by Algorithm 2. For this application,the number of additions performed by Algorithm 2 may be reduced by adapting thealgorithm to take into account the zeros that initially occupy the 2 n q − ℓ rightmostentries of the vector ( a , . . . , a n q − ).5. Interpolation algorithm To solve the Hermite interpolation problem for arbitrary lengths, we use anapproach analogous to that employed by the inverse truncated FFT algorithm ofLarrieu [23] by reducing to a length 2 n q problem under the assumption that thenew entries of the output that result from extending the problem are providedas inputs. Thus, we consider the following problem in this section: given c ∈{ , . . . , n q } and ( h , . . . , h c − , f c , . . . , f n q − ) ∈ F n q , compute f , . . . , f c − ∈ F such that F = P n q − i =0 f i x i satisfies ( D i div q F )( ω i mod q ) = h i for i ∈ { , . . . , c − } .Here, existence and uniqueness of f , . . . , f c − follow readily from Lemma 2.2. Thelength ℓ Hermite interpolation problem is then captured as an instance of the newproblem by taking n = ⌈ log ⌈ ℓ/q ⌉⌉ , c = ℓ and f c = · · · = f n q − = 0.The Hermite interpolation algorithm is described in Algorithm 4. If c > n − q ,then the algorithm closely follows the approach described in Section 3 by recursivelycomputing the polynomials F and F + D n − F , before using Lemma 3.2 and theexpansion (3.3) to compute the desired coefficients of F . The recursive call usedto recover F + D n − F cannot be made without first computing the coefficientof x i in the polynomial for i ≥ c − n − q . Consequently, after the algorithm hasrecovered F , the required coefficients are computed by function the PrepareRight ,which steps through Lines 4 to 8 of Algorithm 2 while only modifying those entries a i with indices i ≥ c . If c ≤ n − q , then the function PrepareLeft from Algorithm 2is used to recover the coefficient of x i in F for i ≥ c before the remaining coefficientsof the polynomial are recursively computed. The function PrepareLeft , which is itsown inverse for fixed c , is then used to compute the lower order coefficients of theoutput. The base case of the recursion is handled by an algorithm Interpolate thatsatisfies the specifications of Algorithm 3.Algorithm 3 may be realised with a complexity of O ( M ( q )+ M ( c ) log c ) operationsin F by the use of a fast Chinese remainder algorithm: P c − i =0 f i x i is equal to thesum of the polynomial C ∈ F [ x ] c that satisfies C ( ω i ) = F ( ω i ) for i ∈ { , . . . , c − } , Algorithm 3 Interpolate (( a , . . . , a q − ) , c ) Input: ( a , . . . , a q − ) ∈ F q and c ∈ { , . . . , q } such that for some polynomial F = P q − i =0 f i x i ∈ F [ x ] q the following conditions hold:(1) a i = F ( ω i ) for i ∈ { , . . . , c − } , and(2) a i = f i for i ∈ { c, . . . , q − } . Output: a i = f i for i ∈ { , . . . , q − } .and the remainder of P q − i = c f i x i upon division by Q c − i =0 ( x − ω i ), with the productbeing computed as part of the Chinese remainder algorithm. If the enumeration ofthe field may be chosen freely and its multiplicative group has smooth cardinality,then a better complexity is obtained by using the inverse truncated FFT algorithmof Larrieu [23]. In doing so, one should set ω q − = 0 so that only a single additionis required on top of the call to the FFT algorithm, since F ( ω ) = f q − + P q − i =0 f i ω i for nonzero ω ∈ F . Proposition 5.1. Algorithm 4 is correct if Algorithm 3 is correctly implemented.Proof. Under the assumption that Algorithm 3 has been correctly implemented, weuse induction to show that for all n ∈ N , Algorithm 4 produces the correct outputwhen given inputs ( a , . . . , a n q − ) ∈ F n q and c ∈ { , . . . , n q } . Therefore, supposethat Algorithm 3 has been correctly implemented. Then for inputs with n = 0,the algorithm trivially produces the correct output since Algorithm 3 is simplyapplied in this case. Let n ≥ n . Suppose that ( a , . . . , a n q − ) ∈ F n q and c ∈ { , . . . , n q } are given to the algorithm as inputs, and let F ∈ F [ x ] n q bethe corresponding polynomial for which the input requirements are satisfied. Let F , F ∈ F [ x ] n − q such that (3.1) and, equivalently, (3.3) hold.Suppose that c > n − q . Then Lemma 3.1 implies that (4.1) initially holds for i ∈ { , . . . , n − q − } . Consequently, the induction hypothesis and Lemma 2.2imply that the recursive call of Line 4 sets a i equal to the coefficient of x i in F for i ∈ { , . . . , n − q − } . Thus, when the function PrepareRight is called inLine 5, (3.3) implies that Lines 17 and 18 of the function set a n − q + i equal tothe coefficient of x i in F for i ∈ { c − n − q, . . . , n − q − } . Then Lemma 3.2implies that Lines 19 to 25 of the function set a n − q + i equal to the coefficient of x i in F + D n − F for i ∈ { c − n − q, . . . , n − q − } . The entries a n − q , . . . , a c − are so far unchanged by the algorithm. Thus, Lemma 3.1 implies that (4.2) holdsfor i ∈ { , . . . , c − n − q − } . The induction hypothesis and Lemma 2.2 thereforeimply that the recursive call of Line 6 sets a n − q + i equal to the coefficient of x i in F + D n − F for i ∈ { , . . . , c − n − q − } . Hence, after the recursive call, the lefthalf of the vector ( a , . . . , a n q − ) contains the coefficients of F , while its right halfcontains the coefficients of F + D n − F . Consequently, Lemma 3.2 implies thatLines 7 to 9 set a n − q + i equal to the coefficient of x i in F for i ∈ { , . . . , n − q − } ,then (3.3) implies that Lines 10 to 11 set a i equal to the coefficient of x i in F for i ∈ { , . . . , n q − } . The algorithm stops at this point, and thus produces thecorrect output.Suppose now that c ≤ n − q . Then (3.3) implies that the call to PrepareLeft in Line 13 sets a i equal to the coefficient of x i in F for i ∈ { c, . . . , n − q − } .This call to PrepareLeft does not modify a , . . . , a c − . Thus, Lemma 3.1 implies AST HERMITE INTERPOLATION AND EVALUATION OVER CHARACTERISTIC TWO 11 Algorithm 4 HermiteInterpolate (( a , . . . , a n q − ) , c ) Input: ( a , . . . , a n q − ) ∈ F n q and c ∈ { , . . . , n q } such that n ∈ N and for somepolynomial F = P n q − i =0 f i x i ∈ F [ x ] n q the following conditions hold:(1) a i = ( D i div q F )( ω i mod q ) for i ∈ { , . . . , c − } , and(2) a i = f i for i ∈ { c, . . . , n q − } . Output: a i = f i for i ∈ { , . . . , n q − } . If n = 0: Interpolate (( a , . . . , a q − ) , c ) /* Algorithm 3 */ Else if c > n − q : HermiteInterpolate (( a , . . . , a n − q − ) , n − q ) PrepareRight (( a , . . . , a n q − ) , c ) HermiteInterpolate (( a n − q , . . . , a n q − ) , c − n − q ) For i = q/ , . . . , q − For j = 0 , . . . , n − − a n i + j ← a n i + j + a n i + j − ( q − n − For i = 2 n − , . . . , n − ( q + 1) − a i ← a i + a n − ( q − i Else: PrepareLeft ( a, c ) /* From Algorithm 2 */ HermiteInterpolate (( a , . . . , a n − q − ) , c ) PrepareLeft ( a, Function PrepareRight (( a , . . . , a n q − ) , c ): For i = 2 n − ( q + 1) − , n − ( q + 1) − , . . . , c : a i ← a i + a n − ( q − i t ← c div 2 n , r ← min( c mod 2 n , n − ) For j = 0 , . . . , r − For i = t + 1 , . . . , q − a n i + j ← a n i + j + a n i + j − ( q − n − For j = r, . . . , n − − For i = t, . . . , q − a n i + j ← a n i + j + a n i + j − ( q − n − that (4.1) holds for i ∈ { , . . . , c − } when the recursive call of Line 14 is made. Theinduction hypothesis and Lemma 2.2 therefore imply that Line 14 sets a i equal tothe coefficient of x i in F for i ∈ { , . . . , c − } . Hence, after the recursive call, the lefthalf of the vector ( a , . . . , a n q − ) contains the coefficients of F , while the entriesin the right half still retain their initial values, with a i equal to the coefficient of x i in F for i ∈ { n − q, . . . , n q − } . Thus, (3.3) implies that the call to PrepareLeft in Line 15 sets a i equal to the coefficient of x i in F for i ∈ { , . . . , n − q − } . Thealgorithm stops at this point, and thus produces the correct output. (cid:3) Proposition 5.2. For n ∈ N , define A n , M n : { , . . . , n q } → N as follows: A n ( c ) and M n ( c ) are respectively the number of additions and multiplications in F per-formed by Algorithm 4 (for some implementation of Algorithm 3) when given inputs ( a , . . . , a n q − ) ∈ F n q and c ∈ { , . . . , n q } . Then A n ( c ) ≤ A ( q )( ⌈ c/q ⌉ − 1) + A ( c mod ∗ q )+ (cid:18) ⌈ log ⌈ c/q ⌉⌉ − n − (cid:19) ( ⌈ c/q ⌉ − q + (2 n − q + 1) and M n ( c ) = M ( q )( ⌈ c/q ⌉ − 1) + M ( c mod ∗ q ) for n ∈ N and c ∈ { , . . . , n q } .Proof. The proof of the bound on A n ( c ) follows along similar lines to that of Propo-sition 4.2, but with the addition of having to bound the number of additions per-formed in Lines 5 and 13 of the algorithm. As no multiplications are performed byeither of these lines, they may be ignored when proving the formula for M n ( c ). Indoing so, the proof follows along identical lines to that of Proposition 4.2, and istherefore omitted.Suppose that Algorithm 4 has been given inputs ( a , . . . , a n q − ) ∈ F n q and c ∈{ , . . . , n q } for some nonzero n ∈ N . Let t = c div 2 n and r = min( c mod 2 n , n − ),as defined in Line 19 of the function PrepareRight . If c > n − q , then the call to PrepareRight in Line 5 of the Algorithm 4 performsmax(2 n − ( q + 1) − c, 0) + (cid:0) n − ( q − t ) − r (cid:1) < n − + (cid:0) n − q − c/ (cid:1) additions. In particular, no additions are performed if c = 2 n q . Consequently, A n once again satisfies the recurrence (4.5) for nonzero n ∈ N , and, as a result,also satisfies (4.6). If c ≤ n − q , then Line 13 of Algorithm 4 performs at most(2 n − q − c ) + (2 n − n − ) additions. By summing the contributions of each line ofAlgorithm 4 in the manner of the proof of Proposition 4.2, with the two boundsused for the contributions of Lines 5 and 13, it follows that A n ( c ) ≤ A n − (cid:0) c − δ n ( c )2 n − q (cid:1) + 2 n − (2 q + 1)+ δ n ( c )2 n − (cid:18) A ( q ) + 34 ( n − q + q (cid:19) − (cid:18) − δ n ( c )2 (cid:19) c for nonzero n ∈ N and c ∈ { , . . . , n q } , where δ n is the indicator function definedin the proof of Proposition 4.2. Therefore, A n ( c ) ≤ A ( q )( ⌈ c/q ⌉ − 1) + A ( c mod ∗ q ) + (2 n − q + 1)+ (cid:18) ⌈ log ⌈ c/q ⌉⌉ − (cid:19) ( ⌈ c/q ⌉ − q − n − X k =0 (cid:18) − i k (cid:19) c − q n − X j = k +1 j i j ! for n ∈ N and c ∈ { , . . . , n q } , where i , . . . , i n − ∈ { , } are the coefficients ofthe binary expansion (4.9). The upper bound on A n ( c ) stated in the proposition isthen obtained by observing that n − X k =0 (cid:18) − i k (cid:19) c − q n − X j = k +1 j i j ! ≥ n − X k = ⌈ log ⌈ c/q ⌉⌉ c + ⌈ log ⌈ c/q ⌉⌉− X k =max( ⌈ log ⌈ c/q ⌉⌉− , c ≥ (cid:18) n − ⌈ log ⌈ c/q ⌉⌉ + 12 (cid:19) ( ⌈ c/q ⌉ − q for n ∈ N , c ∈ { , . . . , n q } and i , . . . , i n − ∈ { , } such that (4.9) holds. (cid:3) By taking A and M to be in O ( M ( q ) + M ( c ) log c ), it follows from Propo-sition 5.2 that the length ℓ Hermite interpolation problem can be solved with AST HERMITE INTERPOLATION AND EVALUATION OVER CHARACTERISTIC TWO 13 O ( M ( ℓ ) log q + ℓ log ⌈ ℓ/q ⌉ ) operations in F by Algorithm 4. The number of ad-ditions performed by the algorithm in this setting may once again be reduced bytaking into account the zeros that initially occupy the 2 n q − ℓ rightmost entries ofthe vector ( a , . . . , a n q − ). Moreover, as some of these entries are changed duringthe course of the algorithm, but ultimately are equal to zero again at its end, it ispossible to save further additions by not performing those steps specific to restoringthe entries to zero. References 1. A. V. Aho, K. Steiglitz, and J. D. Ullman, Evaluating polynomials at fixed sets of points ,SIAM J. Comput. (1975), no. 4, 533–539.2. Daniel Augot, Fran¸coise Levy-dit-Vehel, and Abdullatif Shikfa, A storage-efficient and robustprivate information retrieval scheme allowing few servers , Cryptology and network security,Lecture Notes in Comput. Sci., vol. 8813, Springer, Cham, 2014, pp. 222–239.3. Daniel J. Bernstein, Scaled remainder trees , Available from https://cr.yp.to/arith/scaledmod-20040820.pdf, 2004.4. Daniel J. Bernstein and Tung Chou, Faster binary-field multiplication and faster binary-fieldMACs , Selected areas in cryptography—SAC 2014, Lecture Notes in Comput. Sci., vol. 8781,Springer, Cham, 2014, pp. 92–111.5. Daniel J. Bernstein, Tung Chou, and Peter Schwabe, McBits: Fast constant-time code-basedcryptography , Cryptographic Hardware and Embedded Systems - CHES 2013: 15th Interna-tional Workshop, Santa Barbara, CA, USA, August 20-23, 2013. (Berlin, Heidelberg) (GuidoBertoni and Jean-S´ebastien Coron, eds.), Springer Berlin Heidelberg, 2013, pp. 250–272.6. Dario Bini and Victor Y. Pan, Polynomial and matrix computations, vol. 1: Fundamentalalgorithms , Progress in Theoretical Computer Science, Birkh¨auser Boston, Inc., Boston, MA,1994.7. A. Borodin and R. Moenck, Fast modular transforms , J. Comput. System Sci. (1974),366–386.8. A. Bostan, G. Lecerf, B. Salvy, ´E. Schost, and B. Wiebelt, Complexity issues in bivariatepolynomial factorization , ISSAC 2004, ACM, New York, 2004, pp. 42–49.9. A. Bostan, G. Lecerf, and ´E. Schost, Tellegen’s principle into practice , Proceedings of the 2003International Symposium on Symbolic and Algebraic Computation (New York, NY, USA),ISSAC ’03, ACM, 2003, pp. 37–44.10. Alin Bostan and ´Eric Schost, Polynomial evaluation and interpolation on special sets of points ,J. Complexity (2005), no. 4, 420–446.11. David G. Cantor, On arithmetical algorithms over finite fields , J. Combin. Theory Ser. A (1989), no. 2, 285–300.12. David G. Cantor and Erich Kaltofen, On fast multiplication of polynomials over arbitraryalgebras , Acta Inform. (1991), no. 7, 693–701.13. Ming-Shing Chen, Chen-Mou Cheng, Po-Chun Kuo, Wen-Ding Li, and Bo-Yin Yang, Fastermultiplication for long binary polynomials , 2017, arXiv:1708.09746 [cs.SC] .14. Francis Y. Chin, A generalized asymptotic upper bound on fast polynomial evaluation andinterpolation , SIAM J. Comput. (1976), no. 4, 682–690.15. James W. Cooley and John W. Tukey, An algorithm for the machine calculation of complexFourier series , Math. Comp. (1965), 297–301.16. Nicholas Coxon, Fast systematic encoding of multiplicity codes , arXiv:1704.07083 [cs.IT] ,2017.17. Charles M. Fiduccia, Polynomial evaluation via the division algorithm the fast fourier trans-form revisited , Proceedings of the Fourth Annual ACM Symposium on Theory of Computing(New York, NY, USA), STOC ’72, ACM, 1972, pp. 88–93.18. N. J. Fine, Binomial coefficients modulo a prime , Amer. Math. Monthly (1947), 589–592.19. Shuhong Gao and Todd Mateer, Additive fast Fourier transforms over finite fields , IEEETrans. Inform. Theory (2010), no. 12, 6265–6272.20. David Harvey, A cache-friendly truncated FFT , Theoret. Comput. Sci. (2009), no. 27-29,2649–2658. 21. David Harvey and Daniel S. Roche, An in-place truncated Fourier transform and applicationsto polynomial multiplication , ISSAC 2010—Proceedings of the 2010 International Symposiumon Symbolic and Algebraic Computation, ACM, New York, 2010, pp. 325–329.22. Swastik Kopparty, Some remarks on multiplicity codes , Discrete geometry and algebraic com-binatorics, Contemp. Math., vol. 625, Amer. Math. Soc., Providence, RI, 2014, pp. 155–176.23. Robin Larrieu, The truncated Fourier transform for mixed radices , Proceedings of the 2017ACM on International Symposium on Symbolic and Algebraic Computation (New York, NY,USA), ISSAC’17, ACM, 2017, pp. 261–268.24. Sian-Jheng Lin, Tareq Y. Al-Naffouri, and Yunghsiang S. Han, FFT algorithm for binaryextension finite fields and its application to Reed-Solomon codes , IEEE Trans. Inform. Theory (2016), no. 10, 5343–5358.25. Sian-Jheng Lin, Tareq Y. Al-Naffouri, Yunghsiang S. Han, and Wei-Ho Chung, Novel poly-nomial basis with fast Fourier transform and its application to Reed-Solomon erasure codes ,IEEE Trans. Inform. Theory (2016), no. 11, 6284–6299.26. Sian-Jheng Lin, Wei-Ho Chung, and Yunghsiang S. Han, Novel polynomial basis and itsapplication to Reed-Solomon erasure codes , 55th Annual IEEE Symposium on Foundationsof Computer Science—FOCS 2014, IEEE Computer Soc., Los Alamitos, CA, 2014, pp. 316–325.27. Edouard Lucas, Th´eorie des Fonctions Num´eriques Simplement P´eriodiques. [Continued] ,Amer. J. Math. (1878), no. 3, 197–240.28. J. Markel, FFT pruning , IEEE Transactions on Audio and Electroacoustics (1971), no. 4,305–311.29. Todd Mateer, Fast Fourier Transform algorithms with applications , Ph.D. thesis, ClemsonUniversity, August 2008.30. R. Moenck and A. Borodin, Fast modular transforms via division , Proceedings of the 13thAnnual Symposium on Switching and Automata Theory (Swat 1972) (Washington, DC, USA),SWAT ’72, IEEE Computer Society, 1972, pp. 90–96.31. Vadim Olshevsky and Amin Shokrollahi, Matrix-vector product for confluent Cauchy-likematrices with application to confluent rational interpolation , Proceedings of the Thirty-SecondAnnual ACM Symposium on Theory of Computing, ACM, New York, 2000, pp. 573–581.32. Victor Y. Pan, Structured matrices and polynomials: unified superfast algorithms , Birkh¨auserBoston, Inc., Boston, MA; Springer-Verlag, New York, 2001.33. A. Sch¨onhage, Schnelle Multiplikation von Polynomen ¨uber K¨orpern der Charakteristik 2 ,Acta Informat. (1976/77), no. 4, 395–398. MR 043666334. A. Sch¨onhage and V. Strassen, Schnelle Multiplikation grosser Zahlen , Computing (Arch.Elektron. Rechnen) (1971), 281–292.35. H. V. Sorensen and C. S. Burrus, Efficient computation of the DFT with only a subset ofinput or output points , IEEE Transactions on Signal Processing (1993), no. 3, 1184–1200.36. Joris van der Hoeven, The truncated Fourier transform and applications , ISSAC 2004, ACM,New York, 2004, pp. 290–296.37. Joris van der Hoeven, Faster Chinese remaindering , Tech. report, HAL, 2016, http://hal.archives-ouvertes.fr/hal-01403810 .38. T. M. Vari, Some complexity results for a class of Toeplitz matrices , Tech. report, Dept. ofComputer Sci. and Math., York Univ., Toronto, 1974.39. Joachim von zur Gathen, Functional decomposition of polynomials: the tame case , J. SymbolicComput. (1990), no. 3, 281–299.40. Joachim von zur Gathen and J¨urgen Gerhard, Fast algorithms for Taylor shifts and cer-tain difference equations , Proceedings of the 1997 International Symposium on Symbolic andAlgebraic Computation, ACM, New York, 1997, pp. 40–47.41. Joachim von zur Gathen and J¨urgen Gerhard, Modern computer algebra , third ed., CambridgeUniversity Press, Cambridge, 2013.42. David Woodruff and Sergey Yekhanin,