Fast transforms over finite fields of characteristic two
FFAST TRANSFORMS OVER FINITE FIELDS OFCHARACTERISTIC TWO
NICHOLAS COXON
Abstract.
An additive fast Fourier transform over a finite field of charac-teristic two efficiently evaluates polynomials at every element of an F -linearsubspace of the field. We view these transforms as performing a change ofbasis from the monomial basis to the associated Lagrange basis, and con-sider the problem of performing the various conversions between these twobases, the associated Newton basis, and the “novel” basis of Lin, Chung andHan (FOCS 2014). Existing algorithms are divided between two families, thosedesigned for arbitrary subspaces and more efficient algorithms designed for spe-cially constructed subspaces of fields with degree equal to a power of two. Wegeneralise techniques from both families to provide new conversion algorithmsthat may be applied to arbitrary subspaces, but which benefit equally fromthe specially constructed subspaces. We then construct subspaces of fieldswith smooth degree for which our algorithms provide better performance thanexisting algorithms. Introduction
Let F be a finite field of characteristic two, and W = { ω , . . . , ω n − } be an n -dimensional F -linear subspace of F . Define polynomials L i = n − (cid:89) j =0 j (cid:54) = i x − ω j ω i − ω j , N i = i − (cid:89) j =0 x − ω j ω i − ω j and X i = n − (cid:89) k =0 2 k [ i ] k − (cid:89) j =0 x − ω j ω k [ i ] k − ω j for i ∈ { , . . . , n − } , where [ · ] k : N → { , } for k ∈ N such that i = (cid:88) k ∈ N k [ i ] k for i ∈ N . Let F [ x ] (cid:96) denote the space of polynomials with coefficients in F and degree lessthan (cid:96) . Then { L , . . . , L n − } is the Lagrange basis of F [ x ] n associated with theenumeration of W . Similarly, { N , . . . , N n − } is the associated Newton basis, nor-malised so that N i ( ω i ) = 1. The definition of the functions [ · ] k implies that eachof the polynomials X i has degree equal to i . Thus, { X , . . . , X n − } is also a basisof F [ x ] n . This unusual basis was introduced by Lin, Chung and Han in 2014 [21].Consequently, we refer to it as the Lin–Chung–Han basis, or simply the LCH ba-sis. In this paper, we describe new fast algorithms for converting between theLagrange, (normalised) Newton, Lin–Chung–Han and monomial bases for speciallyconstructed subspaces. Date : July 23, 2018.2010
Mathematics Subject Classification.
Primary 68W30, 68W40, 12Y05.This work was supported by Nokia in the framework of the common laboratory between NokiaBell Labs and INRIA. a r X i v : . [ c s . S C ] J u l NICHOLAS COXON
Converting to the Lagrange basis from one of the three remaining bases cor-responds to evaluating a polynomial at each element in W . An algorithm thatefficiently performs this evaluation for polynomials written on the monomial ba-sis is referred to as an additive fast Fourier transform (FFT). The designation as“additive” reflects the fact that a fast Fourier transform traditionally evaluatespolynomials at each element of a cyclic group. To avoid confusion, we refer to suchalgorithms as multiplicative FFTs hereafter. Additive FFTs have been investigatedas an alternative to multiplicative FFTs for use in fast multiplication algorithms forbinary polynomials [28, 6, 23, 8, 9, 18], and have also found applications in codingtheory and cryptography [4, 3, 10, 1].Additive FFTs first appeared in the 1980s with the of algorithm of Wang andZhu [29], which was subsequently rediscovered by Cantor [7]. Applied to character-istic two finite fields, the Wang–Zhu–Cantor algorithm is restricted to extensionsof degree equal to a power of two. This restriction is removed by the algorithm ofvon zur Gathen and Gerhard [28], but at the cost of a higher algebraic complexity.For these and subsequent algorithms [14, 4, 3], the subspace W is described by anordered basis β = ( β , . . . , β n − ), which also defines the enumeration of the space by(1.1) ω i = n − (cid:88) k =0 [ i ] k β k for i ∈ { , . . . , n − } . For an arbitrary choice of basis, the algorithm of von zur Gathen and Gerhardperforms O ( (cid:96) log (cid:96) ) additions and multiplications, where (cid:96) = | W | = 2 n . Thesubsequent algorithm of Gao and Mateer [14] achieves the same additive complexitywhile performing only O ( (cid:96) log (cid:96) ) multiplications.For extensions with degree equal to a power of two, faster algorithms are obtainedthrough a special choice of subspace and its basis. The defining property of thesespaces is the existence of a Cantor basis, i.e., a basis β = ( β , . . . , β n − ) such that(1.2) β = 1 and β i = β i +1 − β i +1 for i ∈ { , . . . , n − } . For subspaces represented by a Cantor basis, the Wang–Zhu–Cantor algorithm per-forms O ( (cid:96) log log (cid:96) ) additions and O ( (cid:96) log (cid:96) ) multiplications. Gao and Mateer [14]also contribute to this special case by providing an algorithm that achieves the samemultiplicative complexity while performing only O ( (cid:96) (log (cid:96) ) log log (cid:96) ) additions.For the same subspace enumeration used in the additive FFTs, Lin, Chungand Han [21] provide algorithms for converting between the Lagrange and LCHbases that perform O ( (cid:96) log (cid:96) ) additions and multiplications. Lin, Chung and Hanuse their basis and conversion algorithms to provide fast encoding and decodingalgorithms for Reed–Solomon codes. This application is further explored in thesubsequent work of Lin, Al-Naffouri and Han [19] and Lin, Al-Naffouri, Han andChung [20], while Ben-Sasson et al. [2] utilise the conversion algorithms within theirzero-knowledge proof system. Lin, Al-Naffouri, Han and Chung [20] additionallyconsider the problem of converting between the LCH and monomial bases. Forsubspaces represented by an arbitrary choice of basis, they provide algorithms forconverting between the two bases that perform O ( (cid:96) log (cid:96) ) additions and O ( (cid:96) log (cid:96) )multiplications. For subspaces represented by a Cantor basis they provide algo-rithms that require only O ( (cid:96) (log (cid:96) ) log log (cid:96) ) additions and perform no multiplica-tions. The techniques used in both cases, as well as for the algorithms of Lin, AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 3
Chung and Han, originate in the work of Gao and Mateer. This relationship be-comes apparent when combining the algorithms to obtain additive FFTs, since oneessentially obtains the algorithms of Gao and Mateer.The techniques developed for additive FFTs have yet to be applied to conver-sions involving the Newton basis. In the realm of multiplicative FFTs, one has thealgorithms of van der Hoeven and Schost [26], which convert between the monomialbasis and the Newton basis associated with the radix-2 truncated Fourier transformpoints [25, 24]. Fast conversion between the two basis is a necessary requirement ofmultivariate evaluation and interpolation algorithms [26, 11] and their applicationto systematic encoding of Reed–Muller and multiplicity codes [11]. For applicationsin coding theory, characteristic two finite fields are particularly interesting due totheir fast arithmetic. However, the algorithms of van der Hoeven and Schost arenot suited to such fields as they require the existence of roots of unity with orderequal to a power of two. It is likely that this problem may be partially overcomeby generalising their algorithm in a manner analogous to the generalisation of theradix-2 truncated Fourier transform [25, 24] to mixed radices by Larrieu [17]. De-veloping an algorithm based on the ideas of additive FFTs provides a second andmore widely applicable solution to the problem.In this paper, we describe new fast conversion algorithms between the Lin–Chung–Han basis and each of the Newton, Lagrange and monomial bases. Thesealgorithms may in-turn be combined to obtain fast conversions algorithms betweenany two of the four bases. We once again represent subspaces by ordered bases,and use (1.1) for their enumeration.In Section 2, we show that if the defining basis β = ( β , . . . , β n − ) has dimensiongreater than one and satisfies(1.3) 1 , β β , . . . , β d − β ∈ F d for some d ∈ { , . . . , n − } , then each of the three conversions problems over thesubspace generated by β may be efficiently reduced to instances of the problem overthe subspaces generated by α = ( β , . . . , β d − ) and some vector δ ∈ F n − d . One mayalways take d equal to one, allowing the reductions to be applied regardless of thechoice of β . Consequently, fast conversions algorithms are obtained by recursivelysolving the smaller problems admitted by the reduction, and directly solving theproblems for the base case of n = 1.Our basis conversion algorithms are described in Section 3. Over the subspacegenerated by an n -dimensional basis β , the algorithms take as input the first (cid:96) coefficients on the input basis of a polynomial in F [ x ] (cid:96) ⊆ F [ x ] n . The algorithmsthen return the first (cid:96) coefficients of the polynomial on the desired output basis.For conversion between the Lagrange and LCH bases, the first (cid:96) Lagrange basispolynomials do not form a basis of F [ x ] (cid:96) for (cid:96) < n . Consequently, we embedthese cases in the larger case of (cid:96) = 2 n , after-which we disregard unnecessary partsof the resulting computations so as not to incur a large penalty in complexity.This approach results in what is known as “pruned” or “truncated” algorithmsin the literature on fast Fourier transforms. While truncated additive FFTs havebeen previously investigated [28, 23, 6, 4, 3, 8, 9], our algorithms for convertingbetween the Lagrange and LCH bases are obtained as analogues of Harvey’s “cache-friendly” truncated multiplicative FFTs [15]. As a consequence of this approach, thealgorithms in fact solve slightly more general problems than those just described, NICHOLAS COXON allowing us to in-turn provide the slightly generalised additive FFTs required bythe fast Hermite interpolation and evaluation algorithms of Coxon [12].Table 1 provides bounds on the number of additions and multiplications per-formed by our algorithms for conversion in either direction between the LCH basisand each of the three remaining bases. These bounds omit the cost of a small pre-computation requiring O ( n ) field operations. The table also provides bounds onthe number of field elements that are required to be stored in auxiliary space by thealgorithms. The bound for conversion with the Newton basis is new. For conversionwith the Lagrange basis, we only have the algorithm of Lin, Chung and Han [21]to compare with, and only for the case (cid:96) = 2 n . Their algorithm performs feweradditions in this case, but only after a much larger precomputation, of unanalysedcomplexity, that stores 2 n − (cid:96) = 2 n . Our algorithms perform the same number ofadditions as their algorithms in this case, while performing fewer multiplications.Basis Additions Multiplications Auxiliary spaceNewton (cid:96) ( (cid:100) log (cid:96) (cid:101) −
1) + 1 (cid:98) (cid:96)/ (cid:99)(cid:100) log (cid:96) (cid:101) O ( n )Lagrange (cid:98) (cid:96)/ (cid:99) (3 (cid:100) log (cid:96) (cid:101) + 1) (cid:98) (cid:96)/ (cid:99) ( (cid:100) log (cid:96) (cid:101) + 1) 2 n − (cid:96) + O ( n )Monomial (cid:98) (cid:96)/ (cid:99) (cid:0) (cid:100) log (cid:96) (cid:101) (cid:1) (cid:98) (cid:96)/ (cid:99)(cid:100) log (cid:96) (cid:101) + 1 O ( n ) Table 1.
Algebraic and space complexities.The cost of the reductions used in our algorithms reduce with the size of thevalue d for which they are applied. For an arbitrary choice of the basis β , thecondition (1.3) may only ever be satisfied by d equal to one. This will also bethe case if F is the only subfield of F with degree less than n . The bounds inTable 1 describe the complexity of our algorithms for this case, and thus representtheir worst-case complexities. When the field has degree equal to a power of twoand β is a Cantor basis, the condition (1.3) is satisfied by d equal to any power oftwo less than n . Moreover, δ = ( β , . . . , β n − d − ) for all such values of d , so thatthe recursive cases are themselves represented by Cantor bases. Consequently, itis possible to always take d to be the largest power of two less than n . With thisstrategy, the algorithms for converting between the LCH and Newton bases performonly (3 (cid:96) − (cid:100) log (cid:96) (cid:101) / (cid:98) (cid:96)/ (cid:99)(cid:100) log (cid:96) (cid:101)(cid:100) log log max( (cid:96), (cid:101) additions. Moreover, for conversionsbetween the LCH and monomial bases, all multiplications in the algorithms becomemultiplications by one, allowing them to be eliminated altogether. In this case, thealgorithms reduce to those of Lin et al. [20].While the benefits of using a Cantor basis are clear, F will only admit a Cantorbasis of a given dimension if its degree is divisible by a sufficiently large power oftwo. In Section 4, we propose new basis constructions that provide benefits similarto those afforded by Cantor bases when the degree of the field contains a sufficientlylarge smooth factor, i.e., one that factors into a product of small primes. Such afactor ensures the presence of a tower of subfields, which we use in Section 4.1 AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 5 to construct bases that reduce the number of field operations performed by thealgorithms of Section 3 by allowing their reductions to be applied more frequentlywith values of d greater than one. For towers containing quadratic extensions, weadditionally show in Section 4.2 how to leverage freedom in the construction inorder to eliminate some multiplications in the algorithms for conversion betweenthe monomial and LCH bases, echoing the reduction in multiplications obtained forCantor bases. Finally, in Section 4.3, we show how to take advantage of quadraticextensions in a different manner by generalising the construction of Cantor bases.2. Preliminaries
For β = ( β , . . . , β n − ) ∈ F n , define ω β,i = n − (cid:88) k =0 [ i ] k β k for i ∈ { , . . . , n − } . If the entries of β are linearly independent over F , then let L β,i , N β,i and X β,i respectively denote the i th Lagrange, Newton and LCH basis polynomials associ-ated with the enumeration { ω β, , . . . , ω β, n − } of the subspace it generates. Thatis, define L β,i = n − (cid:89) j =0 j (cid:54) = i x − ω β,j ω β,i − ω β,j , N β,i = i − (cid:89) j =0 x − ω β,j ω β,i − ω β,j and X β,i = n − (cid:89) k =0 2 k [ i ] k − (cid:89) j =0 x − ω β,j ω β, k [ i ] k − ω β,j for i ∈ { , . . . , n − } .2.1. Factorisations of basis polynomials.
The following lemma provides fac-torisations of the basis polynomials associated with a vector β ∈ F n . These factori-sations in-turn provide the reductions employed in our basis conversion algorithms. Lemma 2.1.
Let n ≥ and β = ( β , . . . , β n − ) ∈ F n have entries that are linearlyindependent over F . For some d ∈ { , . . . , n − } such that β i /β ∈ F d for i ∈ { , . . . , d − } , set α = ( β , . . . , β d − ) , γ = ( β d , . . . , β n − ) and δ = (cid:32)(cid:18) β d β (cid:19) d − β d β , . . . , (cid:18) β n − β (cid:19) d − β n − β (cid:33) . Then δ has entries that are linearly independent over F , and L β, d i + j = L δ,i (cid:32)(cid:18) xβ (cid:19) d − xβ (cid:33) L α,j ( x − ω γ,i ) ,N β, d i + j = N δ,i (cid:32)(cid:18) xβ (cid:19) d − xβ (cid:33) N α,j ( x − ω γ,i ) ,X β, d i + j = X δ,i (cid:32)(cid:18) xβ (cid:19) d − xβ (cid:33) X α,j ( x ) for i ∈ { , . . . , n − d − } and j ∈ { , . . . , d − } . NICHOLAS COXON
Proof.
Let n ≥ β = ( β , . . . , β n − ) ∈ F n have entries that are linearlyindependent over F . Define α , γ and δ as per the lemma for some d ∈ { , . . . , n − } such that β i /β ∈ F d for i ∈ { , . . . , d − } . Then(2.1) ω β, d i + j = d − (cid:88) k =0 [ j ] k β k + n − d − (cid:88) k =0 [ i ] k β d + k = ω α,j + ω γ,i for i ∈ { , . . . , n − d − } and j ∈ { , . . . , d − } . The choice of d implies that (cid:18) β k β (cid:19) d − β k β = (cid:89) ω ∈ F d β k β − ω = 0 for k ∈ { , . . . , d − } . Thus,(2.2) (cid:18) ω β, d i + j β (cid:19) d − ω β, d i + j β = n − d − (cid:88) k =0 [ i ] k (cid:32)(cid:18) β d + k β (cid:19) d − β d + k β (cid:33) = ω δ,i for i ∈ { , . . . , n − d − } and j ∈ { , . . . , d − } . It follows that(2.3) d − (cid:89) j =0 x − ω β, d i + j = β d (cid:32)(cid:18) xβ (cid:19) d − (cid:18) xβ (cid:19) − ω δ,i (cid:33) for i ∈ { , . . . , n − d − } , since the polynomials on either side of the equation aremonic and split over F with identical roots. As the entries of β are linearly inde-pendent over F , comparing roots shows that the polynomials on the left-hand sideof (2.3) are distinct for different values of i ∈ { , . . . , n − d − } . Consequently, com-paring the polynomials on the right-hand side of the equation shows that ω δ,i (cid:54) = ω δ,j for distinct i, j ∈ { , . . . , n − d − } . Thus, the entries of δ are linearly independentover F .Collecting terms and substituting in (2.1), (2.2) and (2.3) shows that L β, d i + j = n − d − (cid:89) s =0 s (cid:54) = i d − (cid:89) t =0 x − ω β, d s + t ω β, d i + j − ω β, d s + t d − (cid:89) t =0 t (cid:54) = j x − ω β, d i + t ω β, d i + j − ω β, d i + t = n − d − (cid:89) s =0 s (cid:54) = i ( x/β ) d − ( x/β ) − ω δ,s ω δ,i − ω δ,s d − (cid:89) t =0 t (cid:54) = j x − ω γ,i − ω α,t ω γ,i + ω α,j − ω γ,i − ω α,t = n − d − (cid:89) s =0 s (cid:54) = i ( x/β ) d − ( x/β ) − ω δ,s ω δ,i − ω δ,s d − (cid:89) t =0 t (cid:54) = j x − ω γ,i − ω α,t ω α,j − ω α,t = L δ,i (cid:32)(cid:18) xβ (cid:19) d − xβ (cid:33) L α,j ( x − ω γ,i )for i ∈ { , . . . , n − d − } and j ∈ { , . . . , d − } . AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 7
Similarly, N β, d i + j = i − (cid:89) s =0 2 d − (cid:89) t =0 x − ω β, d s + t ω β, d i + j − ω β, d s + t (cid:32) j − (cid:89) t =0 x − ω β, d i + t ω β, d i + j − ω β, d i + t (cid:33) = (cid:32) i − (cid:89) s =0 ( x/β ) d − ( x/β ) − ω δ,s ω δ,i − ω δ,s (cid:33)(cid:32) j − (cid:89) t =0 x − ω γ,i − ω α,t ω α,j − ω α,t (cid:33) = N δ,i (cid:32)(cid:18) xβ (cid:19) d − xβ (cid:33) N α,j ( x − ω γ,i )for i ∈ { , . . . , n − d − } and j ∈ { , . . . , d − } . It follows immediately from thedefinition of the Newton and LCH bases that X β,i = n − (cid:89) k =0 N β, k [ i ] k for i ∈ { , . . . , n − } . Hence, X β, d i + j = (cid:32) n − d − (cid:89) k =0 N β, k + d [ i ] k (cid:33)(cid:32) d − (cid:89) k =0 N β, k [ j ] k (cid:33) = (cid:32) n − d − (cid:89) k =0 N δ, k [ i ] k (cid:32)(cid:18) xβ (cid:19) d − xβ (cid:33)(cid:33)(cid:32) d − (cid:89) k =0 N α, k [ j ] k ( x − ω γ, ) (cid:33) = X δ,i (cid:32)(cid:18) xβ (cid:19) d − xβ (cid:33) X α,j ( x )for i ∈ { , . . . , n − d − } and j ∈ { , . . . , d − } . (cid:3) To illustrate how we can utilise Lemma 2.1, let us consider the problem of con-verting polynomials from the Lagrange basis to the LCH basis. The factorisationof the Lagrange basis polynomials provided by the lemma includes a shift of vari-ables for one factor. Consequently, a recursive approach is facilitated by consideringthe more general problem of converting from a basis of shifted Lagrange polyno-mials to the LCH basis. An instance of this new problem is defined by a vector β = ( β , . . . , β n − ) ∈ F n with linearly independent entries over F , a shift pa-rameter λ ∈ F , and coefficients f , . . . , f n − ∈ F . The goal is then to compute h , . . . , h n − ∈ F such that(2.4) n − (cid:88) i =0 h i X β,i = n − (cid:88) i =0 f i L β,i ( x − λ ) . If n = 1, then L β, = xβ + 1 , L β, = xβ , X β, = 1 and X β, = xβ . Thus, one can simply compute h = f − ( λ/β )( f + f ) and h = f + f . If n ≥
2, then the following consequence of Lemma 2.1 decomposes the length 2 n problem into 2 n − d problems of length 2 d , and 2 d problems of length 2 n − d , for d ∈ { , . . . , n − } such that β i /β ∈ F d for i ∈ { , . . . , d − } . After choosing such NICHOLAS COXON a value of d , for which one always has the possibility of taking d = 1, the smallerinstances of the problem admitted by the decomposition can be solved recursively. Lemma 2.2.
Suppose that β = ( β , . . . , β n − ) ∈ F n has n ≥ linearly indepen-dent entries over F , and let α , γ and δ be defined as per Lemma 2.1 for some d ∈ { , . . . , n − } such that β i /β ∈ F d for i ∈ { , . . . , d − } . Suppose that f , . . . , f n − , λ, g , . . . , g n − , h , . . . , h n − ∈ F satisfy (2.5) d − (cid:88) j =0 g d i + j X α,j = d − (cid:88) j =0 f d i + j L α,j ( x − λ − ω γ,i ) for i ∈ { , . . . , n − d − } , and (2.6) n − d − (cid:88) i =0 h d i + j X δ,i = n − d − (cid:88) i =0 g d i + j L δ,i ( x − η ) for j ∈ { , . . . , d − } , where η = ( λ/β ) d − ( λ/β ) . Then (2.4) holds.Proof. Suppose that β = ( β , . . . , β n − ) ∈ F n has n ≥ F , and let α , γ and δ be defined as per Lemma 2.1 for some d ∈ { , . . . , n − } such that β i /β ∈ F d for i ∈ { , . . . , d − } . Suppose that f , . . . , f n − , λ , g , . . . , g n − , h , . . . , h n − ∈ F satisfy (2.5) and (2.6). Then Lemma 2.1 impliesthat n − (cid:88) i =0 f i L β,i ( x − λ ) = n − d − (cid:88) i =0 d − (cid:88) j =0 f d i + j L α,j ( x − λ − ω γ,i ) × L δ,i (cid:32)(cid:18) xβ (cid:19) d − xβ − η (cid:33) , where η = ( λ/β ) d − ( λ/β ). By substituting in (2.5) and (2.6), it follows that n − (cid:88) i =0 f i L β,i ( x − λ ) = n − d − (cid:88) i =0 d − (cid:88) j =0 g d i + j X α,j ( x ) L δ,i (cid:32)(cid:18) xβ (cid:19) d − xβ − η (cid:33) = d − (cid:88) j =0 n − d − (cid:88) i =0 g d i + j L δ,i (cid:32)(cid:18) xβ (cid:19) d − xβ − η (cid:33) X α,j ( x )= d − (cid:88) j =0 n − d − (cid:88) i =0 h d i + j X δ,i (cid:32)(cid:18) xβ (cid:19) d − xβ (cid:33) X α,j ( x ) . Hence, Lemma 2.1 implies that (2.4) holds. (cid:3)
Reduction trees.
We use full binary trees to encode the values of d for whichthe reductions provided by Lemma 2.1 are applied when converting between two ofthe bases. Definition 2.3.
A full binary tree is a tree that contains a unique vertex of degreezero or two, while all other vertices have degree one or three. Given a full binarytree (
V, E ) with unique vertex r ∈ V of degree zero or two, we use the followingnomenclature: AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 9 • the vertex r is called the root of the tree, • a vertex of degree at most one is called a leaf, • a vertex of degree greater than one is called an internal vertex, • if v ∈ V \ { r } and v = v , v , . . . , v n = r is a path, then v is called a childof v , and a descendant of v i for i ∈ { , . . . , n } , • the set of leaves that are descended from or equal to v ∈ V is denoted L v , • the subtree of ( V, E ) rooted on v ∈ V is the full binary tree ( V (cid:48) , E (cid:48) ) suchthat V (cid:48) consists of v and all its descendants, and E (cid:48) consists of all edges { u, v } ∈ E such that u, v ∈ V (cid:48) . Example 2.4.
The graph shown in Figure 1 is a full binary tree with root vertex v ,internal vertices v , v , v and v , and leaves v , v , v , v and v . The vertex v haschildren v and v , while its descendants are v , v , v and v . Consequently, thesubtree rooted on v consists of those vertices and edges contained in the dotted box.Finally, L v = { v , v , v , v , v } , L v = { v , v , v } , L v = { v , v } , L v = { v , v } and L v i = { v i } for i ∈ { , , , , } . v v v v v v v v v Figure 1.
A full binary tree.Each internal vertex of a full binary tree has exactly two children. As it isnecessary for us to distinguish between the children of internal vertices in ouralgorithms, we assume that each full binary tree (
V, E ) comes equipped with apartition { E α , E δ } of E such that if v ∈ V has children v and v , then { v, v i } ∈ E α if and only if { v, v − i } ∈ E δ . Then for each internal vertex v ∈ V , we denote by v α the child of v such that { v, v α } ∈ E α , and by v δ the child of v such that { v, v δ } ∈ E δ .The partition additionally induces a vertex labelling d : V → N given by d ( v ) = (cid:40) v is a leaf vertex , | L v α | if v is an internal vertex . Example 2.5.
We obtain one such partition { E α , E δ } for the tree shown in Figure 1by letting E α contain the leftmost (as shown in the figure), and E δ the rightmost,of the two edges that connect each internal vertex with its children. Then d ( v ) = | L v | = 3, d ( v ) = | L v | = 2, d ( v ) = | L v | = 1, d ( v ) = | L v | = 1 and d ( v i ) = 0 for i ∈ { , , , , } .We use the vertex labelling to encode the values of d for which the reductionsprovided by Lemma 2.1 are applied. A reduction tree is a full binary tree thatfulfils this role for some subspace basis β ∈ F n . To allow us to formally define this notion, we now introduce maps that send β to each of the vectors α and δ definedin Lemma 2.1. For β = ( β , . . . , β n − ) ∈ F n with n ≥ F and d ∈ { , . . . , n − } , define α ( β, d ) = ( β , . . . , β d − ) and δ ( β, d ) = (cid:32)(cid:18) β d β (cid:19) d − β d β , . . . , (cid:18) β n − β (cid:19) d − β n − β (cid:33) . When d has the additional property that β i /β ∈ F d for i ∈ { , . . . , d − } ,Lemma 2.1 implies that the vector δ ( β, d ) has linearly independent entries over F . Definition 2.6.
Let β ∈ F n have entries that are linearly independent over F ,and ( V, E ) be a full binary tree with root vertex r ∈ V . Then ( V, E ) is a reductiontree for β if it has n leaves, and the following conditions hold if n > β i /β ∈ F d ( r ) for i ∈ { , . . . , d ( r ) − } ,(2) the subtree of ( V, E ) rooted on r α is a reduction tree for α ( β, d ( r )), and(3) the subtree of ( V, E ) rooted on r δ is a reduction tree for δ ( β, d ( r )).If v is an internal vertex of a full binary tree, then | L v α | = d ( v ) and | L v δ | = | L v | − | L v α | = | L v | − d ( v ) , since { L v α , L v δ } is a partition of L v . Therefore, given a basis vector β and one ofits reduction trees ( V, E ), there exists vectors β v = ( β v, , . . . , β v, | L v |− ) ∈ F | L v | for v ∈ V , each with linearly independent entries over F , such that β v, β v, , . . . , β v,d ( v ) − β v, ∈ F d ( v ) , α ( β v , d ( v )) = β v α and δ ( β v , d ( v )) = β v δ for all internal v ∈ V , and β v = β if v is the root of the tree. These vectors defineinstances of each of the basis conversions problem over the subspaces they gener-ate. If v is an internal vertex, then Lemma 2.1 allows instances of the conversionsproblems over the subspace generated by β v to be reduced to instances over thesubspaces generated by β v α and β v δ . If v is in a leaf, then β v is 1-dimensional andthe corresponding conversion problems may be solved directly. Consequently, theexistence of a reduction tree allows the basis conversion problems to be recursivelysolved with recursion depth equal to that of the tree, i.e., equal to the length of thelongest path between its root vertex and one of its leaves.As the first condition of Definition 2.6 is trivially satisfied if d ( r ) = 1, theexistence of a reduction tree for an arbitrarily chosen subspace basis is guaranteed. Proposition 2.7.
Let β ∈ F n have entries that are linearly independent over F .Then every full binary tree with n leaves and Im( d ) ⊆ { , } is a reduction tree for β . It is straightforward to prove Proposition 2.7 by induction on n , or to obtain theproposition as a consequence of Theorem 4.3 in Section 4.1. Consequently, we omitits proof. The choice of reduction trees provided by the proposition captures thestrategy used in recent algorithms [14, 21, 20] for an arbitrary choice of subspacebasis. Accordingly, we use such trees as our baseline for comparison. The reductionstrategy use by recent algorithms specific to Cantor bases [14, 20] is captured by re-duction trees such that d ( v ) = 2 (cid:100) log | L v |(cid:101)− for all internal vertices. We characterisethe reduction trees of Cantor bases in the following proposition. Proposition 2.8.
Suppose that β ∈ F n is a Cantor basis. Then a full binary tree isa reduction tree for β if and only if it has n leaves and Im( d ) ⊆ { , , , , . . . } . AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 11
We use the following properties of Cantor bases to prove Proposition 2.8.
Lemma 2.9 (Properties of Cantor bases) . Suppose that β = ( β , . . . , β n − ) ∈ F n is a Cantor basis. Then (1) β , . . . , β n − are linearly independent over F , (2) β , . . . , β i − ∈ F k if i ≤ k for some k ∈ N , and (3) β k i − β i = β i − k if i ≥ k for some k ∈ N . Lemma 2.9 is proved by Gao and Mateer [14, Apendix A], and also obtained asa special case of Lemma 4.13 in Section 4.3.
Proof of Proposition 2.8.
A full binary tree with one leaf has Im( d ) = { } . Thus,the proposition holds for n = 1, since a full binary tree is a reduction tree for a1-dimensional Cantor basis if and only if it has one leaf. Therefore, let n ≥ n . Suppose that β = ( β , . . . , β n − ) ∈ F n is a Cantor basis, and let ( V, E ) be a full binary tree withroot vertex r ∈ V . Then ( V, E ) is a reduction tree for β if and only if it has n leaves, β , . . . , β d ( r ) − ∈ F d ( r ) (recall that β = 1 for a Cantor basis), the subtree rootedon r α is a reduction tree for α ( β, d ( r )), and the subtree rooted on r δ is a reductiontree for δ ( β, d ( r )). If the tree has n leaves, then 1 ≤ d ( r ) < n and properties (1)and (2) of Lemma 2.9 imply that β , . . . , β d ( r ) − ∈ F d ( r ) if and only if d ( r ) is apower of two. In this case, property (3) of the lemma implies that α ( β, d ( r )) =( β , . . . , β d ( r ) − ) and δ ( β, d ( r )) = ( β , . . . , β n − d ( r ) − ). Finally, if the tree has n leaves, then the subtree rooted on r α has d ( r ) leaves, and the subtree rooted on r δ has | L r | − d ( r ) = n − d ( r ) leaves. Therefore, the induction hypothesis implies that( V, E ) is a reduction tree for β if and only if it has n leaves, d ( r ) is a power of two,and the subtrees rooted on r α and r δ both satisfy Im( d ) ⊆ { , , , , . . . } . Thatis, if and only if the tree has n leaves and Im( d ) ⊆ { , , , , . . . } . Hence, theproposition follows by induction. (cid:3) Conversion algorithms
In this section, we describe how to convert between the polynomial bases once β and a suitable reduction tree have been chosen. Accordingly, we now fix a vector β = ( β , . . . , β n − ) ∈ F n that has linearly independent entries over F , and areduction tree ( V, E ) for β . The algorithms of this section are then presentedfor this arbitrary selection. Recall that we only provide algorithms for convertingbetween the LCH basis and each of the Newton, Lagrange and monomial bases. Thealgorithms that we propose for each of these problems require the precomputationof a small number of constants associated with the basis and the reduction tree.Consequently, we begin the section by discussing these precomputations and theircost.3.1. Precomputations.
For the remainder of the section, we use the shorthandnotations d v = d ( v ) and n v = | L v | for v ∈ V . Define β v = ( β v, , . . . , β v,n v − ) ∈ F n v for v ∈ V recursively as follows: if v is the root of the tree, then β v = β ; if v is aninternal vertex, then β v α = α ( β v , d v ) and β v δ = δ ( β v , d v ). Then Definition 2.6 andLemma 2.1 imply that the vectors β v for v ∈ V each have entries that are linearlyindependent over F . Finally, let γ v = ( β v,d v , . . . , β v,n v − ) for all internal v ∈ V . Given β v for some internal v ∈ V , repeated squaring allows δ ( β v , d v ) to becomputed with ( n v − d v )( d v + 1) multiplications and n v − d v additions. If r ∈ V isthe root vertex of the tree, then V \ L r is the set of internal vertices in V , and(3.1) (cid:88) v ∈ V \ L r d v ( n v − d v ) = (cid:88) v ∈ V \ L r | L v α || L v δ | = (cid:18) | L r | (cid:19) = (cid:18) n (cid:19) . Thus, the vectors β v for v ∈ V can be computed with O ( n ) field operations. Thealgorithms we propose for converting between the monomial and LCH bases onlyrequire the precomputation and storage of β v δ , or 1 /β v δ , for all internal v ∈ V . Asa full binary tree with n leaves has n − O ( n ) field elements, which can be computed with O ( n ) field operations.For conversions involving the Lagrange or Newton bases, we generalise the prob-lems to include a shift parameter, as we did at the end of Section 2.1 for Lagrangeto LCH conversion. As a result, we require additional machinery to allow compu-tations related to the shift parameter to be handled in a time and space-friendlymanner. Define maps ϕ v : L v × F → F for v ∈ V recursively as follows: for u ∈ L v and λ ∈ F , ϕ v ( u, λ ) = λ/β v, if u = v,ϕ v α ( u, λ ) if u (cid:54) = v and u ∈ L v α ,ϕ v δ (cid:16) u, ( λ/β v, ) dv − λ/β v, (cid:17) if u (cid:54) = v and u ∈ L v δ . For internal v ∈ V , define σ v,i = i (cid:88) j =0 β v,d v + j for i ∈ { , . . . , n v − d v − } . Then the algorithms we propose for conversions involving the Newton or Lagrangebases require the precomputation and storage of ϕ v ( u, σ v, ) , . . . , ϕ v ( u, σ v,n v − d v − )for all internal v ∈ V and u ∈ L v α . The identity (3.1) implies that n ( n − / O ( n ) field operations. Remark . If β is a Cantor basis, then Proposition 2.8 and property (3) ofLemma 2.9 imply that β v = ( β , . . . , β n v − ) for v ∈ V . It follows that β v δ , = β = 1 for all internal v ∈ V . Thus, no precomputations are required for con-verting between the LCH and monomial bases. For v ∈ V and u ∈ L v , the map ϕ v ( u, · ) : F → F is F -linear, while Proposition 2.8 and properties (2) and (3) ofLemma 2.9 imply that ϕ v ( u, β i ) = β i if u = v,ϕ v α ( u, β i ) if u (cid:54) = v and u ∈ L v α ,ϕ v δ ( u, β i − d v ) if u (cid:54) = v , u ∈ L v δ and i ≥ d v , u (cid:54) = v , u ∈ L v δ and i < d v , for i ∈ { , . . . , n − } . Consequently, the precomputations for conversions involvingthe Newton or Lagrange bases require fewer operations. AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 13
Conversion between the Newton and Lin–Chung–Han bases.
The fac-torisations of the Newton basis polynomials provided by Lemma 2.1 include a shiftof variables for one factor. Consequently, as we did for Lagrange basis to LCH basisconversion, we consider the more general problem of converting between a basis ofshifted Newton polynomials and the LCH basis. Our conversions algorithms arethen based on the following analogue of Lemma 2.2.
Lemma 3.2.
Let v ∈ V be an internal vertex and (cid:96) ∈ { , . . . , n v } . Suppose that f , . . . , f (cid:96) − , λ, g , . . . , g (cid:96) − , h , . . . , h (cid:96) − ∈ F satisfy (3.2) min ( (cid:96) − dv i, dv ) − (cid:88) j =0 g dv i + j X β vα ,j = min ( (cid:96) − dv i, dv ) − (cid:88) j =0 f dv i + j N β vα ,j ( x − λ − ω γ v ,i ) for i ∈ { , . . . , (cid:6) (cid:96)/ d v (cid:7) − } , and (3.3) (cid:100) ( (cid:96) − j ) / dv (cid:101) − (cid:88) i =0 h dv i + j X β vδ ,i = (cid:100) ( (cid:96) − j ) / dv (cid:101) − (cid:88) i =0 g dv i + j N β vδ ,i ( x − η ) for j ∈ { , . . . , min(2 d v , (cid:96) ) − } , where η = ( λ/β v, ) dv − λ/β v, . Then (3.4) (cid:96) − (cid:88) i =0 h i X β v ,i = (cid:96) − (cid:88) i =0 f i N β v ,i ( x − λ ) . The proof of Lemma 3.2 is omitted since it follows along similar lines to that ofLemma 2.2. Using the lemma, we obtain Algorithms 1 and 2 for conversion betweenthe Newton and LCH bases. The algorithms each operate on a vector ( a , . . . , a (cid:96) − )of field elements that initially contains the coefficients of a polynomial on the inputbasis, and overwrite its entries with the coefficients of the polynomial on the outputbasis. The subvectors of this vector that appear in the algorithms would be rep-resented in practice by auxiliary variables, e.g., by a pointer to their first elementand a stride parameter. The map ∆ : N → N that appears in the algorithms isdefined by i (cid:55)→ min { k ∈ N | [ i ] k = 0 } . By noting that ∆(0) , ∆(1) , . . . , ∆(2 k −
2) isthe transition sequence of the k -bit binary reflected Gray code, it is possible to suc-cessively compute the terms of the sequence at the cost of a small constant numberof operations per element by the algorithm of Bitner, Ehrlich and Reingold [5] (seealso [16]). Theorem 3.3.
Algorithms 1 and 2 are correct.Proof.
We only prove correctness for Algorithm 1, since the proof for Algorithm 2is almost identical. Suppose that the input vertex v is a leaf. Then (cid:96) ∈ { , } since n v = 1. Moreover, L v = { v } , ϕ v ( v, λ ) = λ/β v, , X β v , = N β v , = 1 and X β v , = N β v , = x/β v, . It follows that Algorithm 1 produces the correct outputwhenever the input vertex is a leaf. Therefore, as ( V, E ) is a full binary tree, itis sufficient to show that for internal v ∈ V , if the algorithm produces the correctoutput whenever v α or v δ is given as an input, then the algorithm produces thecorrect output whenever v is given as an input.Let v ∈ V be an internal vertex and suppose that the algorithm produces thecorrect output whenever v α or v δ is given as an input. Let λ ∈ F , (cid:96) ∈ { , . . . , n v } and f , . . . , f (cid:96) − ∈ F . Suppose that Algorithm 1 is called on v , ( ϕ v ( u, λ )) u ∈ L v and (cid:96) , Algorithm 1
N2X ( v, ( ϕ v ( u, λ )) u ∈ L v , (cid:96), ( a , . . . , a (cid:96) − )) Input: a vertex v ∈ V , the vector ( ϕ v ( u, λ )) u ∈ L v ∈ F n v for some λ ∈ F , (cid:96) ∈{ , . . . , n v } , and a i = f i ∈ F for i ∈ { , . . . , (cid:96) − } . Output: a i = h i ∈ F for i ∈ { , . . . , (cid:96) − } such that (3.4) holds. if v is a leaf then if (cid:96) = 2 then a ← a + ϕ v ( v, λ ) a return (cid:96) ← (cid:6) (cid:96)/ d v (cid:7) − (cid:96) ← (cid:96) − d v (cid:96) , (cid:96) (cid:48) ← min(2 d v , (cid:96) ) µ ← ( ϕ v ( u, λ )) u ∈ L vα , ν ← ( ϕ v ( u, λ )) u ∈ L vδ for i = 0 , . . . , (cid:96) − do N2X ( v α , µ, d v , ( a dv i , a dv i +1 , . . . , a dv ( i +1) − )) µ ← µ + ( ϕ v ( u, σ v, ∆( i ) )) u ∈ L vα N2X ( v α , µ, (cid:96) , ( a dv (cid:96) , a dv (cid:96) +1 , . . . , a (cid:96) − )) for j = 0 , . . . , (cid:96) − do N2X ( v δ , ν, (cid:96) + 1 , ( a j , a dv + j , . . . , a dv (cid:96) + j )) for j = (cid:96) , . . . , (cid:96) (cid:48) − do N2X ( v δ , ν, (cid:96) , ( a j , a dv + j , . . . , a dv ( (cid:96) − j )) Algorithm 2
X2N ( v, ( ϕ v ( u, λ )) u ∈ L v , (cid:96), ( a , . . . , a (cid:96) − )) Input: a vertex v ∈ V , the vector ( ϕ v ( u, λ )) u ∈ L v ∈ F n v for some λ ∈ F , (cid:96) ∈{ , . . . , n v } , and a i = h i ∈ F for i ∈ { , . . . , (cid:96) − } . Output: a i = f i ∈ F for i ∈ { , . . . , (cid:96) − } such that (3.4) holds. if v is a leaf then if (cid:96) = 2 then a ← a + ϕ v ( v, λ ) a return (cid:96) ← (cid:6) (cid:96)/ d v (cid:7) − (cid:96) ← (cid:96) − d v (cid:96) , (cid:96) (cid:48) ← min(2 d v , (cid:96) ) µ ← ( ϕ v ( u, λ )) u ∈ L vα , ν ← ( ϕ v ( u, λ )) u ∈ L vδ for j = 0 , . . . , (cid:96) − do X2N ( v δ , ν, (cid:96) + 1 , ( a j , a dv + j , . . . , a dv (cid:96) + j )) for j = (cid:96) , . . . , (cid:96) (cid:48) − do X2N ( v δ , ν, (cid:96) , ( a j , a dv + j , . . . , a dv ( (cid:96) − j )) for i = 0 , . . . , (cid:96) − do X2N ( v α , µ, d v , ( a dv i , a dv i +1 , . . . , a dv ( i +1) − )) µ ← µ + ( ϕ v ( u, σ v, ∆( i ) )) u ∈ L vα X2N ( v α , µ, (cid:96) , ( a dv (cid:96) , a dv (cid:96) +1 , . . . , a (cid:96) − ))with a i = f i for i ∈ { , . . . , (cid:96) − } . Then Lines 5 and 8 of the algorithm and the F -linearity of the maps ϕ v ( u, · ) : F → F , for u ∈ L v , ensure that µ = ϕ v ( u, λ ) + i − (cid:88) j =0 ϕ v (cid:0) u, σ v, ∆( j ) (cid:1) u ∈ L vα = ϕ v u, λ + i − (cid:88) j =0 σ v, ∆( j ) u ∈ L vα . AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 15 each time the recursive call of Line 7 is performed. As γ v = ( β v,d v , . . . , β v,n v − d v − ),we have ω γ v , = 0 and ω γ v ,i = ω γ v ,i − + ∆( i − (cid:88) j =0 β v,d v + j = ω γ v ,i − + σ v, ∆( i − = i − (cid:88) j =0 σ v, ∆( j ) for i ∈ { , . . . , n v − d v − } . Thus, µ = ( ϕ v ( u, λ + ω γ v ,i )) u ∈ L vα = ( ϕ v α ( u, λ + ω γ v ,i )) u ∈ L vα each time the recursive call of Line 7 is performed. Similarly, µ = (cid:0) ϕ v α (cid:0) u, λ + ω γ v , (cid:100) (cid:96)/ dv (cid:101)− (cid:1)(cid:1) u ∈ L vα when the recursive call of Line 9 is performed. Therefore, the assumption thatAlgorithm 1 produces the correct output whenever v α is given as an input impliesthat Lines 4–9 set a i = g i for i ∈ { , . . . , (cid:96) − } , where g , . . . , g (cid:96) − are the uniqueelements in F such that (3.2) holds.The recursive calls of Lines 11 and 13 are made with ν = ( ϕ v ( u, λ )) u ∈ L vδ = ( ϕ v δ ( u, η )) u ∈ L vδ , where η = ( λ/β v, ) dv − λ/β v, . Therefore, the assumption that Algorithm 1 pro-duces the correct output whenever v δ is given as an input implies that Lines 10–13set a i = h i for i ∈ { , . . . , (cid:96) − } , where h , . . . , h (cid:96) − are the unique elements in F such that (3.3) holds. Hence, Lemma 3.2 implies that (3.4) holds at the end of thealgorithm. (cid:3) For conversions between the Newton and LCH bases the shift parameter λ isequal to zero for the initial calls to Algorithms 1 and 2. In this case, the vector( ϕ v ( u, λ )) u ∈ L v contains all zeros. If an application arises where it is necessary forthe initial call to be made with λ (cid:54) = 0, then the input vector can be computed with O ( n v ) field operations. Storing the input vector and the vectors µ and ν that appearin the algorithms requires storing O ( n v ) field elements. Including the computationand storage of the precomputed elements ϕ v ( u, σ v,i ), it follows that Algorithms 1and 2 require auxiliary storage for O ( n ) field elements, while all precomputationscan be performed with O ( n ) field operations.The number of multiplications performed by Algorithms 1 and 2 is independentof the choice of reduction tree. This is not true of the number of additions performedby the algorithms due to the updates made to the vector µ in the recursive case.Line 8 of Algorithm 1 and Line 12 of Algorithm 2 each perform ( (cid:6) (cid:96)/ d v (cid:7) − d v additions over all iterations of their containing loops. Consequently, we should aimto avoid small values of d v when choosing a reduction tree. Moreover, we expectthe number of additions performed by the algorithms to be maximised when thesubtree rooted on the initial input vertex satisfies Im( d ) ⊆ { , } . Example 3.4. If β is a Cantor basis, then Proposition 2.8 implies that d v ≤ (cid:100) log n v (cid:101)− for all internal v ∈ V , and guarantees the existence of a reductiontree for β such that equality always holds. For n up to fifteen, we confirmedexperimentally that this choice of reduction tree always minimises the number ofadditions performed by Algorithms 1 and 2 over all possible reduction trees, whilethose with Im( d ) ⊆ { , } were found to always maximise the number of additionsperformed by the algorithms. Figure 2 shows the maximum and minimum number . . . · · Polynomial length ( ‘ ) N u m b e r o f a dd i t i o n s o r m u l t i p li c a t i o n s Max. additionsMin. additionsMultiplications
Figure 2.
Maximum and minimum number of operations per-formed by Algorithms 1 and 2 for Cantor bases of dimension 15.of additions performed over all reduction trees for n = 15, as well as the number ofmultiplications performed for all trees.The difference between the maximum and minimum number of additions shownin Figure 2 is reflected in the bounds that we obtain on the complexities of thealgorithms. Theorem 3.5.
Algorithms 1 and 2 perform at most (cid:98) (cid:96)/ (cid:99)(cid:100) log (cid:96) (cid:101) multiplicationsand (cid:96) ( (cid:100) log (cid:96) (cid:101) −
1) + 1 additions in F . If d v = 2 (cid:100) log n v (cid:101)− for all internal v ∈ V ,then the algorithms perform at most (3 (cid:96) − (cid:100) log (cid:96) (cid:101) / additions in F . We split the proof of Theorem 3.5 into three lemmas, one for each of the threebounds. It is clear that Algorithms 1 and 2 perform the same number of multipli-cations when given identical inputs. Consequently, we only prove the bounds forAlgorithm 1. The three bounds are equal to zero if (cid:96) = 1, and one if (cid:96) = 2. Thus,the bounds hold if the input vertex is a leaf. Therefore, for each of the three boundsit is sufficient to show that if v ∈ V is an internal vertex such that the bound holdswhenever the input vertex is v α or v δ , then the bound holds whenever v is the inputvertex. Lemma 3.6.
Algorithms 1 and 2 perform at most (cid:98) (cid:96)/ (cid:99)(cid:100) log (cid:96) (cid:101) multiplications in F .Proof. Suppose that the input vertex v ∈ V to Algorithm 1 is an internal vertex suchthat the algorithm performs at most (cid:98) (cid:96)/ (cid:99)(cid:100) log (cid:96) (cid:101) multiplications in F whenever v α or v δ is given as the input vertex. If (cid:96) = 0, then (cid:96) = (cid:96) (cid:48) = (cid:96) and the algorithmperforms at most (cid:22) (cid:96) (cid:23) (cid:100) log (cid:96) (cid:101) + (cid:96) (cid:22) (cid:23) (cid:100) log (cid:101) = (cid:22) (cid:96) (cid:23) (cid:100) log (cid:96) (cid:101) AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 17 multiplications. Therefore, suppose that (cid:96) (cid:54) = 0. Then, as (cid:96) ≤ d v , Lines 6–9 ofthe algorithm perform at most(3.5) 2 d v − (cid:96) d v + (cid:22) (cid:96) (cid:23) (cid:100) log (cid:96) (cid:101) ≤ (cid:18) d v − (cid:96) + (cid:22) (cid:96) (cid:23)(cid:19) d v = (cid:22) (cid:96) (cid:23) d v multiplications. Let k ∈ N such that 2 k − < (cid:96) + 1 ≤ k . Then (cid:100) log (cid:96) + 1 (cid:101) = k and 2 d v + k − < d v ( (cid:96) + (cid:96) / d v ) = (cid:96) ≤ d v + k , since k ≥ < (cid:96) / d v ≤ (cid:100) log (cid:96) + 1 (cid:101) = (cid:100) log (cid:96) (cid:101) − d v . It follows that Lines 10–13 of the algorithmperform at most (cid:96) (cid:22) (cid:96) + 12 (cid:23) (cid:100) log (cid:96) + 1 (cid:101) + (cid:0) d v − (cid:96) (cid:1)(cid:22) (cid:96) (cid:23) (cid:100) log (cid:96) (cid:101)≤ (cid:36) (cid:96) ( (cid:96) + 1) + (cid:0) d v − (cid:96) (cid:1) (cid:96) (cid:37) (cid:100) log (cid:96) + 1 (cid:101) = (cid:22) (cid:96) (cid:23) ( (cid:100) log (cid:96) (cid:101) − d v )multiplications. By combining this bound with (3.5), it follows that Algorithm 1performs at most (cid:98) (cid:96)/ (cid:99)(cid:100) log (cid:96) (cid:101) multiplications if (cid:96) (cid:54) = 0. (cid:3) Lemma 3.7.
Algorithms 1 and 2 perform at most (cid:96) ( (cid:100) log (cid:96) (cid:101) − additions in F .Proof. Suppose that the input vertex v ∈ V to Algorithm 1 is an internal vertexsuch that the algorithm performs at most (cid:96) ( (cid:100) log (cid:96) (cid:101) − F whenever v α or v δ is given as the input vertex. If (cid:96) = 0, then (cid:96) = (cid:96) (cid:48) = (cid:96) and the algorithmperforms at most (cid:96) ( (cid:100) log (cid:96) (cid:101) −
1) + 1 + (cid:96) (1( (cid:100) log (cid:101) −
1) + 1) = (cid:96) ( (cid:100) log (cid:96) (cid:101) −
1) + 1additions. Therefore, suppose that (cid:96) (cid:54) = 0. Then Lines 6–9 of the algorithm performat most (cid:96) (cid:0) d v ( d v −
1) + 1 + d v (cid:1) + (cid:96) ( (cid:100) log (cid:96) (cid:101) −
1) + 1= (cid:96) ( d v −
1) + 1 + (cid:96) ( d v + 1) + (cid:96) ( (cid:100) log (cid:96) (cid:101) − d v )additions, while Lines 10–13 perform at most (cid:96) ( (cid:96) + 1)( (cid:100) log (cid:96) + 1 (cid:101) −
1) + (cid:0) d v − (cid:96) (cid:1) (cid:96) ( (cid:100) log (cid:96) (cid:101) −
1) + 2 d v ≤ (cid:96) ( (cid:100) log (cid:96) + 1 (cid:101) −
1) + 2 d v = (cid:96) ( (cid:100) log (cid:96) (cid:101) − d v ) − (cid:96) + 2 d v additions. As 1 ≤ (cid:96) ≤ d v , it follows that Algorithm 1 performs at most (cid:96) ( (cid:100) log (cid:96) (cid:101) −
1) + 1 + (cid:96) ( d v + 1) + (cid:96) ( (cid:100) log (cid:96) (cid:101) − d v ) − (cid:96) + 2 d v = (cid:96) ( (cid:100) log (cid:96) (cid:101) −
1) + 1 − (cid:0) d v − d v − (cid:1) ( (cid:96) −
1) + d v + 1 + (cid:96) (cid:24) log (cid:96) d v +1 (cid:25) ≤ (cid:96) ( (cid:100) log (cid:96) (cid:101) −
1) + 1 + d v + 1 − (cid:4) log d v +1 /(cid:96) (cid:5) d v +1 /(cid:96) d v +1 ≤ (cid:96) ( (cid:100) log (cid:96) (cid:101) −
1) + 1additions if (cid:96) (cid:54) = 0. (cid:3) Lemma 3.8.
Suppose that d v = 2 (cid:100) log n v (cid:101)− for all internal v ∈ V . Then Algo-rithms 1 and 2 perform at most (3 (cid:96) − (cid:100) log (cid:96) (cid:101) / additions in F . Proof.
Suppose that d v = 2 (cid:100) log n v (cid:101)− for all internal v ∈ V . Furthermore, supposethat the input vertex v ∈ V to Algorithm 1 is an internal vertex such that thealgorithm performs at most (3 (cid:96) − (cid:100) log (cid:96) (cid:101) / F whenever v α or v δ isgiven as the input vertex. If (cid:96) = 0, then (cid:96) = (cid:96) (cid:48) = (cid:96) and the algorithm performsat most 3 (cid:96) − (cid:100) log (cid:96) (cid:101) + (cid:96) (cid:18) − (cid:100) log (cid:101) (cid:19) = 3 (cid:96) − (cid:100) log (cid:96) (cid:101) . additions. Therefore, suppose that (cid:96) (cid:54) = 0. Then, as (cid:96) ≤ d v , Lines 6–9 of thealgorithm perform at most (cid:96) (cid:18) · d v − d v + d v (cid:19) + 3 (cid:96) − (cid:100) log (cid:96) (cid:101) ≤ (cid:96) − d v + (cid:96) d v additions, while Lines 10–13 perform at most (cid:96) (cid:96) + 1) − (cid:100) log (cid:96) + 1 (cid:101) + (cid:0) d v − (cid:96) (cid:1) (cid:96) − (cid:100) log (cid:96) (cid:101)≤ (cid:96) − d v +1 (cid:100) log (cid:96) + 1 (cid:101)≤ (cid:96) −
24 ( (cid:100) log (cid:96) (cid:101) − d v ) − d v −
12 log ( (cid:96) + 1)additions. It follows that Algorithm 1 performs at most3 (cid:96) − (cid:100) log (cid:96) (cid:101) + d v ( (cid:96) + 1) (cid:18) (cid:96) log ( (cid:96) + 1) − d v − d v (cid:19) ≤ (cid:96) − (cid:100) log (cid:96) (cid:101) additions if (cid:96) (cid:54) = 0, since (cid:96) + 1 ≤ n v − d v ≤ d v and the function x/ log ( x + 1) isincreasing for x ≥ (cid:3) Conversion from the Lagrange basis to the Lin–Chung–Han basis.
Our algorithms for converting between the Lagrange and LCH bases are directanalogues of Harvey’s cache-friendly truncated FFT and inverse truncated FFTalgorithms [15]. We align our presentation of the algorithms and their proofs ofcorrectness with their counterparts given by Harvey, and direct the reader to Har-vey’s paper for further motivation behind the algorithms. In this section, we focuson the problem of converting from the Lagrange basis to the LCH basis. The in-verse problem of converting from the LCH basis to the Lagrange basis is consideredseparately in the next section.We propose Algorithm 3 for converting from the Lagrange basis to the LCHbasis. Like the algorithms of Section 3.2, Algorithm 3 operates on a vector of fieldelements whose initial entries are overwritten with the output. However, the lengthof the vector is determined by the input vertex rather than the parameter (cid:96) whichbounds the polynomial length. Consequently, the vector may have coordinates thatare never used to store input or output values, but which are still used for inter-mediate computations. If the parameter c is less than (cid:96) , then the algorithm differssubstantially from the other basis conversion algorithms in this paper by requiringthat the vector initially contain a combination of coefficients from a polynomial’srepresentations on the input and output bases. If the parameter b is set equal toone, then the algorithm has the additional unique feature of being required to com-pute a coefficient from the input basis representation. These two parameters arepart of the internal mechanics of the recursive approach used by the algorithm, and AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 19 in most practical applications, such as the multiplication of binary polynomials,the initial call to the algorithm is made with c = (cid:96) and b = 0. However, one maybe required to initially call the algorithm with c < (cid:96) if it used within the Hermiteinterpolation algorithm of Coxon [12].The reduction used by Algorithm 3 is provided by the following generalisationof Lemma 2.2, which is rephrased to reflect the fact that the algorithm takes in amixture of coefficients from representations on the Lagrange and LCH bases. Lemma 3.9.
Let v ∈ V be an internal vertex and (cid:96) ∈ { , . . . , n v } . Suppose that f , . . . , f nv − , λ, h , . . . , h (cid:96) − ∈ F satisfy (3.6) (cid:96) − (cid:88) i =0 h i X β v ,i = nv − (cid:88) i =0 f i L β v ,i ( x − λ ) . Then there exist unique elements g , . . . , g nv − ∈ F such that (3.7) min ( dv ,(cid:96) ) − (cid:88) j =0 g dv i + j X β vα ,j = dv − (cid:88) j =0 f dv i + j L β vα ,j ( x − λ − ω γ v ,i ) for i ∈ { , . . . , n v − d v − } , and (3.8) (cid:98) ( (cid:96) − − j ) / dv (cid:99) (cid:88) i =0 h dv i + j X β vδ ,i = nv − dv − (cid:88) i =0 g dv i + j L β vδ ,i ( x − η ) for j ∈ { , . . . , d v − } , where η = ( λ/β v, ) dv − λ/β v, .Proof. Let v ∈ V , (cid:96) ∈ { , . . . , n v } and f , . . . , f nv − , λ, h , . . . , h (cid:96) − ∈ F satisfythe conditions of the lemma. Then there exist unique elements g , . . . , g nv − ∈ F such that (3.8) holds for j ∈ { , . . . , d v − } . We show that they also satisfy (3.7)for i ∈ { , . . . , n v − d v − } . If j ∈ { , . . . , d v − } satisfies j ≥ (cid:96) , then (3.8) impliesthat g dv i + j = 0 for i ∈ { , . . . , n v − d v − } . Therefore, if we let h (cid:96) = · · · = h nv − = 0, then the left-hand sides of (3.6), (3.7) and (3.8) remain unchangedif we replace (cid:96) by 2 n v in the summation bounds. Consequently, (3.7) must holdfor i ∈ { , . . . , n v − d v − } , since otherwise Lemma 2.2 allows us to contradict theuniqueness of the coefficients f , . . . , f nv − in (3.6) by writing the polynomial onthe left-hand side of (3.7) on the basis (cid:8) L β vα , ( x − λ − ω γ v ,i ) , . . . , L β vα , dv − ( x − λ − ω γ v ,i ) (cid:9) for i ∈ { , . . . , n v − d v − } . (cid:3) Theorem 3.10.
Algorithm 3 is correct.Proof.
Suppose that v ∈ V is a leaf. Then Table 2 displays the input and outputrequirements of Algorithm 3 on the vector ( a , . . . , a nv − ) = ( a , a ) for eachpossible input that includes v . The table also shows the output of the algorithmas computed by Lines 1–7. The elements f i and h i that appear in a row of thetable are the coefficients of (3.6) for the specified value of (cid:96) . Elements denoted byasterisks are unspecified by the algorithm. As v is a leaf, we have X β v , = 1 , X β v , = xβ v, , L β v , = xβ v, + 1 and L β v , = xβ v, . Algorithm 3
L2X ( v, ( ϕ v ( u, λ )) u ∈ L v , c, (cid:96), b, ( a , . . . , a nv − )) Input: a vertex v ∈ V , the vector ( ϕ v ( u, λ )) u ∈ L v ∈ F n v for some λ ∈ F , c, (cid:96) ∈ N such that c ≤ (cid:96) and 1 ≤ (cid:96) ≤ n v , b ∈ { , } such that 1 ≤ b + c ≤ n v , a i = f i ∈ F for i ∈ { , . . . , c − } , and a i = h i ∈ F for i ∈ { c, . . . , (cid:96) − } . Output: a i = h i ∈ F for i ∈ { , . . . , c − } such that (3.6) holds for some f c , . . . , f nv − ∈ F , and a c = f c if b = 1. if v is a leaf then if c = 2 then a ← a + a , a ← a + ϕ v ( v, λ ) a if c = 1, (cid:96) = 2 and b = 1 then w ← ϕ v ( v, λ ) a , a ← a + a , a ← a + w if c = 1, (cid:96) = 2 and b = 0 then a ← a + ϕ v ( v, λ ) a if c = 0 and (cid:96) = 2 then a ← a + ϕ v ( v, λ ) a if c = 1, (cid:96) = 1 and b = 1 then a ← a return c ← (cid:4) c/ d v (cid:5) , c ← c − d v c (cid:96) ← (cid:4) (cid:96)/ d v (cid:5) , (cid:96) ← (cid:96) − d v (cid:96) (cid:96) (cid:48) ← min(2 d v , (cid:96) ), b (cid:48) ← min( b + c , s ← min( c , (cid:96) ), t ← max( c , (cid:96) ) µ ← ( ϕ v ( u, λ )) u ∈ L vα , ν ← ( ϕ v ( u, λ )) u ∈ L vδ for i = 0 , . . . , c + b (cid:48) − do L2X ( v α , µ, d v , d v , , ( a dv i , a dv i +1 , . . . , a dv ( i +1) − )) µ ← µ + ( ϕ v ( u, σ v, ∆( i ) )) u ∈ L vα if b (cid:48) = 0 then L2X ( v α , µ, d v , d v , , ( a dv ( c − , a dv ( c − , . . . , a dv c − )) for j = c , . . . , t − do L2X ( v δ , ν, c , (cid:96) + 1 , b (cid:48) , ( a j , a dv + j , . . . , a dv (2 nv − dv − j )) for j = t, . . . , (cid:96) (cid:48) − do L2X ( v δ , ν, c , (cid:96) , b (cid:48) , ( a j , a dv + j , . . . , a dv (2 nv − dv − j )) if b (cid:48) = 1 then L2X ( v α , µ, c , (cid:96) (cid:48) , b, ( a dv c , a dv c +1 , . . . , a dv ( c +1) − )) for j = 0 , . . . , s − do L2X ( v δ , ν, c + 1 , (cid:96) + 1 , , ( a j , a dv + j , . . . , a dv (2 nv − dv − j )) for j = s, . . . , c − do L2X ( v δ , ν, c + 1 , (cid:96) , , ( a j , a dv + j , . . . , a dv (2 nv − dv − j ))Moreover, ϕ v ( v, λ ) = λ/β v, for λ ∈ F . Thus, the coefficients of (3.6) satisfy h = f + ϕ v ( v, λ )( f + f ) and h = f + f if (cid:96) = 2, and h = f = f if (cid:96) = 1.Using these equation, one can readily verify that the computed output agrees withthe required output for all inputs. Consequently, Algorithm 3 produces the correctoutput whenever the input vertex is a leaf. Therefore, as ( V, E ) is a full binary tree,it is sufficient to show that for all internal v ∈ V , if the algorithm produces thecorrect output whenever v α or v δ is given as an input, then it produces the correctoutput whenever v is given as an input.Let v ∈ V be an internal vertex and suppose that Algorithm 3 produces thecorrect output whenever v α or v δ is given as an input. Let λ ∈ F , c, (cid:96) ∈ N suchthat c ≤ (cid:96) and 1 ≤ (cid:96) ≤ n v , and b ∈ { , } such that 1 ≤ b + c ≤ n v . Supposethat Algorithm 3 is called on v , ( ϕ v ( u, λ )) u ∈ L v , c , (cid:96) , b and ( a , . . . , a nv − ), with a i = f i ∈ F for i ∈ { , . . . , c − } , and a i = h i ∈ F for i ∈ { c, . . . , (cid:96) − } . Then AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 21
Input Required output Computed output c (cid:96) b a a a a a a f f h h f + ϕ v ( v, λ )( f + f ) f + f f h h f f + ϕ v ( v, λ ) h f + h f h h ∗ f + ϕ v ( v, λ ) h ∗ h h f ∗ h + ϕ v ( v, λ ) h ∗ f ∗ h f f f f ∗ h ∗ f ∗ h ∗ f ∗ h ∗ Table 2.
Required and computed outputs of Algorithm 3 when v is a leaf.there exist unique elements h , . . . , h c − , f c , . . . , f nv − ∈ F such that (3.6) holds.In-turn, Lemma 3.9 implies that there exist unique elements g , . . . , g nv − ∈ F such that (3.7) and (3.8) hold.Repeating arguments from the proof of Theorem 3.3 shows that vector µ is equalto ( ϕ v α ( u, λ + ω γ v ,i )) u ∈ L vα each time the recursive call of Line 14 is made, equalto ( ϕ v α ( u, λ + ω γ v ,c − )) u ∈ L vα whenever the recursive call of Line 17 is performed,and equal to ( ϕ v α ( u, λ + ω γ v ,c )) u ∈ L vα whenever the recursive call of Line 23 isperformed. Similarly, the vector ν is equal to ( ϕ v δ ( u, η )) u ∈ L vδ for the recursivecalls of Lines 19, 21, 25 and 27. It follows that if the recursive call of Line 14 ismade with a dv i + j = f dv i + j for j ∈ { , . . . , d v − } , then (3.7) and the assumptionthat the algorithm produces the correct output whenever v α is given as an inputimply that a dv i + j = g dv i + j for j ∈ { , . . . , d v − } afterwards. Similarly, if therecursive call of Line 19 is made with a dv i + j = g dv i + j for i ∈ { , . . . , c − } and a dv i + j = h dv i + j for i ∈ { c , . . . , (cid:96) } , then (3.8) and the assumption thatthe algorithm produces the correct output whenever v δ is given as an input implythat a dv i + j = h dv i + j for i ∈ { , . . . , c − } , and a dv c + j = g dv c + j if b (cid:48) = 1,afterwards. Similar statements hold for the remaining recursive calls made by thealgorithm.The remainder of the proof is split into four cases. For each case, we providean example in either Figure 3 or 4 of how the vector ( a , . . . , a nv − ) evolves dur-ing the algorithm. In the figures, the vector is represented by the 2 n v − d v × d v matrix ( a dv i + j ) ≤ i< nv − dv , ≤ j< dv . Under this representation, the subvectors thatare subjected to recursive calls by the algorithm correspond to row and columnvectors of the matrix. Asterisks in the figures represent unspecified entries, whileentries surrounded by parenthesis are computed only if b = 1, and are unspecifiedotherwise. Case a:
Suppose that (cid:96) = 0. Then Lines 8–11 of the algorithm set c = 0, c = s = c , (cid:96) = (cid:96) (cid:48) = t = (cid:96) and b (cid:48) = 1. Thus, Lines 13–17 have no effect.Equation (3.8) implies that Lines 18–19 set a j = g j for j ∈ { c, . . . , (cid:96) − } . Lines 20–21 have no effect since (cid:96) (cid:48) = t . Equation (3.7) implies that Lines 22–23 set a i = g i for i ∈ { , . . . , c − } , and a c = f c if b = 1. Equation (3.8) implies that Lines 24–25set a i = h i for i ∈ { , . . . , c − } . Finally, Lines 26–27 have no effect since s = c . Case b:
Suppose that (cid:96) (cid:54) = 0 and c = 0. Then Lines 8–11 set c = c/ d v , (cid:96) (cid:48) = 2 d v , b (cid:48) = b , s = 0 and t = (cid:96) . Equation (3.7) implies that Lines 13–17 set Case a Case b f f f h h h ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ f f f f f f f f f f f f f f f f h h h h h ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ (3a) Initial contents. f f f h h h ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ g g g g g g g g g g g g g g g g h h h h h ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ (3b) After Lines 13–17 make recursive calls on the highlighted rows. f f f g g g ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ h h h h h h h h h h h h h h h h ( g ) ( g ) ( g ) ( g ) ( g ) ( g ) ( g ) ( g ) ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ (3c) After Lines 18–21 make recursive calls on the highlighted columns. g g g ( f ) ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ h h h h h h h h h h h h h h h h ( f ) ∗ ∗ ∗ ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ (3d) After Lines 23 makes a recursive call on the highlighted row. h h h ( f ) ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ h h h h h h h h h h h h h h h h ( f ) ∗ ∗ ∗ ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ (3e) After Lines 24–27 make recursive calls on the highlighted columns. Figure 3.
Evolution of the vector ( a , . . . , a nv − ) during Algo-rithm 3 for n v = 5, d v = 3, c = 3, (cid:96) = 6 (Case a), and n v = 5, d v = 3, c = 16, (cid:96) = 21 (Case b). a i = g i for i ∈ { , . . . , c − } . Thus, (3.8) implies that Lines 18–21 set a i = h i for i ∈ { , . . . , c − } , and a c + j = g c + j for j ∈ { , . . . , d v − } if b = 1. Equation (3.7)implies that Lines 22–23 set a c = f c if b = 1, and have no effect otherwise. Lines 24–27 have no effect since s = c = 0. Case c:
Suppose that (cid:96) (cid:54) = 0 and 0 < c ≤ (cid:96) . Then Lines 8–11 set (cid:96) (cid:48) = 2 d v , b (cid:48) = 1, s = c and t = (cid:96) . Equation (3.7) implies that Lines 13–17 set a i = g i for i ∈ { , . . . , d v c − } . Thus, (3.8) implies that Lines 18–21 set a dv i + j = h dv i + j and AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 23 a dv c + j = g dv c + j for i ∈ { , . . . , c − } and j ∈ { c , . . . , d v − } . Equation (3.7)implies that Lines 22–23 set a dv c + j = g dv c + j for j ∈ { , . . . , c − } , and a c = f c if b = 1. Equation (3.8) implies that Lines 24–25 set a dv i + j = h dv i + j for i ∈{ , . . . , c } and j ∈ { , . . . , c − } . Lines 26–27 have no effect since s = c . Case d:
Suppose that (cid:96) (cid:54) = 0 and (cid:96) < c . Then Lines 8–11 set (cid:96) (cid:48) = 2 d v , b (cid:48) = 1, s = (cid:96) and t = c . Equation (3.7) implies that Lines 13–17 set a i = g i for i ∈ { , . . . , d v c − } . Lines 18–19 have no effect since t = c . Equation (3.8) impliesthat Lines 20–21 set a dv i + j = h dv i + j and a dv c + j = g dv c + j for i ∈ { , . . . , c − } and j ∈ { c , . . . , d v − } . Equation (3.7) implies Lines 22–23 set a dv c + j = g dv c + j for j ∈ { , . . . , c − } , and a c = f c if b = 1. Equation (3.8) implies that Lines 24–27set a dv i + j = h dv i + j for i ∈ { , . . . , c } and j ∈ { , . . . , c − } .In each of the four cases, Algorithm 3 terminates with a i = h i for i ∈ { , . . . , c − } , and a c = f c if b = 1, as required. Hence, for internal v ∈ V , if the algorithmproduces the correct output whenever v α or v δ is given as an input, then it producesthe correct output whenever v is given as an input. (cid:3) Algorithm 3 requires the same precomputations as the algorithms of Section 3.2.However, as the length of the vector on which the algorithm operates is no longertied to the polynomial length (cid:96) , the auxiliary space requirements grow to 2 n v − (cid:96) + O ( n ) field elements, where the last term accounts for the storage of the vectors µ and ν , and the precomputed elements ϕ v ( u, σ v,i ). The update to the vector µ inLine 15 of the algorithm performs ( c + b (cid:48) − d v = ( (cid:4) c/ d v (cid:5) + b (cid:48) − d v additions overall iterations of its containing loop. Therefore, as for the algorithms of Section 3.2,small values of d v should be avoided when choosing a reduction tree. Example 3.11.
For β equal to a Cantor basis of dimension 15, and inputs (cid:96) ∈{ , . . . , } , c = (cid:96) and b = 0, Figure 5 shows the maximum and minimum numberof additions performed by Algorithm 3 over all possible reduction trees for thebasis. The number of multiplications performed by the algorithm for each valueof (cid:96) , which is independent of the choice of reduction tree, is also shown in thefigure. As for Example 3.4, the number of additions performed for each value of (cid:96) is maximised by the tree with Im( d ) ⊆ { , } , and minimised by the tree such that d v = 2 (cid:100) log n v (cid:101)− for all internal v ∈ V . Theorem 3.12.
Algorithm 3 performs at most min (cid:18) c + b −
12 ( (cid:100) log c + b (cid:101) −
1) + (cid:96) − , n v − n v (cid:19) multiplications in F , and at most min (cid:18) c + b −
12 (3 (cid:100) log c + b (cid:101) −
1) + (cid:96) − , n v − (3 n v −
2) + 1 (cid:19) additions in F . We split the proof of Theorem 3.12 into four lemmas, one for each bound. Itis readily verified that the bounds hold if the input vertex is a leaf. Therefore, as(
V, E ) is a full binary tree, it is sufficient to show for each bound that if v ∈ V isan internal vertex such that the bound holds whenever the input vertex is v α or v δ ,then the bound holds whenever v is the input vertex. Lemma 3.13.
Algorithm 3 performs at most n v − n v multiplications in F . Case c Case d f f f f f f f f f f f f f f f f f f h h h h h h h h h h ∗ ∗ ∗ ∗ f f f f f f f f f f f f f f f f f f f f f f h h h h h h ∗ ∗ ∗ ∗ (4a) Initial contents. g g g g g g g g g g g g g g g g f f h h h h h h h h h h ∗ ∗ ∗ ∗ g g g g g g g g g g g g g g g g f f f f f f h h h h h h ∗ ∗ ∗ ∗ (4b) After Lines 13–17 make recursive calls on the highlighted rows. g g h h h h h h g g h h h h h h f f g g g g g g h h ∗ ∗ ∗ ∗ ∗ ∗ g g g g g g h h g g g g g g h h f f f f f f g g h h h h ∗ ∗ ∗ ∗ (4c) After Lines 18–21 make recursive calls on the highlighted columns. g g h h h h h h g g h h h h h h g g ( f ) ∗ ∗ ∗ ∗ ∗ h h ∗ ∗ ∗ ∗ ∗ ∗ g g g g g g h h g g g g g g h h g g g g g g ( f ) ∗ h h h h ∗ ∗ ∗ ∗ (4d) After Lines 23 makes a recursive call on the highlighted row. h h h h h h h h h h h h h h h h h h ( f ) ∗ ∗ ∗ ∗ ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ h h h h h h h h h h h h h h h h h h h h h h ( f ) ∗∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ (4e) After Lines 24–27 make recursive calls on the highlighted columns. Figure 4.
Evolution of the vector ( a , . . . , a nv − ) during Algo-rithm 3 for n v = 5, d v = 3, c = 18, (cid:96) = 28 (Case c), and n v = 5, d v = 3, c = 22, (cid:96) = 28 (Case d). Proof.
Suppose that Algorithm 3 is called on an internal vertex v ∈ V such that thebound of the lemma holds whenever the input vertex is v α or v δ . Then Lines 13–17of the algorithm perform at most c d v − d v ≤ (2 n v − d v − b (cid:48) )2 d v − d v multiplications,since c ≤ n v − d v with equality implying c = b = b (cid:48) = 0. If (cid:96) ≥ d v , then (cid:96) (cid:48) = 2 d v ≥ t ≥ c . If (cid:96) < d v , then (cid:96) = (cid:96) = (cid:96) (cid:48) ≥ c = c . Thus, the inequalities c ≤ t ≤ (cid:96) (cid:48) ≤ d v hold in either case. It follows that Lines 18–21 perform atmost (2 d v − c )2 n v − d v − ( n v − d v ) multiplications. Lines 22–23 performs at most AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 25 . . . · · Polynomial length ( ‘ ) N u m b e r o f a dd i t i o n s o r m u l t i p li c a t i o n s Max. additionsMin. additionsMultiplications
Figure 5.
Maximum and minimum number of operations per-formed by Algorithm 3 (Algorithm 4) for Cantor bases of dimen-sion 15, with parameters c = (cid:96) and b = 0 (respectively, c = (cid:96) ). b (cid:48) d v − d v multiplications, while Lines 24–27 perform at most c n v − d v − ( n v − d v ).Summing these contributions, it follows that Algorithm 3 performs at most 2 n v − n v multiplications. (cid:3) Lemma 3.14.
Algorithm 3 performs at most (3.9) c + b −
12 ( (cid:100) log c + b (cid:101) −
1) + (cid:96) − multiplications in F .Proof. Suppose that Algorithm 3 is called on an internal vertex v ∈ V such that thebound of the lemma holds whenever the input vertex is v α or v δ . Then Lemma 3.13implies that Lines 13–17 perform at most(3.10) c d v − d v = c − c d v multiplications. As c ≤ t ≤ (cid:96) (cid:48) ≤ d v , Lines 18–21 perform at most(3.11) (cid:0) d v − c (cid:1) c + b (cid:48) −
12 ( (cid:100) log c + b (cid:48) (cid:101) −
1) + ( (cid:96) (cid:48) − c )( (cid:96) −
1) + t − c multiplications. Lines 22–23 perform at most(3.12) b (cid:48) (cid:18) c + b −
12 ( (cid:100) log max( c + b, (cid:101) −
1) + (cid:96) (cid:48) − (cid:19) multiplications, while Lines 24–27 perform at most(3.13) c (cid:16) c (cid:100) log c + 1 (cid:101) −
1) + (cid:96) − (cid:17) + s multiplications. We show that the sum of these contributions is bounded by (3.9).Suppose that b (cid:48) = 1. Then 1 ≤ c + b ≤ c + b . Thus, the sum of (3.10)–(3.13) inthis case is at most(3.14) c − c (cid:100) log c + 1 (cid:101) + d v −
1) + c + b −
12 ( (cid:100) log c + b (cid:101) −
1) + (cid:96) − , since (cid:96) (cid:48) ( (cid:96) −
1) + s + t − c = (cid:96) (cid:48) ( (cid:96) −
1) + (cid:96) ≤ d v (cid:96) + (cid:96) − (cid:96) (cid:48) = (cid:96) − (cid:96) (cid:48) . If c = 0, then c = c . If c (cid:54) = 0 and c (cid:54) = 0, then (cid:100) log c + 1 (cid:101) = (cid:6) log (cid:6) c/ d v (cid:7)(cid:7) = (cid:100) log c (cid:101) − d v ≤ (cid:100) log c + b (cid:101) − d v . If c (cid:54) = 0 and c = 0, then b = 1 since c + b ≥
1, and (cid:100) log c + 1 (cid:101) = (cid:6) log c/ d v + 1 (cid:7) = (cid:100) log c + 1 (cid:101) − d v = (cid:100) log c + b (cid:101) − d v . In all three of these cases, substituting into (3.14) yields (3.9).Suppose now that b (cid:48) = 0. Then c = 0 and b = 0. As c + b ≥
1, it followsthat c = c/ d v (cid:54) = 0. Thus, (cid:96) (cid:48) = 2 d v , since (cid:96) ≥ c ≥ d v . Therefore, the sum of(3.10)–(3.13) in this case is at most c d v + c − d v (cid:100) log c (cid:101) −
1) + (cid:96) − d v = c d v + c −
12 ( (cid:100) log c (cid:101) − d v − − d v −
12 ( (cid:100) log c (cid:101) −
1) + (cid:96) − d v ≤ c d v + c −
12 ( (cid:100) log c (cid:101) − d v −
1) + 2 d v −
12 + (cid:96) − d v = c −
12 ( (cid:100) log c (cid:101) −
1) + (cid:96) − − (cid:0) d v − d v − (cid:1) ≤ c + b −
12 ( (cid:100) log c + b (cid:101) −
1) + (cid:96) − , as required. (cid:3) Lemma 3.15.
Algorithm 3 performs at most n v − (3 n v −
2) + 1 additions in F .Proof. Suppose that Algorithm 3 is called on an internal vertex v ∈ V such that thebound of the lemma holds whenever the input vertex is v α or v δ . Then Lines 13–17of the algorithm perform at most c (cid:0) d v − (3 d v −
2) + 1 + d v (cid:1) − (1 − b (cid:48) ) d v ≤ (cid:0) n v − d v − b (cid:48) (cid:1)(cid:0) d v − (3 d v −
2) + 1 (cid:1) + (cid:0) n v − d v − (cid:1) d v additions, since c ≤ n v − d v with equality implying that c = b = b (cid:48) = 0. As c ≤ t ≤ (cid:96) (cid:48) ≤ d v , Lines 18–21 perform at most (2 d v − c )(2 n v − d v − (3( n v − d v ) −
2) + 1)additions. Lines 22–23 perform at most b (cid:48) (2 d v − (3 d v −
2) + 1) additions, whileLines 24–27 perform at most c (2 n v − d v − (3( n v − d v ) −
2) + 1) additions. Summingthese contributions, it follows that Algorithm 3 performs at most2 n v − (3 n v −
2) + 1 − (cid:0) d v − d v − (cid:1)(cid:0) n v − d v − (cid:1) ≤ n v − (3 n v −
2) + 1additions. (cid:3)
AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 27
Lemma 3.16.
Algorithm 3 performs at most (3.15) c + b −
12 (3 (cid:100) log c + b (cid:101) −
1) + (cid:96) − additions in F .Proof. Suppose that Algorithm 3 is called on an internal vertex v ∈ V such that thebound of the lemma holds whenever the input vertex is v α or v δ . Then Lemma 3.15implies that Lines 13–17 perform at most(3.16) c (cid:0) d v − (3 d v −
2) + 1 + d v (cid:1) − (1 − b (cid:48) ) d v ≤ c − c d v − (1 − b (cid:48) ) d v additions. As c ≤ t ≤ (cid:96) (cid:48) ≤ d v , Lines 18–21 perform at most(3.17) (cid:0) d v − c (cid:1) c + b (cid:48) −
12 (3 (cid:100) log c + b (cid:48) (cid:101) −
1) + ( (cid:96) (cid:48) − c )( (cid:96) −
1) + t − c additions. Lines 22–23 perform at most(3.18) b (cid:48) (cid:18) c + b −
12 (3 (cid:100) log max( c + b, (cid:101) −
1) + (cid:96) (cid:48) − (cid:19) additions, while Lines 24–27 perform at most(3.19) c (cid:16) c (cid:100) log c + 1 (cid:101) −
1) + (cid:96) − (cid:17) + s additions. We show that the sum of these contributions is bounded by (3.15).Suppose that b (cid:48) = 1. Then the sum of (3.16)–(3.19) is at most(3.20) c − c (cid:100) log c + 1 (cid:101) + 3 d v −
1) + c + b −
12 (3 (cid:100) log c + b (cid:101) −
1) + (cid:96) − . If c = 0, then c = c . If not, then (cid:100) log c + 1 (cid:101) ≤ (cid:100) log c + b (cid:101) − d v . In either case,substituting into (3.20) yields (3.15).Suppose now that b (cid:48) = 0. Then c = b = 0, c (cid:54) = 0 and (cid:96) (cid:48) = 2 d v . It follows thatthe sum of (3.16)–(3.19) in this case is at most c d v − d v + c − d v (cid:100) log c (cid:101) −
1) + (cid:96) − d v ≤ c −
12 3 d v + 12 d v + c −
12 (3 (cid:100) log c (cid:101) − d v −
1) + 2 d v −
12 + (cid:96) − d v = c −
12 (3 (cid:100) log c (cid:101) −
1) + (cid:96) − − (cid:0) d v − d v − (cid:1) ≤ c + b −
12 (3 (cid:100) log c + b (cid:101) −
1) + (cid:96) − , as required. (cid:3) Conversion from the Lin–Chung–Han basis to the Lagrange basis.
We propose Algorithm 4 for converting from the LCH basis to the Lagrange basis.The parameter c plays a different role than in Algorithm 3, with its function beingto specify the number Lagrange basis coefficients returned by the algorithm, ratherthan to specify a mixture of coefficients. For conversion from the LCH basis tothe Lagrange basis, Algorithm 4 is initially called with c = 2 n v . Smaller initialvalues of c are relevant, for example, when using the algorithm within the Hermiteevaluation algorithm of Coxon [12]. Theorem 3.17.
Algorithm 4 is correct.
Algorithm 4
X2L ( v, ( ϕ v ( u, λ )) u ∈ L v , c, (cid:96), ( a , . . . , a nv − )) Input: a vertex v ∈ V , the vector ( ϕ v ( u, λ )) u ∈ L v ∈ F n v for some λ ∈ F , c, (cid:96) ∈{ , , . . . , n v } , and a i = h i ∈ F for i ∈ { , . . . , (cid:96) − } . Output: a i = f i ∈ F for i ∈ { , . . . , c − } such that (3.6) holds for some f c , . . . , f nv − ∈ F . if v is a leaf then if c = 2 and (cid:96) = 2 then a ← a + ϕ v ( v, λ ) a , a ← a + a if c = 1 and (cid:96) = 2 then a ← a + ϕ v ( v, λ ) a if c = 2 and (cid:96) = 1 then a ← a return c ← (cid:6) c/ d v (cid:7) − c ← c − d v c (cid:96) ← (cid:4) (cid:96)/ d v (cid:5) , (cid:96) ← (cid:96) − d v (cid:96) (cid:96) (cid:48) ← min(2 d v , (cid:96) ) µ ← ( ϕ v ( u, λ )) u ∈ L vα , ν ← ( ϕ v ( u, λ )) u ∈ L vδ for j = 0 , . . . , (cid:96) − do X2L ( v δ , ν, c + 1 , (cid:96) + 1 , ( a j , a dv + j , . . . , a dv (2 nv − dv − j )) for j = (cid:96) , . . . , (cid:96) (cid:48) − do X2L ( v δ , ν, c + 1 , (cid:96) , ( a j , a dv + j , . . . , a dv (2 nv − dv − j )) for i = 0 , . . . , c − do X2L ( v α , µ, d v , (cid:96) (cid:48) , ( a dv i , a dv i +1 , . . . , a dv ( i +1) − )) µ ← µ + ( ϕ v ( u, σ v, ∆( i ) )) u ∈ L vα X2L ( v α , µ, c , (cid:96) (cid:48) , ( a dv c , a dv c +1 , . . . , a dv ( c +1) − )) Proof.
Table 3 displays the input and output requirements of Algorithm 4 when theinput vertex v is a leaf, as well as showing the output of the algorithm as computedby Lines 1–5. The elements f i and h i that appear in a row of the table are thecoefficients of (3.6) for the specified value of (cid:96) . Elements denoted by asterisksare unspecified by the algorithm. As v is a leaf, the coefficients of (3.6) satisfy f = h + ϕ v ( v, λ ) h and f = h + ( h + ϕ v ( v, λ ) h ) if (cid:96) = 2, and f = f = h if (cid:96) = 1. Using these equation, one can readily verify that the computed outputagrees with the required output for all inputs. Consequently, Algorithm 4 producesthe correct output whenever the input vertex is a leaf. Therefore, as ( V, E ) is afull binary tree, it is sufficient to show that for all internal v ∈ V , if the algorithmproduces the correct output whenever v α or v δ is given as an input, then it producesthe correct output whenever v is given as an input.Input Required output Computed output c (cid:96) a a a a a a h h f f h + ϕ v ( v, λ ) h h + ( h + ϕ v ( v, λ ) h )1 2 h h f ∗ h + ϕ v ( v, λ ) h ∗ h ∗ f f h h h ∗ f ∗ h ∗ Table 3.
Required and computed outputs of Algorithm 4 when v is a leaf.Let v ∈ V be an internal vertex and suppose that Algorithm 4 produces thecorrect output whenever v α or v δ is given as an input. Suppose that the algorithm AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 29 is called on v , the vector ( ϕ v ( u, λ )) u ∈ L v for some λ ∈ F , integers c, (cid:96) ∈ { , , . . . , n v } and ( a , . . . , a nv − ), with a i = h i ∈ F for i ∈ { , . . . , (cid:96) − } . Then there exist uniqueelements f , . . . , f nv − ∈ F such that (3.6) holds. In-turn, Lemma 3.9 implies thatthere exist unique elements g , . . . , g nv − ∈ F such that (3.7) and (3.8) hold.Once again repeating arguments from the proof of Theorem 3.3 shows that ν =( ϕ v δ ( u, η )) u ∈ L vδ for the recursive calls of Lines 9–12, µ = ( ϕ v α ( u, λ + ω γ v ,i )) u ∈ L vα each time the recursive call of Line 14 is performed, and finally µ = ( ϕ v α ( u, λ + ω γ v ,c )) u ∈ L vα for the recursive call of Line 16. Thus, (3.8) and the assumption thatthe algorithm produces the correct output whenever v δ is given as an input implythat Lines 9–12 set a dv i + j = g dv i + j for i ∈ { , . . . , c } and j ∈ { , . . . , min(2 d v , (cid:96) ) − } . Consequently, (3.7) and the assumption that the algorithm produces the correctoutput whenever v α is given as an input imply that Lines 13–15 set a i = f i for i ∈ { , . . . , d v c − } , and that Line 16 sets a i = f i for i ∈ { d v c , . . . , c − } .Therefore, the algorithm terminates with a i = f i for i ∈ { , . . . , c − } , as required.Hence, for internal v ∈ V , if the algorithm produces the correct output whenever v α or v δ is given as an input, then it produces the correct output whenever v isgiven as an input. (cid:3) Algorithm 4 requires the same precomputations as the algorithms of Sections 3.2and 3.3, while the algorithm requires auxiliary space for 2 n v − max( c, (cid:96) ) + O ( n )field elements. Small values of d v should once again be avoided when choosing areduction tree for the algorithm, in order to help reduce the number of additionsperformed by the updates to the vector µ in Line 15. Example 3.18.
For β equal to a Cantor basis of dimension 15, and inputs (cid:96) ∈{ , . . . , } and c = (cid:96) , Figure 5 also shows the maximum and minimum number ofadditions performed by Algorithm 4 over all possible reduction trees for the basis,as well as the number of multiplications performed by the algorithm for all suchtrees. Thus, Algorithm 3 with c = (cid:96) and b = 0 performs the same number ofoperations for both extremes (see Example 3.11). As for Examples 3.4 and 3.11,the maximum and minimum number of additions performed for each (cid:96) are givenrespectively by the trees with d v = 1 and d v = 2 (cid:100) log n v (cid:101)− for all internal v ∈ V . Theorem 3.19.
Algorithm 4 performs at most min (cid:18) c −
12 ( (cid:100) log c (cid:101) −
1) + (cid:96) − , n v − n v (cid:19) multiplications in F , and at most min (cid:18) c −
12 (3 (cid:100) log c (cid:101) −
1) + (cid:96) − , n v − (3 n v −
2) + 1 (cid:19) additions in F . We split the proof of Theorem 3.19 into four lemmas, one for each bound. Itis readily verified that the bounds hold if the input vertex is a leaf. Therefore, as(
V, E ) is a full binary tree, it is sufficient to show for each bound that if v ∈ V isan internal vertex such that the bound holds whenever the input vertex is v α or v δ ,then the bound holds whenever v is the input vertex. Lemma 3.20.
Algorithm 4 performs at most n v − n v multiplications in F . Proof.
Suppose that Algorithm 4 is called on an internal vertex v ∈ V such that thebound of the lemma holds whenever the input vertex is v α or v δ . Then Lines 9–12of the algorithm perform at most (cid:96) (cid:48) n v − d v − ( n v − d v ) ≤ n v − ( n v − d v ) multipli-cations, while Lines 13–16 perform at most ( c + 1)2 d v − d v ≤ n v − d v multiplica-tions. Summing theses contributions, it follows that Algorithm 4 performs at most2 n v − n v multiplications. (cid:3) Lemma 3.21.
Algorithm 4 performs at most ( c − (cid:100) log c (cid:101) − / (cid:96) − multi-plications in F .Proof. Suppose that Algorithm 4 is called on an internal vertex v ∈ V such that thebound of the lemma holds whenever the input vertex is v α or v δ . Then Lines 9–12perform at most2 d v c (cid:100) log c + 1 (cid:101) −
1) + (cid:96) (cid:48) ( (cid:96) −
1) + (cid:96) ≤ c − c (cid:100) log c + 1 (cid:101) −
1) + (cid:96) − (cid:96) (cid:48) multiplications. Lemma 3.20 implies that Lines 13–16 perform at most c d v − d v + c −
12 ( (cid:100) log c (cid:101) − (cid:96) (cid:48) − c − c d v + c −
12 ( (cid:100) log c (cid:101) − (cid:96) (cid:48) − c ≤ c , it follows that Algorithm 4 performs at most c − c (cid:100) log c + 1 (cid:101) + d v −
1) + c −
12 ( (cid:100) log c (cid:101) −
1) + (cid:96) − c − (cid:100) log c (cid:101) − / (cid:96) − c = c if c = 0, and (cid:100) log c + 1 (cid:101) = (cid:100) log c (cid:101) − d v otherwise. (cid:3) Lemma 3.22.
Algorithm 4 performs at most n v − (3 n v −
2) + 1 additions in F .Proof. Suppose that Algorithm 4 is called on an internal vertex v ∈ V such that thebound of the lemma holds whenever the input vertex is v α or v δ . Then Lines 9–12of the algorithm perform at most (cid:96) (cid:48) (cid:0) n v − d v − (3( n v − d v ) −
2) + 1 (cid:1) ≤ n v − (3 n v − d v ) − d v (cid:0) n v − d v − (cid:1) additions. Lines 13–16 perform at most( c + 1) (cid:0) d v − (3 d v −
2) + d v + 1 (cid:1) − d v ≤ n v − (3 d v −
2) + 1 + ( d v + 1) (cid:0) n v − d v − (cid:1) additions. It follows that Algorithm 4 performs at most2 n v − (3 n v −
2) + 1 − (cid:0) d v − d v − (cid:1)(cid:0) n v − d v − (cid:1) ≤ n v − (3 n v −
2) + 1additions. (cid:3)
Lemma 3.23.
Algorithm 4 performs at most ( c − (cid:100) log c (cid:101)− / (cid:96) − additionsin F .Proof. Suppose that Algorithm 4 is called on an internal vertex v ∈ V such that thebound of the lemma holds whenever the input vertex is v α or v δ . Then Lines 9–12perform at most2 d v c (cid:100) log c + 1 (cid:101) −
1) + (cid:96) (cid:48) ( (cid:96) −
1) + (cid:96) ≤ c − c (cid:100) log c + 1 (cid:101) −
1) + (cid:96) − (cid:96) (cid:48) additions. Lemma 3.22 implies that Lines 13–15 perform at most c (cid:0) d v − (3 d v −
2) + 1 + d v (cid:1) = c − c d v − c (cid:0) d v − d v − (cid:1) ≤ c − c d v AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 31 additions. Line 16 performs at most c −
12 (3 (cid:100) log c (cid:101) −
1) + (cid:96) (cid:48) − c ≤ c , it follows that Algorithm 4 performs at most c − c (cid:100) log c + 1 (cid:101) + 3 d v −
1) + c −
12 (3 (cid:100) log c (cid:101) −
1) + (cid:96) − c − (cid:100) log c (cid:101) − / (cid:96) − c = c if c = 0, and (cid:100) log c + 1 (cid:101) = (cid:100) log c (cid:101) − d v otherwise. (cid:3) Interlude: generalised Taylor expansion.
The generalised Taylor expan-sion of a polynomial F ∈ F [ x ] at a degree t ≥ T ∈ F [ x ], also called its T -adic expansion, is the series expansion F = F + F T + F T + · · · such that F i ∈ F [ x ] t for i ∈ N . Gao and Mateer [14, Section II] provide a fastalgorithm for computing the coefficients of the Taylor expansion when T = x t − x with t ≥
2. The algorithm is then utilised as part of their additive FFT algorithms.Our algorithm for converting from the monomial basis to the LCH basis similarlyrelies on their generalised Taylor expansion algorithm. Consequently, we make abrief aside to recall their algorithm.The algorithm of Gao and Mateer can be viewed as a specialisation of the re-cursive algorithm of von zur Gathen [27] that takes advantage of easy division by( x t − x ) k = x k t − x k in characteristic two. We present a nonrecursive version oftheir algorithm modelled on the basis conversion algorithms of van der Hoeven andSchost [26, Section 2.2]. We also present the inverse algorithm, which recovers apolynomial from the coefficients of its Taylor expansion at x t − x , as it is requiredby our algorithm for converting from the LCH basis to monomial basis. Finally, wederive a bound on the complexity of both algorithms that is tighter than the oneprovided by Gao and Mateer.Let F ∈ F [ x ] (cid:96) and t ≥ k ∈ N , define F k, , F k, , . . . ∈ F [ x ] k t by the equation(3.21) F = (cid:88) i ∈ N F k,i (cid:0) x t − x (cid:1) k i . Then F , , F , , . . . are the coefficients of the Taylor expansion at x t − x , while F k, = F for k ≥ (cid:100) log (cid:100) (cid:96)/t (cid:101)(cid:101) . By grouping terms of indices 2 i and 2 i + 1 in (3.21),it follows that F = (cid:88) i ∈ N (cid:16) F k, i + F k, i +1 (cid:16) x k t − x k (cid:17)(cid:17)(cid:0) x t − x (cid:1) k +1 i for k ∈ N . Thus, we obtain the recursive formula F k +1 ,i = F k, i + x k F k, i +1 + x k t F k, i +1 for k, i ∈ N . Given F k, i and F k, i +1 on the monomial basis, the recursive formula allows F k +1 ,i to be readily computed on the monomial basis. The formula also allows this compu-tation to be easily inverted. Therefore, given the Taylor coefficients F , , F , , . . . onthe monomial basis, we can efficiently compute F = F (cid:100) log (cid:100) (cid:96)/t (cid:101)(cid:101) , on the monomialbasis by means of the recursive formula, and vice versa. Using this observation, weobtain Algorithms 5 and 6. Algorithm 5
TaylorExpansion ( t, (cid:96), ( a , . . . , a (cid:96) − )) Input:
Integers t ≥ (cid:96) ≥
1, and a i = f i ∈ F for i ∈ { , . . . , (cid:96) − } . Output: a i = c i for i ∈ { , . . . , (cid:96) − } such that(3.22) (cid:100) (cid:96)/t (cid:101)− (cid:88) i =0 min( (cid:96) − ti,t ) − (cid:88) j =0 c ti + j x j (cid:0) x t − x (cid:1) i = (cid:96) − (cid:88) i =0 f i x i . for k = (cid:100) log (cid:100) (cid:96)/t (cid:101)(cid:101) − , . . . , do (cid:96) ← (cid:4) (cid:96)/ (2 k +1 t ) (cid:5) , (cid:96) ← (cid:96) − k +1 t(cid:96) for i = 0 , . . . , (cid:96) − do for j = 2 k t − , . . . , do a k t (2 i )+2 k + j ← a k t (2 i )+2 k + j + a k t (2 i +1)+ j for j = (cid:96) − k t − , . . . , do a k t (2 (cid:96) )+2 k + j ← a k t (2 (cid:96) )+2 k + j + a k t (2 (cid:96) +1)+ j Algorithm 6
InverseTaylorExpansion ( t, (cid:96), ( a , . . . , a (cid:96) − )) Input:
Integers t ≥ (cid:96) ≥
1, and a i = c i ∈ F for i ∈ { , . . . , (cid:96) − } . Output: a i = f i i ∈ { , . . . , (cid:96) − } such that (3.22) holds. for k = 0 , . . . , (cid:100) log (cid:100) (cid:96)/t (cid:101)(cid:101) − do (cid:96) ← (cid:4) (cid:96)/ (2 k +1 t ) (cid:5) , (cid:96) ← (cid:96) − k +1 t(cid:96) for i = 0 , . . . , (cid:96) − do for j = 0 , . . . , k t − do a k t (2 i )+2 k + j ← a k t (2 i )+2 k + j + a k t (2 i +1)+ j for j = 0 , . . . , (cid:96) − k t − do a k t (2 (cid:96) )+2 k + j ← a k t (2 (cid:96) )+2 k + j + a k t (2 (cid:96) +1)+ j Lemma 3.24.
Algorithms 5 and 6 perform at most (cid:98) (cid:96)/ (cid:99)(cid:100) log (cid:100) (cid:96)/t (cid:101)(cid:101) additions in F .Proof. For each k ∈ { , . . . , (cid:100) log (cid:100) (cid:96)/t (cid:101)(cid:101) − } , Lines 2–7 of either algorithm performat most2 k t(cid:96) + max( (cid:96) − k t, ≤ k t(cid:96) + max( (cid:96) − (cid:100) (cid:96) / (cid:101) , (cid:98) (cid:96) / (cid:99) ) = 2 k t(cid:96) + (cid:98) (cid:96) / (cid:99) = (cid:98) (cid:96)/ (cid:99) additions in F . (cid:3) Conversion between the Lin–Chung–Han and monomial bases.
Weuse Lemma 2.1 to provide algorithms for converting between the monomial basisand the “twisted” LCH basis { X β v , ( β v, x ) , . . . , X β v ,(cid:96) − ( β v, x ) } of F [ x ] (cid:96) , for v ∈ V and (cid:96) ∈ { , . . . , n v } . Conversions between the LCH and monomial bases thenrequire at most an additional max(2 (cid:96) − ,
0) multiplications for performing thesubstitution x (cid:55)→ x/β v, or x (cid:55)→ β v, x . In particular, no additional multiplicationsare required if β is a Cantor basis, since β v, = 1 for all v ∈ V (see Remark 3.1).We base the conversion algorithms on the following analogue of Lemma 2.2. Lemma 3.25.
Let v ∈ V be an internal vertex and (cid:96) ∈ { , . . . , n v } . Suppose that h , . . . , h (cid:96) − , λ, g , . . . , g (cid:96) − , c , . . . , c (cid:96) − ∈ F satisfy (3.23) min ( (cid:96) − dv i, dv ) − (cid:88) j =0 g dv i + j x j = min ( (cid:96) − dv i, dv ) − (cid:88) j =0 h dv i + j X β vα ,j ( β v α , x ) AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 33 for i ∈ { , . . . , (cid:6) (cid:96)/ d v (cid:7) − } , and (3.24) (cid:100) ( (cid:96) − j ) / dv (cid:101) − (cid:88) i =0 c dv i + j x i = (cid:100) ( (cid:96) − j ) / dv (cid:101) − (cid:88) i =0 g dv i + j X β vδ ,i ( β v δ , x ) for j ∈ { , . . . , min(2 d v , (cid:96) ) − } . Then (3.25) (cid:100) (cid:96)/ dv (cid:101) − (cid:88) i =0 min ( (cid:96) − dv i, dv ) − (cid:88) j =0 c dv i + j x j (cid:32) x dv − xβ v δ , (cid:33) i = (cid:96) − (cid:88) i =0 h i X β v ,i ( β v, x ) . Proof.
Let v ∈ V be an internal vertex and (cid:96) ∈ { , . . . , n v } . Suppose that h , . . . , h (cid:96) − , λ, g , . . . , g (cid:96) − , c , . . . , c (cid:96) − ∈ F satisfy equations (3.23) and (3.24).Then β v,i /β v, ∈ F dv for i ∈ { , . . . , d v − } , since ( V, E ) is a reduction treefor β . Thus, Lemma 2.1 implies that (cid:96) − (cid:88) i =0 h i X β v ,i ( β v, x ) = (cid:100) (cid:96)/ dv (cid:101) − (cid:88) i =0 min ( (cid:96) − dv i, dv ) − (cid:88) j =0 h dv i + j X β vα ,j ( β v, x ) × X β vδ ,i (cid:16) x dv − x (cid:17) . Substituting in β v, = β v α , , (3.23) and (3.24), it follows that (cid:96) − (cid:88) i =0 h i X β v ,i ( β v, x ) = (cid:100) (cid:96)/ dv (cid:101) − (cid:88) i =0 min ( (cid:96) − dv i, dv ) − (cid:88) j =0 g dv i + j x j X β vδ ,i (cid:16) x dv − x (cid:17) = min ( dv ,(cid:96) ) − (cid:88) j =0 (cid:100) ( (cid:96) − j ) / dv (cid:101) − (cid:88) i =0 g dv i + j X β vδ ,i (cid:32) β v δ , x dv − xβ v δ , (cid:33) x j = min ( dv ,(cid:96) ) − (cid:88) j =0 (cid:100) ( (cid:96) − j ) / dv (cid:101) − (cid:88) i =0 c dv i + j (cid:32) x dv − xβ v δ , (cid:33) i x j . Hence, (3.25) holds. (cid:3)
Using Lemma 3.25, we obtain Algorithms 7 and 8 for converting between themonomial basis and the twisted basis { X β v , ( β v, x ) , . . . , X β v ,(cid:96) − ( β v, x ) } of F [ x ] (cid:96) .Each algorithm makes what is now a familiar pattern of recursive calls, but withthe addition of the computation of either a generalised Taylor expansion or theinverse transformation, for which the algorithms of Section 3.5 are used. Theorem 3.26.
Algorithms 7 and 8 are correct.Proof.
We prove correctness for Algorithm 7 by induction on (cid:96) . The proof ofcorrectness for Algorithm 8 is omitted since it is almost identical. For v ∈ V , wehave X β v , ( β v, x ) = 1 and X β v , ( β v, x ) = x . Thus, Algorithm 7 produces thecorrect output for all inputs with (cid:96) ≤
2. In particular, it follows that the algorithmproduces the correct output whenever the input vertex is a leaf. Therefore, it issufficient to show that for internal v ∈ V , if the algorithm produces the correct Algorithm 7
X2M ( v, (cid:96), ( a , . . . , a (cid:96) − )) Input: a vertex v ∈ V , (cid:96) ∈ { , . . . , n v } , and a i = h i ∈ F for i ∈ { , . . . , (cid:96) − } . Output: a i = f i ∈ F for i ∈ { , . . . , (cid:96) − } such that(3.26) (cid:96) − (cid:88) i =0 f i x i = (cid:96) − (cid:88) i =0 h i X β v ,i ( β v, x ) . if (cid:96) ≤ then return (cid:96) ← (cid:6) (cid:96)/ d v (cid:7) − (cid:96) ← (cid:96) − d v (cid:96) , (cid:96) (cid:48) ← min(2 d v , (cid:96) ) for i = 0 , . . . , (cid:96) − do X2M ( v α , d v , ( a dv i , a dv i +1 , . . . , a dv ( i +1) − )) X2M ( v α , (cid:96) , ( a dv (cid:96) , a dv (cid:96) +1 , . . . , a (cid:96) − )) for j = 0 , . . . , (cid:96) − do X2M ( v δ , (cid:96) + 1 , ( a j , a dv + j , . . . , a dv (cid:96) + j )) for j = (cid:96) , . . . , (cid:96) (cid:48) − do X2M ( v δ , (cid:96) , ( a j , a dv + j , . . . , a dv ( (cid:96) − j )) if (cid:96) (cid:54) = 0 and 1 /β v δ , (cid:54) = 1 then w ← /β v δ , for i = 1 , . . . , (cid:96) − do for j = 0 , . . . , d v − do a dv i + j ← wa dv i + j w ← w/β v δ , for j = 0 , . . . , (cid:96) − do a dv (cid:96) + j ← wa dv (cid:96) + j InverseTaylorExpansion (2 d v , (cid:96), ( a , a , . . . , a (cid:96) − ))output whenever v α or v δ is given as an input, then it produces the correct outputwhenever v and (cid:96) ∈ { , . . . , n v } are given as inputs.Let v ∈ V be an internal vertex and suppose that Algorithm 7 produces thecorrect output whenever v α or v δ is given as an input. Suppose that the algorithmis called on v and (cid:96) ∈ { , . . . , n v } , with a i = h i ∈ F for i ∈ { , . . . , (cid:96) − } . Then theassumption that Algorithm 7 produces the correct output whenever v α is given asan input implies that Lines 2–5 of the algorithm set a i = g i for i ∈ { , . . . , (cid:96) − } ,where g , . . . , g (cid:96) − are the unique elements in F such that (3.23) holds. Similarly,the assumption implies that Lines 6–9 then set a i = c i for i ∈ { , . . . , (cid:96) − } , where c , . . . , c (cid:96) − are the unique elements in F such that (3.24) holds. As v is an internalvertex, Lemma 3.25 implies that c , . . . , c (cid:96) − also satisfy (3.25).Let f , . . . , f (cid:96) − be the unique elements in F such that (3.26) holds. If (cid:96) ≤ d v ,then (3.25) and (3.26) imply that f i = c i for i ∈ { , . . . , (cid:96) − } . Moreover,Lines 10–18 have no effect in this case. Therefore, the algorithm produces thecorrect output if (cid:96) ≤ d v . If (cid:96) > d v , then Lines 10–17 set a dv i + j = c dv i + j /β iv δ , for i ∈ { , . . . , (cid:6) (cid:96)/ d v (cid:7) − } and j ∈ { , . . . , min( (cid:96) − d v i, d v ) − } . Substitutinginto (3.25), it follows that (cid:100) (cid:96)/ dv (cid:101) − (cid:88) i =0 min( (cid:96) − dv i, dv ) − (cid:88) j =0 a dv i + j x j (cid:16) x dv − x (cid:17) i = (cid:96) − (cid:88) i =0 h i X β v ,i ( β v, x ) AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 35
Algorithm 8
M2X ( v, (cid:96), ( a , . . . , a (cid:96) − )) Input: a vertex v ∈ V , (cid:96) ∈ { , . . . , n v } , and a i = f i ∈ F for i ∈ { , . . . , (cid:96) − } . Output: a i = h i ∈ F for i ∈ { , . . . , (cid:96) − } such that (3.26) holds. if (cid:96) ≤ then return (cid:96) ← (cid:6) (cid:96)/ d v (cid:7) − (cid:96) ← (cid:96) − d v (cid:96) , (cid:96) (cid:48) ← min(2 d v , (cid:96) ) TaylorExpansion (2 d v , (cid:96), ( a , a , . . . , a (cid:96) − )) if (cid:96) (cid:54) = 0 and β v δ , (cid:54) = 1 then w ← β v δ , for i = 1 , . . . , (cid:96) − do for j = 0 , . . . , d v − do a dv i + j ← wa dv i + j w ← β v δ , w for j = 0 , . . . , (cid:96) − do a dv (cid:96) + j ← wa dv (cid:96) + j for j = 0 , . . . , (cid:96) − do M2X ( v δ , (cid:96) + 1 , ( a j , a dv + j , . . . , a dv (cid:96) + j )) for j = (cid:96) , . . . , (cid:96) (cid:48) − do M2X ( v δ , (cid:96) , ( a j , a dv + j , . . . , a dv ( (cid:96) − j )) for i = 0 , . . . , (cid:96) − do M2X ( v α , d v , ( a dv i , a dv i +1 , . . . , a dv ( i +1) − )) M2X ( v α , (cid:96) , ( a dv (cid:96) , a dv (cid:96) +1 , . . . , a (cid:96) − ))when InverseTaylorExpansion is called in Line 18. Thus, the algorithm produces thecorrect output if (cid:96) > d v . Hence, for internal v ∈ V , if the algorithm produces thecorrect output whenever v α or v δ is given as an input, then it produces the correctoutput whenever v and (cid:96) ∈ { , . . . , n v } are given as inputs. (cid:3) Algorithm 8 requires the precomputation and storage of the elements β v δ , , whiletheir inverses are required for Algorithm 7. Consequently, the algorithms requireauxiliary storage for O ( n ) field elements, while all precomputations can be per-formed with O ( n ) field operations. If (cid:96) = (cid:6) (cid:96)/ d v (cid:7) − (cid:96) − (cid:0) d v + 1 (cid:1) + (cid:96) = (cid:96) + (cid:6) (cid:96)/ d v (cid:7) − d v − TaylorExpansion or InverseTaylorExpansion perform at most (cid:98) (cid:96)/ (cid:99) (cid:6) log (cid:6) (cid:96)/ d v (cid:7)(cid:7) additions. It followsthat we should once again aim to avoid small values of d v when choosing a reduc-tion tree for the algorithms. However, compared to the algorithms for conversionbetween the LCH and the Newton and Lagrange bases, a much greater cost in termsof multiplications and additions is incurred if one fails to do so. If β is a Cantorbasis, then β v δ , = 1 for all internal v ∈ V , regardless of the choice of reductiontree (see Remark 3.1). Thus, Algorithms 7 and 8 perform no multiplications in thiscase, and require no precomputations.Lin et al. [20] provide two algorithms for converting from the monomial basis tothe LCH basis when (cid:96) is a power of two, one for arbitrary bases and one for Cantorbases. The reduction strategy they apply for the arbitrary bases corresponds toreduction trees with Im( d ) ⊆ { , } . For such reduction trees, Algorithm 8 performsthe same number of additions as their algorithm, but fewer multiplications in the recursive case (after equalising precomputations). For Cantor bases, Algorithm 8reduces to their algorithm by choosing the reduction tree so that d v = 2 (cid:100) log n v (cid:101)− for all internal v ∈ V . Theorem 3.27.
Algorithms 7 and 8 perform at most (cid:98) (cid:96)/ (cid:99) (3 (cid:100) log (cid:96) (cid:101) −
4) + 1 mul-tiplications and (cid:98) (cid:96)/ (cid:99) (cid:0) (cid:100) log (cid:96) (cid:101) (cid:1) additions in F . If d v = 2 (cid:100) log n v (cid:101)− for all internal v ∈ V , then the algorithms perform at most (cid:98) (cid:96)/ (cid:99)(cid:100) log (cid:96) (cid:101)(cid:100) log log max( (cid:96), (cid:101) addi-tions in F . If β is a Cantor basis, then the algorithms perform no multiplications. We have already shown that Algorithms 5 and 6 perform no multiplicationswhen β is a Cantor basis. We split the remainder of the proof of Theorem 3.27 intothree lemmas, one for each of three remaining bounds. It is clear that Algorithms 7and 8 perform the same number of multiplications when given identical inputs.Consequently, we only prove the bounds for Algorithm 7. All three bounds are equalto zero or one for (cid:96) ≤
2, while Algorithm 7 performs no additions or multiplicationsfor such input values of (cid:96) . In particular, it follows that all three bound holds if theinput vertex is a leaf. Consequently, for each of the three bounds it is sufficient toshow that if v ∈ V is an internal vertex such that the bound holds whenever theinput vertex is v α or v δ , then the bound holds whenever the input vertex is v and (cid:96) ∈ { , . . . , n v } . Lemma 3.28.
Algorithms 7 and 8 perform at most (cid:98) (cid:96)/ (cid:99) (3 (cid:100) log (cid:96) (cid:101) −
4) + 1 multi-plications in F .Proof. Suppose that for some internal vertex v ∈ V , Algorithm 7 performs at most (cid:98) (cid:96)/ (cid:99) (3 (cid:100) log (cid:96) (cid:101) −
4) + 1 multiplications in F whenever v α or v δ is given as the inputvertex. Furthermore, suppose that v and (cid:96) ∈ { , . . . , n v } are given as inputs tothe algorithm. If (cid:96) = 0, then (cid:96) = (cid:96) (cid:48) = (cid:96) and the algorithm performs at most (cid:22) (cid:96) (cid:23) (3 (cid:100) log (cid:96) (cid:101) −
4) + 1 + (cid:96) × (cid:22) (cid:96) (cid:23) (3 (cid:100) log (cid:96) (cid:101) −
4) + 1multiplications. Therefore, suppose that (cid:96) (cid:54) = 0. Then, as (cid:96) ≤ d v , Lines 3–5 ofthe algorithm perform at most (cid:96) (cid:0) d v − (3 d v −
4) + 1 (cid:1) + (cid:22) (cid:96) (cid:23) (3 (cid:100) log (cid:96) (cid:101) −
4) + 1 ≤ (cid:22) (cid:96) (cid:23) (3 d v −
4) + (cid:96) + 1multiplications. Lines 6–9 perform at most (cid:96) (cid:22) (cid:96) + 12 (cid:23) (3 (cid:100) log (cid:96) + 1 (cid:101) −
4) + (cid:0) d v − (cid:96) (cid:1)(cid:22) (cid:96) (cid:23) (3 (cid:100) log (cid:96) (cid:101) −
4) + 2 d v ≤ (cid:22) (cid:96) ( (cid:96) + 1) + (2 d v − (cid:96) ) (cid:96) (cid:23) (3 (cid:100) log (cid:96) + 1 (cid:101) −
4) + 2 d v = (cid:22) (cid:96) (cid:23) (3 (cid:100) log (cid:96) (cid:101) − d v −
4) + 2 d v AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 37 multiplications, Lines 10–17 perform ( (cid:96) − (cid:0) d v + 1 (cid:1) + (cid:96) multiplications, andLine 18 performs no multiplications. Summing these bounds, it follows that Algo-rithm 7 performs at most (cid:22) (cid:96) (cid:23) (3 (cid:100) log (cid:96) (cid:101) −
4) + 1 − (cid:22) (cid:96) (cid:23) + (cid:0) d v + 2 (cid:1) (cid:96) + (cid:96) − ≤ (cid:22) (cid:96) (cid:23) (3 (cid:100) log (cid:96) (cid:101) −
4) + 1 − (cid:96) + (cid:0) d v + 2 (cid:1) (cid:96) + (cid:96) + 1= (cid:22) (cid:96) (cid:23) (3 (cid:100) log (cid:96) (cid:101) −
4) + 1 − (cid:0) d v − (cid:1) (cid:96) − ( (cid:96) − ≤ (cid:22) (cid:96) (cid:23) (3 (cid:100) log (cid:96) (cid:101) −
4) + 1multiplications. (cid:3)
Lemma 3.29.
Algorithms 7 and 8 perform at most (cid:98) (cid:96)/ (cid:99) (cid:0) (cid:100) log (cid:96) (cid:101) (cid:1) additions in F .Proof. Suppose that for some internal vertex v ∈ V , Algorithm 7 performs at most (cid:98) (cid:96)/ (cid:99) (cid:0) (cid:100) log (cid:96) (cid:101) (cid:1) additions in F whenever v α or v δ is given as the input vertex. Fur-thermore, suppose that v and (cid:96) ∈ { , . . . , n v } are given as inputs to the algorithm.If (cid:96) = 0, then (cid:96) = (cid:96) (cid:48) = (cid:96) and the algorithm performs at most (cid:22) (cid:96) (cid:23)(cid:18) (cid:100) log (cid:96) (cid:101) (cid:19) + (cid:96) × (cid:22) (cid:96) (cid:23) (cid:100) log (cid:101) = (cid:22) (cid:96) (cid:23)(cid:18) (cid:100) log (cid:96) (cid:101) (cid:19) additions. Therefore, suppose that (cid:96) >
0. Then, as (cid:96) ≤ d v , Lines 3–5 of thealgorithm perform at most (cid:96) d v − (cid:18) d v (cid:19) + (cid:22) (cid:96) (cid:23)(cid:18) (cid:100) log (cid:96) (cid:101) (cid:19) ≤ (cid:96) d v − (cid:18) d v (cid:19) + (cid:22) (cid:96) (cid:23)(cid:18) d v (cid:19) = (cid:22) (cid:96) (cid:23)(cid:18) d v (cid:19) additions. Lines 6–9 of the algorithm perform at most (cid:96) (cid:22) (cid:96) + 12 (cid:23)(cid:18) (cid:100) log (cid:96) + 1 (cid:101) (cid:19) + (cid:0) d v − (cid:96) (cid:1)(cid:22) (cid:96) (cid:23)(cid:18) (cid:100) log (cid:96) (cid:101) (cid:19) ≤ (cid:22) (cid:96) (cid:23)(cid:18) (cid:100) log (cid:96) + 1 (cid:101) (cid:19) additions, since (cid:96) ( (cid:96) + 1) + (2 d v − (cid:96) ) (cid:96) = (cid:96) . Lines 10–17 perform no additions,while Lemma 3.24 implies that Line 18 performs at most (cid:98) (cid:96)/ (cid:99) (cid:6) log (cid:6) (cid:96)/ d v (cid:7)(cid:7) = (cid:98) (cid:96)/ (cid:99)(cid:100) log (cid:96) + 1 (cid:101) additions. As (cid:100) log (cid:96) + 1 (cid:101) = (cid:100) log (cid:96) (cid:101) − d v , it follows by summingthese bounds that Algorithm 7 performs at most (cid:22) (cid:96) (cid:23)(cid:18)(cid:18) (cid:100) log (cid:96) (cid:101) (cid:19) − (cid:100) log (cid:96) + 1 (cid:101) ( d v − (cid:19) ≤ (cid:22) (cid:96) (cid:23)(cid:18) (cid:100) log (cid:96) (cid:101) (cid:19) additions. (cid:3) Lemma 3.30.
Suppose that d v = 2 (cid:100) log n v (cid:101)− for all internal v ∈ V . Then Algo-rithms 7 and 8 perform at most (cid:98) (cid:96)/ (cid:99)(cid:100) log (cid:96) (cid:101)(cid:100) log log max( (cid:96), (cid:101) additions in F .Proof. Suppose that d v = 2 (cid:100) log n v (cid:101)− for all internal vertices v ∈ V . Further-more, suppose that for some internal vertex v ∈ V , Algorithm 7 performs at most (cid:98) (cid:96)/ (cid:99)(cid:100) log (cid:96) (cid:101)(cid:100) log log max( (cid:96), (cid:101) additions in F whenever v α or v δ is given as theinput vertex. Finally, suppose that v and (cid:96) ∈ { , . . . , n v } are given as inputs tothe algorithm. If (cid:96) = 0, then (cid:96) = (cid:96) (cid:48) = (cid:96) and the algorithm performs at most (cid:22) (cid:96) (cid:23) (cid:100) log (cid:96) (cid:101)(cid:100) log log (cid:96) (cid:101) + (cid:96) × (cid:22) (cid:96) (cid:23) (cid:100) log (cid:101) = (cid:22) (cid:96) (cid:23) (cid:100) log (cid:96) (cid:101)(cid:100) log log (cid:96) (cid:101) additions. Therefore, suppose that (cid:96) >
0. Then, as (cid:96) ≤ d v < (cid:96) , Lines 3–5 of thealgorithm perform at most(3.27) (cid:96) d v − d v (cid:100) log log (cid:96) (cid:101) + (cid:22) (cid:96) (cid:23) d v (cid:100) log log (cid:96) (cid:101) = (cid:22) (cid:96) (cid:23) d v (cid:100) log log (cid:96) (cid:101) additions. Lines 6–7 of the algorithm perform at most (cid:96) (cid:22) (cid:96) + 12 (cid:23) (cid:100) log (cid:96) + 1 (cid:101)(cid:100) log log (cid:96) + 1 (cid:101) additions, while Lines 8–9 perform at most (cid:0) d v − (cid:96) (cid:1)(cid:22) (cid:96) (cid:23) (cid:100) log (cid:96) (cid:101)(cid:100) log log max( (cid:96) , (cid:101) additions. As (cid:96) ( (cid:96) + 1) + (2 d v − (cid:96) ) (cid:96) = (cid:96) , it follows that Lines 6–9 of the algorithmperform at most (cid:98) (cid:96)/ (cid:99)(cid:100) log (cid:96) + 1 (cid:101)(cid:100) log log (cid:96) + 1 (cid:101) additions. If (cid:96) ≥
2, then thereexists an integer k ≥ k − < (cid:96) + 1 ≤ k . Then (cid:100) log log (cid:96) + 1 (cid:101) = k , 2 k − < n v − d v ≤ d v and (cid:96) = 2 d v ( (cid:96) + (cid:96) / d v ) > d v +2 k − > k . Thus, (cid:100) log log (cid:96) + 1 (cid:101) ≤ (cid:100) log log (cid:96) (cid:101) − (cid:96) ≥
2. As (cid:96) ≥
3, the inequality also holds if (cid:96) = 1. Therefore, Lines 6–9 of the algorithm perform at most(3.28) (cid:22) (cid:96) (cid:23) ( (cid:100) log (cid:96) (cid:101) − d v ) (cid:100) log log (cid:96) (cid:101) − (cid:22) (cid:96) (cid:23) (cid:100) log (cid:96) + 1 (cid:101) additions. Lines 10–17 of the algorithm perform no additions, while Lemma 3.24implies that Line 18 performs at most (cid:98) (cid:96)/ (cid:99) (cid:6) log (cid:6) (cid:96)/ d v (cid:7)(cid:7) = (cid:98) (cid:96)/ (cid:99)(cid:100) log (cid:96) + 1 (cid:101) ad-ditions. By combining this last bound with the bounds (3.27) and (3.28) on thenumber of additions performed by Lines 3–5 and Lines 6–9, it follows that Algo-rithm 7 performs at most (cid:98) (cid:96)/ (cid:99)(cid:100) log (cid:96) (cid:101)(cid:100) log log (cid:96) (cid:101) additions, which is the requiredbound since (cid:96) ≥ (cid:3) Constructing a basis and reduction tree
Let β = ( β , . . . , β n − ) ∈ F n have entries that are linearly independent over F .If n = 1, then there exists a unique reduction tree for β , the tree consisting of asingle vertex. If n >
1, then a full binary tree is a reduction tree for β if it has n leaves, the subtrees rooted on the children r α and r δ of the tree’s root vertex r are themselves reduction trees for α ( β, d ( r )) and δ ( β, d ( r )), respectively, and thequotients β /β , . . . , β d ( r ) − /β belong to F d ( r ) . The requirement on the quotientsis trivially satisfied if d ( r ) = 1. Consequently, the full binary tree with n leaves andIm( d ) ⊆ { , } is a reduction tree for β (Proposition 2.7). We view this tree as thetrivial choice of reduction tree for the basis, and as capturing the approach used byexisting algorithms. We expect such trees to yield the worst algebraic complexityfor the algorithms of Section 3. Accordingly, when we have freedom to choose thebasis vector, our choice should enable us to avoid their use. Cantor bases providesuch a choice, and we witnessed the benefits they provide in Section 3. However,Cantor bases are restricted to extensions with degree divisible by sufficiently largepowers of two. In this section, we propose new basis constructions that allow us tobenefit similarly in other extensions.Regardless of the chosen basis vector, the choice of reduction trees is limited bythe subfield structure of F . AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 39
Proposition 4.1. If ( V, E ) is a reduction tree for some vector in F n , then d ( v α ) < d ( v ) < n ≤ [ F : F ] and max( d ( v α ) , | d ( v ) | [ F : F ] for all internal v ∈ V .Proof. Suppose that (
V, E ) is a reduction tree for some vector β ∈ F n . Then n ≤ [ F : F ] since Definition 2.6 requires β to have linearly independent entries over F . The definition also requires the tree to have n leaves. Thus, d ( v α ) < | L v α | = d ( v ) < | L v | ≤ n ≤ [ F : F ]for all internal v ∈ V .Let v ∈ V be an internal vertex. Then it follows from Definition 2.6 that thesubtree rooted on v is a reduction tree for some vector β v = ( β v, , . . . , β v, | L v |− ) ∈ F | L v | that has linearly independent entries over F . Moreover, as | L v | >
1, thedefinition implies that the quotients β v, /β v, , . . . , β v,d ( v ) − /β v, belong to F d ( v ) .These quotients inherit linear independence over F . Thus, they form a basis ofthe extension F d ( v ) / F . As the quotients also belong to F , it follows that F d ( v ) isa subfield of F . Therefore, d ( v ) divides [ F : F ]. Similarly, if the vertex is v α is aninternal vertex, then F d ( vα ) is a subfield of F d ( v ) , since the subtree rooted on v α isa reduction tree for α ( β v , d ( v )) = ( β v, , . . . , β v,d ( v ) − ). As v α is an internal vertexif and only if d ( v α ) ≥
1, it follows that max( d ( v α ) ,
1) divides d ( v ). (cid:3) Corollary 4.2.
Suppose that the entries of β ∈ F n are linearly independent over F ,and [ F : F ] has no proper factor less than n . Then a full binary tree is a reductiontree for β if and only if it has n leaves and Im( d ) ⊆ { , } .Proof. Proposition 2.7 implies that it is sufficient to have n leaves and Im( d ) ⊆{ , } , while Proposition 4.1 implies that it is also necessary. (cid:3) A construction for arbitrary fields.
It follows from Proposition 4.1 thata path in a reduction tree that consists of two or more edges of the form { v, v α } admits a nontrivial tower of subfields of F . However, the existence of a basis vectorof a prescribed dimension that has a nontrivial reduction tree is not guaranteed bythe existence of nontrivial tower of subfields. Indeed, Corollary 4.2 shows that it isnecessary for the tower to contain a subfield other than F of degree bounded bythe dimension. In this section, we show that this requirement is also sufficient. Theorem 4.3.
Suppose there exists a tower of subfields (4.1) F = F n ⊂ F n ⊂ · · · ⊂ F nm = F . Let { ϑ k, , . . . , ϑ k,n k +1 /n k − } be a basis of F nk +1 / F nk for k ∈ { , . . . , m − } , and β i = m − (cid:89) k =0 ϑ k,i k such that m − (cid:88) k =0 i k n k = i for i ∈ { , . . . , n m − } . Then β , . . . , β n m − ∈ F are linearly independent over F .Moreover, a full binary tree ( V, E ) with n ≤ n m leaves is a reduction tree for ( β , . . . , β n − ) if Im( d ) ⊆ { , n , . . . , n m − } and d ( v δ ) ≤ d ( v ) for all internal v ∈ V . The requirements of Theorem 4.3 are satisfied by the full binary tree with n leaves and Im( d ) ⊆ { , } . Consequently, Proposition 2.7 follows from the case m = 1. We delay the proof of the theorem until the end of the section. Instead, we now show that the basis vectors given by the construction of the theorem allow anontrivial and, more importantly, beneficial choice of reduction trees. Proposition 4.4. If I ⊆ N and ( V, E ) is a full binary tree such that d ( v ) = max { i ∈ I | i < | L v |} for all internal v ∈ V , then d ( v δ ) ≤ d ( v ) for all internal v ∈ V .Proof. Suppose that I ⊆ N and a full binary tree ( V, E ) satisfy the conditions ofthe proposition. Then d ( v δ ) < | L v δ | < | L v | and d ( v δ ) ∈ I ∪ { } for all internal v ∈ V . Hence, d ( v δ ) ≤ max { i ∈ I | i < | L v |} = d ( v ) for all internal v ∈ V . (cid:3) For tuples of positive integers ( n , . . . , n m ) such that (4.1) holds, let T ( n ,...,n m ) n denote the full binary tree with n leaves and d ( v ) = max { n k | n k < | L v |} for allinternal vertices v . If n < n ≤ n m , then the root vertex r of T ( n ,...,n m ) n satisfies d ( r ) ≥ n >
1, establishing the existence of a tree with Im( d ) (cid:42) { , } that satisfiesthe conditions of Theorem 4.3. Moreover, we expect this tree to approximatelyminimise the algebraic complexity of the conversion algorithms of Section 3 overall trees that satisfy the conditions of the theorem. Example 4.5.
Suppose that F = F . Then there are eight tuples of positiveintegers ( n , . . . , n m ) such that (4.1) holds. For each such tuple, Figure 6 displaysthe relative number of additions performed by the basis conversion algorithms ofSection 3 for β = ( β , . . . , β ) given by the constructed of Theorem 4.3 (the choiceof the bases for the extensions F nk +1 / F nk doesn’t matter here), the reduction tree T ( n ,...,n m ) , and polynomial length ( (cid:96) ) ranging over { , . . . , } . The number ofadditions performed in each case is given as a fraction of the number performed forthe tuple (1 , ,
12) has Im( d ) = { , } .Thus, it represents the complexity obtained with the reduction strategy of existingalgorithms. The additional parameters c = (cid:96) and b = 0 are used for Algorithm 3,and c = (cid:96) is used for Algorithm 4. Figure 7 similarly displays the relative numberof multiplications performed by Algorithms 7 and 8, under the assumption that β v δ , is never equal to one in Line 10 of Algorithm 7 and Line 4 of Algorithm 8.The daggered tuples that appear in the figure are discussed in the next section.Example 4.5 demonstrates that the construction of Theorem 4.3 and the choiceof reduction trees it provides allows us to achieve a lower algebraic complexity if[ F : F ] has even a single sufficiently small factor. The potential benefits are greaterstill when the degree of the field contains many small prime factors, echoing thebenefits obtained by using roots of unity with smooth order in multiplicative FFTs.While a reduction in algebraic complexity is certainly desirable, it is not the onlyconsideration in practice. For example, Harvey’s “cache-friendly” variant [15] ofthe radix-2 truncated Fourier transform [25] obtains better practical performanceby optimising cache effects. This variant employs a reduction strategy that morerapidly reduces to problems of size that fit into cache, helping to reduce dataexchanges with RAM. An analogous approach for the algorithms of Section 3 is touse reduction trees that balance the size of | L v α | and | L v δ | for internal vertices. If[ F : F ] is smooth and the construction of Theorem 4.3 is applied with the numberof subfields in the tower taken as large as possible, then the following propositionshows that it is possible to construct such trees while meeting requirements of AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 41 . . N2X / X2N . . L2X . . X2L . . . X2M / M2X (1 ,
12) (1 , ,
12) (1 , ,
12) (1 , , , ,
12) (1 , , ,
12) (1 , , ,
12) (1 , , , Figure 6.
Relative number of additions performed by the algo-rithms of Section 3 (see Example 4.5).the theorem by choosing d ( v ) ∈ { n , . . . , n m } to minimise max( n k , | L v | − n k ) forall internal vertices. While there are trade-offs between optimising cache effectsand reducing the operation count to be considered in practice, it appears thatTheorem 4.3 offers us some freedom, especially when the degree of the field issmooth, to tune the algorithms of Section 3. Proposition 4.6. If I ⊆ N and ( V, E ) is a full binary tree such that d ( v ) ∈ arg min i ∈ I max( i, | L v | − i ) for all internal v ∈ V , then d ( v δ ) ≤ d ( v ) for all internal v ∈ V .Proof. We prove the proposition by contradiction. Suppose that I ⊆ N and a fullbinary tree ( V, E ) satisfy the conditions of the proposition. Furthermore, supposethere exists an internal vertex v ∈ V such that d ( v δ ) > d ( v ). Then v δ is an internal . . . . X2M / M2X (1 , , , , , , , , , , , , , , , , , , , , , † (1 , , , † (1 , , , † Figure 7.
Relative number of multiplications performed by Al-gorithms 7 and 8 (see Example 4.5).vertex since d ( v δ ) >
1. Thus, d ( v δ ) ∈ I andmax( d ( v δ ) , | L v | − d ( v δ )) < max( | L v δ | , | L v | − d ( v ))= max( | L v | − | L v α | , | L v | − d ( v ))= max( | L v | − d ( v ) , | L v | − d ( v )) ≤ max( d ( v ) , | L v | − d ( v )) , which contradicts the minimality of max( d ( v ) , | L v | − d ( v )). (cid:3) We now turn our attention to the proof of Theorem 4.3, which we obtain as aconsequence of the following more general result.
Lemma 4.7.
Suppose there exists a tower of subfields F = F n ⊂ · · · ⊂ F nm = F .Let β = ( β , . . . , β n − ) ∈ F n have entries that are linearly independent over F ,and ( V, E ) be a full binary tree with n leaves and root vertex r ∈ V . Then ( V, E ) isa reduction tree for β if the following conditions are satisfied: (1) Im( d ) ⊆ { , n , . . . , n m − } , (2) d ( v δ ) ≤ d ( v ) for all internal v ∈ V , and (3) β i /β n k (cid:98) i/n k (cid:99) ∈ F nk for i ∈ { , . . . , n − } and k ∈ { , . . . , m − } such that n k ≤ d ( r ) .Proof. We prove the lemma by induction on n . The lemma holds trivially if n = 1,since it is sufficient for ( V, E ) to have n leaves in this case. Therefore, let n ≥ n . Suppose that β = ( β , . . . , β n − ) ∈ F n and a full binary tree ( V, E ) satisfy the conditions of the
AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 43 lemma. Then the root vertex r ∈ V of the tree is not a leaf since | L r | = n ≥ d ( r ) = n (cid:96) for some (cid:96) ∈ { , . . . , m − } such that n (cid:96) < n .Let α ( β, d ( r )) = ( α , . . . , α n (cid:96) − ) and δ ( β, d ( r )) = ( δ , . . . , δ n − n (cid:96) − ), which havelinearly independent entries over F by Lemma 2.1. Then, as d ( r α ) < d ( r ), (3)implies that α i /α n k (cid:98) i/n k (cid:99) = β i /β n k (cid:98) i/n k (cid:99) ∈ F nk for i ∈ { , . . . , n (cid:96) − } and k ∈{ , . . . , m − } such that n k ≤ d ( r α ). Moreover, as n , . . . , n (cid:96) divide n (cid:96) , we have β n (cid:96) + i β = β n (cid:96) + i β n (cid:96) + n k (cid:98) i/n k (cid:99) β n (cid:96) + n k (cid:98) i/n k (cid:99) β = β n (cid:96) + i β n k (cid:98) ( n (cid:96) + i ) /n k (cid:99) β n (cid:96) + n k (cid:98) i/n k (cid:99) β , where β n (cid:96) + i /β n k (cid:98) ( n (cid:96) + i ) /n k (cid:99) ∈ F nk ⊆ F n(cid:96) , for i ∈ { , . . . , n − n (cid:96) − } and k ∈{ , . . . , (cid:96) } . It follows that δ i = β n (cid:96) + i β n k (cid:98) ( n (cid:96) + i ) /n k (cid:99) (cid:32)(cid:18) β n (cid:96) + n k (cid:98) i/n k (cid:99) β (cid:19) n(cid:96) − β n (cid:96) + n k (cid:98) i/n k (cid:99) β (cid:33) = β n (cid:96) + i β n k (cid:98) ( n (cid:96) + i ) /n k (cid:99) δ n k (cid:98) i/n k (cid:99) for i ∈ { , . . . , n − n (cid:96) − } and k ∈ { , . . . , (cid:96) } . As d ( r δ ) ≤ d ( r ) = n (cid:96) by (2), it followsthat δ i /δ n k (cid:98) i/n k (cid:99) ∈ F nk for i ∈ { , . . . , n − n (cid:96) − } and k ∈ { , . . . , m − } such that n k ≤ d ( r δ ). Conditions (1) and (2) are satisfied by the subtrees of ( V, E ) rooted on r α and r δ through inheritance. The subtree rooted on r α has | L r α | = d ( r ) = n (cid:96) < n leaves, while the subtree rooted on r δ has | L r δ | = | L r | − | L r α | = n − d ( r ) = n − n (cid:96) < n leaves. Therefore, the induction hypothesis implies that the subtreerooted on r α is a reduction tree for α ( β, d ( r )), and the subtree rooted on r δ is areduction tree for δ ( β, d ( r )). Finally, (3) implies that β i /β = β i /β n (cid:96) (cid:98) i/n (cid:96) (cid:99) ∈ F d ( r ) for i ∈ { , . . . , d ( r ) − } . Hence, ( V, E ) is a reduction tree for β . (cid:3) If β = ( β , . . . , β n − ) ∈ F n is a Cantor basis with n ≥
4, then properties (1)and (2) of Lemma 2.9 imply that β ∈ F \ F . Thus, β /β does not belong to F ,since β = β /β + 1. Consequently, Proposition 2.8 implies that the converseof Lemma 4.7 does not hold. We now complete the proof of Theorem 4.3 byestablishing linear independence and showing that the vectors ( β , . . . , β n − ) for n ∈ { , . . . , n m } always satisfy the third condition of Lemma 4.7. Lemma 4.8.
Suppose that β , . . . , β n m − are given by the construction of Theo-rem 4.3. Then β , . . . , β n m − are linearly independent over F , and β i /β n k (cid:98) i/n k (cid:99) ∈ F nk for i ∈ { , . . . , n m − } and k ∈ { , . . . , m − } .Proof. Suppose that β , . . . , β n m − are given by the construction of Theorem 4.3.Let i ∈ { , . . . , n m − } and i = (cid:80) m − k =0 i k n k with i k ∈ { , . . . , n k +1 /n k − } for k ∈ { , . . . , m − } . Then β i β n k (cid:98) i/n k (cid:99) = ϑ ,i · · · ϑ k − ,i k − ϑ k,i k · · · ϑ m − ,i m − ϑ , · · · ϑ k − , ϑ k,i k · · · ϑ m − ,i m − = ϑ ,i · · · ϑ k − ,i k − ϑ , · · · ϑ k − , ∈ F nk for k ∈ { , . . . , m } , as required. Similarly,(4.2) β in k + j β = ϑ k,i ϑ k, β j β and β j β = β j β n k (cid:98) j/n k (cid:99) ∈ F nk for i ∈ { , . . . , n k +1 /n k − } , j ∈ { , . . . , n k − } and k ∈ { , . . . , m − } . Now { ϑ k, /ϑ k, , . . . , ϑ k,n k +1 /n k − /ϑ k, } is a basis of F nk +1 / F nk for k ∈ { , . . . , m − } .Therefore, if β /β , . . . , β n k − /β are linearly independent over F for some k ∈ { , . . . , m − } , then (4.2) implies that β /β , . . . , β n k +1 − /β are linearly indepen-dent over F . As n = 1, it follows that β , . . . , β n m − are linearly independentover F . (cid:3) Fewer multiplications for quadratic extensions.
Theorem 4.3 does notplace any restrictions on the choice of bases for the extensions F nk +1 / F nk . In thissection, we show that if some of these extension are quadratic, then it possible tochoose their bases so that Algorithms 7 and 8 perform fewer multiplications. Theorem 4.9.
Let β , . . . , β n m − ∈ F be constructed as per Theorem 4.3, n ∈{ , . . . , n m } , and V be the vertex set of the tree T ( n ,...,n m ) n . Recursively definevectors β v = ( β v, , . . . , β v, | L v |− ) for v ∈ V as follows: if v is the root of the tree,then β v = ( β , . . . , β n − ) ; and if v is an internal vertex, then β v α = α ( β v , d ( v )) and β v δ = δ ( β v , d ( v )) . Suppose there exists t ∈ { , . . . , m − } such that (4.3) n t +1 n t = 2 and Tr F nt +1 / F nt (cid:18) ϑ t, ϑ t, (cid:19) = 1 . Then β v δ , = 1 for all v ∈ V such that d ( v ) = n t . The definition of the vectors β v in Theorem 4.9 matches that of Section 3. Con-sequently, for β = ( β , . . . , β n − ) given by the construction of Theorem 4.3, andthe reduction tree T ( n ,...,n m ) n , Lines 10–17 of Algorithm 7 (similarly, Lines 4–11 ofAlgorithm 8) perform no multiplications if the input vertex v satisfies d ( v ) = n t for some t such that (4.3) holds. If n t +1 /n t = 2, then we may ensure that thecondition on the trace in (4.3) is satisfied by taking ϑ t, = 1 and ϑ t, ∈ F nt +1 with Tr F nt +1 / F nt ( ϑ t, ) = 1, which yields a basis since the trace of one is equalto zero. Therefore, it is possible to achieve a significant reduction in the numberof multiplications performed by Algorithms 7 and 8 when the tower used in theconstruction contains several quadratic extensions. Returning to Example 4.5, wesee such an improvement for F = F by looking at the relative number of mul-tiplications performed for the daggered tuples in Figure 7. For these tuples, it isassumed that β v δ , = 1 if and only if d ( v ) = n t for some t such that n t +1 /n t = 2.We now turn our attention to the proof of Theorem 4.9. Lemma 4.10.
Assume the hypothesis and notation of Theorem 4.9. If v ∈ V isan internal vertex and d ( v ) = n (cid:96) , then there exist elements ϑ (cid:48) (cid:96), , ϑ (cid:48) (cid:96), , . . . ∈ F n(cid:96) +1 that are linearly independent over F n(cid:96) , for which (4.4) β v δ ,i = ϑ ,i ϑ , · · · ϑ (cid:96) − ,i (cid:96) − ϑ (cid:96) − , ϑ (cid:48) (cid:96),i (cid:96) such that (cid:96) (cid:88) k =0 i k n k = i, for i ∈ { , . . . , | L v δ | − } . Furthermore, (4.5) ϑ (cid:48) (cid:96), = (cid:18) ϑ (cid:96), ϑ (cid:96), (cid:19) n(cid:96) − ϑ (cid:96), ϑ (cid:96), if there is no vertex u ∈ V such that u δ = v and d ( u ) = n (cid:96) .Proof. Throughout the proof, if i ∈ { , . . . , n m − } , then i , . . . , i m − denote thecoefficients of the expansion i = (cid:80) m − k =0 i k n k such that i k ∈ { , . . . , n k +1 /n k − } for k ∈ { , . . . , m − } . We note in particular that { , . . . , | L v |− } ⊆ { , . . . , n m − } for v ∈ V , since | L v | ≤ n ≤ n m . We begin by showing that the assertions of the lemma AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 45 hold for a special subset of the internal vertices in the tree, before completing theproof by induction.Let v ∈ V be an internal vertex such that the (possibly trivial) path r = v , . . . , v h = v that connects v to the root vertex r ∈ V satisfies v i = ( v i − ) α for i ∈ { , . . . , h } . Then β v,i = β v h − ,i = · · · = β v ,i = β r,i = m − (cid:89) k =0 ϑ k,i k for i ∈ { , . . . , | L v | − } . As v is an internal vertex, d ( v ) = max { n k | n k < | L v |} by the definition of the tree T ( n ,...,n m ) n . Thus, d ( v ) = n (cid:96) for some (cid:96) ∈ { , . . . , m − } . Moreover, | L v δ | = | L v | − n (cid:96) < n (cid:96) +1 , since otherwise the maximality of n (cid:96) is contradicted. Consequently,(4.6) β v,n (cid:96) + i β v, = ϑ ,i ϑ , · · · ϑ (cid:96) − ,i (cid:96) − ϑ (cid:96) − , ϑ (cid:96), i (cid:96) ϑ (cid:96), for i ∈ { , . . . , | L v δ | − } . As the quotients ϑ k,i /ϑ k, ∈ F nk +1 ⊆ F n(cid:96) for k ∈ { , . . . , (cid:96) − } , it follows that (4.4)holds with ϑ (cid:48) (cid:96),i = (cid:18) ϑ (cid:96),i +1 ϑ (cid:96), (cid:19) n(cid:96) − ϑ (cid:96),i +1 ϑ (cid:96), for i ∈ { , . . . , n (cid:96) +1 /n (cid:96) − } . Consequently, (4.5) also holds. Finally, ϑ (cid:48) (cid:96), , . . . , ϑ (cid:48) (cid:96),n (cid:96) +1 /n (cid:96) − inherit linear inde-pendence over F n(cid:96) from ϑ (cid:96), , . . . , ϑ (cid:96),n (cid:96) +1 /n (cid:96) − , since n (cid:96) +1 /n (cid:96) − (cid:88) i =0 λ i +1 ϑ (cid:48) (cid:96),i = 0 if and only if n (cid:96) +1 /n (cid:96) − (cid:88) i =1 λ i ϑ (cid:96),i ϑ (cid:96), ∈ F n(cid:96) for λ , . . . , λ n (cid:96) +1 /n (cid:96) − ∈ F n(cid:96) .We now proceed by induction on the depth of v , i.e., on the length h of the path r = v , . . . , v h = v that connects v to the root vertex of the tree. If v ∈ V is aninternal vertex of depth zero, then v is the root of the tree which is covered bythe already proved case. Therefore, let h be a positive integer, and suppose thatthe assertions of the lemma holds for all internal vertices with depth less than h .Let v ∈ V be an internal vertex of depth h (if no such vertex exists, then we aredone) and r = v , . . . , v h = v be the path that connects v to the root vertex r ∈ V .We may assume that v i = ( v i − ) δ for some i ∈ { , . . . , h } . Let j the maximumof all such indices. Then v j − is an internal vertex. Thus, d ( v j − ) = n k for some k ∈ { , . . . , m − } . Consequently, the induction hypothesis and the choice of j imply that there exists a basis { ϑ (cid:48)(cid:48) k, , . . . , ϑ (cid:48)(cid:48) k,n k +1 /n k − } of F nk +1 / F nk such that(4.7) β v,i = β v h − ,i = · · · = β v j ,i = ϑ ,i ϑ , · · · ϑ k − ,i k − ϑ k − , ϑ (cid:48)(cid:48) k,i k for i ∈ { , . . . , | L v | − } . As v is an internal vertex, d ( v ) = n (cid:96) for some (cid:96) ∈ { , . . . , m − } . Moreover, thedefinition of the tree T ( n ,...,n m ) n implies that n (cid:96) = max { n t | n t < | L v |} ≤ max (cid:8) n t | n t < (cid:12)(cid:12) L v j − (cid:12)(cid:12)(cid:9) = n k , since v is descended from v j − . If (cid:96) = k , then (4.7) implies that (4.4) holds with ϑ (cid:48) (cid:96),i = (cid:32) ϑ (cid:48)(cid:48) (cid:96),i +1 ϑ (cid:48)(cid:48) (cid:96), (cid:33) n(cid:96) − ϑ (cid:48)(cid:48) (cid:96),i +1 ϑ (cid:48)(cid:48) (cid:96), for i ∈ { , . . . , n (cid:96) +1 /n (cid:96) − } , which inherit linear independence over F n(cid:96) from ϑ (cid:48)(cid:48) k, , . . . , ϑ (cid:48)(cid:48) k,n k +1 /n k − . Moreover,we must have j = h , since otherwise the maximality of j implies that v is descendedfrom ( v j ) α , which in-turn implies that n k = n (cid:96) < | L v | ≤ d ( v j ) ≤ n k . It followsthat u = v h − satisfies u δ = v and d ( u ) = n (cid:96) . Consequently, we are not required toshow that (4.5) holds in this case.If (cid:96) < k , then | L v δ | = | L v | − n (cid:96) < n (cid:96) +1 ≤ n k , since n (cid:96) = max { n t | n t < | L v |} .Thus, (4.7) implies that (4.6) holds, which we have already shown to be sufficient forthe two assertions of the lemma to hold. Hence, the lemma follows by induction. (cid:3) Remark . Lemma 4.10 implies that β v δ , = ϑ (cid:48) (cid:96), lies in the subfield F n(cid:96) +1 regardless of the choice of bases used in the construction of Theorem 4.3. Conse-quently, if β v δ , cannot be forced to equal one, then it may still be possible to reducethe cost of the multiplications performed in Lines 10–17 of Algorithm 7 (similarly,Lines 4–11 of Algorithm 8) by choosing the representation of the elements in F sothat the cost of multiplication is reduced whenever one of the multiplicands be-longs to F n(cid:96) +1 . Such optimisations have previously been shown to be beneficial inpractice, particularly for multiplications by elements of small subfields, by Bern-stein and Chou [3] and Chen et al. [8]. These considerations also extend to thealgorithms for conversion between the LCH and the Newton or Lagrange bases ofSections 3.2–3.4. For these algorithms, if the tower used in the construction of thebasis contains small subfields, then Lemma 4.10 can be used to show that some ofthe precomputed elements ϕ v ( u, σ v,i ) belong to small subfields. Consequently, ifthe initial shift parameter λ also lies in a small subfield, as is the case when it iszero, then so too do some of the multiplicands in the base cases of the algorithms. Proof of Theorem 4.9.
Suppose there exists t ∈ { , . . . , m − } such that (4.3) holds,and there exists a vertex v ∈ V such that d ( v ) = n t . If v = u δ for some u ∈ V ,then d ( u ) (cid:54) = n t , since otherwise n t +1 = 2 n t < n t + | L v | = n t + ( | L u | − n t ) = | L u | ,contradicting the maximality of n t . Therefore, Lemma 4.10 implies that β v δ , = ϑ , ϑ , · · · ϑ t − , ϑ t − , (cid:32)(cid:18) ϑ t, ϑ t, (cid:19) nt − ϑ t, ϑ t, (cid:33) = Tr F nt +1 / F nt (cid:18) ϑ t, ϑ t, (cid:19) = 1 , as required. (cid:3) Generalised Cantor basis.
Gao and Mateer propose a generic method ofconstructing Cantor bases in the appendix of their paper [14]. We generalise theirconstruction to one that extends an arbitrarily chosen basis of F t / F to a basisof F mt / F that enjoys properties similar to those offered by Cantor bases. Theoriginal construction of Gao and Mateer then corresponds to the case t = 1. Bygeneralising their construction, we are able to take advantage of quadratic exten-sions in a different manner to the previous section in order to provide a greaterselection of reduction trees.Hereafter, we assume that F q m ⊆ F with m ∈ N positive and q = 2 t for somepositive t ∈ N . We also fix a basis β , . . . , β m t − of F q m / F which is given bythe following generalisation of the construction of Gao and Mateer: choose a basis { ϑ , . . . , ϑ t − } of F q / F , choose β (2 m − t , . . . , β m t − ∈ F q m such thatTr F q m / F q (cid:0) β (2 m − t + i (cid:1) = ϑ i for i ∈ { , . . . , t − } , and recursively define β i = β qi + t − β i + t for i ∈ { , . . . , (2 m − t − } . AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 47 T u u u (cid:100) n/t (cid:101)− u (cid:100) n/t (cid:101) · · · T T T T (cid:100) n/t (cid:101)− T (cid:100) n/t (cid:101) Figure 8.
Construction of Theorem 4.12.For this construction, we provide generalisations of the properties of Cantor basesgiven in Lemma 2.9. The properties are then used to prove the following the-orem, which provides a method of constructing reduction trees for the vectors( β , . . . , β n − ) for n ∈ { , . . . , m t } . Theorem 4.12.
Let n ∈ { , . . . , m t } , T = ( V , E ) be a full binary tree with (cid:100) n/t (cid:101) leaves such that Im( d ) ⊆ { , , , . . . , (cid:100) log (cid:100) n/t (cid:101)(cid:101)− } , and T i = ( V i , E i ) be a reduc-tion tree for ( ϑ , . . . , ϑ min( n − ( i − t,t ) − ) , for i ∈ { , . . . , (cid:100) n/t (cid:101)} . Let u , . . . , u (cid:100) n/t (cid:101) ∈ V be the leaves of T , ordered such that for all i, j ∈ { , . . . , (cid:100) n/t (cid:101)} with i < j ,there exists an internal vertex v ∈ V with u i ∈ L v α and u j ∈ L v δ . Construct a newtree T = ( V, E ) by identifying the root vertex of T i with u i for i ∈ { , . . . , (cid:100) n/t (cid:101)} ,as shown in Figure 8. Then T is a reduction tree for ( β , . . . , β n − ) . Theorem 4.12 provides greater freedom than Theorem 4.3 by not requiring theinequality d ( v δ ) ≤ d ( v ) to hold for v ∈ V that are initially internal vertices in thetree T . Proposition 2.7 guarantees the existence of trees T , . . . , T (cid:100) n/t (cid:101) to use inthe construction. We can of course provide a better selection for these trees if themethods of Sections 4.1 and 4.2 are used to construct the basis { ϑ , . . . , ϑ t − } .The remainder of the section is dedicated to the proof of Theorem 4.12. Lemma 4.13.
The following hold: (1) β i = (cid:80) jr =0 (cid:0) jr (cid:1) β q r i + jt for i ∈ { , . . . , (2 m − j ) t − } and j ∈ { , . . . , m − } , (2) β i = β q k i +2 k t − β i +2 k t for i ∈ { , . . . , (2 m − k ) t − } and k ∈ { , . . . , m − } , (3) β i = ϑ i for i ∈ { , . . . , t − } , (4) β , . . . , β k t − ∈ F q k for k ∈ { , . . . , m − } , and (5) β , . . . , β m t − are linearly independent over F . Our proof of Lemma 4.13 generalises arguments found in Cantor’s paper [7] andthe paper of Gao and Mateer [14].
Proof.
We prove (1) by induction on j . It is clear that (1) holds if j = 0, regardlessof the value of i . Therefore, suppose that (1) holds for some j ∈ { , . . . , m − } and each i ∈ { , . . . , (2 m − j ) t − } . Then β i + t = j (cid:88) r =0 (cid:18) jr (cid:19) β q r i + t + jt = j (cid:88) r =0 (cid:18) jr (cid:19) β q r i +( j +1) t for i ∈ { , . . . , (2 m − j − t − } . As (2 m − j − t − ≤ (2 m − t −
1, it followsthat β i = β qi + t − β i + t = j (cid:88) r =0 (cid:18) jr (cid:19)(cid:16) β q r +1 i +( j +1) t − β q r i +( j +1) t (cid:17) = β i +( j +1) t + j +1 (cid:88) r =1 (cid:18)(cid:18) jr − (cid:19) + (cid:18) jr (cid:19)(cid:19) β q r i +( j +1) t = β i +( j +1) t + j +1 (cid:88) r =1 (cid:18) j + 1 r (cid:19) β q r i +( j +1) t = j +1 (cid:88) r =0 (cid:18) j + 1 r (cid:19) β q r i +( j +1) t for i ∈ { , . . . , (2 m − j − t − } . Thus, property (1) holds.For i, j ∈ N , Lucas’ lemma [22, p. 230] (see also [13]) implies that (cid:0) ij (cid:1) ≡ j ] k ≤ [ i ] k for k ∈ N . Using the lemma, property (2) followsfrom property (1) by setting j = 2 k . Similarly, Lucas’ lemma and property (1) with j = (2 m − k − k implies that β (2 k − t + i = (2 m − k − k (cid:88) r =0 (cid:18) (2 m − k − k r (cid:19) β q r (2 m − t + i = m − k − (cid:88) r =0 β q kr (2 m − t + i = Tr F q m / F q k (cid:0) β (2 m − t + i (cid:1) for i ∈ { , . . . , t − } and k ∈ { , . . . , m − } . Setting k = 0, property (3) fol-lows by the choice of β (2 m − t , . . . , β m t − . Moreover, the trace formula impliesthat β (2 k − t , . . . , β k t − ∈ F q k for k ∈ { , . . . , m − } , after-which the recursivedefinition of β , . . . , β (2 m − t − implies that property (4) holds.Property (3) implies that β , . . . , β t − are linearly independent over F , andbelong to the kernel of the F -linear map ϕ : F → F given by ω (cid:55)→ ω q − ω .Thus, for i ∈ { , . . . , m } , any nontrivial F -linear relation amongst β , . . . , β it − ,which necessarily involves at least one of β ( i − t , . . . , β it − , translates under ϕ to anontrivial relation amongst β , . . . , β ( i − t − . It follows that property (5) holds byinduction on i . (cid:3) Lemma 4.13 provides generalisations of the properties of Cantor bases givenin Lemma 2.9. The following lemma may be viewed as a partial generalisationProposition 2.8.
Lemma 4.14.
Let ( V, E ) be a full binary tree with n ≤ m t leaves that satisfiesthe following conditions: (1) if v ∈ V such that | L v | > t , then d ( v ) ∈ { k t | k < (cid:100) log (cid:100) n/t (cid:101)(cid:101)} , and AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 49 (2) if v ∈ V such that | L v | ≤ t , and v is either the root of the tree or a childof a vertex v (cid:48) ∈ V with | L v (cid:48) | > t , then the subtree of ( V, E ) rooted on v isa reduction tree for ( β , . . . , β | L v |− ) .Then ( V, E ) is a reduction tree for ( β , . . . , β n − ) . The following technical lemma is required for the proof of Lemma 4.14.
Lemma 4.15.
Let µ ∈ F n have linearly independent entries over F , ( V, E ) be areduction tree for µ , and ω ∈ F be nonzero. Then ( V, E ) is a reduction tree for ωµ .Proof. We prove the lemma by induction on n . The lemma holds trivially for n = 1. Therefore, let n ≥ n . Let µ = ( µ , . . . , µ n − ) ∈ F n have linearly independent entries over F ,( V, E ) be a reduction tree for µ , r ∈ V be the root vertex of the tree, and ω ∈ F be nonzero. Then µ i /µ = ωµ i / ( ωµ ) ∈ F d ( r ) for i ∈ { , . . . , d ( r ) − } , theinduction hypothesis implies that the subtree rooted on r α is a reduction treefor ωα ( µ, d ( r )) = α ( ωµ, d ( r )), and the subtree rooted on r δ is a reduction tree for δ ( µ, d ( r )) = δ ( ωµ, d ( r )). Therefore, ( V, E ) is a reduction tree for ωµ . Hence, thelemma follows by induction. (cid:3) Proof of Lemma 4.14.
We prove the lemma by induction of n . Condition (2) im-plies that the lemma holds trivially if n ≤ t . Therefore, let n ∈ { t + 1 , . . . , m t } and suppose that the lemma is true for all smaller values of n . Let ( V, E ) be a fullbinary tree with n leaves that satisfies conditions (1) and (2) of the lemma. Let β = ( β , . . . , β n − ) and r ∈ V be the root vertex of the tree. Then | L r | = n > t .Thus, (1) implies that d ( r ) = 2 k t for some k < (cid:100) log (cid:100) n/t (cid:101)(cid:101) ≤ m . Therefore,property (4) of Lemma 4.13 implies that β i /β ∈ F d ( r ) for i ∈ { , . . . , d ( r ) − } .Moreover, properties (2) and (4) of the lemma imply that(4.8) α ( β, d ( r )) = (cid:0) β , . . . , β d ( r ) − (cid:1) and δ ( β, d ( r )) = 1 β (cid:0) β , . . . , β n − d ( r ) − (cid:1) . The subtrees of (
V, E ) rooted on r α and r δ satisfy the conditions of the lemmathrough inheritance. Thus, the induction hypothesis and (4.8) imply that thesubtree rooted on r α is a reduction tree for α ( β, d ( r )). Similarly, the inductionhypothesis, Lemma 4.15 and (4.8) imply that the subtree root on r δ is a reductiontree for δ ( β, d ( r )). Therefore, ( V, E ) is a reduction tree for β . Hence, the lemmafollows by induction. (cid:3) We now complete the proof of Theorem 4.12 by showing that its constructionproduces binary trees that satisfy the conditions of Lemma 4.14.
Proof of Theorem 4.12.
For i ∈ { , . . . , (cid:100) n/t (cid:101)} , ( V i , E i ) is a full binary tree withmin( n − ( i − t, t ) leaves. Therefore, it is clear that ( V, E ) is a full binary tree with (cid:100) n/t (cid:101) (cid:88) i =1 min( n − ( i − t, t ) = n leaves. We show that ( V, E ) satisfies the conditions of Lemma 4.14.Suppose there exists a vertex v ∈ V such that | L v | > t . Then v is not descendedfrom or equal to u i for i ∈ { , . . . , (cid:100) n/t (cid:101)} , since ( V i , E i ) has at most t leaves. Bythe choice of ( V , E ), it follows that 2 k of the vertices u i are descended from v α for some k < (cid:100) log (cid:100) n/t (cid:101)(cid:101) . Let i , . . . , i k be the indices of these vertices. Then i , . . . , i k < (cid:100) n/t (cid:101) , since the ordering of the vertices u i implies that u (cid:100) n/t (cid:101) must beequal to v δ or one of its descendants. It follows that the subtrees of ( V, E ) rootedon u i , . . . , u i k each have t leaves. Thus, d ( v ) = 2 k t for some k < (cid:100) log (cid:100) n/t (cid:101)(cid:101) .Therefore, ( V, E ) satisfies condition (1) of Lemma 4.14.Suppose there exists a vertex v ∈ V such that | L v | ≤ t , and v is either the root ofthe tree or the child of a vertex v (cid:48) ∈ V with | L v (cid:48) | > t . Then v is descended from orequal to u i for some i ∈ { , . . . , (cid:100) n/t (cid:101)} , since the subtrees rooted on u , . . . , u (cid:100) n/t (cid:101)− each have t leaves, while the subtree rooted on u (cid:100) n/t (cid:101) has at least one leaf. If v is the root of ( V, E ), then (
V, E ) = ( V i , E i ). If v is the child of a vertex v (cid:48) ∈ V such that | L v (cid:48) | > t , and thus | L v (cid:48) | > | L u i | , then u i is a descendant of v (cid:48) . In eithercase, v is equal to u i . Thus, the choice of ( V i , E i ) and property (3) of Lemma 4.13imply that the subtree of ( V, E ) rooted on v is a reduction tree for ( β , . . . , β | L v |− ).Hence, ( V, E ) satisfies condition (2) of Lemma 4.14. (cid:3)
References
1. Eli Ben-Sasson, Iddo Bentov, Alessandro Chiesa, Ariel Gabizon, Daniel Genkin, MatanHamilis, Evgenya Pergament, Michael Riabzev, Mark Silberstein, Eran Tromer, and MadarsVirza,
Computational integrity with a public random string from quasi-linear PCPs , Ad-vances in cryptology—EUROCRYPT 2017. Part III, Lecture Notes in Comput. Sci., vol.10212, Springer, Cham, 2017, pp. 551–579.2. Eli Ben-Sasson, Iddo Bentov, Yinon Horesh, and Michael Riabzev,
Scalable, transparent, andpost-quantum secure computational integrity , Cryptology ePrint Archive, Report 2018/046,2018, https://eprint.iacr.org/2018/046 .3. Daniel J. Bernstein and Tung Chou,
Faster binary-field multiplication and faster binary-fieldMACs , Selected areas in cryptography—SAC 2014, Lecture Notes in Comput. Sci., vol. 8781,Springer, Cham, 2014, pp. 92–111.4. Daniel J. Bernstein, Tung Chou, and Peter Schwabe,
McBits: Fast constant-time code-basedcryptography , Cryptographic Hardware and Embedded Systems—CHES 2013, Lecture Notesin Comput. Sci., vol. 8086, Springer, Berlin, 2013, pp. 250–272.5. James R. Bitner, Gideon Ehrlich, and Edward M. Reingold,
Efficient generation of the binaryreflected Gray code and its applications , Comm. ACM (1976), no. 9, 517–521.6. Richard P. Brent, Pierrick Gaudry, Emmanuel Thom´e, and Paul Zimmermann, Faster mul-tiplication in
GF(2)[ x ], Algorithmic number theory—ANTS 2008, Lecture Notes in Comput.Sci., vol. 5011, Springer, Berlin, 2008, pp. 153–166.7. David G. Cantor, On arithmetical algorithms over finite fields , J. Combin. Theory Ser. A (1989), no. 2, 285–300.8. Ming-Shing Chen, Chen-Mou Cheng, Po-Chun Kuo, Wen-Ding Li, and Bo-Yin Yang, Fastermultiplication for long binary polynomials , 2017, arXiv:1708.09746 [cs.SC] .9. ,
Multiplying boolean polynomials with Frobenius partitions in additive fast Fouriertransform , 2018, arXiv:1803.11301 [cs.SC] .10. Tung Chou,
McBits revisited , Cryptographic Hardware and Embedded Systems—CHES 2017,Lecture Notes in Comput. Sci., vol. 10529, Springer, Cham, 2017, pp. 213–231.11. Nicholas Coxon,
Fast systematic encoding of multiplicity codes , arXiv:1704.07083 [cs.IT] ,Apr 2017.12. , Fast Hermite interpolation and evaluation over finite fields of characteristic two , arXiv:1807.00645 [cs.SC] , July 2018.13. N. J. Fine, Binomial coefficients modulo a prime , Amer. Math. Monthly (1947), 589–592.14. Shuhong Gao and Todd Mateer, Additive fast Fourier transforms over finite fields , IEEETrans. Inform. Theory (2010), no. 12, 6265–6272.15. David Harvey, A cache-friendly truncated FFT , Theoret. Comput. Sci. (2009), no. 27-29,2649–2658.16. Donald E. Knuth,
The art of computer programming. Vol. 4, Fasc. 2 , Addison-Wesley, UpperSaddle River, NJ, 2005.
AST TRANSFORMS OVER FINITE FIELDS OF CHARACTERISTIC TWO 51
17. Robin Larrieu,
The truncated Fourier transform for mixed radices , ISSAC’17—Proceedingsof the 2017 ACM International Symposium on Symbolic and Algebraic Computation, ACM,New York, 2017, pp. 261–268.18. Wen-Ding Li, Ming-Shing Chen, Po-Chun Kuo, Chen-Mou Cheng, and Bo-Yin Yang,
Frobe-nius additive fast Fourier transform , 2018, arXiv:1802.03932 [cs.SC] .19. Sian-Jheng Lin, Tareq Y. Al-Naffouri, and Yunghsiang S. Han,
FFT algorithm for binaryextension finite fields and its application to Reed-Solomon codes , IEEE Trans. Inform. Theory (2016), no. 10, 5343–5358.20. Sian-Jheng Lin, Tareq Y. Al-Naffouri, Yunghsiang S. Han, and Wei-Ho Chung, Novel poly-nomial basis with fast Fourier transform and its application to Reed–Solomon erasure codes ,IEEE Trans. Inform. Theory (2016), no. 11, 6284–6299.21. Sian-Jheng Lin, Wei-Ho Chung, and Yunghsiang S. Han, Novel polynomial basis and itsapplication to Reed-Solomon erasure codes , 55th Annual IEEE Symposium on Foundationsof Computer Science—FOCS 2014, IEEE Computer Soc., Los Alamitos, CA, 2014, pp. 316–325.22. Edouard Lucas,
Th´eorie des Fonctions Num´eriques Simplement P´eriodiques. [Continued] ,Amer. J. Math. (1878), no. 3, 197–240.23. Todd Mateer, Fast Fourier Transform algorithms with applications , ProQuest LLC, AnnArbor, MI, 2008, Ph.D. thesis–Clemson University.24. J. van der Hoeven,
Notes on the Truncated Fourier Transform , Tech. Report 2005-5, Univer-sit´e Paris-Sud, Orsay, France, 2005.25. Joris van der Hoeven,
The truncated Fourier transform and applications , ISSAC 2004—Proceedings of the 2004 international symposium on Symbolic and algebraic computation,ACM, New York, 2004, pp. 290–296.26. Joris van der Hoeven and ´Eric Schost,
Multi-point evaluation in higher dimensions , Appl.Algebra Engrg. Comm. Comput. (2013), no. 1, 37–52.27. Joachim von zur Gathen, Functional decomposition of polynomials: the tame case , J. SymbolicComput. (1990), no. 3, 281–299.28. Joachim von zur Gathen and J¨urgen Gerhard, Arithmetic and factorization of polynomialover F (extended abstract) , ISSAC ’96—Proceedings of the 1996 International Symposiumon Symbolic and Algebraic Computation, ACM, New York, 1996, pp. 1–9.29. Y. Wang and X. Zhu, A fast algorithm for the Fourier transform over finite fields and itsVLSI implementation , IEEE Journal on Selected Areas in Communications (1988), no. 3,572–577. INRIA Saclay–ˆIle-de-France & Laboratoire d’Informatique, ´Ecole polytechnique,91128 Palaiseau Cedex, France
E-mail address ::