Computing isomorphisms and embeddings of finite fields
Ludovic Brieulle, Luca De Feo, Javad Doliskani, Jean-Pierre Flori, ?ric Schost
CComputing isomorphisms and embeddings of finite fields
Ludovic Brieulle, Luca De Feo, Javad Doliskani,Jean-Pierre Flori and ´Eric SchostMay 4, 2017
Abstract
Let F q be a finite field. Given two irreducible polynomials f, g over F q , with deg f dividing deg g , the finite field embedding problem asks to compute an explicit descrip-tion of a field embedding of F q [ X ] /f ( X ) into F q [ Y ] /g ( Y ). When deg f = deg g , this isalso known as the isomorphism problem.This problem, a special instance of polynomial factorization, plays a central role incomputer algebra software. We review previous algorithms, due to Lenstra, Allombert,Rains, and Narayanan, and propose improvements and generalizations. Our detailedcomplexity analysis shows that our newly proposed variants are at least as efficient aspreviously known algorithms, and in many cases significantly better.We also implement most of the presented algorithms, compare them with the stateof the art computer algebra software, and make the code available as open source. Ourexperiments show that our new variants consistently outperform available software. Contents a r X i v : . [ c s . S C ] M a y Algorithm selection 28
Appendices 40
A Rain’s conic algorithm 40B Using j = 0 , in the elliptic Rains’ algorithm 41 B.1 The ordinary case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42B.2 The supersingular case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
C Elliptic periods 44
C.1 Normality of elliptic periods . . . . . . . . . . . . . . . . . . . . . . . . . . . 44C.2 Experimental evidence for the conjecture . . . . . . . . . . . . . . . . . . . . 47
Let q be a prime power and let F q be a field with q elements. Let f and g be irreducible poly-nomials over F q , with deg f dividing deg g . Define k = F q [ X ] /f ( X ) and K = F q [ Y ] /g ( Y );then, there is an embedding φ : k (cid:44) → K , unique up to F q -automorphisms of k . The goal ofthis paper is to describe algorithms to efficiently represent and evaluate one such embedding.All the algorithms we are aware of split the embedding problem in two sub-problems:1. Determine elements α ∈ k and β ∈ K such that k = F q ( α ), and such that thereexists an embedding φ mapping α (cid:55)→ β . We refer to this problem as the embeddingdescription problem . It is easily seen that α and β describe an embedding if and onlyif they share the same minimal polynomial.2. Given elements α and β as above, given γ ∈ k and δ ∈ K , solve the following problems: • Compute φ ( γ ) ∈ K . • Test if δ ∈ φ ( k ). • If δ ∈ φ ( k ), compute φ − ( δ ) ∈ k .We refer collectively to these problems as the embedding evaluation problem .2 otivation, previous work The first to get interested in this problem was H. Lenstra:in his seminal paper [38] he shows that it can be solved in deterministic polynomial time,by using a representation for finite fields that he calls explicit data . In practice, the em-bedding problem arises naturally when designing a computer algebra system: as soon as asystem is capable of representing arbitrary finite fields, it is natural to ask it to compute themorphisms between them. Ultimately, by representing effectively the lattice of finite fieldswith inclusions, the user is given access to the algebraic closure of F q . The first system toimplement a general embedding algorithm was Magma [4]. As detailed by its developers [5],it used a much simpler approach than Lenstra’s algorithm, entirely based on polynomialfactorization and linear algebra. Lenstra’s algorithm was later revived by Allombert [2, 3]who modified some key steps in order to make it practical; his implementation has sincebeen part of the PARI/GP system [61].Meanwhile, a distinct family of algorithms for the embedding problem was started byPinch [51], and later improved by Rains [53]. These algorithms, based on principles radicallydifferent from Lenstra’s, are intrinsically probabilistic. Although their worst-case complexityis no better than that of Allombert’s algorithm, they are potentially much more efficient ona large set of parameters. This potential was understood by Magma’s developers, whoimplemented Rains’ algorithm in Magma v2.14. With the exception of Lenstra’s work, the aforementioned papers were mostly concernedwith the practical aspects of the embedding problem. While it was generally understood thatcomputing embeddings is an easier problem than general polynomial factoring, no results onits complexity more precise than Lenstra’s had appeared until recently. A few months beforethe present paper was finalized, Narayanan published a novel generalization of Allombert’salgorithm [48], based on elliptic curve computations, and showed that its (randomized)complexity is at most quadratic. Narayanan’s generalization relies on the fact that Artin–Schreier and Kummer theories are special cases of a more general situation: as alreadyemphasized by Couveignes and Lercier [13], whereas the former theory acts on the additivegroup of a finite field, and the latter on its multiplicative group, they can be extended tomore general commutative algebraic groups, in particular to elliptic curves.
Our contribution
This work aims to be, in large part, a complete review of all knownalgorithms for the embedding problem; we analyze in detail the cost of existing algorithmsand introduce several new variants. After laying out the foundations in the next section, westart with algorithms for the embedding description problem.Section 3 describes the family of algorithms based (more or less loosely) on Lenstra’swork; we call these
Kummer-type algorithms. In doing so, we pay a particular attention toAllombert’s algorithm: to our knowledge, this is the first detailed and complete complexityanalysis of this algorithm and its variants. Thanks to our work on asymptotic complexity, wewere able to devise improvements to the original variants of Allombert that largely outper-form them both in theory and practice. One notable omission in this section is Narayanan’s Technically, Lenstra only proved his theorem in the case where k and K are isomorphic; however, thegeneralization to the embedding problem poses no difficulties. As a matter of fact, Rains’ algorithm was never published; the only publicly available source for it is inMagma’s source code (file package/Ring/FldFin/embed.m , since v2.14). elliptic periods to solve the embedding descriptionproblem, and that the resulting algorithm behaves well both in theory and in practice. Whileworking out the correctness proof of the elliptic variant of Rains’ algorithm, we encounteran unexpected difficulty: whereas roots of unity enjoy Galois properties that guaranteethe success of Rains’ original algorithm, points of elliptic curves fail to provide the same.Heuristically, the failure probability of the elliptic variant is extremely small, however we arenot able to prove it formally. Our experimental searches even seem to suggest that the failureprobability might be, surprisingly, zero. We state this as a conjecture on elliptic periods (seeConjecture 23); our findings and supporting evidence are summarized in Appendix C.Section 6 does a global comparison of all the algorithms presented previously. In par-ticular, Rains’ algorithm and variants require a non-trivial search for parameters, whichwe discuss thoroughly. Then we present an algorithm to select the best performing em-bedding description algorithm from a practical point of view. This theoretical study iscomplemented by the experimental Section 7, where we compare our implementations ofall the algorithms; our source code is made available through the Git repository https://github.com/defeo/ffisom for replication and further scrutiny.Our review could not be complete without a presentation of all the embedding evaluationalgorithms, which we undertake in Section 8. Given that the algorithms of this section aremuch more classical and well understood, we only give a theoretical presentation, with noexperimental support.In conclusion, we hope that our review will constitute a reference guide for researchersand engineers interested in implementing embeddings of finite fields in a computer algebrasystem.
Acknowledgments
We would like to thank Eric M. Rains for sharing his preprint with us.We also thank Bill Allombert, Christian Berghoff, Jean-Marc Couveignes, Reynald Lercier,and Benjamin Smith for fruitful discussions.
We review the fundamental building blocks that constitute the algorithms presented next.We are going to measure all complexities in number of operations +, × , ÷ in F q , unlessexplicitly stated otherwise. Most of the algorithms we present are randomized; we use the big-Oh notation O ( ) to express average asymptotic complexity, and we will make it clear4hen this complexity depends on heuristics. We also occasionally use the notation ˜ O ( ) toneglect logarithmic factors in the parameters.We let M ( m ) be a function such that polynomials in F q [ X ] of degree less than m canbe multiplied in M ( m ) operations in F q , under the assumptions of [62, Ch. 8.3], togetherwith the slightly stronger one, that M ( mn ) is in O ( m ε M ( n )) for all ε >
0. Using FFTmultiplication, one can take M ( m ) ∈ O ( m log( m ) loglog( m )) [9].We denote by ω the exponent of linear algebra , i.e. a constant such that m × m matriceswith coefficients in any field k can be multiplied using O ( m ω ) additions and multiplicationsin k . One can take ω < .
38, the best result to date being in [36]; on the other hand, wealso suppose that ω > k isa finite field and ξ is some element of an algebraic extension of k , we will write k [ ξ ] for thering generated by ξ . To avoid confusion, when the extension generated by ξ is a finite field,we will write instead k ( ξ ).Some algorithms will operate in a polynomial ring k [ Z ], where k is a field extension of F q ; some other algorithms will operate in k [ Z ] /h ( Z ), where h is a monic polynomial in k [ Z ].We review the basic operations in these rings. We assume that k is represented as a quotientring F q [ X ] /f ( X ), with m = deg f , and we let s = deg h in the complexity estimates.Multiplying and dividing polynomials of degree at most s in k [ Z ] is done in O ( M ( sm ))operations in F q , using Kronecker’s substitution [44, 30, 62, 63, 28]. Multiplication in k [ Z ] /h ( Z ) is also done in O ( M ( sm )) operations using the technique in [49]. By the sametechniques, gcds of degree m polynomials in k [ Z ] and inverses in k [ Z ] /h ( Z ) are computedin O ( M ( sm ) log( sm )) operations.Given polynomials e, g, h ∈ k [ Z ] of degree at most s , modular composition is the problemof computing e ( g ) mod h . An upper bound on the algebraic complexity of modular compo-sition is obtained by the Brent–Kung algorithm [8]; under our assumptions on the respectivecosts of polynomial and matrix multiplication, its cost is O ( s ( ω +1) / M ( m )) operations in F q (so if k = F q , this is O ( s ( ω +1) / )). In the binary RAM complexity model, the Kedlaya–Umansalgorithm [34] and its extension in [52] yield an algorithm with essentially linear complexityin s , m and log( q ). Unfortunately, making these algorithms competitive in practice is chal-lenging; we are not aware of any implementation of them that would outperform Brent andKung’s algorithm. Note 1.
If we have several modular compositions of the form e ( g ) mod h, . . . , e t ( g ) mod h to compute, we can slightly improve the obvious bound O ( ts ( ω +1) / ) (we discuss here k = F q ,so m = 1). If t = O ( s ), using [32, Lemma 4], this can be done in time O ( t ( ω − / s ( ω +1) / ).If t = Ω( s ), this can be done in O ( ts ω − ) operations, by computing 1 , g, . . . , g s − modulo f ,and doing a matrix product in size s × s by s × t . Frobenius evaluation
Consider an F q -algebra Q , and an element α in Q . Given integers c, d , we will have to compute expressions of the form σ d = α q d , τ d = d − (cid:88) i =0 α q ci , µ d = α (cid:98) q d /c (cid:99) . O ( d log( q )) multiplica-tions in Q for the first expression.To do better, we use a recursive approach that goes back to [64], with further ideasborrowed from [56, 31]. For i ≥
1, define integers A i , B i as follows q i = A i c + B i , ≤ B i < c. Then, we have the relations σ i + j = σ q i j , τ i + j = τ i + τ q ic j , µ i + j = µ q i j µ B j i α (cid:98) B i B j /c (cid:99) . Since we are interested in σ d , τ d and µ d , using an addition chain for d , we are left to perform O (log( d )) steps as above.To perform these operations, we will make a heavy use of a technique originating in [64].In its simplest form, it amounts to the following: if Q = F q [ X ] /f ( X ), for some polynomial f in F q [ X ], and β is in Q , we can compute β q by means of the modular composition β ( ξ ),where ξ = x q and x is the image of X modulo f .In the following proposition, we discuss versions of this idea for various kinds of algebras Q , and how they allow us to compute the expressions σ d , τ d , µ d defined above. Proposition 2.
Let f ∈ F q [ X ] be a polynomial of degree m , and define the F q -algebra Q = F q [ X ] /f ( X ) . Let h ∈ Q [ Z ] be a polynomial of degree s , and define the Q -algebra S = Q [ Z ] /h ( Z ) . Finally, whenever h ∈ F q [ Z ] , define the F q -algebra Q (cid:48) = F q [ Z ] /h ( Z ) .Denote by T Q , T S , T Q (cid:48) the cost, in terms of F q -operations, of one modular compositionin Q, S, Q (cid:48) respectively. Also denote by T Q,t ≤ tT Q (resp. T S,t , T Q (cid:48) ,t ) the cost of t modularcompositions sharing the same polynomial (see Note 1).Then the expressions σ d = α q d , τ d = d − (cid:88) i =0 α q ci , µ d = α (cid:98) q d /c (cid:99) can be computed using the following number of operations: Case 1. α ∈ Q : • σ d : O ( M ( m ) log( q ) + T Q log( d )) , • τ d : O ( M ( m ) log( q ) + T Q log( dc )) , • µ d : O ( M ( m ) log( q ) + ( T Q + M ( m ) log( c )) log( d )) ; Case 2. α ∈ Q with f | X r − : • σ d : O ( M ( m ) log( q ) + M ( r ) log( d )) , • τ d : O ( M ( m ) log( q ) + M ( r ) log( dc )) , • µ d : O ( M ( m ) log( q ) + ( M ( r ) + M ( m ) log( c )) log( d )) ; Case 3. α ∈ S : σ d : O ( M ( ms ) log( q ) + ( T Q,s + T S ) log( d )) , • τ d : O ( M ( ms ) log( q ) + ( T Q,s + T S ) log( dc )) , • µ d : O ( M ( ms ) log( q ) + ( T Q,s + T S + M ( ms ) log( c )) log( d )) ; Case 4. α ∈ S with h ∈ F q [ Z ] : • σ d : O (( M ( m ) + M ( s )) log( q ) + ( T Q,s + T Q (cid:48) ,m ) log( d ) , • τ d : O (( M ( m ) + M ( s )) log( q ) + ( T Q,s + T Q (cid:48) ,m ) log( dc )) , • µ d : O (( M ( m )+ M ( s )) log( q )+( T Q,s + T Q (cid:48) ,m +( m M ( s )+ s M ( m )) log( c )) log( d )) ; Case 5. α ∈ S with h | X r − a for a ∈ Q : • σ d : O ( M ( m ) log( q ) + ( T Q,s + M ( mr )) log( d )) , • τ d : O ( M ( m ) log( q ) + ( T Q,s + M ( mr )) log( dc )) , • µ d : O ( M ( m ) log( q ) + ( T Q,s + M ( mr ) + M ( m ) log( c )) log( d )) .Proof. The complexity estimates mostly rely on the complexity of modular composition.
Case 1.
We let x be the image of X in Q , and we start by computing x q , using O ( M ( m ) log( q ))operations in F q .For i ≥
0, given ξ i = x q i and β in Q , we can compute β q i as β q i = β ( ξ i ), using T Q = O ( m ( ω +1) / ) operations; in particular, this allows us to compute ξ i + j fromthe knowledge of ξ i and ξ j . Given an addition chain for d , we thus compute allcorresponding ξ i ’s, and we deduce the σ i ’s similarly, since σ i + j = σ j ( ξ j ). Altogether,starting from ξ = x q , this gives us σ d for O ( T Q log( d )) further operations in F q .The same holds for τ d , with a cost in O ( T Q log( cd )), since we have to compute ξ c first; and for µ d , with a cost in O (( T Q + M ( m ) log( c )) log( d )) operations, as theformula for µ i + j shows that we can obtain it by means of a modular composition(to compute µ q i j = µ j ( ξ i )), together with two exponentiations of indices less than c .The costs for computing σ d , τ d , µ d follow immediately. Case 2.
When f divides X r −
1, we obtain β ( ξ i ) by computing β ( X q i mod r ) mod ( X r − f ( X ). Thus, the cost of one modular compositionis T Q = O ( M ( r )), and the total cost is obtained by replacing this value in theestimates for the previous case. Case 3.
We let x and z be the respective images of X in Q and Z in S , and as a firststep, we compute z q (and x q , unless f is as in Case 2 above), in O ( M ( ms ) log( q ))operations.In order to compute the quantities σ d , τ d , µ d , we apply the same strategy as above;the key factor for complexity is thus the cost of computing β q i , for β in S , given ζ i = z q i and ξ i = x q i (as we did in Case 1, we apply this procedure to our input element α , as well as to ζ i itself, and ξ i , in order to be able to continue the calculation).7o do so, we use an algorithm by Kaltofen and Shoup [31], which boils down towriting β = (cid:80) s − j =0 c j ( x ) z j , so that β q i = (cid:80) s − j =0 c j ( x ) q i ζ ji . The s coefficients c j ( x ) q i are computed by applying the previous algorithms in Q to s inputs. This takestime at most sT Q , but as pointed out in Note 1, improvements are possible if webase our algorithm on modular composition; we thus denote the cost T Q,s .Then, we do a modular composition in S to evaluate the result at ζ i ; this latterstep takes T S = O ( s ( ω +1) / M ( m )) operations in F q . Case 4.
The cost for computing z q is O ( M ( s ) log( q )) and that for computing x q is O ( M ( m ) log( q )).In the last step, the cost T S of modular composition in S is now that of m modularcompositions in degree s (with the same argument), as detailed in Note 1, thatwe denote T Q (cid:48) ,m . Similarly, the cost of multiplication in S can be reduced from O ( M ( ms )) to O ( s M ( m ) + m M ( s )) operations. Case 5.
We start by computing x q , using O ( M ( m ) log( q )) operations in F q .For β as above, suppose that we have already computed all coefficients d j ( x ) = c j ( x ) q i in O ( T Q,s ) operations; we now have to compute β q i = (cid:80) s − j =0 d j ( x ) ζ ji .We first do the calculation modulo Z r − a rather than modulo h ; that is, wecompute (cid:80) s − j =0 d j ( x ) z ji where z i = z q i . Because z r = a , we have z i = a i z q i mod r ,with a i = a (cid:98) q i /r (cid:99) . If we assume that a i is known, we can compute (cid:80) s − j =0 d j ( x ) z ji using Horner’s method, in time O ( s M ( m )), and we reduce this result modulo h ,for the cost O ( M ( mr )) of a Euclidean division in degree r in Q [ Z ].In order to continue the calculation for all indices in our addition chain, we mustthus compute the corresponding a i ’s as well, just like the µ i ’s; this takes O ( T Q + M ( m ) log( r )) operations.Since the first stage of the algorithm took O ( T Q,s ) operations, we can take T S = O ( M ( mr )) for computing β q i .To initiate the procedure, the algorithm also needs to compute a = a (cid:98) q/r (cid:99) , using O (log( q )) multiplications in Q for a cost O ( M ( m ) log( q )). Computing subfields
With k = F q [ X ] /f ( X ) and deg f = m as above, we are given adivisor r of m , and we want to construct an intermediate extension F q ⊂ L ⊂ k of degree r over F q . More precisely, we want to compute a monic irreducible polynomial g ∈ F q [ X ]of degree r , and a polynomial h ∈ F q [ X ] such that x (cid:55)→ h ( x ) mod f defines an embedding L = F q [ X ] /g ( X ) (cid:44) → k . We proceed as follows.Let α ∈ k be a random element. Then α has a minimal polynomial of degree m over F q with high probability. In other words, one needs O (1) such random elements to find onewith degree m minimal polynomial. Now, the traceTr k/L ( α ) = α + α q r + · · · + α q m − r (1)has a minimal polynomial of degree r over F q with high probability as well. This means wecan compute, after O (1) random trials, the desired polynomials β = Tr k/L ( α ), its minimalpolynomial g , and h the polynomial of degree less than m representing β .8 roposition 3. Let F q ⊂ k be a finite extension of degree m , and let r be a divisor of m . Computing an intermediate field F q ⊂ L ⊂ k with [ L : F q ] = r takes an expected O ( m ( ω +1) / log( m ) + M ( m ) log( q )) operations in F q . Once L is computed, any element γ ∈ L can be lifted to its image in k using O ( m ( ω +1) / ) operations.Proof. Computing the minimal polynomial of an element in k takes O ( m ( ω +1) / ) operations in F q , see [55]. The trace in Eq. (1) is computed as the expression τ m of the previous paragraph(with c = r and d = m/r ), at a cost of O ( m ( ω +1) / log( m ) + M ( m ) log( q )) operations in F q .Finally, given an element γ ∈ L , its image in k is computed by evaluating h ( γ ), where h is the polynomial representation of Tr k/L ( α ). This can be done by a modular compositionat cost O ( m ( ω +1) / ). Root finding in cyclotomic extensions
Given a field k = F q [ X ] /f ( X ) of degree m as above, we will need to factor some special polynomials in k [ Z ]: we are interested infinding one factor of a polynomial that splits into factors of the same, known, degree. Thisproblem is known as equal degree factorization (EDF), and the best generic algorithm for itis the Cantor–Zassenhaus method [10, 64], which runs in O ( M ( sm )( dm log( q ) + log( sm )))operations in F q [62, Th. 14.9], where s is the degree of the polynomial to factor, and d isthe degree of the factors.More efficient variants of the Cantor–Zassenhaus method are known for special cases.When the degree s of the polynomial is small compared to the extension degree m , Kaltofenand Shoup [31] give an efficient algorithm which is as follows. Algorithm 1
Kaltofen–Shoup EDF for extension fields
Input:
A polynomial h with irreducible factors of degree d over k = F q [ X ] /f ( X ). Output:
An irreducible factor of h over k . If deg h = d return h . Take a random polynomial a ∈ k [ Z ] of degree less than deg h , Compute a ← md − (cid:88) i =0 a q i mod h , if q is an even power q = 2 e then Compute a ← e − (cid:88) i =0 a i mod h else Compute a ← a ( q − / mod h end if Compute h ← gcd( a , h ) and h ← gcd( a − , h ) and h − ← h/ ( h h ), Apply recursively to the smallest non-constant polynomial among h , h , h − .We refer the reader to the original paper [31] for the correctness of the Kaltofen–Shoupalgorithm. We are mainly interested here in its application to root extraction in cyclotomicextensions. Let r be a prime power and let f be an irreducible factor of the r -th cyclotomicpolynomial Φ r , with s = deg f . Denote F q [ X ] /f ( X ) by F q ( ζ ), where ζ is the image of X in the quotient. Given an r -th power α ∈ F q ( ζ ) we want to compute an r -th root α /r , orequivalently a linear factor of Z r − α over F q ( ζ ).9e propose two different algorithms; one of them is quadratic in r , whereas the otherone has a runtime that depends on r and s , and will perform better for small values of s . Proposition 4.
Let r be a prime power and let ζ be a primitive r -th root of unity; let also s = [ F q ( ζ ) : F q ] . One can take r -th roots in F q ( ζ ) using either O ( M ( s ) log( q ) + rs ω − log( r ) log( s ) + M ( rs ) log( s ) log( r )) or O ( M ( s ) log( q ) + r M ( r ) log( s ) + M ( rs ) log( s ) log( r )) operations in F q .Proof. We use Algorithm 1 with k = F q ( ζ ), to get a linear factor of the polynomial Z r − α ,so that d = 1 (note that Z r − α splits into linear factors in k [ Z ]). We discuss Step 3, which isthe dominant step. Let f ∈ F q [ X ] be the defining polynomial of F q ( ζ ) and let h be a factorof Z r − α of degree n .We are in Case 5 of our discussion on Frobenius evaluation, and we want to compute atrace-like expression of the form τ s . As per that discussion, two algorithms are available todo Frobenius evaluation in k (one of them uses modular composition, the other the fact that f divides X r − s ≤ r , we deduce that a can be computed in either O ( M ( s ) log( q ) + rs ω − log( s ) + M ( rs ) log( s ))or O ( M ( s ) log( q ) + n M ( r ) log( s ) + M ( rs ) log( s ))operations in F q , where the first term accounts for computing α (cid:98) q/r (cid:99) (so we need only computeit once). The depth of the recursion in Algorithm 1 is log( r ), and the degree n is halvedeach time, so we obtain the desired result. Root finding in some extensions of cyclotomic extensions
Let r = v d , where v (cid:54) = p is a prime and d is a positive integer and let s be the order of q in Z /v Z . We assume that d ≥
2, since this will be the case whenever we want to apply the following.Consider an extension F q ⊂ k = F q [ X ] /f ( X ) of degree r , and let F q ( ζ ) and k ( ζ ) beextensions of degree s over F q and k respectively, defined by an irreducible factor of the v -thcyclotomic polynomial over F q . In this paragraph, we discuss the cost of computing a v -throot in k ( ζ ), by adapting the root extraction algorithm given in [24].Following [24, Algorithm 3], one reduces the root extraction in k ( ζ ) to a root extractionin F q ( ζ ); note that [24, Algorithm 3] reduces the root extraction to the smallest possibleextension of F p , but projecting to F q ( ζ ) is more convenient here. The critical computationin this algorithm is a trace-like computation performing the reduction. Algorithm 2 v -th root in k ( ζ ) Input: a ∈ k ( ζ ) v Output: a v -th root of a repeat . choose a random c ∈ k ( ζ ) a (cid:48) ← ac v λ ← a (cid:48) ( q s − /v b ← λ + λ q s + · · · + λ q s + ··· + q ( r − s until b (cid:54) = 0 β ← ( a (cid:48) b v ) /v in F q ( ζ ) return βb − c − One multiplication in k ( ζ ) amounts to doing r multiplications modulo a degree s factor ofΦ v , and s multiplications modulo f ; since s ≤ r , this takes O ( s M ( r )) operations in F q . Thecomputation of λ = a (cid:48) ( q s − /v = a (cid:48)(cid:98) q s /v (cid:99) can then be done as explained in our discussion onFrobenius evaluation (Case 4). The cost of each modular composition is O ( s ( ω − / r ( ω +1) / ),for a total of O ( s ( ω − / r ( ω +1) / log( s ) + s M ( r ) log( q )) operations in F q .The trace-like computation of 1 + λ + λ q s + · · · + λ q s + ··· + q ( r − s can be done as follows.Let x be the image of X in k = F q [ X ] /f ( X ). To compute x q s we first compute x q using O ( M ( r ) log( q )) operations in F q , and then do log( s ) modular compositions in k . To compute λ q s , note that an element λ ∈ k ( ζ ) can be written as λ = λ ( x ) + λ ( x ) ζ + · · · + λ s − ( x ) ζ s − and that ζ q s = ζ . Therefore for any i , λ q is = s − (cid:88) j =0 λ j ( x q is ) (cid:16) ζ q is (cid:17) j = s − (cid:88) j =0 λ j ( x q is ) ζ j . In particular, given x q is , λ q is can be computed using O ( s ( ω − / r ( ω +1) / ) operations in F q ,and [24, Algorithm 2] can be applied in a direct way, with a cost of O ( s ( ω − / r ( ω +1) / log( r ) + M ( r ) log( q )) operations in F q .The root extraction in F q ( ζ ) is done as in the previous paragraph, and have a negligiblecost, since we assumed that s ≤ v ≤ √ r . Therefore, we arrive at the following result. Proposition 5.
With k , ζ and v as above, one can extract v -th roots in k ( ζ ) using anexpected O ( s ( ω − / r ( ω +1) / log( r ) + s M ( r ) log( q )) operations in F q . We are finally ready to address the problem of describing the embedding of k = F q [ X ] /f ( X )in K = F q [ Y ] /g ( Y ); throughout the paper we let m = deg f and n = deg g , so that m | n .The embedding description problem asks to find two elements α ∈ k and β ∈ K such that α (cid:55)→ β for some field embedding φ : k → K . This is equivalent to α and β having the sameminimal polynomial.The most obvious way to solve this problem is to take the class of X in k = F q [ X ] /f ( X )for α , and a root of f in K for β . Since f splits completely in K , we can apply Algorithm 1for the special case d = 1. Using our discussion on the cost of Frobenius evaluation (precisely,Case 4), we obtain an upper bound of O (cid:0) ( nm ( ω +1) / + M ( m ) n ( ω +1) / + m M ( n ) log( q )) log( m ) (cid:1) expected operations in F q for the problem. We remark that this complexity is strictly largerthan ˜ O ( m ). 11or a more specialized approach, we note that it is enough to solve the following problem:let r be a prime power such that r | m and gcd( r, m/r ) = 1, find α r ∈ k and β r ∈ K suchthat α r and β r have the same minimal polynomial, of degree r .Indeed, once such α r and β r are known for every primary factor r of m , possible solutionsto the embedding problem are α = (cid:89) r | m, gcd( r,m/r )=1 α r , β = (cid:89) r | m, gcd( r,m/r )=1 β r , or α = (cid:88) r | m, gcd( r,m/r )=1 α r , β = (cid:88) r | m, gcd( r,m/r )=1 β r . Moreover, to treat the general embedding description problem, it is sufficient to treatthe case where [ k : F q ] = [ K : F q ] = r . Indeed, we can reduce to this situation by applyingProposition 3, at an additional cost of O ( n ( ω +1) / log( n ) + M ( n ) log( q )) for each primaryfactor r . Therefore, to simplify the exposition, we focus on algorithms solving the followingproblem. Problem.
Let r be a prime power and k, K a pair of extensions of F q of degree r . Describean isomorphism between k and K .Note that although some algorithms are restricted to this situation, especially thosepresented in Section 3, some of them could still be readily applied to a more general situation,especially those from Sections 4 and 5.All algorithms presented next are going to rely on one common principle: construct anelement in k (and in K ) such that its minimal polynomial (or, equivalently, its orbit underthe absolute Galois group of F q ) is uniquely (or almost uniquely) defined. In this section, we review what we call
Kummer-type approaches to the embedding prob-lem for prime power degree extensions. We briefly review the works of Lenstra [38], andAllombert [2, 3], then we give variants of these algorithms with significantly lower complexi-ties. As stated above, we let k, K be degree r extensions of F q , where r is a prime power, andwe let p be their characteristic. We give our fast versions of the algorithms for two separatecases: the case p (cid:45) r is treated in Section 3.1, the case r = p d , where d is a positive integer, istreated in Section 3.2. Finally, in Section 3.3 we give a variant of the case p (cid:45) r better suitedfor the case where r is a high-degree prime power.In [38], Lenstra proves that given two finite fields of the same size, there exists a determin-istic polynomial time algorithm that finds an isomorphism between them. The focus of thepaper is on theoretical computational complexity; in particular, it avoids using randomizedsubroutines, such as polynomial factorization. In [2, 3], Allombert gives a similar approachwith more focus on practical efficiency. In contrast to Lenstra’s, his algorithm relies onpolynomial factorization, thus it is polynomial time Las Vegas. Even though neither of the12wo algorithms is given a detailed complexity analysis, both rely on solving linear systems,thus a rough analysis yields an estimate of O ( r ω ) operations in F q in both cases.The idea of Lenstra’s algorithm is as follows. Assume that r is prime, and let F q [ ζ ]denote the ring extension F q [ Z ] / Φ r ( Z ) where Φ r is the r -th cyclotomic polynomial. Let τ be a non r -adic residue of F q [ ζ ], and let F q [ ζ ][ θ ] denote the quotient F q [ ζ ][ Y ] / ( Y r − τ ) suchthat θ = τ /r is the residue class of Y . Lenstra shows that F q [ ζ ][ θ ] is isomorphic to k [ ζ ]as a ring (Lenstra actually goes the other way around and constructs τ from θ as τ = θ r whereas θ itself comes from a normal basis of k computed using linear algebra. In Lenstra’sterminology, θ and τ = θ r are generators of the Teichm¨uller subgroups of k [ ζ ] and F q [ ζ ] andsolutions to Hilbert’s theorem 90).Furthermore, the algorithm constructs θ , θ , and τ , τ in such a way that an integer j > ψ : F q [ ζ ][ θ ] → F q [ ζ ][ θ ] θ (cid:55)→ θ j is an isomorphism of rings. Finally, denoting by ∆ the automorphism group of k [ ζ ] over k ,an embedding k (cid:44) → K is obtained by restricting the above isomorphism ψ to the fixed field k [ ζ ] ∆ . To summarize, the algorithm is made of three steps: • Construct elements θ ∈ k [ ζ ] and θ ∈ K [ ζ ]; • Letting τ i = θ ri , find the integer j such that τ = τ j by a discrete logarithm computationin F q [ ζ ]; • Compute α ∈ k and β ∈ K as some functions of θ , θ j invariant under ∆.The algorithm is readily generalized to prime powers r by iterating this procedure.Allombert’s algorithms differ from Lenstra’s in two key steps, both resorting to polyno-mial factorization. First, he computes an irreducible factor h of the cyclotomic polynomialΦ r of degree s , and so constructs a field extension F q ( ζ ) as F q [ Z ] /h ( Z ). Then he defines k [ ζ ] = k [ Z ] /h ( Z ) and K [ ζ ] = K [ Z ] /h ( Z ) (note that these are not fields if r is not prime),and constructs θ ∈ k [ ζ ] and θ ∈ K [ ζ ] in a way equivalent to Lenstra’s using linear algebra.At this point, rather than computing a discrete logarithm, Allombert points out that thereexists a c ∈ F q ( ζ ) such that θ (cid:55)→ cθ defines an isomorphism, and that such value can becomputed as the r -th root of θ r /θ r . Finally, by making the automorphism group of k [ ζ ] over k act on θ and θ , he obtains an embedding k (cid:44) → K . In this section, we analyze the complexity of Allombert’s original algorithm [2], that of itsrevised version [3], and we present new variants with the best known asymptotic complexities.The main difference with respect to the versions presented in [2, 3] is in the way we compute θ , θ , which are solutions to Hilbert’s theorem 90 as will become clear below. WhereasAllombert resorts to linear algebra, we rely instead on evaluation formulas that have a highprobability of yielding a solution. Recently, Narayanan [48, Sec. 3] independently describeda variant which is similar to our Proposition 8 in the special case s = 1.13 .1.1 General strategy Let k = F q [ X ] /f ( X ) where f has degree r , a prime power, and let x be the image of X in k . Let h ( Z ) be an irreducible factor of the r -th cyclotomic polynomial over F q . Then h has degree s where s is the order of q in the multiplicative group ( Z /r Z ) × . We form thefield extension F q ( ζ ) ∼ = F q [ Z ] /h ( Z ) and the ring extension k [ ζ ] = k [ Z ] /h ( Z ) ∼ = k ⊗ F q ( ζ )where ζ is the image of Z in the quotients. The action of the Galois group Gal( k/ F q ) canbe extended to k [ ζ ] by σ : k [ ζ ] → k [ ζ ] x ⊗ ζ (cid:55)→ x q ⊗ ζ . Allombert shows (see [2, Prop. 3.2]) that σ is an automorphism of F q ( ζ )-algebras, and thatits fixed set is isomorphic to F q ( ζ ). The same can be done for the ring K [ ζ ]. Let us restatethe algorithm for clarity. Algorithm 3
Allombert’s algorithm
Input:
Field extensions k, K of F q of degree r . Output:
The description of a field embedding k → K . Factor the r -th cyclotomic polynomial and make the extensions F q ( ζ ) , k [ ζ ] , K [ ζ ]; Find θ ∈ k [ ζ ] such that σ ( θ ) = ζθ ; Find θ ∈ K [ ζ ] such that σ ( θ ) = ζθ ; Compute an r -th root c of θ r /θ r in F q ( ζ ); Let α, β be the constant terms of θ , cθ respectively; return The field embedding defined by α (cid:55)→ β .The cyclotomic polynomial Φ r is factored over F q using [56, Theorem 9], and r -th rootextraction in F q ( ζ ) is done using Proposition 4, so we are left with the problem of finding θ (and θ ), that is, instances of Hilbert’s theorem 90.We now show how to do it in the extension k [ ζ ] / F q ( ζ ), the case of K [ ζ ] being analogous.We review approaches due to Allombert, that rely on linear algebra, and propose new al-gorithms that rely on evaluation formulas and ultimately polynomial arithmetic. Note thatall these variants can be directly applied to any extension degree r as long as p (cid:45) r , and donot require r to be a prime power. Nevertheless, in practice, it is more efficient to performcomputations for each primary factor independently and glue the results together in the end.If A is a polynomial with coefficients in F q ( ζ ), we will denote by ˆ A the morphism A ( σ )of the algebra k [ ζ ]; note that the usual property of q -polynomials holds: (cid:100) AB = ˆ A ◦ ˆ B . As some algorithmic details were omitted in Allombert’s publications, and no precise com-plexity analysis was performed, we extracted the details from PARI/GP source code [61] andperform the complexity analysis here. We also propose another variant, using an algorithmby Paterson and Stockmeyer. 14 llombert’s original algorithm
A direct solution to Hilbert’s theorem 90 is to find anon-zero θ ∈ k [ ζ ] such that (cid:92) ( S − ζ )( θ ) = 0.The original version of Allombert’s algorithm [2] does precisely this, by computing thematrix of the Frobenius automorphism σ of k/ F q using O ( M ( r ) log( q ) + r M ( r )) operationsin F q and then an eigenvalue of σ for ζ over F q ( ζ ) using linear algebra, at a cost of O (( rs ) ω )operations in F q . This gives a total cost of O ( s M ( r ) log( q ) + ( rs ) ω ) operations in F q . Allombert’s revised algorithm
Allombert’s revision of his own algorithm [3] uses thefactorization h ( S ) = ( S − ζ ) b ( S ) . (2)If we set h ( S ) = S s + (cid:80) s − i =0 h i S i , we can explicitly write b as b ( S ) = s − (cid:88) i =0 b i ( S ) ζ i , where (cid:40) b s − ( S ) = 1 ,b i − ( S ) = b i ( S ) S + h i . (3)Indeed, Horner’s rule shows that b − ( S ) = h ( S ), and by direct calculation we find that( S − ζ ) · b ( S ) = b − ( S ).We get a solution to Hilbert’s theorem 90 by evaluating b ( S ) = h ( S ) / ( S − ζ ) on anelement in the kernel of ˆ h over k , linear algebra now taking place over F q rather than F q ( ζ ).The details on the computation of ˆ h were extracted from PARI/GP source code and yieldthe following complexity. Proposition 6.
Using Allombert’s revised algorithm, a solution θ to Hilbert’s theorem 90can be computed in O ( M ( r ) log( q ) + sr M ( r ) + r ω ) operations in F q .Proof. As in Allombert’s original algorithm, one first computes the matrix of σ over k at acost of O ( M ( r ) log( q ) + r M ( r )) operations in F q .To get the matrix of ˆ h over k , one first computes the powers x q i for 0 ≤ i ≤ s using thematrix of σ , at a cost of O ( sr ) operations in F q . From them, one can iteratively computethe powers x jq i for 2 ≤ j ≤ r for a total cost of O ( sr M ( r )) operations in F q , and iterativelycompute the matrix of ˆ h for an additional total cost of O ( sr ) operations in F q , accountingfor the scalar multiplications by the coefficients of h . The total cost is therefore dominatedby O ( sr M ( r )) operations in F q .Given the matrix of ˆ h over k , computing an element in its kernel costs O ( r ω ) operationsin F q . The final evaluation of ˆ b is done using Eq. (3) and the matrix of σ for Frobeniuscomputations, for a cost of O ( sr ) operations in F q . Using the Paterson–Stockmeyer algorithm
Given the matrix M σ of σ , there is a natu-ral way of evaluating ˆ h at a reduced cost: the Paterson–Stockmeyer algorithm [50] computesthe matrix of ˆ h and h ( M σ ), using O ( √ sr ω ) operations in F q . The evaluations of σ that takea total of O ( sr ) operations in F q can be done directly using modular exponentiations, for atotal of O ( s M ( r ) log( q )). Proposition 7.
Using the Paterson–Stockmeyer algorithm and modular exponentiations, asolution θ to Hilbert’s theorem 90 can be computed in O ( s M ( r ) log( q ) + √ sr ω ) operationsin F q . It is immediate to see that the minimal polynomial of σ over k [ ζ ] is S r −
1; by directcalculation, we verify that it factors as S r − S − ζ ) · Θ( S ) = ( S − ζ ) r − (cid:88) i =0 ζ − i − S i . (4)Hence, we can set θ a = ˆΘ( a ) = a ⊗ ζ − + σ ( a ) ⊗ ζ − + · · · + σ r − ( a ) ⊗ ζ − r (5)for some a ∈ k chosen at random. Because of Eq. (4), θ a is a solution as long as it is non-zero.This is reminiscent of Lenstra’s algorithm [38, Th. 5.2].To ensure the existence of a such that θ a (cid:54) = 0, we only need to prove that k is notentirely contained in ker ˆΘ. But the maps σ i restricted to k are all distinct, thus Artin’stheorem on character independence (see [35, Ch VI, Theorem 4.1]) shows that they arelinearly independent, and therefore ˆΘ is not identically zero on k . In practice, we take a ∈ k at random until θ a (cid:54) = 0. Since the map ˆΘ is F q -linear and non-zero, it has rank at least 1,thus a random θ a is zero with probability less than 1 /q . Therefore, we only need O (1) trialsto find θ (and θ ).Using the polynomial b ( S ) introduced in Eq. (2), and defining g ( S ) = ( S r − /h ( S ), wecan rewrite Eq. (4) as Θ( S ) = b ( S ) · g ( S ) . (6)Then, the morphism ˆΘ can be evaluated as ˆ b ◦ ˆ g , the advantage being that g has coefficientsin F q , rather than in F q ( ζ ): we set τ a = ˆ g ( a ) for some a ∈ k chosen at random and compute θ a = ˆ b ( τ a ) using Eq. (3), yielding a solution to Hilbert’s theorem 90 as soon as τ a (cid:54) = 0. Asbefore, O (1) trials are enough to get θ a (cid:54) = 0.We now give three variations on the above algorithm to compute a candidate solution θ a more efficiently. Which algorithm has the best asymptotic complexity depends on the valueof s with respect to r ; we arrange them by increasing s . First solution: divide-and-conquer recursion.
We use a recursive algorithm similarto the computation of trace-like functions in Proposition 2, to directly evaluate θ a usingEq. (5). Let ξ = x q and θ a, = aζ − , and set the following recursive relations: ξ j = (cid:40) σ j/ ( ξ j/ ) j even, σ ( ξ j − ) j odd, θ a,j = (cid:40) θ a,j/ + ζ − j/ σ j/ ( θ a,j/ ) j even,( a + σ ( θ a,j − )) ζ − j odd. (7)Then θ a = θ a,r . 16 roposition 8. Given a ∈ k , the value θ a in Eq. (5) can be computed using O ( s ( ω − / r ( ω +1) / log( r ) + M ( r ) log( q )) operations in F q .Proof. The value ξ is computed by binary powering using O ( M ( r ) log( q )) operations, whilethe value θ a, is deduced from the polynomial h using O ( rs ) operations.To compute the recursive formulas in Eq. (7) we use the same technique as in Propo-sition 2: given b ∈ k [ ζ ], the value σ j ( b ) is computed as the modular composition of thepolynomial b ( x, z ) with the polynomial ξ j ( x ) in the first argument. Each modular compo-sition in k [ ζ ] is done using s modular compositions in k , at a cost of O ( s ( ω − / r ( ω +1) / )operations (see Note 1). Multiplications by ζ − j are done by seeing the elements of k [ ζ ]as polynomials in x over F q ( ζ ), thus performing r multiplications modulo h , at a cost of O ( r M ( s )) operations. Given that the total depth of the recursion is O (log( r )), we obtain thestated bound. Second solution: automorphism evaluation.
We use Eq. (6) and Eq. (3) to compute θ a as θ a = ˆ b ◦ ˆ g ( a ). Proposition 9.
Given a ∈ k , the value θ a in Eq. (5) can be computed using O ( r ( ω − ω − / ( ω − + ( s + r / (5 − ω ) ) M ( r ) log( q )) operations in F q .Proof. We proceed in two steps. We first compute ˆ g ( a ) using the automorphism evaluation algorithm of Kaltofen and Shoup [32, Algorithm AE], at a cost of O ( r ( ω +1) / − ω ) | β − / | + r ( ω +1) / − β )( ω − / + r β M ( r ) log( q )), for any 0 ≤ β ≤
1. Choosing β = 2 / (5 − ω ) minimizesthe overall runtime, giving the exponents reported above.We then use Eq. (3) to compute θ a = (cid:80) s − i =0 a i ⊗ ζ i , where a s − = ˆ g ( a ), and a i − = σ ( a i ) + h i ˆ g ( a ). The cost of this computation is dominated by the evaluations of σ , whichtake O ( M ( r ) log( q )) operations each, thus contributing O ( s M ( r ) log( q )) total operations. Third solution: multipoint evaluation.
Finally, we can compute all the values σ ( a ) , . . . , σ r − ( a )directly, write θ a as a polynomial in x and ζ of degree r − h for each power x i . Proposition 10.
Given a ∈ k , the value θ a in Eq. (5) can be computed using O ( M ( r ) log( r ) + M ( r ) log( q )) operations in F q .Proof. The values σ ( a ) , . . . , σ r − ( a ) can be computed by binary powering using O ( r M ( r ) log( q )).We can do slightly better using the iterated Frobenius technique of von zur Gathen andShoup [64, Algorithm 3.1] (see also [62, Ch. 14.7]), which costs of O ( M ( r ) log( r )+ M ( r ) log( q ))operations. The final reduction modulo h costs O ( r M ( r ) log( r )) operations, which is negli-gible in front of the previous step. 17he following proposition summarizes our analysis. To clarify the order of magnitude ofthe exponents, let us assume q = O (1) and neglect polylogarithmic factors; then, if ω = 2 . O ( s . r . ) for s ∈ O ( r . ), O ( r . + s . r ) for s ∈ Ω( r . ) and s ∈ O ( r . ), and ˜ O ( r ) otherwise. For ω = 3, all costs are at bestquadratic. Proposition 11.
Given k, K of degree r over F q , assuming that s is the order of q in ( Z /r Z ) × , Algorithm 3 computes its output using • O ( s ( ω − / r ( ω +1) / log( r ) + M ( r ) log( q )) expected operations in F q if s ∈ O ( r ( ω − / ( ω − ) ,or • O ( r ( ω − ω − / ( ω − + ( s + r / (5 − ω ) ) M ( r ) log( q ) + s ω − r log( r ) log( s )) expected operationsin F q if if s ∈ Ω( r ( ω − / ( ω − ) and s ∈ O ( r / ( w − ) , or • O ( M ( r ) log ( r ) + M ( r ) log( r ) log( q )) expected operations in F q otherwise.Proof. The cost of factoring the r -th cyclotomic polynomial is an expected O ( M ( r ) log( rq ))operations in F q , using [56, Theorem 9]. This is negligible compared with other steps. Thesolutions θ , θ to Hilbert’s theorem 90 are computed as described above, according to thesize of s . The powers θ r , θ r are computed using Kronecker substitution in O ( M ( sr ) log( r ))operations, which is also negligible. Finally, the cost of computing an r -th root in F q ( ζ ) isgiven by Proposition 4 and can not be neglected.Combining the costs coming from the solution to Hilbert’s theorem 90 and the r -th rootextraction, we obtain the following complexities according to s . • If we use the algorithm described in our first solution, combining Proposition 8 withthe first case of Proposition 4, we obtain an estimate of O ( s ( ω − / r ( ω +1) / log( r ) + M ( r ) log( q )) operations. • If we use the algorithm described in our second solution, combining Proposition 9with the first case of Proposition 4, we obtain an estimate of O ( r ( ω − ω − / ( ω − + ( s + r / (5 − ω ) ) M ( r ) log( q ) + s ω − r log( r ) log( s ) + M ( rs ) log( r ) log( s )) operations. • Otherwise, we use the algorithm described in our third solution. Combining Proposi-tion 10 with the second case of Proposition 4, and replacing s with r everywhere, weobtain an estimate of O ( M ( r ) log ( r ) + M ( r ) log( r ) log( q )) expected operations.For s ∈ O ( r ( ω − / ( ω − ), the first solution has the better runtime. Assuming s ∈ Ω( r ( ω − / ( ω − ),the runtime in the second case can be written as O ( r ( ω − ω − / ( ω − +( s + r / (5 − ω ) ) M ( r ) log( q )+ s ω − r log( r ) log( s )). If in addition s is in O ( r / ( w − ), this runtime is subquadratic, that is,better than that in our third solution. This section is devoted to the case r = p d for some positive integer d . The technique wepresent here originates in Adleman and Lenstra’s work [1, Lemma 5], and appears againin Lenstra’s [38] and Allombert’s [2]. The chief difference with previous work once again18onsists in replacing linear algebra with a technique to solve the additive version of Hilbert’stheorem 90 similar to the one in the previous section. Recently, Narayanan [48, Sec. 4]independently described a related variant with a similar complexity.The idea is to build a tower inside the extension k/ F q using polynomials of the form X p − X − a where a ∈ k . To start, let a ∈ F q be such that Tr F q / F p ( a ) (cid:54) = 0. Let σ ∈ Gal( F q / F p ) be a generator of the Galois group. Then by the additive version of Hilbert’stheorem 90 there is no element α ∈ F q such that σ ( α ) − α = a . Equivalently, the polynomial f = X p − X − a has no root in F q . By the Artin–Schreier theorem in [35, Ch VI] f isirreducible over F q . For a root α ∈ k of f the extension F q ( α ) / F q is of degree p . Now let a = a α p − . Then by [1, Lemma 5] the polynomial f = X p − X − a is irreducible over F q ( α ). So, for a root α ∈ k of f the extension F q ( α , α ) / F q ( α ) is of degree p . Continuingthe above process we build a tower F q ⊂ F q ( α ) ⊂ · · · ⊂ F q ( α , · · · , α d ) = k. (8)The idea of building such tower using the Artin–Schreier polynomials f i can also be foundin [38, 2, 55]. By construction, α i / ∈ F q ( α , · · · , α i − ) for all 1 ≤ i ≤ d . This means thatthe minimal polynomial of α d over F q is of degree r = p d . Therefore, k = F q ( α d ), and theelement α d is uniquely defined up to F q -isomorphism.The above construction boils down to computing a root of the polynomial f = X p − X − a ∈ k [ X ]. We now show how to efficiently compute such a root. By construction, a is alwaysin an intermediate subfield F q ⊆ k (cid:48) ⊂ k . This meansTr k/ F p ( a ) = Tr k (cid:48) / F p (Tr k/k (cid:48) ( a )) = Tr k (cid:48) / F p ( p i a ) = 0for some i >
0. By Hilbert’s theorem 90 there exists α ∈ k such that α − σ ( α ) = − a for agenerator σ ∈ Gal( k/ F p ). In other words, α p − α − a = 0. Therefore, α is a root of f . Onthe other hand, for a random element θ ∈ k with nonzero trace, α can be explicitly set as α = 1Tr( θ ) [ aσ ( θ ) + ( a + σ ( a )) σ ( θ ) + · · · + ( a + σ ( a ) + · · · + σ rt − ( a )) σ rt − ( θ )] (9)where t = [ F q : F p ]. To compute α using Eq. (9) efficiently, we define ξ i = σ i ( x ) , β i ( u ) = u + σ ( u ) + · · · + σ i − ( u ) , α i ( v ) = β ( a ) σ ( v ) + · · · + β i ( a ) σ i ( v ) . A simple calculation gives α j + k ( v ) = α j ( v ) + σ j ( α k ( v )) + β j ( a ) σ j +1 ( β k ( v )) . From these we can extract the following recursive relations: ξ j = (cid:40) σ j/ ( ξ j/ ) j even σ ( ξ j − ) j odd ,β j ( u ) = (cid:40) β j/ ( u ) + σ j/ ( β j/ ( u )) j even u + σ ( β j − ( u )) j odd ,α j ( v ) = (cid:40) α j/ ( v ) + σ j/ ( α j/ ( v )) + β j/ ( a ) σ j/ ( β j/ ( v )) j even α ( v ) + σ ( α j − ( v )) + aσ ( β j − ( v )) j odd19herefore, the values Tr( θ ) = β rt ( θ ), and α = β rt ( θ ) − α rt ( θ ) can be computed recursively, in O (log( rt )) steps. At step j of the recursive algorithm, ξ j , β j ( a ) , β j ( θ ) , α j ( θ ) are computed.As before, the action of σ j is the same as composing with ξ j . So each step of the recursion isdominated by O (1) modular compositions over F q at the cost of O ( r ( ω +1) / ) operations in F q .The initial value of ξ = x q is computed using O ( M ( r ) log( q )) operations in F q . Therefore,the cost of computing a root of f is O ( r ( ω +1) / log( rt ) + M ( r ) log( q )) operations in F q .Now, to compute α d in Eq. (8) we need to take d roots where d ∈ O (log( r ) / log( p )) whichleads to the following result. (Note that ξ is computed only once and reused thereafter.) Proposition 12.
Let r = p d for a positive integer d , and let t = [ F q : F p ] . An isomorphismof two extensions k/ F q , K/ F q of degree r can be constructed using O ( r ( ω +1) / log( rt ) log( r ) + M ( r ) log( q )) operations in F q . We end this section with an algorithm that is particularly efficient when the extension degree r is a high-degree prime power. Allombert’s algorithm works well in this case, however itscomplexity depends linearly on the order s of q modulo r . If r = v d for some prime v (cid:54) = p ,it is natural to seek an algorithm which depends on the order of q modulo v instead. Theidea we present is a variation on Lenstra’s algorithm, using successive v -th root extractions.We are not aware of this algorithm appearing anywhere in the literature. We also note thatNarayanan [48, Sec. 5] recently published a radically different generalization of Allombert’salgorithm with a very similar complexity in r (his algorithm has much worse complexity in q , though).An overview of our construction is as follows. Let r = v d where v (cid:54) = p is a prime and d isa positive integer. Suppose the extension k/ F q is of degree r . Let s be the order of q in Z /v Z ,and write q s − uv t where gcd( v, u ) = 1. We first move to cyclotomic field extensions F q ( ζ ) , k ( ζ ) , K ( ζ ) of degree s over F q , k, K respectively, by obtaining an irreducible factorof the v -th cyclotomic polynomial over F q . Then we obtain a random non- v -adic residue η ∈ F q ( ζ ).We have [ k ( ζ ) : F q ( ζ )] = r , so we can compute an r -th root θ of η in k ( ζ ) using d successive v -th root extractions in k ( ζ ). Therefore, θ is a generator for the unique subgroupof k ( ζ ) ∗ of order v d + t . Then the constant term α of θ is such that k = F q ( α ). Doing thesame in K yields an element β ∈ K such that the map α (cid:55)→ β defines an isomorphism. Themain difficulty in applying such an algorithm resides in computing efficiently v -th roots in k ( ζ ), for which we use Proposition 5; this yields the main result of this section. Theorem 13.
Let r = v d where v (cid:54) = p is a prime and d is a positive integer. Also let s bethe order of q in Z /v Z . Given extensions k/ F q , K/ F q of degree r , an embedding k (cid:44) → K can be constructed at the cost of an expected O ( s ( ω − / r ( ω +1) / log( r ) + s M ( r ) log( r ) log( q ) operations in F q .Proof. We can construct the embedding of Theorem 13 as follows. We first build the ex-tensions k ( ζ ) / F q ( ζ ) and K ( ζ ) / F q ( ζ ). Let η be a non- v -adic residue in F q ( ζ ). Then η is an r -power in k ( ζ ) and K ( ζ ). To obtain r -th roots θ ∈ k , θ ∈ K of η we take d successive v -th roots. 20 lgorithm 4 Kummer-type algorithm for extension towers
Input:
Extensions k/ F q K/ F q of degree prime-power r = v d , with v (cid:54) = p . Output:
The description of a field embedding k (cid:44) → K . Factor the v -th cyclotomic polynomial over F q to build the extensions k ( ζ ) / F q ( ζ ) and K ( ζ ) / F q ( ζ ); Find a random non- v -adic residue η ∈ F q ( ζ ); Compute r -th roots θ , θ of η in k ( ζ ) , K ( ζ ); Let α, β be the constant terms of θ , θ respectively; return The field embedding defined by α (cid:55)→ β .Step 1 is done using [56, Theorem 9], which takes O ( M ( v ) log( vq )) operations in F q .We do Step 2 by taking random elements in F q ( ζ ) until a non- v -adic residue is found.Testing v -adic residuosity of η amounts to computing η ( q s − /v in F q ( ζ ), which can be done in O ( s ( ω − / log( s )+ M ( s ) log( v ) log( s )+ M ( s ) log( q )) operations in F q , in view of our discussionin Section 2.Step 3 is done using d = O (log( r ) / log( v )) successive root extractions, each of which takesan expected O ( s ( ω − / r ( ω +1) / log( r ) + s M ( r ) log( q )) operations in F q . Therefore Algorithm 4runs in an expected O ( s ( ω − / r ( ω +1) / log( r ) + s M ( r ) log( r ) log( q ) operations in F q . We now move on to a different family of algorithms based on the theory of algebraic groups.The simplest of these is Pinch’s cyclotomic algorithm [51]. The idea is very simple: given r ,select an integer (cid:96) such that [ F q ( µ (cid:96) ) : F q ] = r , where µ (cid:96) is the group of (cid:96) -th roots of unity.Then, any embedding k → K takes µ (cid:96) ⊂ k ∗ to µ (cid:96) ⊂ K ∗ , and the minimal polynomial of anyprimitive (cid:96) -th root of unity has degree exactly r .Pinch’s algorithm is very effective when r = ϕ ( (cid:96) ). Indeed in this case the (cid:96) -th cyclotomicpolynomial Φ (cid:96) is irreducible over F q , and its roots form a unique orbit under the action ofthe absolute Galois group of F q . Thus we can take any primitive (cid:96) -th roots of unity α ∈ k and β ∈ K to describe the embedding.In the general case, however, the roots of Φ (cid:96) are partitioned in ϕ ( (cid:96) ) /r orbits, thus for tworandomly chosen (cid:96) -th roots of unity ζ ∈ k and ζ ∈ K , we can only say that there exists anexponent e such that α = ζ (cid:55)→ ζ e = β defines a valid embedding. Pinch’s algorithm tests all possible exponents e , until a suitableone is found. To test for the validity of a given e , it applies the embedding φ : ζ (cid:55)→ ζ tothe class of X in k , and verifies that its image is a root of f in K (see Section 8 for detailson embedding evaluation).The trial-and-error nature of Pinch’s algorithm makes it impractical, except for rarefavorable cases where a small (cid:96) such that r = ϕ ( (cid:96) ) can be found. One possible workaround,suggested by Pinch himself, is to replace the group of roots of unity with a group of torsionpoints of a well chosen elliptic curve. We analyze this idea in greater detail in Section 5.This section is devoted to a different way of improving Pinch’s algorithm, imagined byRains [53], and implemented in the Magma computer algebra system [4]. Rains’ technical21ontribution is twofold: first he replaces roots of unity with Gaussian periods to avoid trial-and-error, second he moves to slightly larger extension fields to ensure the existence of asmall (cid:96) as above. For the rest of the section, we are going to assume that q is prime. The case where q is ahigher power of a prime is discussed in Note 18.Suppose that we have an (cid:96) , coprime with q , such that [ F q ( µ (cid:96) ) : F q ] = r , then thecyclotomic polynomial Φ (cid:96) factors over F q into ϕ ( (cid:96) ) /r distinct factors of degree r . Pinch’smethod, by choosing random roots of Φ (cid:96) in k and K , randomly selects one of these factorsas minimal polynomial. By combining the roots of Φ (cid:96) into Gaussian periods, Rains’ methoduniquely selects a minimal polynomial of degree r . Definition 14.
Let q be a prime, and let (cid:96) be a squarefree integer such that ( Z /(cid:96) Z ) × = (cid:104) q (cid:105)× S for some S . For any generator ζ (cid:96) of µ (cid:96) in F q ( µ (cid:96) ), define the Gaussian period η q ( ζ (cid:96) ) as η q ( ζ (cid:96) ) = (cid:88) σ ∈ S ζ σ(cid:96) . (10)It is evident from the definition that the Galois orbit of η q ( ζ (cid:96) ) is independent of theinitial choice of ζ (cid:96) . Much less evident is the fact that this orbit has maximal size and formsa normal basis of F q ( µ (cid:96) ), as stated in the following lemma. Lemma 15.
Let q be a prime, and let (cid:96) be a squarefree integer such that ( Z /(cid:96) Z ) × = (cid:104) q (cid:105) × S for some S . The periods η q ( ζ τ(cid:96) ) for τ running through (cid:104) q (cid:105) form a normal basis of F q ( µ (cid:96) ) over F q , independent of the choice of ζ (cid:96) .Proof. See [25, Main Theorem]. The main idea of the proof is to show that cyclotomic unitsare normal in characteristic zero, then that integrality conditions carry normality throughreduction modulo q .In what follows we are going to write η ( ζ (cid:96) ) when q is clear from the context. Example 16.
Consider the extension F / F of degree 3, which is generated by the 7-th rootsof unity. We have a decomposition ( Z / Z ) × = (cid:104) (cid:105) × (cid:104)− (cid:105) , and the cyclotomic polynomialfactors as Φ ( X ) = ( X + X + 1)( X + X + 1) . (11)For any root ζ , we define the period η ( ζ ) = ζ + ζ − . (12)The three periods η ( ζ ), η ( ζ ) and η ( ζ ) are all roots of the polynomial x + x + 1 andform a normal basis of F / F . 22 .2 Rains’ cyclotomic algorithm The bottom-line of Rains’ algorithm follows immediately from the previous section: given k , K and r ,1. find a small (cid:96) satisfying the conditions of Lemma 15 with [ F q ( µ (cid:96) ) : F q ] = r ;2. take random (cid:96) -th roots of unity ζ (cid:96) ∈ k and ζ (cid:48) (cid:96) ∈ K ;3. return the Gaussian periods α r = η ( ζ (cid:96) ) and β r = η ( ζ (cid:48) (cid:96) ).The problem with this algorithm is the vaguely defined smallness requirement on (cid:96) .Indeed the conditions of Lemma 15 imply that (cid:96) divides Φ r ( q ), thus in the worst case (cid:96) canbe as large as O ( q ϕ ( r ) ), which yields an algorithm of exponential complexity in the field size.To circumvent this problem, Rains allows the algorithm to work in small auxiliary exten-sions of k and K , and then descend the results to k and K via a field trace. In other words,Rains’ algorithm looks for (cid:96) such that [ F q ( µ (cid:96) ) : F q ] = rs for some small s . We summarizethis method in Algorithm 5; we only give the procedure for the field k , the procedure for thefield K being identical. Algorithm 5
Rains’ cyclotomic algorithm
Input:
A field extension k/ F q of degree r ; a squarefree integer (cid:96) such that • ( Z /(cid:96) Z ) × = (cid:104) q (cid:105) × S for some S , • (cid:104) q (cid:105) = rs for some integer s ;a polynomial h of degree s irreducible over k . Output:
A normal generator of k over F q , with a uniquely defined Galois orbit. Construct the field extension k (cid:48) = k [ Z ] /h ( Z ); repeat Compute ζ ← θ ( k (cid:48) − /(cid:96) for a random θ ∈ k (cid:48) until ζ is a primitive (cid:96) -th root of unity; Compute η ( ζ ) ← (cid:80) σ ∈ S ζ σ ; return α ← Tr k (cid:48) /k η ( ζ ) = (cid:80) s − i =0 η ( ζ ) q ri . Proposition 17.
Algorithm 5 is correct. On input q, r, (cid:96), s it computes its output using O ( sr ( ω +1) / log( sr ) + M ( sr )(log( q ) + ( (cid:96)/r ) log( (cid:96) ))) operations in F q on average.Proof. By construction k (cid:48) is isomorphic to F q ( µ (cid:96) ). By Lemma 15 η ( ζ ) is a normal generatorof k (cid:48) , and by [46, Prop. 5.2.3.1] α is a normal generator of k . This proves correctness.According to Proposition 2, computing ζ in Step 3 costs O (cid:0)(cid:0) s ( ω +1) / M ( r ) + sr ( ω +1) / + M ( sr ) log( (cid:96) ) (cid:1) log( sr ) + M ( sr ) log( q ) (cid:1) , and the loop is executed O (1) times on average. By observing that s ( ω − / ∈ O ( (cid:96)/r ), thisfits into the stated bound. 23teps 5 and 6 can be performed at once by observing that α = s − (cid:88) i =0 η ( ζ q ri ) = s − (cid:88) i =0 (cid:88) σ ∈ S ζ q ri σ . By reducing q ri σ modulo (cid:96) , we can compute this sum at the cost of ϕ ( (cid:96) ) /r exponentiationsof degree at most (cid:96) in k (cid:48) , for a total cost of O (( M ( sr )( (cid:96)/r ) log( (cid:96) )), using the techniques ofSection 2. The final result is obtained as an element of k .The attentive reader will have noticed the irreducible polynomial h of degree s given asinput to Rains’ algorithm. Computing this polynomial may be expensive. For a start, wemay ask s to be coprime with r , so that h can be taken with coefficients in F q . Then, forsmall values of s and q , one may use a table of irreducible polynomials. For larger values,the constructions [14, 18, 19] are reasonably efficient, and yield an irreducible polynomialin time less than quadratic in s . However negligible from an asymptotic point of view,the construction of the polynomial h and of the field k (cid:48) take a serious toll on the practicalperformances of Rains’ algorithm. This concludes the presentation of Rains’ algorithm. However, we are still left with aproblem: how to find (cid:96) satisfying the conditions of the algorithm, and what bounds can begiven on it. These questions will be analyzed in Section 6.
Note 18.
Rains’ algorithm is easily extended to a non-prime field F q , as long as q = p d withgcd( d, r ) = 1. In this case, indeed, any generator of F p r over F p is also a generator of F q r over F q . The algorithm is unchanged, except for the additional requirement that gcd( ϕ ( (cid:96) ) , d ) = 1,which ensures that the Gaussian periods indeed generate F p r .However, when gcd( d, r ) (cid:54) = 1, it is impossible to have ( Z /(cid:96) Z ) × = (cid:104) q (cid:105) × S , so Rains’algorithm simply cannot be applied to this case. In the next section we are going to presenta variant that does not suffer from this problem. The Pinch/Rains’ algorithm presented in the previous section relies on the use of the mul-tiplicative group of finite fields. It is natural to try to extend it to other types of algebraicgroups in order to cover a wider range of parameters. And indeed Pinch [51] showed howto use torsion points of elliptic curves in place of roots of unity. Rains also considered thispossibility, but did not investigate it thoroughly as no theoretical gain was to be expected.However, the situation in practice is quite different. In particular, the need for auxiliaryextensions in the cyclotomic method is very costly, whereas the elliptic variant has naturallymore chances to work in the base fields, and to be therefore very competitive.In the next sections, we first introduce elliptic periods , a straightforward generalizationof Gaussian periods for torsion points of elliptic curves, then analyze the cost of their com-putation. The main issue with this generalization is that, contrary to Gaussian periods, A straightforward way to avoid these constructions consists in computing a factor h of the cyclotomicpolynomial Φ (cid:96) over the extension k following case 5 from Section 2.1. Then, using Newton’s identities, theperiod can be recovered from the logarithmic derivative of the reciprocal of h . Nevertheless, the cost offactoring Φ (cid:96) renders this approach unpractical. An elliptic curve
E/L defined over a field L is given by an equation of the form E : y + a xy + a y = x + a x + a x + a with a , a , a , a , a ∈ L .For any field extension M/L the group of M -rational points of E is the set E ( M ) = { ( x, y ) ∈ M | E ( x, y ) = 0 } ∪ {O} endowed with the usual group law, where O is the point at infinity.For an integer (cid:96) , we denote by E [ (cid:96) ] the (cid:96) -torsion subgroup of E ( ¯ L ), where ¯ L denotes thealgebraic closure of L . In this section we are going to consider integers (cid:96) coprime with thecharacteristic of L , then E [ (cid:96) ] is a group of rank 2.For an elliptic curve E/ F q defined over a finite field, we denote by π its Frobeniusendomorphism . It is well known that π satisfies a quadratic equation π − tπ + q = 0,where t is called the trace of E , and that this equation determines the cardinality of E as E ( F q ) = q + 1 − t .Like in the cyclotomic case, the Frobenius endomorphism partitions E [ (cid:96) ] into orbits. Ourgoal is to take traces of points in E [ (cid:96) ] so that a uniquely defined orbit arises. This task ismade more complex by the fact that E [ (cid:96) ] has rank 2, hence we are going to restrict to afamily of primes (cid:96) named Elkies primes . Definition 19 (Elkies prime) . Let E/ F q be an elliptic curve, let (cid:96) be a prime number notdividing q . We say that (cid:96) is an Elkies prime for E if the characteristic polynomial of theFrobenius endomorphism π splits into two distinct factors over Z /(cid:96) Z : π − tπ + q = ( π − λ )( π − µ ) mod (cid:96) with λ (cid:54) = µ. (13)Note that if (cid:96) is an Elkies prime for E , then E [ (cid:96) ] splits into two eigenspaces for π whichare defined on extensions of F q of degrees ord (cid:96) ( λ ) and ord (cid:96) ( µ ). We are now ready to definethe elliptic curve analogue of Gaussian periods. Definition 20.
Let E/ F q be an elliptic curve of j -invariant not 0 or 1728. Let (cid:96) > E , λ an eigenvalue of π , and P a point of order (cid:96) in the eigenspacecorresponding to λ (i.e., such that π ( P ) = λP ). Suppose that there is a subgroup S of( Z /(cid:96) Z ) × such that ( Z /(cid:96) Z ) × = (cid:104) λ (cid:105) × S. (14)Then we define an elliptic period as η λ,S ( P ) = (cid:40)(cid:80) σ ∈ S/ {± } x ([ σ ] P ) if − ∈ S , (cid:80) σ ∈ S x ([ σ ] P ) otherwise, (15)where x ( P ) denotes the abscissa of P . 25or a generalization of this definition that also covers the cases j = 0 , Lemma 21.
With the same notation as in Definition 20, let (cid:104) λ (cid:105) = (cid:40) r if − / ∈ (cid:104) λ (cid:105) , r otherwise.Then, for any point P in the eigenspace of λ , the period η λ,S ( P ) is in F q r , and its minimalpolynomial does not depend on the choice of P .Proof. By construction, the Frobenius endomorphism π acts on (cid:104) P (cid:105) as multiplication by thescalar λ . It is well known that two points have the same abscissa if and only if they areopposite, hence the Galois orbit of x ( P ) has size r , and we conclude that both x ( P ) and η λ,S ( P ) are in F q r .Now let P (cid:48) = [ a ] P be another point in the eigenspace of λ . By construction, a = ± λ i σ ,for some 0 ≤ i < r and some σ ∈ S . Hence η λ,S ( P (cid:48) ) = η λ,S ([ λ i ] P ), implying that η λ,S ( P )and η λ,S ( P (cid:48) ) are conjugates in F q r .We remark that the previous lemma only states that the elliptic periods η λ,S ([ λ i ] P )uniquely define an orbit inside F q r , but gives no guarantee that they generate the whole F q r .At this point, one would like to have an equivalent of Lemma 15 for elliptic periods, i.e. thatthe elliptic period η λ,S ( P ) is a normal generator of F q ( x ( P )). However, it is easy to findnon-normal elliptic periods, as the following example shows. Example 22.
Let E/ F be defined by y = x + 5 x + 4, and consider the degree 3 extensionof F defined by k = F [ X ] / ( X + 6 X + 4). Then • (cid:96) = 31 is an Elkies prime for E ; • the eigenvalues of the Frobenius modulo (cid:96) are λ = 25 of multiplicative order 3 and µ = 4 of multiplicative order 5; • P = (5 a + 2 a,
4) is a point of order 31 of
E/k ; • η = η λ,S ( P ) = 5 a + 5 a + 4 is not a normal element, indeed η + 4 η + 2 η = 0.All well known proofs of Lemma 15 rely on the fact that the (cid:96) -th cyclotomic polynomialis irreducible over Q , and its roots form a normal basis of Q ( ζ (cid:96) ). This fails in the ellipticcase: there is indeed no guarantee that the eigenspace of λ can be lifted to a normal basisover some number field.Note however that, even if the elliptic period is not normal, it is enough for our purposethat it generates F q ( x ( P )) as a field, like in the example above. In Appendix C we gathersome experimental evidence suggesting that this might always be the case. Thus, we statethis as a conjecture. Conjecture 23.
With the above notation, the elliptic period η λ,S ( P ) generates F q ( x ( P ))over F q . 26f the conjecture is false, the only arguments we can give are of a heuristic nature. Firstand most simply, we can assume that the elliptic period behaves like a random elementof F q ( x ( P )). In this case the chance of it not being a generator is approximately 1 /q r .Secondly, in Appendix C we give a sufficient condition for the period to be a normal generatorof F q ( x ( P )). This is a weak counterpart to Lemma 15, based on the polynomially cyclicalgebras setting of [43]. Heuristically, it suggests that the chance of the period being normalis approximately 1 − /q . We are now ready to present the generalization of Rains’ algorithm,with the warning that the algorithm may fail, with low probability, if Conjecture 23 is false. Rain’s cyclotomic algorithm needs auxiliary extensions to accommodate for sufficiently smallsubgroups µ (cid:96) of the unit group. By replacing unit groups with torsion groups of ellipticcurves, we gain more freedom on the choice of the size of the group, thus we are able to workwith smaller fields.The algorithm is very similar to Algorithm 5, and follows immediately from the previoussection. For simplicity, we are going to state it only for r odd. Given k , K and r ,1. find a prime (cid:96) , an elliptic curve E , and an eigenvalue λ of the Frobenius endomorphism,satisfying the conditions of Definition 20, and such that ord (cid:96) ( λ ) = r ;2. take random points P ∈ E ( k )[ (cid:96) ] and P (cid:48) ∈ E ( K )[ (cid:96) ] in the eigenspace of λ ;3. return the elliptic periods α := η λ,S ( P ) and β := η λ,S ( P (cid:48) ).Here we are faced with a difficulty: given E and λ it is easy to pick a random point in E [ (cid:96) ], but it is potentially much more expensive to compute a point in the eigenspace of λ .We will circumvent the problem by forcing E ( F q r )[ (cid:96) ] to be of rank 1, and to coincide exactlywith the eigenspace of λ . If we write µ = q/λ for the other eigenvalue of π , this is easilyensured by further asking that ord (cid:96) ( µ ) (cid:45) r .We defer the discussion on the search for the elliptic curve E to Section 6. Here wesuppose that we are already given suitable parameters (cid:96) , E and λ , and analyze the last twosteps of the algorithm, summarized below. We only give the procedure for k , the procedurefor the field K being identical. Algorithm 6
Elliptic Rain’s algorithm
Input:
A field extension k/ F q of odd degree r , an elliptic curve E/ F q , its trace t , a prime (cid:96) not dividing q , an integer λ such that: • X − tX + q = ( X − λ )( X − q/λ ) mod (cid:96) , • ord (cid:96) ( λ ) = r , ord (cid:96) ( q/λ ) (cid:45) r , • ( Z /(cid:96) Z ) × = (cid:104) λ (cid:105) × S for some S . Output:
A generator of k over F q , with a uniquely defined Galois orbit, or FAIL. repeat Compute P ← [ E ( k ) /(cid:96) ] Q for a random Q ∈ E ( k );27 . until P (cid:54) = O ; Compute α ← η λ,S ( P ); return α if k = F q ( α ), FAIL otherwise. Proposition 24.
Algorithm 6 is correct. Assuming the heuristics developed in Appendix C, itfails with probability ≤ /q r . On input r, q, E, t, (cid:96), λ it computes its output using O ( M ( r )( r log( q )+( (cid:96)/r ) log( (cid:96) )) operations in F q on average, or ˜ O ( r log( q )) assuming (cid:96) ∈ o ( r ) .Proof. Correctness follows immediately from Lemma 21. Success probability comes fromthe assumption that η λ,S ( P ) behaves like a random element of F q ( x ( P )), as discussed inAppendix C.From the knowledge of the trace t , we immediately determine the zeta function of E , andhence the cardinality E ( k ), at no algebraic cost.To select the random point Q ∈ E ( k ) we take a random element x ∈ k , then we verifythat it is the abscissa of a point using a squareness test, at a costs of O ( r M ( r ) log( q )) opera-tions. Then, using Montgomery’s formulas for scalar multiplication [45], we can compute thepoints P and [ (cid:96) ] P without the knowledge of the ordinate of Q , at a cost of O ( r M ( r ) log( q ))operations. A valid point is obtained after O (1) tries on average.The computation of the elliptic period α requires O ( (cid:96)/r ) scalar multiplications by aninteger less than (cid:96) , for a total cost of O (( M ( r )( (cid:96)/r ) log( (cid:96) )).Finally, testing that α generates k is done by computing its minimal polynomial, at acost of O ( r ( ω +1) / ) operations in F q using [55]. The algorithms presented in the previous sections have very similar complexities, and noone stands out as absolute winner. The complexity of all algorithms, besides the naive one,depends in a non-trivial way on the parameters q and r , and, for Rains’ algorithms, on thesearch for a parameter (cid:96) and an associated elliptic curve.This section studies the complexity of the embedding description problem from a globalperspective. We give procedures to find parameters for Rains’ algorithms, criteria to choosethe best among the embedding algorithms, and asymptotic bounds on the embedding de-scription problem. Given parameters q = p d and r , Rains’ cyclotomic algorithm asks for a small parameter (cid:96) such that:1. ( Z /(cid:96) Z ) × = (cid:104) q (cid:105) × S for some S ,2. (cid:104) q (cid:105) = rs for some integer s ,3. gcd( ϕ ( (cid:96) ) , d ) = 1 (see Note 18). 28ince r is a prime power, the second condition lets us take a prime power for (cid:96) too.Indeed if Z /(cid:96) Z (cid:39) Z /(cid:96) Z × Z /(cid:96) Z , then either q mod (cid:96) or q mod (cid:96) has order a multiple of r .Furthermore, if gcd( (cid:96), r ) = 1, then we can take (cid:96) prime, since higher powers would not helpsatisfy the conditions. On the other hand if gcd( (cid:96), r ) (cid:54) = 1, then the algorithms of Section 3have much better complexity. Hence we shall take (cid:96) prime.Given the above constraints, we can rewrite the conditions as:1. (cid:96) = rsv + 1 for some s, u such that gcd( rs, v ) = 1,2. ord (cid:96) ( q ) = rs ,3. gcd( rsv, d ) = 1. Remark.
Rains remarked that, when q = 2 and r is a power of 2 greater than 4, no (cid:96) can satisfy these constraints because 2 is a quadratic residue modulo any prime of the form8 u + 1. This case, however, is covered by the Artin–Schreier technique in Section 3.2, wethus ignore it.In the elliptic algorithm we look for an integer (cid:96) and a curve E/ F q that satisfy thepreconditions of Algorithm 6, i.e., such that1. the Frobenius endomorphism π satisfies a characteristic equation ( π − λ )( π − µ ) = 0mod (cid:96) ,2. ( Z /(cid:96) Z ) × = (cid:104) λ (cid:105) × S for some S ,3. (cid:104) λ (cid:105) = r , and4. µ r (cid:54) = 1 mod (cid:96) .As before, we only need to look at prime (cid:96) . Because µ = q/λ , the last condition isequivalent to q r (cid:54) = 1 mod (cid:96) . Hence, we can restate the conditions on (cid:96) as1. (cid:96) = ru + 1 for some u such that gcd( r, u ) = 1,2. q r (cid:54) = 1 mod (cid:96) .Once (cid:96) is found, we compile a list of all integers of order r in ( Z /(cid:96) Z ) × , and look for a curveof trace t = λ + q/λ mod (cid:96) for any λ in the list. Note, however, that for there to be sucha curve, t must have a representative in the interval [ − √ q, √ q ]. In order to have a goodchance of finding such curves, we are going to set an even more stringent bound (cid:96) ∈ o ( √ q ).We propose a procedure to simultaneously find parameters for the cyclotomic and theelliptic case in Algorithm 7. The procedure is given bounds on the size of the parameterssought, and outputs all suitable parameters within those bounds. Algorithm 7
Parameter selection for Rains’ algorithms
Input:
Integers q = p d and r , bounds ¯ u , ¯ s , and ¯ e ; Output: C and E , lists of parameters for the cyclotomic and elliptic algorithm resp. C ← {} , E ← {} ; 29 . for u = 1 to ¯ u do if (cid:96) = ur + 1 is prime then if ord (cid:96) ( q ) = rs with s ≤ ¯ s , and gcd( rs, u/s ) = 1, and gcd( ur, d ) = 1 then Add (cid:96) to C . end if if (cid:96) ≤ ¯ e and gcd( u, r ) = 1 and q r (cid:54) = 1 mod (cid:96) then Compute
T ← { λ + q/λ mod (cid:96) | ord (cid:96) ( λ ) = r } ; repeat Take random E/ F q and compute its trace t ; until ( t mod (cid:96) ) ∈ T Add ( (cid:96), E, t ) to E . end if end if end for return C and E . Proposition 25.
On input q , r , ¯ u and ¯ e , Algorithm 7 computes its output using ˜ O (cid:16) √ r ¯ u + ¯ u log ( q ) (cid:17) binary operations, assuming ¯ e ∈ o ( √ q ) .Proof. We only consider naive integer arithmetic, since it is unrealistic to apply embeddingalgorithms to very large sizes.In Step 3 we need to test for the primality of (cid:96) , while Steps 4 and 8 require the factorizationof (cid:96) −
1. Both operations can be performed in ˜ O ( √ r ¯ u ) operations using naive algorithms.In Step 10 we need to count the number of points of an elliptic curve over F q . This can bedone in O (cid:0) log ( q ) (cid:1) binary operations using the Schoof–Elkies–Atkin algorithm with naiveinteger arithmetic [54, 39].All other operations have negligible cost compared to these ones. We finally need toaccount for the two loops in the algorithm. The inner loop at Step 9 stops when a curvewith t mod (cid:96) in T is found. The set T has size ϕ ( r ), hence, assuming that traces are evenlydistributed modulo (cid:96) , we expect to find a suitable curve after O ( (cid:96)/r ) ⊂ O (¯ u ) tries. Althoughit is well known that traces are not evenly distributed modulo prime numbers [37], it is shownin [11, Th. 1] that the probability that a random curve has trace congruent to a fixed t mod (cid:96) approaches 1 /(cid:96) , as (cid:96) and q go to infinity, subject to (cid:96) ∈ o ( √ q ). Hence, we shall assume that¯ e ∈ o ( √ q ) for the complexity analysis to hold.The outer loop multiplies the whole complexity by ¯ u , we conclude that the overall com-plexity is in ˜ O (cid:16) √ r ¯ u + ¯ u log ( q ) (cid:17) . A natural question arises: what bounds ¯ u, ¯ s, ¯ e must be taken to ensure that the lists C , E inAlgorithm 7 are non-empty?It is not easy to give a precise answer: already the condition that (cid:96) = ur + 1, in Step 3,poses some difficulties. Heuristically, we expect that about ¯ u/ log(¯ ur ) of those numbers areprime. However the best lower bound on primes of the form (cid:96) = ur + 1, even under GRH,is (cid:96) ∈ O ( r . (cid:15) ) [29]. Empirical data show that the reality is much closer to the heuristic30 degree 3 polynomial regression − − − − − − − Figure 1: Prime powers r (abscissa) versus smallest integer u (ordinate) such that ur + 1is prime. Abscissa in logarithmic scale, density normalized by log( x ) /x and colored inlogarithmic scale.bound: in Figure 1 we plot for all prime powers r < the smallest u such that ur + 1 isprime. It appears that u is effectively bounded by O (log( r )) for any practical purpose.For the cyclotomic algorithm we also require that ord (cid:96) ( q ) is a multiple of r . Assumingthat q is uniformly distributed in ( Z /(cid:96) Z ) × , its order is exactly (cid:96) − (cid:96) − /(cid:96) ,hence we can assume that asymptotically ord (cid:96) ( q ) ∈ O ( (cid:96) ) = O ( r log( r )). Similar consider-ations can be made for the elliptic algorithm, assuming that (cid:96) ∈ o ( √ q ). Finally, we mustalso take into account the possibility that the elliptic algorithm fails. Under the heuristicsof Appendix C, this possibility only discards one in O ( q r ) curves, and is thus negligible.Summarizing, if we take ¯ u, ¯ s ∈ O (log( r )), we can expect Algorithm 7 to find suitableparameters for the cyclotomic algorithm, leading to expected parameters (cid:96) ∈ O ( r log( r )),and to an expected running time of ˜ O ( √ r ) binary operations and ˜ O ( r ( ω +1) / + M ( r ) log( q ))operations in F q . Similarly, if we also take ¯ e ∈ O ( r log( r )), assuming that r log( r ) ∈ o ( √ q ),we can expect Algorithm 7 to find suitable parameters for the elliptic algorithm, leading toan expected running time of ˜ O ( √ r + (log( r )) (log( q )) ) binary operations and ˜ O ( r log( q ))operations in F q .Although the complexity of the cyclotomic algorithm looks better, it must not be ne-glected that the ˜ O notation hides the cost of taking an auxiliary extension of degree O (log( r ));whereas the elliptic algorithm, when it applies, does not incur such overhead. The impactof the hidden terms in the complexity can be extremely important, as we will show in thenext section.The same considerations also apply when comparing Rains’ algorithms to Allombert’s. This assumption is obviously false for any fixed q , but it is a good enough approximation in practice. s of the auxiliary extension issmall, but becomes slower as this degree increases.In practice, it is hopeless to try and determine the appropriate bounds for each algorithmfrom a purely theoretical point of view. The best approach we can suggest, is to determineparameters at runtime, and set bounds and thresholds experimentally. To summarize, givenparameters q and r , we suggest the following approach:1. If gcd( q, r ) (cid:54) = 1, run the Artin–Schreier algorithm of Section 3.2.2. If r is a power of a small prime v , run the algorithm of Section 3.3.3. Determine the order s of q in ( Z /r Z ) × . If it is small enough, run one of the variantsof Allombert’s algorithm presented in Section 3.4. Run Algorithm 7 with bounds ¯ u , ¯ s and ¯ e determined according to s . Depending on thebest parameters found by Algorithm 7, run the best option among Rains’ cyclotomicalgorithm, Rains’ elliptic algorithm, and Allombert’s algorithm.In the next section we shall focus on the last two steps, by comparing our implementationsof the algorithms involved, thus giving an estimate of the various thresholds between them.However we stress that these thresholds are bound to vary depending on the implementationand the target platform, thus it is the implementer responsibility to determine them at themoment of configuring the system. Remark.
Although our exposition focused on the case where r is a prime power and k (cid:39) K ,most of the algorithms presented here can be easily adapted to work more generally.In particular, it is a non-negligible practical improvement to work with composite r by“gluing” together many prime powers. For example, let r and r (cid:48) be two prime powers, andlet (cid:96) and (cid:96) (cid:48) be two primes selected for use in Rains’ cyclotomic algorithm. If (cid:96) = (cid:96) (cid:48) , then (cid:104) q (cid:105) = rr (cid:48) s for some s , and Rains’ algorithm can be run only once for both r and r (cid:48) at thesame time, with the added benefit of requiring a smaller auxiliary extension. Similarly, if (cid:96) (cid:54) = (cid:96) (cid:48) , then we can run only once a straightforward generalization of Rains’ algorithm using( (cid:96)(cid:96) (cid:48) )-th roots of unity; this is especially advantageous when ord (cid:96) ( q ) = rs and ord (cid:96) (cid:48) ( q ) = r (cid:48) s ,so that the degree of the auxiliary extension is unchanged.In general, given the output of Algorithm 7 for many prime powers, combining theresults to obtain the best “gluing” requires solving an integer linear program (ILP). Giventhe availability of very fast ILP solvers, the practical speed-up can be significant. Similartechniques also apply to Allombert’s algorithm. On the other hand, it seems much moreunlikely to apply them to the elliptic Rains’ algorithm. To validate our results, we implemented the algorithms described in the previous sec-tions, and compared them to the implementation of Allombert’s algorithm available inPARI/GP [61], and to that of Rains’ algorithm available in Magma [4]. The variants of32llombert’s algorithm described in Section 3.1 were implemented in C on top of the Flint li-brary [27]. Rains’ cyclotomic and elliptic algorithms were implemented in Sage [23] (which it-self uses PARI and Flint to implement finite fields), with critical code rewritten in C/Cython.Our code only handles q prime and m, n odd.We ran tests for a wide range of primes q between 3 and 2 + 253, and prime powers r between 3 and 2069. All tests were run on an Intel(R) Xeon(R) CPU E5-4650 v2 clockedat 2.40GHz. We report in Figure 2 statistics only on the runs for 100 < q < ; otherranges show very similar trends. The source code and the full datasets can be downloadedat https://github.com/defeo/ffisom .We start by comparing our implementation of the three variants of Allombert’s algorithmpresented in Section 3.1.3 with the original one in PARI. In Figure 2a we plot running timesagainst the extension degree r , only for cases where the auxiliary degree s = ord q ( r ) is atmost 10: dots represent individual runs, continuous lines represent degree 2 linear regressions.Analyzing the behavior for arbitrary auxiliary degree s is more challenging. Based on theobservation that all variants have essentially quadratic cost in r , in Figure 2b we take runningtimes, we scale them down by r , and we plot them against the auxiliary degree s .The first striking observation is the extremely poor performance of PARI, especially as s grows. To provide a fairer comparison, we re-implemented Allombert’s revised algorithm [3],as faithfully as possible, as described in Section 3.1.2; this is the curve labeled “Allombert(rev)” in the graphs. For completeness we also implemented the Paterson-Stockmeyer variantdescribed previously; we do not plot it here, because it overlaps almost perfectly with our “Divide & conquer” curve. Although our re-implementations are considerably faster thanPARI, it is apparent that Allombert’s original algorithm does not behave as well as our newvariants.Focusing now on our three new variants presented in Section 3.1.3, one can’t fail to no-tice that the second one, named “Automorphism evaluation” , beats the other two by a greatmargin, both for small and large auxiliary degree. Although the “Multipoint evaluation” approach is expected to eventually beat the other variants as s grows, the cross point seemsto be extremely far from the parameters we explored. However, we notice that the naivevariant of “Multipoint evaluation” not using the iterated Frobenius technique (labeled “Mul-tipoint evaluation (var)” in the graphs), starts poorly, then quickly catches “Automorphismevaluation” as s grows.Now we shift to Rains’ algorithm and its variants. In comparing our implementationwith Magma’s, discarding outliers, we obtain a fairly consistent speed-up of about 30% (seeFigure 3); hence we will compare these algorithms only based on our timings. In Figure 2cwe group runs of the cyclotomic algorithm by the degree s of the auxiliary extension, and weplot median times against the degree r ; only the graphs for s <
10 are shown in the figure.We observe a very large gap between s = 1 and larger s ( s = 2 is 8 −
16 times slower). Thisis partly due to the fact that we use generic Python code to construct auxiliary extensions,rather than dedicated C; however, a large gap is unavoidable, due to the added cost ofcomputing in extension fields. We also plot median times for the elliptic variant and for theconic variant (see Appendix A). It is apparent that the elliptic algorithm outperforms thecyclotomic one as soon as s ≥
3, and that the conic algorithm conveniently replaces the case s = 2. Thus, at least for the parameter ranges we have tested, the cyclotomic algorithmwith auxiliary extensions seems of limited interest.33
00 200 300 400 500 600degree r . . . . . . s ec o nd s Divide & conquerAutomorphism eval.Multipoint eval.Multipoint eval. (var)PARI/GPAllombert (rev) (a) Comparison of various implementations ofAllombert’s algorithm, in the case where theauxiliary degree s = ord q ( r ) ≤
10. Dots rep-resent individual runs, lines represent degree 2linear regressions. q mod r . . . . . . . r a t i o Divide & conquerAutomorphism eval.Multipoint eval.Multipoint eval. (var)PARI/GPAllombert (rev) r (b) Comparison of various implementations ofAllombert’s algorithm, as a function of the aux-iliary degree s = ord q ( r ). Individual runningtimes are scaled by down by r . Dots representindividual runs, lines represent degree 2 linearregressions. degree r − − s ec o nd s Cyclotomic Rains’Conic Rains’Elliptic Rains’ (c) Cyclotomic, conic and elliptic variants ofRains’ algorithm. Auxiliary extension degrees s for cyclotomic Rains’ range between 1 and 9.Lines represent median times. degree r − − − − − s ec o nd s Allombert (AE) s ∈ [1 , s ∈ [8 , s ∈ [64 , s ∈ [512 , s = 1Conic Rains’Elliptic Rains’ (d) Comparison of Allombert’s (Automorphismevaluation variant) and Rains’ algorithms atsome fixed auxiliary extension degrees s . Linesrepresent median times, shaded areas minimumand maximum times. Figure 2: Benchmarks for Rains’ and Allombert’s algorithms. q is a prime between 100and 2 , r is an odd prime power varying between 3 and 2069. Plots c and d are in doublylogarithmic scale. Full dataset available at https://github.com/defeo/ffisom .34 − − − Cyclotomic Rains’ (seconds)2 − − M ag m a ( r a t i o ) Figure 3: Comparison of our implementation of Rains’ algorithm and Magma’s. Runningtime of our implementation in seconds vs ratio of Magma running time over ours. Plot indoubly logarithmic scale.Finally, in Figure 2d we compare Rains’ algorithms against Allombert’s. In light of theexcellent performances of the “Automorphism evaluation” variant of Allombert’s algorithm,we only plot the performances for this variant. We plot against the degree r runs of Al-lombert’s algorithm grouped by ranges of the auxiliary degree ord r ( q ): we shade the areabetween minimum and maximum running times, and trace the median time. We also takefrom Figure 2c the graphs for the cyclotomic (only s = 1), the conic and the elliptic variantsof Rains’ algorithm. We notice that Allombert’s algorithm, even with relatively large aux-iliary degrees, is extremely fast; the cyclotomic algorithm only beats it when ord r ( q ) goesbeyond 10 to 50, the conic algorithm only beats extremely large ord r ( q ), and the elliptic al-gorithm is never better. We also observe that Allombert’s algorithm has a better asymptoticbehavior as the degree r grows.In light of these comparisons, it seems that the absolute winner is our Automorphismevaluation variant of Allombert’s algorithm, with Rains’ cyclotomic algorithm being onlyoccasionally more interesting. Obviously, the comparisons are only relevant to our own codeand test conditions. Other implementations and benchmarks will likely find slightly differentcross-points for the algorithms.
We end this work with a review of the known methods for embedding evaluation . We recallthe problem statement: given two finite fields k = F q [ X ] /f ( X ) and K = F q [ Y ] /g ( Y ), andgiven an embedding φ : k (cid:44) → K represented by two elements α ∈ k and β ∈ K such that k = F q ( α ) and φ ( α ) = β , answer the following questions: • Given γ ∈ k , compute φ ( γ ); • Given δ ∈ K , determine whether δ ∈ φ ( k );35 Given δ ∈ φ ( k ), compute φ − ( δ ). We are going to assume that elements of k are represented on the monomial basis (1 , X, . . . , X m − ),and elements of K on the monomial basis (1 , Y, . . . , Y n − ).We review three solutions for this problem, each built on top of the previous one. Thefirst one uses basic linear algebra; it is simple and effective, but has large space and timecomplexities. The second one improves the complexity by avoiding matrix inversion; howeverit is still based on linear algebra, thus it has the same storage requirements as the previousmethod. The last one replaces linear algebra with modular composition, thus providing thebest space and time complexities overall. Unlike the algorithms of the previous sections, allalgorithms presented in this section are deterministic, unless a randomized algorithm is usedfor modular composition.All the methods presented in this section are classic and well understood, thus we willkeep the presentation short, and will not discuss implementation details. Since the map φ is F q -linear, one obvious solution is to explicitly write its matrix on themonomial bases of k and K . This is the solution employed by both the PARI/GP [61] andthe Magma [4, 5] computer algebra systems. To stress the fact that k is seen a vector spacewith a fixed basis (1 , X, . . . , X m − ), we will write V X for it, and we will similarly write V Y for K with its monomial basis.The element α defines another basis (1 , α, . . . , α m − ) of k , and we denote by V α this vectorspace isomorphic to V X . Similarly, β defines a basis of a subspace V β ⊂ V Y , also isomorphicto V α . Hence, we decompose the map φ : V X → V Y as a composition of three maps: V X ∼ −→ V α ∼ −→ V β (cid:44) −→ V Y . The middle map from V α to V β is trivially represented by an identity matrix; the only mapsthat require actual computation are the other two.For example, the map V β (cid:44) → V Y is represented by an n × m matrix whose columns arethe coefficients of the elements 1 , β, . . . , β m − written on the monomial basis of V Y . Theinverse map is then computed by solving a linear system; this operation can be sped upby precomputing, e.g., an LU decomposition for the matrix. The computation of the map V X ∼ −→ V α is done analogously.Summarizing, both the full map φ : V X → V Y and its inverse can be computed byperforming one matrix-vector product and solving one linear system. The complexity isdominated by the cost of solving the linear system, that is O ( m ω − n ) operations over F q .However this cost can be counted as a precomputation if we perform an LU decomposition,and so can the computation of the powers α i and β i ; in this case, the cost for evaluating φ or its inverse drops down to O ( mn ) operations. At any rate, the biggest drawback of thisapproach is the large memory complexity: indeed storing the precomputed matrices requires O ( mn ) elements of F q . An additional natural question would be the following: given δ ∈ K , compute an expression δ = (cid:80) n/m − i =0 φ ( γ i ) Y i with all γ i ∈ k . The techniques presented here also apply to this more general problem,however we will skip them for conciseness. .2 Inverse maps and duality The first improvement to the linear algebraic method consists in replacing the linear systemsolving with a much simpler matrix-vector product, combined with an efficient change ofbasis. This technique is not new [56, 57, 58, 7], however it is seldom found in the literature.We briefly recall it, following the presentation of [19].Let k = F q [ X ] /f ( X ) be a finite field with monomial basis (1 , X, . . . , X m − ). The traceTr from k to F q defines a non-degenerate bilinear form by (cid:104) γ, δ (cid:105) k ≡ Tr( γδ ), which itselfdetermines the dual basis ( X ∗ , X ∗ , . . . , X ∗ m − ) to (1 , X, . . . , X m − ), characterized by (cid:104) X j , X ∗ i (cid:105) k = (cid:40) i = j ,0 otherwise.Given the polynomial f , conversions between the monomial and the dual basis can be per-formed very efficiently at a cost of O ( M ( m ) log( m )) operations in F q (see [19, § K = F q [ Y ] /g ( Y ) be another finite field, let (cid:104) , (cid:105) K be the bilinear form defined byits trace to F q , and let φ : k (cid:44) → K be a field embedding. There exists a unique linear map φ t : K → k , called the dual map of φ , such that (cid:104) φ ( γ ) , δ (cid:105) K = (cid:104) γ, φ t ( δ ) (cid:105) k for any γ ∈ k and δ ∈ K .If M = ( m i,j ) is the matrix of φ in the monomial bases of k and K , then its transpose M t is the matrix of φ t in their dual bases . Indeed, m i,j = (cid:104) φ ( X j ) , Y ∗ i (cid:105) = (cid:104) X j , φ t ( Y ∗ i ) (cid:105) . (16)We are now going to show that φ t is closely related to the inverse map of φ . If φ is anisomorphism of fields, then we immediately have φ t = φ − . Indeed, in this case φ preservesthe bilinear forms: (cid:104) γ, δ (cid:105) k = (cid:104) φ ( γ ) , φ ( δ ) (cid:105) K for any γ, δ ∈ k .Hence, duality implies that (cid:104) γ, δ (cid:105) k = (cid:104) γ, φ t ◦ φ ( δ ) (cid:105) k , but then the non-degeneracy of the trace implies that φ t ◦ φ is the identity map. The case where φ is a proper embedding calls for a more careful handling. In this case, φ does not preserve traces, indeed, if n = [ K : F q ], nm (cid:104) γ, δ (cid:105) k = (cid:104) φ ( γ ) , φ ( δ ) (cid:105) K . If n/m is not a multiple of the characteristic, proceeding like before we can show that( m/n ) φ t is the inverse of φ . To handle the general case, take an element η ∈ K such thatTr K/k ( η ) = 1, and let H be the map defined by γ (cid:55)→ ηγ . Then, by composition of traces, weprove that (cid:104) γ, δ (cid:105) k = Tr k/ F q ( γδ ) = Tr K/ F q ( γηδ ) = (cid:104) φ ( γ ) , H ◦ φ ( δ ) (cid:105) K = (cid:104) H ◦ φ ( γ ) , φ ( δ ) (cid:105) K More generally, in the category of finite-dimensional vector spaces with nondegenerate bilinear formsand morphisms that preserve bilinear forms, we have a natural isomorphism between the identity functorand the dual functor.
37e have thus shown that both φ t ◦ H ◦ φ and φ t ◦ H t ◦ φ are the identity map.Let us apply these findings to the linear algebraic approach of the previous subsection.We had an embedding of fields, decomposed as three maps V X ∼ −→ V α ∼ −→ V β (cid:44) −→ V Y . The maps V α ∼ −→ V X and V β (cid:44) → V Y are both field embeddings; the first one is represented byits matrix on the monomial bases generated by α and X ; the second one on those generatedby β and Y . For both maps, we are interested in computing their inverse; instead of solvinga linear system, we switch to dual bases and apply the discussion above. Thus, the inverseof V β (cid:44) → V Y is evaluated as a multiplication by a fixed element η , followed by a conversion tothe dual basis of V Y , then a matrix-vector product with the transposed matrix, and finally aconversion back to the monomial basis of V β . The inverse of V α ∼ −→ V X is computed similarly.Note that the inverse map to V β (cid:44) → V Y is not everywhere defined. Interestingly, whilelinear system solving could immediately recognize elements of V Y that are not in the imageof V β , the new solution will just project them onto an arbitrary element of V β . Indeed, any δ ∈ V Y can be rewritten as δ = ( δ − Tr K/k ( ηδ )) + Tr K/k ( ηδ ) = δ (cid:48) + Tr K/k ( ηδ ) , for the same η chosen above. Direct calculation shows that (cid:104) φ ( γ ) , ηδ (cid:105) K = (cid:104) φ ( γ ) , ηδ (cid:48) (cid:105) K + (cid:104) φ ( γ ) , η Tr K/k ( ηδ ) (cid:105) K = (cid:104) γ, Tr K/k ( ηδ ) (cid:105) k , (17)for any γ ∈ k . Hence, applying the above algorithm to an arbitrary element δ ∈ V Y yieldsthe element Tr K/k ( ηδ ) of V β , which coincides with δ whenever δ ∈ φ ( k ). Hence, the best wayto test that an element δ ∈ K is in the image of φ would be to project it to γ = Tr K/k ( ηδ )using this algorithm, and then test that φ ( γ ) = δ .It is easily seen that the complexity is dominated by the transposed matrix-vector prod-uct, which costs O ( mn ) operations (plus a cost of O ( m M ( n )) operations for precomputingthe matrices). Hence, by paying a little overhead in the changes of basis, we have completelyremoved the cost of solving a linear system. We have not reduced yet the large storage cost,however. Our final improvement consists in replacing the matrix computations with modular compo-sition. The technique originates in Shoup’s work [56, 57, 58].Considering again the map V β (cid:44) → V Y , we observe that its evaluation is precisely a modularcomposition problem: given polynomials γ = (cid:80) γ i β i and β = (cid:80) β i Y i with coefficients in F q , of degree bounded by n , compute γ ( β ) mod g . As seen previously this computation canbe done more efficiently by a dedicated algorithm, than by a naive Horner rule.However we also need to compute the inverse map to V β (cid:44) → V Y , and this problem isclearly not a modular composition one. In the previous subsection we have reduced thecomputation of this inverse map to a change of bases, combined with a transposed matrix-vector product. A very powerful generalization of Eq. (16), called transposition principle ,38llows us to transpose any modular composition algorithm, much like one would transpose amatrix. This technique was also introduced by Shoup, and then refined by many authors [6,20, 17]. The dual problem to modular composition was named power projection by Shoup;its inputs are the polynomials β, g , and an element γ ∗ in the dual space of F q [ X ] (i.e., thelinear forms on F q [ X ]); its output is the list of elements γ ∗ ( β ) , . . . , γ ∗ ( β n − ). Thanks tothe transposition principle, the power projection problem can be solved within the samecomplexity bound as modular composition [57, 33].Summarizing, the inverse map to V β (cid:44) → V Y , is computed by Algorithm 8. Algorithm 8
Inverse embedding
Input:
An element δ ∈ K , and precomputed values: • β ∈ K generating a subfield isomorphic to k , • η ∈ K such that Tr K/k η = 1. Output: Tr K/k ( ηδ ) written in the basis (1 , β, . . . , β m − ). Compute the minimal polynomial of β over F q ; Compute δ (cid:48) = ηδ ; Convert δ (cid:48) to the dual basis ( Y ∗ , . . . , Y ∗ n − ); Compute γ = Tr K/k δ (cid:48) using power projection ; Convert γ to the monomial basis (1 , β, . . . , β m − ); return γ . Theorem 26.
Algorithm 8 is correct. When the input δ is in the image of k , it returns δ itselfwritten on the basis (1 , β, . . . , β m − ) . It computes its output using O ( n ( ω +1) / ) operations in F q in the worst case.Proof. Correctness follows from the discussion above, and the obvious fact that δ = Tr K/k ( ηδ )whenever δ ∈ k .The minimal polynomial of β , required to compute conversions between the monomialand the dual bases generated by β , can be computed in O ( M ( n ) log( n )) operations usingthe Berlekamp–Massey algorithm. Conversions between monomial and dual bases are alsodone in O ( M ( n ) log( n )) as shown in [19, § O ( n ( ω +1) / )operations, using any of the algorithms in [57, 33].We have not specified how the element η is computed. If n/m is not divisible by thecharacteristic, then one can simply take η = m/n . In the general case, it suffices to know anelement such that Tr K/k (cid:54) = 0, and to divide it by its trace. If the ( n − g is not 0, then Y is one such element; otherwise we take elements at random until a suitableone is found: only O (1) trials are expected on average. In any case, computing one tracecan be done using O ( n ( ω +1) / log( n ) + M ( n ) log( q )) operations, thanks to Section 2.1. Corollary 27.
After a precomputation costing O ( n ( ω +1) / log( n ) + M ( n ) log( q )) operationsin F q on average, all sub-questions of the embedding evaluation problem can be answeredusing O ( n ( ω +1) / ) operations in F q . The minimal polynomial could be precomputed along with β , however we include this computation inthe algorithm, as it does not change the total complexity. ppendices A Rain’s conic algorithm
We have seen that Rains’ cyclotomic algorithm suffers in practice from the need to builda field extension k (cid:48) of k . The conic variant we are going to present reduces the degree ofthe field extension from s = [ k (cid:48) : k ] to s/ s is even. This is especially usefulwhen s = 2, as highlighted in Section 7. The algorithm is similar in spirit to Williams’ p + 1factoring method [66], where the arithmetic of the norm 1 subgroup of k (cid:48)∗ is performed usingLucas sequences on a subfield of index 2 of k (cid:48) .Let F be a finite field of odd characteristic, let ∆ ∈ F be a quadratic non-residue, let δ bean element of the algebraic closure of F such that δ = ∆, and define the norm 1 subgroupof F [ δ ] ∗ as T ( F ) = { ( x + δy ) / | x, y ∈ F and x − ∆ y = 4 } ;it is easy to verify that T ( F ) forms a group under multiplication. If we see the elements( x + δy ) / x, y ) on a conic x − ∆ y = 4, the group law of T ( F ) induces a grouplaw on the conic. By projecting onto the x -coordinate, a straightforward calculation showsthat, for any point ( θ, ∗ ) on the conic, its n -th power has coordinates ( θ n , ∗ ), where θ n isdefined by the Lucas sequence θ = 2 , θ = θ, θ i +1 = θθ i − θ i − . We shall denote by [ n ] the map θ (cid:55)→ θ n ; notice how it does not depend on the choice of ∆.The generalization of Rains’ algorithm is now obvious: by projecting on the x -coordinate,we work in a field extension twice as small compared to the original algorithm. This issummarized in Algorithm 9. Algorithm 9
Rains’ conic algorithm
Input:
A field extension k/ F q of degree r ; a prime (cid:96) such that • ( Z /(cid:96) Z ) × = (cid:104) q (cid:105) × S for some S , • (cid:104) q (cid:105) = 2 rs for some integer s ;a polynomial h of degree s irreducible over k . Output:
A normal generator of k over F q , with a uniquely defined Galois orbit. Construct the field extension k (cid:48) = k [ Z ] /h ( Z ); repeat repeat Take a random element θ ∈ k (cid:48) , until θ − Compute ζ = [( k (cid:48) + 1) /(cid:96) ] θ , until ζ (cid:54) = 2; Compute η ( ζ ) ← (cid:80) σ ∈ S [ σ ] ζ ; return α ← Tr k (cid:48) /k η ( ζ ) = (cid:80) s − i =0 [ q ri ] η ( ζ ).40 roposition 28. Algorithm 9 is correct: on input q, r, (cid:96), s it returns an element in the sameGalois orbit as Algorithm 5 on input q, r, (cid:96), s . It computes its output using O ( M ( sr )( sr log( q )+( (cid:96)/r ) log( (cid:96) ))) operations in F q on average, or ˜ O (( sr ) log( q )) assuming (cid:96) ∈ o ( sr ) .Proof. By construction, all the (cid:96) -th roots of unity are in T ( k (cid:48) ). Observe that if ( x + δy ) / T ( k (cid:48) ), then its trace over k (cid:48) is equal to x . Hence, the value ζ computed in Step 6 isthe trace over k (cid:48) of a primitive (cid:96) -th root of unity. We conclude by comparing this algorithmwith Algorithm 5.The non-residuosity test in Step 5 is done by verifying that the ( k (cid:48) − / θ is equal to −
1. We do this in O ( sr log( q )) operations in k (cid:48) , or O ( sr M ( sr ) log( q )) operationsin F q .To implement the other steps, we need to evaluate the map [ n ] efficiently. We have thefollowing classical relationships for the Lucas sequence of θ : θ i = θ i − , θ i +1 = θ i θ i +1 − θ, θ i +2 = θ i +1 − . Starting with θ = 2 and θ = θ , we use a binary scheme to deduce θ i , θ i +1 from θ (cid:98) i/ (cid:99) , θ (cid:98) i/ (cid:99) +1 .We reach θ n after O (log( n )) steps, each requiring a constant number of operations in k (cid:48) .Hence, Step 6 costs O ( sr M ( sr ) log( q )) operations in F q , while Steps 8 and 9 together cost O (( M ( sr )( (cid:96)/r ) log( (cid:96) )).Although this variant does not exploit the asymptotic improvement offered by Proposi-tion 2, the fact that its auxiliary degree s is half the one of the original algorithm usuallygives an interesting practical improvement. Step 6 can be modified so as to avoid the pre-mature projection on the x -axis, so that the algorithms of Proposition 2 apply. We leave thedetails of this variant to the reader. B Using j = 0 , in the elliptic Rains’ algorithm When we defined elliptic periods in Section 5.1, we explicitly ruled out the case where theelliptic curve has j -invariant 0 and 1728. Indeed the definition of elliptic periods for thesecurves is complicated by the fact that they have additional automorphisms: had we appliedDefinition 20 to them, we would have obtained periods that are always equal to 0.In this section we sketch how to extend the elliptic variant of Rains’ algorithm to thesecurves. Although this does not change the overall complexity of Rains’ algorithm, it makesfor a small practical improvement, and a nice recreational mathematics read.Recall (see [60, III.10]) that the order of the automorphisms group Aut( E ) of a curve E defined over a field F q of characteristic p is one of the following: • j ( E ) (cid:54) = 0 , • j ( E ) = 1728 and p (cid:54) = 2 , • j ( E ) = 0 and p (cid:54) = 2 , •
12 if j ( E ) = 0 = 1728 and p = 3, 41
24 if j ( E ) = 0 = 1728 and p = 2.We can now give a meaningful definition of elliptic periods that covers all elliptic curves. Definition 29.
Let E/ F q be an elliptic curve with automorphism group Aut( E ) of order2 n , and Frobenius endomorphism π . Let (cid:96) > E , λ an eigenvalue of π , and P a point of order (cid:96) in the eigenspace corresponding to λ (i.e., such that π ( P ) = λP ).Suppose that there is a subgroup S of ( Z /(cid:96) Z ) × containing Aut( E ), and such that( Z /(cid:96) Z ) × = (cid:104) λ (cid:105) × S. Then we define an elliptic period as η λ,S ( P ) = (cid:88) σ ∈ S/ Aut( E ) x ([ σ ] P ) n where x ( P ) denotes the abscissa of P .This definition is equivalent to Definition 20 when Aut( E ) = {± } . When the automor-phism group is larger, it avoids unnecessary cancellations by quotienting out S by Aut( E ),and at the same time it ensures unicity thanks the the n -th power in the sum. We leaveas an exercise the proof of the statement analogous to Lemma 21. Note that it would bepossible to generalize this definition to the case where Aut( E ) is (partially) contained in (cid:104) λ (cid:105) ,however we leave out this detail, as it is not necessary for Rains’ algorithm. B.1 The ordinary case
The curves of j -invariant 0 and 1728 are ordinary if and only if all automorphisms are definedover F p . For j = 0, this is equivalent to p ≡ j = 1728, this is equivalent to p ≡ E is well known in this case, with earlyresults dating back to Gauss. The two statements below follow easily from [59, Th. 2.5,2.6]. Proposition 30.
Let p be a prime congruent to modulo , and let q = p d . Let p = π ¯ π bethe unique decomposition of p in the Eisenstein integers Z [ ω ] with π ≡ ¯ π ≡ . Let E be the curve defined by y = x + b , and let (cid:0) bπ (cid:1) be the unique sixth root of unity of Z [ ω ] congruent to (4 b ) ( q − / mod π . Then the minimal polynomial of the Frobenius endomorphismof E splits in Z [ ω ] as X − tX + q = (cid:16) X − (cid:0) bπ (cid:1) − π d (cid:17)(cid:16) X − (cid:0) bπ (cid:1) ¯ π d (cid:17) . Proposition 31.
Let p be a prime congruent to modulo , and let q = p d . Let p = π ¯ π bethe unique decomposition of p in the Gaussian integers Z [ i ] with π ≡ ¯ π ≡ i ) . Let E be the curve defined by y = x − ax , and let (cid:0) aπ (cid:1) be the unique fourth root of unity of Z [ i ] congruent to a ( q − / mod π . Then the minimal polynomial of the Frobenius endomorphismof E splits in Z [ i ] as X − tX + q = (cid:16) X − (cid:0) aπ (cid:1) − π d (cid:17)(cid:16) X − (cid:0) aπ (cid:1) ¯ π d (cid:17) . (cid:96) is an Elkies prime for y = x + b if and only if (cid:96) ≡ y = x − ax if and only if (cid:96) ≡ p in Z [ ω ] or Z [ i ] using Cornacchia’salgorithm [12]; then, we deduce the image of the eigenvalues in ( Z /(cid:96) Z ) × directly from thesplitting. We give below more detailed step-by-step instructions for the case j = 1728; thecase j = 0 is analogous and can be treated in a similar way.Our goal is to find a uniquely defined generator for F q r , with q = p d , given a prime (cid:96) suchthat r | ( (cid:96) − q r (cid:54) = 1 mod (cid:96) and that p ≡ (cid:96) ≡ p = x + 4 y ;2. Choose signs so that p = π ¯ π = ( x − iy )( x + 2 iy ) and x − y ≡ ı ∈ F p such that x = 2ˆ ıy ;4. Find a primitive fourth root of unity I ∈ Z /(cid:96) Z ;5. Look for an element of ( Z /(cid:96) Z ) × of order r among the I ± c ( x ± Iy ) d for 0 ≤ c ≤ y = x − ax , where a ∈ F q is such that a ( q − / = ˆ ı c .This procedure can be plugged inside Algorithm 7 to help find candidates for Rains’algorithm. The most computationally intensive step is Cornacchia’s algorithm, which is neg-ligible if compared to the point counting operations needed for general elliptic curves. If anelliptic curve is found by this procedure, then it can be readily used inside the elliptic variantof Rains’ algorithm, by replacing the earlier definition of elliptic period with Definition 29above. B.2 The supersingular case
The supersingular case is of much lesser interest, however we treat it briefly for completeness.To keep things simple, we will not consider the cases p = 2 ,
3. The possibilities for the trace of E are much more constrained in this case. Indeed, the only possibilities are t = 0 , ±√ q, ± √ q (see [65]). Of these, only the case ±√ q is interesting in the context of Rains’ algorithm;indeed when t = ± √ q the two eigenvalues are equal, and Rains’ algorithm does not apply;when t = 0 the two eigenvalues are opposite, thus if one has odd order r modulo (cid:96) , q musthave order 2 r , but in this case the conic variant presented in Appendix A is simpler and moreefficient. The case t = ±√ q is never realized by j = 1728, however for j = 0 it constitutesan interesting optimization whenever q is a square and ord (cid:96) ( q ) = 3 r .Assume that q is an even power of p , and p ≡ j ∈ F q be a primitive cuberoot of unity, and let E be the curve y = x + j . Then E has trace √ q , and the minimalpolynomial of the Frobenius endomorphism splits in Z [ ω ] as X − X √ q + q = ( X + ω √ q )( X + ω √ q ) . r be a prime power not divisible by 2 or 3, and let (cid:96) be a prime such that (cid:96) = 1 mod 3and ord (cid:96) ( q ) = 3 r . Let ˆ ω ∈ Z /(cid:96) Z such that q r = ˆ ω mod (cid:96) , then by direct calculation we seethat ord (cid:96) ( − ˆ ω r √ q ) = r and ord (cid:96) ( − ˆ ω − r √ q ) = 3 r . Hence, there is only one cyclic rationalsubgroup of (cid:96) -torsion in F q r , and we are in the conditions to apply Rains’ algorithm. Thiscomputation also shows that the case t = −√ q is not useful in Rains’ algorithm. Example 32.
We illustrate the algorithm sketched above with a numerical example. Let p = 5, q = 25, r = 5 and (cid:96) = 61. In what follows, elements of F q are represented aspolynomials in j in the quotient ring F [ j ] / ( j + j + 1).The elliptic curve y = x + j has trace 5, and its two Frobenius eigenvalues are − ωp and − ω p . We set ˆ ω = q r = 13 mod 61, and we verify that ord( − ˆ ω r p ) = ord( − ·
5) = r ,whereas ord( − ˆ ω − r p ) = ord( − ·
5) = 3 r . Indeed E ( F q r ) = 3 · · · P ∈ E [61] defined over F q r . To applyDefinition 29, we need a generator of the order 12 subgroup of Z / Z , e.g., 21. Quotientingout the order 6 subgroup, we obtain the period η = x ( P ) + x ([21] P ) , and we verify that its minimal polynomial is x + (3 j − x + ( j + 3) x − jx + ( j + 1) x + (2 j − P . C Elliptic periods
In this section, we cover additional topics related to the correctness of the elliptic variant ofRains’ algorithm. Specifically, we first prove a sufficient condition for an elliptic period tobe a normal element. This, together with heuristic assumptions, gives a lower bound for thesuccess probability of Algorithm 6.Secondly, we review our attempts at finding a counterexample to Conjecture 23, i.e. anelliptic point whose period is defined over a smaller field than its abscissa. Such examplesare not expected to be common, but our extensive searches have produced absolutely none,giving a small argument in favor of the conjecture.
C.1 Normality of elliptic periods
We prove here an analogue of Lemma 15 for elliptic periods. All well known proofs of thatlemma follow a common scheme: for a squarefree (cid:96) and q prime,1. the (cid:96) -th cyclotomic polynomial is irreducible over Q , and its roots form a normal basisof the splitting field;2. Gaussian periods, seen as field traces of roots of unity, then yield normal bases ofsubextensions; 44. for any inert prime q (cid:54) = (cid:96) , integrality properties guarantee that the basis stays normalafter reduction modulo q .In the elliptic setting, nothing guarantees that torsion subgroups yield normal bases,thus the first step above crucially fails; and indeed we have seen examples where ellipticperiods are not normal. However, at least when the torsion subgroup satisfies some normalityconditions, we may hope to carry the rest of the proof through. The framework to express theappropriate normality conditions is given to us by the polynomially cyclic algebras of [43]. Definition 33.
Let E/ F q be an elliptic curve of j -invariant not 0 or 1728. Let (cid:96) > E , let λ be an eigenvalue of π , and let (cid:104) P (cid:105) be the corresponding eigenspace.We define the kernel polynomial of λ as f λ ( X ) = (cid:89) i ∈ ( Z /(cid:96) Z ) ∗ / {± } ( X − x ([ i ] P )) , and the algebra associated to λ as A λ = F q [ X ] /f λ ( X ). Proposition 34.
The algebra A λ is polynomially cyclic in the sense of [43]: the automor-phism group of A λ over F q is cyclic; if ν is a generator of the automorphism group, thereexists a polynomial C ∈ F q [ X ] such that for any a ( X ) ∈ A λ ν ( a ) = a ( C ( X )) mod f λ ( X ) . Proof.
Let c be a generator of ( Z /(cid:96) Z ) ∗ / {± } . The multiplication-by- c map on E cyclicallypermutes the generators of (cid:104) P (cid:105) / {± } , thus it induces a cyclic automorphism ν of A λ . Mul-tiplication by c is defined by a rational function with no poles in (cid:104) P (cid:105) , hence its action on (cid:104) P (cid:105) / {± } can be expressed by a polynomial C modulo f λ .We are now ready to state our main result on the normality of elliptic periods. Lemma 35.
Let E , (cid:96) , λ , f λ and A λ be as in Definition 33, and let ν be a generator ofthe automorphism group of A λ . We say that an element a ∈ A λ is normal if the ν i ( a ) for ≤ i < ( (cid:96) − / are linearly independent. We say that f λ is normal if the image of X in A λ is normal.Suppose that ( Z /(cid:96) Z ) ∗ = (cid:104) λ (cid:105) × S . If f λ is normal, then the elliptic period η λ,S ( P ) is anormal element of F q ( x ( P )) , for any point P in the eigenspace of λ .Proof. If λ is of order r modulo q , then f λ splits as f λ = h · · · h d where the h i are pairwise coprime irreducible polynomials of degree r , and dr = ( (cid:96) − / A λ splits as d different copies of F q r : A λ (cid:39) F q [ X ] /h ( X ) × · · · × F q [ X ] /h d ( X ) . The h i correspond to the different Galois orbits of (cid:104) P (cid:105) / {± } . Up to permutation of indices,we can suppose that ν sends roots of h i onto roots of h i +1 . We denote by σ the automorphismof algebras ν r and by S the corresponding polynomial (there should be no confusion withthe subgroup S ). Note that ν d is the Frobenius automorphism of A λ .Gathering the above remarks we obtain the following diagram:45 λ = F q [ X ] / ( f λ ( X )) x F q [ X ] / ( h ( X )) × F q [ X ] / ( h ( X )) × · · · × F q [ X ] / ( h d ( X ))( x, x, . . . , x ) A Sλ (cid:39) F q r Tr( x ) = (cid:80) d − i =0 S ( i ) ( x ) F q [ X ] / ( h ( X )) × F q [ X ] / ( h ( X )) × · · · × F q [ X ] / ( h ( X ))( x, S ( x ) , . . . , S ( d − ( x ))Let x be the image of X in A λ . Suppose that x is normal in A λ , then so is its trace in A Sλ . But taking the trace of x in A λ and going through the two isomorphisms of the diagramis the same thing as computing the period in the d copies of F q r : x (cid:55)→ d − (cid:88) i =0 S ( i ) ( x ) (cid:55)→ (cid:32) d − (cid:88) i =0 S ( i ) ( x ) (mod h ( x )) , . . . , d − (cid:88) i =0 S ( i ) ( x ) (mod h d ( x )) (cid:33) . If a linear relation existed between η λ,S ( P ) ∈ F q ( x ( P )) and its Galois conjugates for aneigenpoint P of λ , then it would be verified in the d copies of F q r and it would lift up to A λ ,thus forcing the same linear relation on Tr( x ) and its Galois conjugates. Therefore, if x isnormal in A λ , then η λ,S ( P ) is normal in F q ( x ( P )).The converse is easily shown not to be true by experimentation. There is also no rela-tionship between the normality of f λ and that of x ( P ) ∈ F q ( x ( P )) for any eigenpoint P of λ , nor between the normality of x ( P ) and that of η λ,S ( P ).We conclude this section with a proof that normal elements are abundant in the algebra A λ . This can be used as a further heuristic argument that Algorithm 6 fails with very lowprobability. Proposition 36.
Let A be a polynomially cyclic algebra of degree dr over the finite field F q .The number of normal elements in A is the same as the number of normal elements in F q dr : Φ q ( x dr − ∈ O ( q dr ) where Φ q is the Euler function for polynomials [40, Lemma 3.69].Proof. There is at least one normal element in A [43, Theorem 4] (and it can be constructedfrom one in F q ). One such element α can be used to count the number of normal elementsas in the finite field case [40, Theorem 3.73]:1. every element a can be written using the normal basis associated to α : a = (cid:80) dr − i =0 a i ν ( i ) ( α );2. as the Galois group of A is cyclic, the decomposition of the Galois conjugates of a isobtained by shifting the coefficients a i ;3. the circulant matrix M a associated to the a i ’s is invertible if and only if a is a normalelement;4. the polynomial (cid:80) dr − i =0 a i X i in F q [ X ] / ( X dr −
1) is invertible if and only if M a is.Therefore normal elements in A are in one-to-one correspondence with units of F q [ X ] / ( X dr − .2 Experimental evidence for the conjecture We now present experimental evidence supporting the validity of Conjecture 23. Assumingelliptic periods behave as random elements, the chance that one of them does not generate F q r decreases as q − r asymptotically. Thus we may focus on looking for counterexamples withsmall values of q and r , as that increases the probability of finding one. We now present thevarious strategies by which we tried to obtain counterexamples; surprisingly, none of themproduced any. Search strategy 1
For small values of q , one can generate random elliptic curves over F q ,look for small values of (cid:96) which are Elkies primes, compute a point of (cid:96) -torsion defined over F q r , the associated elliptic period and its minimal polynomial over F q .If one restricts to prime values of r and supposes elements of F q r are represented aspolynomials with coefficients in F q , the computation of the minimal polynomial can beavoided: one simply checks that the elliptic period is not in F q , i.e. a constant polynomial. Search strategy 2
With further restrictions on r and (cid:96) it is possible to avoid the compu-tation of an actual (cid:96) -torsion point as well. For example, let r be a prime such that (cid:96) = 2 dr +1with d = 2 is prime and suppose that (cid:96) is an Elkies prime for E : the (cid:96) -division polynomial f (cid:96) of E has two factors h and h of degree r over F q (whose product is f λ the kernel polyno-mial of an (cid:96) -isogeny). The elliptic period is defined as x ( P ) + x ([ i ] P ) for some i ∈ ( Z/(cid:96) Z ) ∗ such that i ≡ − (cid:96) ) and any P ∈ E ( F q r )[ (cid:96) ]. Let m i be multiplication by i map onthe curve. Then the mapping x + m i ( x ) sends the roots of h and the roots of h ontothe elliptic periods. If we suppose that the period is not generating F q , which implies thatRes x ( h ( x ) , y − x − m i ( x )) = Res x ( h , y − x − m i ( x )) splits over F q , then by Galois invariance x + m i ( x ) ≡ t (mod h ) for a constant t ∈ F q , and since the periods do not depend on theinitial choice of h or h , we conclude that m i ( x ) ≡ ( t − x ) (mod f λ ). Hence, we can simplycompute x + m i ( x ) (mod f λ ) (or (mod h ) or (mod h )) and see if it is a constant. Thedominating step in this approach is the factorization of f (cid:96) . Such an approach can easily begeneralized to larger values of d . Search strategy 3
Finally, one can apply the above approach in a global way: pick arational elliptic curve with a rational (cid:96) -isogeny to ensure that (cid:96) is an Elkies prime over finitefields, and such that the (cid:96) -torsion (or its abscissas) is defined over an extension of degree r (or 2 r ) with (cid:96) = 2 dr + 1, then compute the kernel polynomial of the isogeny f λ = h . . . h d with h , . . . , h d of degree r and check that there is no prime, outside those with discriminant D = 0 , ,
4, dividing all the coefficients, except possibly the constant one, of x + m i ( x ) + · · · + ( m i ◦ · · · ◦ m i (cid:124) (cid:123)(cid:122) (cid:125) d − )( x ) (mod f λ )where i a d -th root of − Z/(cid:96) Z ) ∗ . If such a prime q is found, the reduction modulo q is acandidate for a counterexample. It should still be checked that over F q the kernel polynomialdoes not split into more factors than over Q , and the corresponding counterexample shouldbe explicitly computed. 47or r = 3, d = 2 and (cid:96) = 2 dr + 1 = 13, there exists an infinite family of elliptic curvessuitable for this approach. Indeed, Daniels et al. [16] showed that all rational elliptic curveswith 13-torsion defined over a cubic extension of Q belong to the parametrized family with j -invariant: j ( t ) = ( t − t + 5 t + t + 1) ( t − t + 7 t − t + 5 t + 7 t + 5 t + 1) t ( t − t − . In practice, the limiting step when using this family is computing the j -invariant of thecurves because its height quickly grows.The next candidate for prime r and (cid:96) , and d = 2, can not be used. Indeed, for r = 7 and (cid:96) = 29, there is no rational elliptic curve with a rational 29-isogeny [42] (or elliptic curvesdefined over a septic number field with 29-torsion [21]). More generally, rational ellipticcurves with rational (cid:96) -isogenies exist only for a finite set of values of (cid:96) corresponding tonon-cuspidal rational points on X ( (cid:96) ), leaving few chances to find other interesting curves.Among the suitable prime values for (cid:96) ≥
13 [42], only (cid:96) = 13 provides an infinite fam-ily [41] parametrized by j ( t ) = ( t + 5 t + 13) ( t + 7 t + 20 t + 19 t + 1) t . Within this family are the aforementioned curves with 13-torsion defined over a cubic exten-sion of Q . The other curves might only have the abscissas of the 13-torsion points definedover a cubic extension of Q , which is completely equivalent for this approach, or over a sexticextension of Q , in which case the kernel polynomials are irreducible of degree 6. These curvescan still be used with r = 3 and d = 2, but if a candidate counterexample is found, it shouldbe checked that the kernel polynomial splits as two cubic factors over F q . Indeed, ellipticperiods are not normal, and the trace trick used in the cyclotomic Rains algorithm with anauxiliary extension degree s = 2 does not apply. Prime values of (cid:96)
We now treat the other prime values for (cid:96) ≥
13 for which onlya finite number of Q -isomorphism classes of rational elliptic curves come with a rational (cid:96) -isogeny. Curves labels come from Cremona’s tables [15].For (cid:96) = 17, with ( (cid:96) − / , there are two rational elliptic curves with a rational17-isogeny. The factorization of ( (cid:96) − / r and d exceptin trivial cases where r = 2 and d = 1. Therefore, none of these curves can be used.For (cid:96) = 19, with ( (cid:96) − / , there are two rational elliptic curves with a rational19-isogeny. The factorization of ( (cid:96) − / r and d exceptin trivial cases where r = 3 and d = 1. Therefore, none of these curves can be used.For (cid:96) = 37, with ( (cid:96) − / · , there are two rational elliptic curves with a rational37-isogeny. The first curve is curve and its kernel polynomial splits into three sexticfactors. It implies that 3 must divide d and can not divide r . Therefore, the curve can onlybe used with r = 2 and d = 3 . The second curve is curve and its kernel polynomialis irreducible. Therefore, it can be used with r = 2 and d = 3 , and r = 3 and d = 2.For (cid:96) = 43, with ( (cid:96) − / · r = 3 and d = 7, and r = 7 and d = 3. 48or (cid:96) = 67, with ( (cid:96) − / ·
11 there is one rational elliptic curve with a rational67-isogeny and its kernel polynomial is irreducible. Therefore, it can be used with r = 3 and d = 11, and r = 11 and d = 3.For (cid:96) = 163, with ( (cid:96) − / , there is one rational elliptic curve with a rational163-isogeny. The factorization of ( (cid:96) − / r and d exceptin trivial cases where r = 3 and d = 1. Therefore, this curve can not be used. Composite values of (cid:96)
There are also a few odd composite values (cid:96) ≥
13 for whichrational elliptic curves with a rational (cid:96) -isogeny exist. In this case, only points of exact order (cid:96) should be used. If moreover (cid:96) is not squarefree, periods should be computed using themore general formula from [25].For (cid:96) = 15, with φ (15) / , there are four rational elliptic curves with a rational15-isogeny. The factorization of ( (cid:96) − / r and d exceptin trivial cases where r = 2 and d = 1. Therefore, none of these curves can be used.For (cid:96) = 21, with φ (21) / ·
3, there are four rational elliptic curves with a rational21-isogeny All of them have a rational point of 3-torsion, and among them, curve is the unique rational elliptic curve with 21-torsion over a cubic extension of Q [47]. Forcurve , the part of the kernel polynomial corresponding to primitive 21-torsion splitsinto two cubic factors, whereas, for the three other curves , and , this partis a sextic irreducible. Therefore, the former curve can only be used with r = 3 and d = 2,whereas the latter ones can be used both with r = 2 and d = 3, and r = 3 and d = 2.For (cid:96) = 25, with φ (25) / ·
5, there is an infinity of rational elliptic curves with arational 25-isogeny parametrized [41] by j ( t ) = ( t + 10 t + 35 t − t + 50 t − t + 25 t − t + 16) t + 5 t + 5 t − . There is also an infinity of elliptic curves defined over a quintic extension K of Q with25-torsion defined over K [21, 22]. If one of these curves is actually defined over Q , it hasrational 5-torsion [26]. Proving that there is an infinity of such curves or parametrizing themis outside the scope of this work, but one can still sample curves within the larger family,compute the kernel polynomial over the rationals, and compute periods for r = 2 and d = 5,or r = 5 and d = 2, if it is irreducible, and r = 5 and d = 2 if it splits into two quintic factors,and finally check that the kernel polynomials of the candidate counterexamples factors asexpected over F q .For (cid:96) = 27, with φ (27) / , there is one rational elliptic curve with a rational 27-isogeny. The factorization of ( (cid:96) − / r and d except intrivial cases where r = 3 and d = 1. Therefore, this curve can not be used. Experimental results
All above strategies were implemented and ran on different rangesof parameters.For strategy 1, for prime values of (cid:96) in different ranges, we tested all elliptic curvesdefined over F q with q prime up to some bounds and such that (cid:96) is an Elkies prime with aneigenvalue of odd prime order r (without any bound on r ). The greatest values of q and the49 [0 , , , , , q max (cid:96) [500 , , , , , q max q tested for a given range of (cid:96) with strategy 1.total number of curves tested are given in Table 1. Among all ranges we tested more than43 millions of curves.For strategy 2, we picked up small values of r for which (cid:96) = 2 · d · r + 1 with d = 2 isprime, and checked all elliptic curves defined over F q with q prime up to some bound. Thegreatest values of q tested are given in Table 2. r q max r
67 73 79 97 127 q max
691 743 577 419 271Table 2: Largest value of q tested for a given r with strategy 2.For strategy 3, we tested all rational elliptic curves with a rational 13-isogeny and 13-torsion defined over a cubic extension in the infinite family parametrized by j ( t ) = ( t + 5 t + 13) ( t + 7 t + 20 t + 19 t + 1) t for rational values of t with numerator and denominator of absolute values less than 1538.We also tested the sporadic curves for prime values of (cid:96) = 37 , , Final remark
As to the significance of our experiments, it is worth noting that a naturalgeneralization of periods consists in replacing the sum by a product (or any other symmetricfunction, indeed). Globally, this corresponds to replace a trace by a norm in some numberfield, or polynomially cyclic algebra. This idea is bound to fail for Gaussian periods, becauseroots of unity are conjugate to their inverses; however it looks in principle as good as ourdefinition in the elliptic case.Looking more closely, however, we notice that using traces to define periods grants somevery special properties to them. For one, there is no analogue of Proposition 35 for sym-metric functions other than the trace; and, most strikingly, we were able to find examplesof generalized periods not generating their field of definition for most elementary symmetricfunctions, except for the trace, using only small scale experiments.In conclusion, even though we have little theoretical evidence to support Conjecture 23,disproving it may turn out to be a very challenging task.50 eferences [1] Leonard M. Adleman and Hendrik W. Lenstra. Finding irreducible polynomials overfinite fields. In
Proceedings of the Eighteenth Annual ACM Symposium on Theory ofComputing , STOC ’86, pages 350–355, New York, NY, USA, 1986. ACM.[2] Bill Allombert. Explicit computation of isomorphisms between finite fields.
Finite FieldsAppl. , 8(3):332 – 342, 2002.[3] Bill Allombert. Explicit computation of isomorphisms between finite fields. Revisedversion. , 2002.[4] Wieb Bosma, John Cannon, and Catherine Playoust. The MAGMA algebra system I:the user language.
J. Symbolic Comput. , 24(3-4):235–265, 1997.[5] Wieb Bosma, John Cannon, and Allan Steel. Lattices of compatibly embedded finitefields.
Journal of Symbolic Computation , 24(3-4):351–369, 1997.[6] Alin Bostan, Gr´egoire Lecerf, and ´Eeric Schost. Tellegen’s principle into practice. In
ISSAC’03 , pages 37–44. ACM, 2003.[7] Alin Bostan, Bruno Salvy, and ´Eric Schost. Fast algorithms for Zero-Dimensional poly-nomial systems using duality.
Applicable Algebra in Engineering, Communication andComputing , 14(4):239–272, November 2003.[8] Richard P. Brent and H.-T. Kung. Fast algorithms for manipulating formal power series.
Journal of the ACM , 25(4):581–595, 1978.[9] D. G. Cantor and E. Kaltofen. On fast multiplication of polynomials over arbitraryalgebras.
Acta Informatica , 28(7):693–701, July 1991.[10] David G. Cantor and Hans Zassenhaus. A new algorithm for factoring polynomials overfinite fields.
Mathematics of Computation , 36:587–592, 1981.[11] Wouter Castryck and Hendrik Hubrechts. The distribution of the number of points mod-ulo an integer on elliptic curves over finite fields.
The Ramanujan Journal , 30(2):223–242, 2013.[12] Giuseppe Cornacchia. Su di un metodo per la risoluzione in numeri interi dell’equazione (cid:80) nh =0 c h x n − h y h = p . Giornale di Matematiche di Battaglini , 46:33–90, 1908.[13] Jean-Marc Couveignes and Reynald Lercier. Galois invariant smoothness basis.
Serieson Number Theory and Its Applications , 5:142–167, May 2008. World Scientific.[14] Jean-Marc Couveignes and Reynald Lercier. Fast construction of irreducible polynomialsover finite fields.
To appear in the Israel Journal of Mathematics , July 2011.[15] John Cremona. Cremona tables, October 2016. http://johncremona.github.io/ecdata/.5116] Harris B. Daniels, ´Alvaro Lozano-Robledo, Filip Najman, and Andrew V. Sutherland.Torsion subgroups of rational elliptic curves over the compositum of all cubic fields. arXiv:1509.00528 [math] , September 2015. arXiv: 1509.00528.[17] Luca De Feo.
Algorithmes Rapides pour les Tours de Corps Finis et les Isog´enies . PhDthesis, Ecole Polytechnique X, December 2010.[18] Luca De Feo, Javad Doliskani, and ´Eric Schost. Fast algorithms for (cid:96) -adic towers overfinite fields. In
ISSAC’13 , pages 165–172. ACM, 2013.[19] Luca De Feo, Javad Doliskani, and ´Eric Schost. Fast arithmetic for the algebraic closureof finite fields. In
Proceedings of the 39th International Symposium on Symbolic andAlgebraic Computation , ISSAC ’14, pages 122–129, New York, NY, USA, 2014. ACM.[20] Luca De Feo and ´Eric Schost. transalpyne: a language for automatic transposition.
SIGSAM Bulletin , 44(1/2):59–71, 2010.[21] Maarten Derickx and Mark van Hoeij. Gonality of the modular curve.
Journal ofAlgebra , 417:52 – 71, 2014.[22] Marteen Derickx and Andrew V. Sutherland. Torsion subgroups of elliptic curves overquintic and sextic number fields.
ArXiv e-prints , August 2016.[23] The Sage Developers.
SageMath, the Sage Mathematics Software System (Version7.2.0) , 2016. .[24] Javad Doliskani and ´Eric Schost. Taking roots over high extensions of finite fields.
Mathematics of Computation , 83(285):435–446, 2014.[25] Sandra Feisel, Joachim von zur Gathen, and M. Amin Shokrollahi. Normal bases viageneral Gauss periods.
Mathematics of Computation , 68(225):271–290, 1999.[26] Enrique Gonzalez-Jimenez. Complete classification of the torsion structures of rationalelliptic curves over quintic number fields. arXiv:1607.01920 [math] , July 2016. arXiv:1607.01920.[27] William Hart. Fast library for number theory: an introduction.
Mathematical Software-ICMS 2010 , pages 88–91, 2010.[28] David Harvey. Faster polynomial multiplication via multipoint Kronecker substitution.
J. Symbolic Comput. , 44(10):1502–1510, October 2009.[29] D. Roger Heath-Brown. Zero-free regions for Dirichlet L-functions, and the least primein an arithmetic progression. In
Proceedings of the London Mathematical Society(3) ,volume 64, pages 265–338, 1992.[30] Erich Kaltofen. Computer algebra algorithms.
Annual Review in Computer Science ,2:91–118, 1987. 5231] Erich Kaltofen and Victor Shoup. Fast polynomial factorization over high algebraicextensions of finite fields. In
ISSAC ’97: Proceedings of the 1997 international sym-posium on Symbolic and algebraic computation , pages 184–188, New York, NY, USA,1997. ACM.[32] Erich Kaltofen and Victor Shoup. Subquadratic-time factoring of polynomials over finitefields.
Math. Comp. , 67(223):1179–1197, 1998.[33] Kiran S. Kedlaya and Christopher Umans. Fast modular composition in any char-acteristic. In
FOCS ’08: Proceedings of the 2008 49th Annual IEEE Symposium onFoundations of Computer Science , pages 146–155, Washington, DC, USA, 2008. IEEEComputer Society.[34] Kiran S. Kedlaya and Christopher Umans. Fast polynomial factorization and modularcomposition.
SICOMP , 40(6):1767–1802, 2011.[35] Serge Lang.
Algebra . Springer, 3rd edition, January 2002.[36] Fran¸cois Le Gall. Powers of tensors and fast matrix multiplication. In
ISSAC’14 , pages296–303. ACM, 2014.[37] Hendrik W. Lenstra. Factoring integers with elliptic curves.
Annals of Mathematics ,126:649–673, 1987.[38] Hendrik W. Lenstra. Finding isomorphisms between finite fields.
Mathematics of Com-putation , 56(193):329–347, 1991.[39] Reynald Lercier and Thomas Sirvent. On Elkies subgroups of (cid:96) -torsion points in ellipticcurves defined over a finite field.
Journal de th´eorie des nombres de Bordeaux , 20(3):783–797, 2008.[40] Rudolf Lidl and Harald Niederreiter.
Finite Fields (Encyclopedia of Mathematics andits Applications) . Cambridge University Press, October 1996.[41] ´Alvaro Lozano-Robledo. On the field of definition of p -torsion points on elliptic curvesover the rationals. Mathematische Annalen , 357(1):279–305, 2013.[42] Barry Mazur and Peter Swinnerton-Dyer. Arithmetic of weil curves.
Inventiones math-ematicae , 25:1–62, 1974.[43] Preda Mihailescu and Victor Vuletescu. Elliptic gauss sums and applications to pointcounting.
Journal of Symbolic Computation , 45(8):825 – 836, 2010.[44] Robert T. Moenck. Another polynomial homomorphism.
Acta Informatica , 6(2):153–169, June 1976.[45] Peter L. Montgomery. Speeding the pollard and elliptic curve methods of factorization.
Math. Comp. , 48(177), 1987.[46] Gary L. Mullen and Daniel Panario.
Handbook of finite fields . CRC Press, 2013.5347] Filip Najman. Torsion of rational elliptic curves over cubic fields and sporadic pointson x ( n ). Mathematical Research Letters , 23(1), 2016.[48] Anand Kumar Narayanan. Fast computation of isomorphisms between finite fields usingelliptic curves. arXiv preprint arXiv:1604.03072, 2016.[49] Cyril Pascal and ´Eric Schost. Change of order for bivariate triangular sets. In
IS-SAC ’06: Proceedings of the 2006 international symposium on Symbolic and algebraiccomputation , pages 277–284, New York, NY, USA, 2006. ACM.[50] Michael S. Paterson and Larry J. Stockmeyer. On the number of nonscalar multipli-cations necessary to evaluate polynomials.
SIAM Journal on Computing , 2(1):60–66,1973.[51] Richard G. E. Pinch. Recognising elements of finite fields. In
Cryptography and CodingII , pages 193–197. Oxford University Press, 1992.[52] Adrien Poteaux and ´Eric Schost. Modular composition modulo triangular sets andapplications.
Computational Complexity , 22(3):463–516, 2013.[53] Eric M. Rains. Efficient computation of isomorphisms between finite fields. personalcommunication, 1996.[54] Ren´e Schoof. Counting points on elliptic curves over finite fields.
Journal de Th´eoriedes Nombres de Bordeaux , 7(1):219–254, 1995.[55] Victor Shoup. Fast construction of irreducible polynomials over finite fields. In
SODA’93: Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms ,pages 484–492, Philadelphia, PA, USA, 1993. Society for Industrial and Applied Math-ematics.[56] Victor Shoup. Fast construction of irreducible polynomials over finite fields.
Journal ofSymbolic Computation , 17(5):371–391, 1994.[57] Victor Shoup. A new polynomial factorization algorithm and its implementation.
J.Symbolic Comput. , 20(4):363–397, 1995.[58] Victor Shoup. Efficient computation of minimal polynomials in algebraic extensions offinite fields. In
ISSAC’99 , pages 53–58. ACM, 1999.[59] Alice Silverberg. Group order formulas for reductions of cm elliptic curves. In
Pro-ceedings of the Conference on Arithmetic, Geometry, Cryptography and Coding Theory,Contemporary Mathematics , volume 521, pages 107–120, 2010.[60] Joseph H. Silverman.
The arithmetic of elliptic curves , volume 106 of
Graduate Textsin Mathematics . Springer-Verlag, New York, 1992.[61] The PARI Group, Bordeaux.
PARI/GP, version , 2014.5462] Joachim von zur Gathen and Jurgen Gerhard.
Modern Computer Algebra . CambridgeUniversity Press, New York, NY, USA, 1999.[63] Joachim von zur Gathen and Victor Shoup. Computing Frobenius maps and factoringpolynomials. In
STOC ’92: Proceedings of the twenty-fourth annual ACM symposiumon Theory of computing , pages 97–105, New York, NY, USA, 1992. ACM.[64] Joachim von zur Gathen and Victor Shoup. Computing frobenius maps and factoringpolynomials.
Computational complexity , 2(3):187–224, 1992.[65] William C. Waterhouse. Abelian varieties over finite fields.