[PDF] Number Field Sieve with Provable Complexity

Abstract

In this thesis we give an in-depth introduction to the General Number Field Sieve, as it was used by Buhler, Lenstra, and Pomerance, before looking at one of the modern developments of this algorithm: A randomized version with provable complexity. This version was posited in 2017 by Lee and Venkatesan and will be preceded by ample material from both algebraic and analytic number theory, Galois theory, and probability theory.

Full PDF

aa r X i v : . [ m a t h . N T ] J u l Number Field Sieve with provable complexity

Barry van Leeuwen

Supervisor: Dr. A.R. BookerChair: Dr. T. Dokchitser (University of Bristol)Examiners: Dr. J. Bober (University of Bristol)Dr. S. Siksek (University of Warwick)

A dissertation submitted to the

University of Bristol in accordance with the requirements for award of the degree of

Master of Science by Research in Mathematics at the

Faculty of Science

School of Mathematics

July 14, 2020

Word count: 28557 bstract

In this thesis we give an in-depth introduction to the General Number Field Sieve, as it was usedby Buhler, Lenstra, and Pomerance, [17], before looking at one of the modern developments ofthis algorithm: A randomized version with provable complexity. This version was posited in2017 by Lee and Venkatesan, [14], and will be preceded by ample material from both algebraicand analytic number theory, Galois theory, and probability theory.Page 1 of 114 edication and Acknowledgements

I want to thank Dr. Andrew Booker, who as my supervisor managed to ﬁnd what I needed eventhough it may not have been what I wanted. I also want to thank Dr. James Milne, Dr. FlorianBouyer, and Dr. Lynne Walling for providing some of the material used. I also want to thankDr. Dan Fretwell, who helped me ﬁnd my footing when I just started (which feels very long ago)and my mother, Cokky van Leeuwen, who with her sorcery managed to ﬁnd typos that I couldnot.

To my wife, Sarah van Leeuwen, who with her continued support andmotivation made possible what I thought impossible.

Page 2 of 114 eclaration

I declare that the work in this dissertation was carried out in accordance with the requirementsof the University’s Regulations and Code of Practice for Research Degree Programmes and thatit has not been submitted for any other academic award. Except where indicated by speciﬁcreference in the text, the work is the candidate’s own work. Work done in collaboration with,or with the assistance of, others, is indicated as such. Any views expressed in the dissertationare those of the author.Signed: Barry van LeeuwenDate: July 14, 2020 Page 3 of 114 ontents

Abstract 1Contents 5Notation 61 Introduction 72 Algebraic fundamentals 9 Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.2.3 Sieving over Z [ α ] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.3 Dealing with obstructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.4 Computational eﬃciency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 L -functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584.2.3 Dedekind zeta functions and Siegel zeroes . . . . . . . . . . . . . . . . . 604.2.4 Siegel zeroes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624.3 Arithmetic progressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634.3.1 Extensions and Reﬁnements . . . . . . . . . . . . . . . . . . . . . . . . . 644.3.2 Smooth numbers on Arithmetic Progressions . . . . . . . . . . . . . . . . 654.4 Probability measures and moments . . . . . . . . . . . . . . . . . . . . . . . . . 68 References 112

Page 5 of 114 otation

A brief introduction to some of the notation used. Most of the notation listed will be formallyintroduced, but this can be used as a reference guide. R [ α ] - A ring extended by an element αF ( α ) - A ﬁeld extended by an element α O K - The ring of algebraic integers in a ﬁeld extension Kf.f. ( R ) - The ﬁeld of fractions generated by a ring R irr F ( a ) - The minimal polynomial of a over F [ a ] ∈ B - The coset of a in structure B h a i ⊂ B - The ideal generated by a in structure B L ( s, χ ) - Dirichlet L-function with character χ and exponent sa ∼ B - Draw a accoridng to distribution Bπ ( X ) - The number of primes below X In this paper we use Bachmann–Landau Big-O notation extended by Knuth Big-Ω notation,such that for functions f ( x ) and g ( x ) and N ∈ N : f ( x ) = o ( g ( x )) - ∀ k ∈ R > ∃ N such that ∀ x ≥ N : | f ( x ) | ≤ k · g ( x ) f ( x ) = O ( g ( x )) - ∃ k ∈ R > ∃ N such that ∀ x ≥ N : | f ( x ) | ≤ k · g ( x ) f ( x ) = ω ( g ( x )) - ∀ k ∈ R > ∃ N such that ∀ x ≥ N : | f ( x ) | ≥ k · g ( x ) f ( x ) = Ω ( g ( x )) - ∃ k ∈ R > ∃ N such that ∀ x ≥ N : | f ( x ) | ≥ k · g ( x )Page 6 of 114 Introduction

In 1988 Pollard introduced a brand new factorization algorithm: The Number Field Sieve (NFS).This saw a ﬁrst implementation in 1994 by Lenstra et al. and was used to factor the ninth Fer-mat number, F .The grandure of this algorithm was in the conjectured complexity of L n (cid:18) , r

649 + o (1) (cid:19) , where n ∈ N and for a, b, x ∈ R : L x ( a, b ) = exp( b ((log n ) a (log log n ) − a ) . This was a major improvement to the best algorithms at the time, such as the quadratic sieve,which were of complexity L n ( , b ) for some constant b .Despite abundant heuristics indicating the bound and copious research poured into a proofit was very diﬃcult to show the complexity of the two major parts of the sieve:1. Using the concept of smoothness to ﬁnd a factor base.2. Reducing this factor base to a factorization of a given n ∈ N .In chapter 3 we will be looking at the General Number Field Sieve (GNFS) how it was explicitlydescribed in [17] and how the sieve is structured.It wasn’t until 2017 that Lee and Venkatesan, [14]. used a probabilistic and combinatorialapproach to alter the algorithm in such a way that they managed to prove the following, ([14],theorem 2.1 & 2.3): Page 7 of 114 heorem 1.1. There is a randomised variant of the Number Field Sieve which for each n ﬁndscongruences of squares x ≡ y mod n in expected time: L n (cid:18) , r

649 + o (1) (cid:19) Remark.

Note that no claims are made regarding the factorization of n . This theorem onlyproves that there exists an algorithm for which the sieving process up until ﬁnding a congruenceof squares, i.e. a pair ( x, y ) for which x ≡ y mod n , with a non-zero probability in thecomplexity bound given in the theorem above.This version of the algorithm will be extensively studied in chapter 5 and ﬁnishes by provingthe theorem above.To do this we need to use a strong basis in analytic number theory, which we will introducein chapter 4 along with a review of some probability theory. In chapter 2 we will recall thenecessary concepts from algebraic number theory and Galois theory, but for those familiar withthese subjects this chapter can be skipped.Page 8 of 114 Algebraic fundamentals

The reason the General Number Field Sieve is so eﬀective is that it uses strong algebraic con-structions to get a factorization, which allows for far fewer computations to be made to get aneﬀective result. The discussion starts by considering these constructions and concepts, such asﬁeld extensions, in all generality, but will swiftly collapse down to the speciﬁc cases needed forthis sieve. We begin by introducing some notation that we will use throughout.

Deﬁnition 2.1. • Let K and L be ﬁelds such that K ⊂ L , then L is called a ﬁeld extension of K and isdenoted L/K . • K ( α ) is the smallest ﬁeld extension of K that contains α . If α ∈ K then K ( α ) = K . • K [ x ] is the polynomial extension of K , such that K [ x ] = n ∞ X i =0 a i x i | ∀ i ∈ Z + : a i ∈ K o . Remark.

To prevent confusion ⊂ and ⊃ are proper sub- and super-constructions, while ⊆ and ⊇ can have equality.A ﬁeld extension L/K can be seen as a K -vector space and when considered as such itbecomes a natural next step to think about the dimension of L as a K -vectorspace. This is calledthe degree and is denoted [ L : K ] = dim K ( L ). If there are multiple extensions, L ⊇ M ⊇ K ,called a tower of extensions, called a tower of extensions then the degrees of these extensions arevery well behaved. Theorem 2.2. Tower Law

Let K n ⊇ K n − ⊇ . . . ⊇ K be ﬁelds. Then[ K n : K ] = [ K n : K n − ] . . . [ K : K ] . Page 9 of 114f [ L : K ] < ∞ the extension is called ﬁnite, else it is inﬁnite. To use the sieve it is importantto understand when such an extension is ﬁnite and when it isn’t. Let f ( x ) be an irreduciblemonic monovariate polynomial with coeﬃcients in K of degree deg( f ) = d . As f ( x ) is irreducibleit has no roots in K , but it might have roots over other ﬁelds that contain K . Deﬁnition 2.3.

An irreducible polynomial f over a ﬁeld K is said to be separable over K if ithas no multiple zeros in a splitting ﬁeld. That is there exists a ﬁeld L ⊃ K such that f ( x ) = ( x − α )( x − α ) . . . ( x − α d )in L , where f ( α i ) = 0.This deﬁnition is not extremely helpful and in practice it will be more useful to consider [[4],proposition 9.14] instead: Proposition 2.4. If K is a subﬁeld of C then every irreducible polynomial over K is separable.By the fundamental theorem of algebra it is known that a polynomial f ( x ) with deg( f ) = d has exactly d roots in C . These roots are not exclusive, and a root α is the root of manypolynomials. However there is a unique monic polynomial of lowest degree that is of speciﬁcimportance: Deﬁnition 2.5.

Let f ∈ K [ x ] be a polynomial with coeﬃcients in a ﬁeld K for which α is aroot. Then f is the minimal polynomial of α in K if • f is monic. • f is irreducible. • ∀ g ∈ K [ x ] such that g ( α ) = 0, f divides g .This polynomial is unique up to units.Throughout this text we adopt the convention to denote the minimal polynomial for α ∈ K over K as irr K ( α ). The minimal polynomial gives rise to the following equivalencies, which arevital in the background throughout. Page 10 of 114 heorem 2.6. Let f ∈ K [ x ] with deg( f ) = d be a monic irreducible monovariate polynomialwith root α . Let h f ( x ) i denote the ideal generated by f ( x ).Then the following are equivalent:1. f ( x ) is the minimal polynomial for α in K [ x ]; f ( x ) = irr K ( α ).2. K ( α ) ∼ = K [ x ] / h f ( x ) i .3. K ( α ) is a ﬁeld.4. [ K ( α ) : K ] = d and { , α, α , . . . , α d − } is a basis for K ( α ) over K .The above theorem reveals a condition for which the extensions K ( α ) is a ﬁnite degreeextension of K . This condition is called algebraicity and for an extension L/K an element α ∈ L is called algebraic exactly if it is a root for a polynomial f ( x ) ∈ K [ x ]. Remark.

There are two special cases to consider: • A complex number, α ∈ C , that is the root of a polynomial with coeﬃcients in Z is calledan algebraic integer. • A complex number, α ∈ C , that is the root of a polynomial with coeﬃcients in Q is calledan algebraic numberIt turns out that the structure of the algebraic numbers is very simple to describe. Theorem 2.7.

Every algebraic number is of the form αb where α is algebraic over Z and b ∈ Z \{ } .As just discussed: If irr K ( α ) has degree d , then it is only natural to ask about the other rootsof irr K ( α ). These are called the conjugates of α and have some interesting properties. Deﬁnition 2.8.

Let α ∈ C be algebraic over K ⊆ C . The conjugates of α , α = α , . . . , α d arethe roots of irr K ( α ).As K ⊂ C proposition 2.4 shows that any irreducible polynomial over K is separable, henceeach α i is distinct: irr K ( α ) = ( x − α )( x − α ) . . . ( x − α d ) . Page 11 of 114sing the conjugates [[1], theorem 5.6.1] proves that we only need to consider extensions by asingular algebraic number:

Theorem 2.9.

Let K ⊆ C and let α, β ∈ C be algebraic over K . Then there exists γ ∈ C algebraic over K such that K ( α, β ) = K ( γ ) . This can easily be extended to any ﬁnite extension K ( α , . . . , α n ) by using the fact that K ( α , α ) = ( K ( α )) ( α ). To work eﬀectively with the general number ﬁeld sieve there are some restrictions we place onour extensions. The ﬁrst restriction placed upon the extensions is the restriction that they beﬁnite. Hence, from now on, every extension

L/K ⊆ C is assumed ﬁnite. Special focus will beon algebraic number ﬁelds: Deﬁnition 2.10.

Let K ⊆ C , then K is an algebraic number ﬁeld if K = Q ( α , . . . , α n ) suchthat for all i ∈ N , α i is an algebraic number.By theorem 2.9 we can equate K = Q ( α ) for some algebraic number α ∈ C . Moreover weknow that any algebraic number can be written as α = βb where β is an algebraic integer and b ∈ Z \{ } , but Q ( βb ) ∼ = Q ( β ) as b ∈ Q so it is safe to assume that α is in fact an algebraicinteger. Deﬁnition 2.11.

Let A be the set of all algebraic integers and let K be an algebraic numberﬁeld. Then the set O K = A ∩ K is called the ring of algebraic integers over K .As the name suggests, O K is a ring. In fact, by [[1], theorem 6.1.4 and 6.1.6], Theorem 2.12.

Let K be an algebraic number ﬁeld, then O K is integrally closed. i.e.1. O K is an integral domain. Page 12 of 114. ∀ α : α algebraic over O K ⇒ α ∈ O K .In generality O K is surprisingly well behaved allowing for some interesting properties, in-cluding that it’s ﬁeld of fractions is K : f.f. ( O K ) = n αβ (cid:12)(cid:12)(cid:12) α, β ∈ O K o = K. Another of the properties which is important is [[1], theorem 6.5.3]:

Theorem 2.13.

Let K be an algebraic number ﬁeld. Then O K is a Noetherian domain. Remark.

A terribly inconvenient, yet common, convention is that the ideals I ⊆ O K are justcalled ideals of K . Despite the confusion this might causes it will not pose a large problem, andwill be clear from context, as in practice K will be a number ﬁeld, hence only has trivial ideals.It is nearly by deﬁnition that a maximal ideal of an integral domain is a prime ideal, but ingeneral the converse is not true. However, in the case of O K for an algebraic number ﬁeld K this converse is true. Theorem 2.14.

Let K = Q ( α ) be an algebraic number ﬁeld. Let P ⊆ O K be a prime ideal of O K , then P is a maximal ideal of O K .Now we have shown that O K is integrally closed, is a Noetherian domain, and each primeideal of O K is a maximal ideal. These three deﬁne an important class of domains, which arevery useful for the General Number Field Sieve. Deﬁnition 2.15.

An integral domain D that satisﬁes the following properties: • D is a Noetherian domain. • D is integrally closed. • Each prime ideal of D is a maximal ideal.is called a Dedekind domain. Page 13 of 114edekind domains admit a very interesting property, as they generalize the idea of uniquefactorization into primes which we know by the fundamental theorem of arithmetic holds forall z ∈ Z up to units, ±

1. In Dedekind domains this holds as well, however we must use theequivalent construction for number ﬁelds.

Theorem 2.16.

Let D be a Dedekind domain. Any non-trivial integral ideal is a product ofprime ideals and this factorization is unique up to order, i.e. for I ⊂ D an ideal with factorization I = Y i ∈ I P e i i = Y j ∈ J Q f j j , with prime ideals P i , Q j then there exists a permutation σ : I → J such that P i = Q j and e i = f j .To prove this we need some auxiliary measures: Proposition 2.17.

Let D be a Dedekind domain. Then every non-zero ideal I ∈ O K containsa product of one or more prime ideals. Deﬁnition 2.18.

Let D be an integral domain and let K be the quotient ﬁeld of D . For eachprime ideal P of D we deﬁne the set ¯ P by¯ P = { α ∈ K : α P ⊆ D } . This set is an ideal and is called a fractional ideal of D . Proposition 2.19.

Let D be a Dedekind domain. Let P be a prime ideal of D . Then P ¯ P = D. With these facts recorded we proceed with the proof of theorem 2.16:

Proof.

This proof will come in two parts: Existence and Uniqueness. Let D be a Dedekinddomain such that there exists a non-trivial ideal of D that is not a product of prime ideals.Page 14 of 114xistance: As D is a Dedekind domain, it is Noetherian, and so by the maximality princi-ple there is a non-trivial ideal A of D that is maximal with respect to the property of not beinga product of prime ideals. Then by proposition 2.17 there exists prime ideals P , . . . , P k of D such that P . . . P k ⊆ A . Now let k be the smallest positive integer for which such a product exists. If k = 1 then P ⊆ A ⊂ D is a prime ideal, hence it is maximal. So A = P . This contradicts our assumptionthat A was not a product of prime ideals. Hence k ≥

2. By proposition 2.19 we have that¯ P P = D and so ¯ P P . . . P k = D P . . . P k . Hence we have that ¯ P A ⊇ P . . . P k . Moreover by proposition 2.19 we also have that A ⊆ ¯ P A .Now assume that A = ¯ P A , then A ⊇ P . . . P k . Which contradicts the minimality of k . Now assume A ⊂ ¯ P A , then as the latter is an ideal of D , by the maximality property of A , we get¯ P A = Q . . . Q l , for some prime ideals Q j , j ∈ { , . . . , l } . But then A = A D = A ¯ P P = P Q . . . Q l , which contradicts the way A was chosen. Hence every ideal of D is a product of prime ideals.Page 15 of 114niqueness: Suppose now that there exists a maximal ideal A of D , such that A = P . . . P k = Q . . . Q l . By the maximality principle this is a choice we can make. Then as P . . . P k ⊆ Q and Q is aprime ideal we have that P i = Q i for some i ∈ { , . . . , k } . Relabelling Q i as Q we get Q = P as P is a prime, hence maximal, ideal of D . Thus¯ Q Q . . . Q l = AQ = A ¯ P = ¯ P P . . . P k = P . . . P k . Now assume that A = A ¯ P , then A ¯ P P = AP and so A = AP . Deﬁne the fractional idealof A as ¯ A = ¯ P . . . ¯ P k . Then A ¯ A = P . . . P k ¯ P . . . ¯ P k . and so D = ¯ AA = A ¯ P ¯ A = P . but this contradicts the primality of P , so our equality assumption was faulty. Hence A ⊂ A ¯ P .Since A ¯ P is a maximal ideal of D , A ¯ P has exactly one factorization as a product of primeideals. Thus we deduce that k − l − k = l . Relabbeling this gives that for all i ∈ { , . . . , k } P i = Q i , proving that the decomposition into prime ideals is unique.Page 16 of 114 .3 Galois theory and the Frobenius element In this section we give a very concise introduction to a few ideas from Galois Theory. If youhave a basic understanding of ﬁnite Galois theory then there is no loss in simply skipping thissection.

Deﬁnition 2.20.

Let

L/K be number ﬁelds. Then L is said to be normal over K if everyirreducible polynomial f ∈ K [ x ] that has a zero in L splits over L .Normality is easily checked under certain conditions: Deﬁnition 2.21.

Let

L/K be number ﬁelds. Then L is a splitting ﬁeld for K if for thepolynomial f ∈ K [ x ]: • f splits over L . • There is no M , L/M/K , such that M = L and f splits over M . Theorem 2.22.

A ﬁeld extension

L/K is normal and ﬁnite if and only if L is a splitting ﬁeldfor some polynomial f ∈ K [ x ].We can also speak of seperable extensions Deﬁnition 2.23.

Let K be a number ﬁeld. Then a polynomial f ∈ K [ x ] is said to be separableif it splits into linear terms in K . Deﬁnition 2.24.

Let

L/K be number ﬁelds. Then F is said to be a seperable extension of K if for every α ∈ K , irr K ( α ) is seperable over L . Remark.

A ﬁnite, normal, and seperable extension of a number ﬁeld is also called a Galoisextension.For seperable extensions we have an equivalence allowing us to reduce to one irreduciblepolynomial.:

Proposition 2.25.

For an extensions

L/K the following are equivalent:Page 17 of 114. L is a Galois extension of K .2. L is the splitting ﬁeld of a seperable polynomial f ∈ K [ x ].With this in place we will now start talking about automorphisms. First we deﬁne the Galoisgroup: Deﬁnition 2.26.

Let

L/K be number ﬁelds. Then the Galois group of L over K is deﬁnedas the group of K -automorphisms, i.e. the group of automorphisms π : L → L that is ﬁxedfor all elements in K . The group operation is composition of maps and the group is denotedGal( L/K ) . The existance of such a group is trivial as the identity function on L is a K -autmorphism,so will always lie in the Galois group. We make a distinction between real and complex K -automorphisms: A K -automorphism σ ∈ Gal(

L/K ) is real if σ : L → R and complex if σ : L → C . Deﬁnition 2.27.

Let

L/K be number ﬁelds with Galois group Gal(

L/K ) such that | Gal(

L/K ) | = n . Then the signature of Gal( L/K ) is sign (Gal(

L/K )) = ( r, s ) , where r is the number of real K -automorphisms and s is half the number of complex K -automorphisms in Gal( L/K ) suchthat n = r + 2 s .There is one element of the Galois group that is particularly interesting, which is the Frobe-nius element. The idea of this comes from the Frobenius automorphism, which we recall forgood measure. Deﬁnition 2.28. If K is a ﬁeld of characteristic p >

0, the map φ : K → K deﬁned by k k p is called the Frobenius monomorphism of K . When K is ﬁnite, φ is called the Frobeniusautomorphism.To be able to deﬁne the Frobenius elements over extensions we have a little setting up to do.For this let L/K be a Galois extension with Galois group Gal(

L/K ). Let P ⊆ O L lying overPage 18 of 114 ⊆ O K . Then we can deﬁne the decomposition group D ( P | p ) as the set of automorphisms inGal( L/K ) which ﬁx P : D ( P | p ) = { σ ∈ Gal(

L/K ) | σ ( P ) = P } . Deﬁnition 2.29.

Let

L/K be a Galois extension with Galois group Gal(

L/K ), let P ⊆ O L bea prime ideal unramiﬁed in L/K lying over p ∈ O K . Then the Frobenius element, Frob P , isthe element of D ( P | p ) that acts as the Frobenius automorphism on the residue ﬁeld. I.e. theunique Frob P ∈ D ( P | p ) such that • Frob P ∈ D ( P | p ). • For all α ∈ O L , Frob P ( α ) ≡ a q mod P , where q = |O K / p | .This Frobenius element for P is uniquelly determined, but is not unrelated. For example, if P and P ′ both lie over p ⊂ O K then the Frobenius elements are conjugate. This means thateach prime ideal in O K gives rise to a conjugacy class of Frobenius elements in Gal( L/K ). Proposition 2.30.

Let

L/K be a Galois extension with Galois group Gal(

L/K ), let P , P ′ ⊆ O L be prime ideals unramiﬁed in L/K , both lying over p ∈ O K . Then there exists σ ∈ Gal(

L/K )such that Frob P = σ Frob ′ P σ − . Proof.

Let α ∈ O L . Then σ Frob P σ − ( α ) = σ (cid:0) ( σ − ( α )) q + a (cid:1) , for some a ∈ P , and σ (cid:0) ( σ − ( α )) q + a (cid:1) = α q + σ ( a ) ≡ α q mod σ P . Page 19 of 114his is all we need for now, but in chapter 5 we will be expand our theory a little more.

Let K = Q ( α ) be a number ﬁeld with Galois group Gal( K/ Q ). Let O K be the ring of algebraicintegers of K . It will prove useful in what is to come that we can relate the elements of a numberﬁeld to rational number in Q . There are two natural maps to consider: Deﬁnition 2.31.

Let

L/K be an extension of number ﬁelds with [ L : K ] = n . Let σ , . . . , σ n be the K -automorphisms of L in C . For any α ∈ L we deﬁne the following maps:1. The trace of an element α : Tr L/K ( α ) : L → C ,α n X i =1 σ i ( α ) .

2. The ﬁeld norm of an element α : N L/K : L → C ,α n Y i =1 σ i ( α ) [ L : K ( α )] . Remark.

If it is clear from context what ﬁeld we are taking the norm over we will often justwrite N L/K = N .We will mostly be concerned with Galois extensions which makes the ﬁeld norm far simpler.For this consider L/K with Galois group Gal(

L/K ), then for any α ∈ L the norm becomes N L/K ( α ) = Y σ ∈ Gal(

L/K ) σ ( α ) . These two maps have a few important properties we will make use of throughout.

Proposition 2.32.

Page 20 of 114. N L/K and Tr

L/K are homomorphisms from L → K .2. If α ∈ O L then N L/K ( α ) ∈ O K and Tr L/K ( α ) ∈ O K .3. Let α, β ∈ O L such that α | β , then N L/K ( α ) | N L/K ( β ).4. If k ∈ K , then N L/K ( k ) = k n and Tr L/K ( k ) = nk , where [ L : K ] = n . Remark.

Note that (2) implies that for all α ∈ O L : α ∈ O ∗ L ⇔ N L/ Q ( α ) = ± O Q = Z and Z ∗ = ± Deﬁnition 2.33.

Let K be a number ﬁeld of ﬁnite degree. Let O K be the ring of algebraicintegers of K . Let I ⊂ O K be a non-trivial ideal of O K . Then N ( I ) = |O K / I | . However these deﬁnitions are not completely independent from eachother, as they are relatedfor principal ideals. In fact, for principal ideals the ﬁeld and ideal norm overlap.

Theorem 2.34.

Let K be a number ﬁeld of ﬁnite degree. Let I = h α i ⊂ O K . Then |O K / I | = N ( I ) = N ( h α i ) = | N ( α ) | . It will not come as a surprise that the ideal norm is also a homomorphism, such that forevery two non-zero integral ideals A , B ∈ O K , N ( AB ) = N ( A ) N ( B ). This is absolutely nottrivial, but intuitive enough for us to understand to leave the proof to [[1], theorem 9.3.2]. Nowwe have a deﬁnition for the ideal norm and an explicit, and reasonably simple, way to computethis for principal ideals. Up until now we do not have a general way to compute the ideal norm, There are multiple ways to deﬁne the norm of an ideal. The deﬁnition we adopt has the beneﬁt of beingcompletely algebraic, which is why we use it.

Page 21 of 114ut we can get closer by recalling theorem 2.16 as the ideal norm is a homomorphism, so wejust have to consider how to compute the ideal norm for prime ideals.

Theorem 2.35.

Let K be an algebraic number ﬁeld. Let P ⊂ O K . Then there exists a uniqueprime p ∈ Z , such that P | h p i . As this p ∈ Z is unique we say P lies over p , or equivalently p lies below P . This has animmediate eﬀect on the ideal norm, as can be seen in the following theorem: Theorem 2.36.

Let K be an algebraic number ﬁeld with [ K : Q ] = n . Let P ⊂ O K be a primeideal lying over p ∈ Z . Then N ( P ) = p f , for some integer f ∈ { , . . . , n } . Proof. As P lies above p we have that P | h p i . Hence h p i = PQ for some integral ideal Q ⊂ O K .As the norm is multiplicative we have that N ( h p i ) = N ( P ) N ( Q ) . As the K -conjugates of p comprise p n times we have N ( p ) = p n , and so N ( h p i ) = | N ( p ) | = N ( P ) N ( Q ) = p n . The f in the theorem is called the inertial degree and is used to deﬁne one of the mostinteresting objects in number theory, the ideal class group, which will deﬁne momentarily. Firstwe need some preliminary deﬁnitions and properties.Page 22 of 114 eﬁnition 2.37. Let K be an algebraic number ﬁeld with [ K : Q ] = n . Let p ∈ Z be primesuch that h p i = g Y i =1 P e i i . Then g is called the decomposition number, e i is the ramiﬁcation index for P , and the followingproperties hold:1. The inertia degree, f i , can be computed as f = [ O K / P i : Z / h p i ].2. p is said to be ramiﬁed in K if, for some i , e i > p is said to be unramiﬁed in K if for all i : e i = 1.4. The following equation holds: g X i =1 e i f i = n. P is said to be (totally) split if for all i : e i = f i = 1, equivalently g = n . Remark.

This can also be deﬁned generally for extensions

L/K , and follows naturally from theabove deﬁnition. For our purposes the above will suﬃce however.

We now deﬁne the concept of the (absolute) discriminant of a number ﬁeld and immediatelyapply it to a theorem by Dedekind:

Deﬁnition 2.38.

Let K be a number ﬁeld of ﬁnite degree with ring of algebraic integers O K .Let { a , . . . , a n } be an integral basis for O K and let { σ , . . . , σ n } be the set of K -automorphismsinto C . Then the (absolute) discriminant of K is∆ K = det  σ ( a ) . . . σ ( a n )... . . . ... σ n ( a ) . . . σ n ( a n )  . Page 23 of 114 heorem 2.39.

Let K be an algebraic number ﬁeld. Then the prime p ∈ Z ramiﬁes in K ifand only if p | ∆ K .Now we deﬁne the ideal class group: Deﬁnition 2.40.

Let K be an algebraic number ﬁeld and let A , B ⊂ O K be non-zero ideals.Then we say that A ∼ B , that is they are equivalent, if there exists α, β ∈ O K \{ } such that h b i A = h a i B . The equivalence classes under ∼ are called ideal classes and the set of ideal classesof K , denoted Cl K , is called the ideal class group. Moreover, the cardinality of the ideal classgroup, | Cl K | = h K , is called the class number of K .It is well known that h K is ﬁnite and that Cl K is an abelian group for K an algebraic numberﬁeld. More can be said, especially about the generation of Cl K , and we will record these in quicksuccession. Theorem 2.41.

Let I ⊆ O K be a non-zero ideal. Then there exists a non-zero α ∈ I with (cid:12)(cid:12) N K/ Q ( α ) (cid:12)(cid:12) ≤ c K N ( I ) , for c K = (cid:18) π (cid:19) r n ! n n p | ∆ K | , where 2 r = |{ σ i | σ i a Q -automorphism , σ i ( K ) R }| . Remark. c K is called the Minkowski bound. Theorem 2.42.

Any ideal class c ∈ Cl K contains an ideal I such that N ( I ) ≤ c K . Theorem 2.43.

Let K be a number ﬁeld. Then Cl K is generated by the classes of the primeideals [ P ], satisfying N ( P ) ≤ c K . In particular, any such P must lie above a prime p ≤ c K . In the last section we deﬁned the absolute discriminant. It will not surprise the reader that therethen also is a relative discriminant. To do this we need to consider the following:Page 24 of 114 eﬁnition 2.44.

Let

L/K be number ﬁeld with rings of integers O L and O K respectively. Thefractional ideal , C O L |O K = { x ∈ L | Tr( x O L ) ⊂ O K } , is called Dedekind’s complementary module. It’s inverse, D O L |O K = C − O L |O K , is called the diﬀerent of O L / O K . Remark.

With the usual abuse of notation we will write D L/K for the diﬀerent D O L |O K By deﬁnition Dedekind’s complementary module contains O L and so D L/K is in fact anintegral ideal. The diﬀerent is well-behaved in towers of number ﬁelds:

Proposition 2.45.

For a tower of ﬁelds

L/M/K/ Q : D L/K = D L/M D M/K .This diﬀerent ideal, or simply ‘diﬀerent’, is key in the deﬁnition of the relative discriminant;

Deﬁnition 2.46.

Let

L/K be number ﬁelds. The relative discriminant ∆

L/K is the ideal of O K deﬁned by the relative norm of the diﬀerent:∆ L/K = N L/K ( D L/K ) . This allows us to deﬁne a transitivity condition for the discriminant

Lemma 2.47.

Let

L/M/K be a tower of number ﬁelds. Then∆

L/K = ∆ [ L : M ] M/K N M/K (∆ L/M ) . Proof.

Applying to D L/K = D L/M D M/K the norm N L/K = N M/K ◦ N L/M we obtain∆

L/K = N M/K (∆ L/M ) N M/K ( D [ L : M ] M/K ) = N M/K (∆ L/M )∆ [ L : M ] M/K . Page 25 of 114 .5 Dirichlet’s Unit Theorem

In this last preliminary section we want to get an idea of how O K looks. Speciﬁcally, we wantto get an idea of the units in O K . Note that the units of a ring form a group called the unitgroup and is denoted O × K . Deﬁnition 2.48.

Let n ∈ N and ζ n = e πi/n . Then ζ n is a n th root of unity and the numberﬁeld Q ( ζ n ) is called the n th cyclotomic ﬁeld.These cyclotomic ﬁelds are in many ways the easiest number ﬁelds to work with becausethey behave especially nicely when n is an odd prime. To emphasize this let p ∈ Z be an oddprime for the remainder of the section. Theorem 2.49.

Let K = Q ( ζ p ), then O K = Z [ ζ p ].With that little note we return to what is at hand. Let µ ( K ) be the group of units containedin a number ﬁeld K . Note that µ ( K ) is a cyclic group of order n under multiplication. Thefollowing proposition is then a tautology: Proposition 2.50.

Let K be a number ﬁeld. Then µ ( K ) ⊂ O × K .The question now becomes: What are the other elements in O × K ? Recall from 2.3 that thesignature of a ﬁeld, sign( K ) = ( r, s ), where r is the number of real Q -embeddings and 2 s thenumber of complex Q -embeddings. Theorem 2.51.

Let K be a number ﬁeld. There exists a map l : K → R r + s − such that ker( l )is ﬁnite and cyclic, and ker( l ) ∼ = µ ( K ). Moreover l ( O × K ) ∼ = Z r + s − . This immediately leads us to the concluding theorem:

Theorem 2.52. Dirichlet’s Unit Theorem

Let K be a number ﬁeld and let sign( K ) = ( r, s ).Then O × K ∼ = µ ( K ) × Z r + s − . Page 26 of 114 roof.

From theorem 2.51 we have that there exists a map l such that ker( l ) ∼ = µ ( K ) hence bythe isomorphism theorems K/µ ( K ) = K/ ker( l ) ∼ = im( l ) = Z r + s − , and as µ ( K ) is an abelian group and Z r + s − is free we get that K ∼ = µ ( K ) × Z r + s − .Page 27 of 114 The General Number Field Sieve

For the entirety of this section let n ∈ N be a given integer that we wish to factor. Without lossof generality assume that this n is odd, else we could simply consider m ∈ N where n = 2 s m ,and that n is composite, else a primality test, which can be done in polynomial time, such asthe AKS Primality Test [18], will show that n does not need factoring. The General NumberField Sieve (GNFS) then uses a congruence of squares modulo n to perform integer factorization.Assume that ( x, y ) is a pair that admits a congruence of squares modulo n so that x ≡ y mod n, where x = ± y . Then we can use ( x, y ) to ﬁnd a non-trivial factorization by using the well-established fact that x − y = ( x + y )( x − y ). We then compute gcd( x + y, n ) = d andgcd( x − y, n ) = d , in the hope that this gives a non-trivial factorization, that is d , d = 1 or n .Assume that we are able to produce random integers x and y such that the above congruenceholds. Then for the hardest case, n = pq semiprime, Table 1 shows an exhaustive truth table:Table 1: Factorization cases for n = pqp | x + y p | x − y q | x + y q | x − y gcd( x + y, n ) gcd( x − y, n ) Factor?T T T T n n FT T T F n p

TT T F T p n

TT F T T n q

TT F T F n p q TF T T T q n

TF T T F q p

TF T F T 1 n FIt can be seen that we only have failure if p and q have the same conditions on the factorizationof x + y and x − y . This is because that means that x ≡ ± y mod n . If that is not the case thenwe always ﬁnd a non-trivial factorization. How the NFS operates to ﬁnd such congruent pairs( x, y ) is the goal of this section. Page 28 of 114 .1 Introducing the algorithm The Number Field Sieve is so named because of a sieving process happening both over Z anda ring Z [ α ] which are contained in the ﬁeld Q and the number ﬁeld Q ( α ) respectively. In thissection we will give a description on how this is done, while we go into detail, introducing formaldeﬁnitions and rigor, in Section 3.3. However, the following commuting diagram captures thewhole process: Z [ x ] Z Z [ x ] / ( f ) ∼ = Z [ α ] Z /n Z x m mod f mod n x m mod n Before we can discuss how the NFS obtains a congruence we introduce the concept of smoothnumbers.

Deﬁnition 3.1.

Let z ∈ Z and let B ∈ N . Then z is said to be B -smooth if in the primefactorization of z , z = ± n Y i =1 p e i i , it holds that p i ≤ B for all i = 1 , . . . , n .It is clear that this deﬁnition is insuﬃcient for O Q ( α ) as this ring does not solely consist ofintegers, however that is easily amended.For the sake of exposition we will consider O Q ( α ) = Z [ α ]. This is rarely the case for num-ber ﬁelds in general as this only happens for monogenic ﬁelds, but in section 3.3 we will see thatthis assumption can be dealt with algebraically so we only need to concern ourselves with thiscase. Let a − bα ∈ Z [ α ] and let h a − bα i be the ideal generated by it. As Q ( α ) is an algebraicnumber ﬁeld we get by that Z [ α ] is a Dedekind domain, hence any ideal splits uniquely into aproduct of prime ideals: h a − bα i = n Y i =1 P η i i , Page 29 of 114oreover by theorem 2.36 we know that any prime ideal of Z [ α ] evaluates to a prime powerunder the norm map. This allows us to deﬁne an analogy to smoothness over Z : Deﬁnition 3.2.

Let a − bα ∈ Z [ α ] and let B ∈ N . Consider h a − bα i ⊆ Z [ α ] with factorizationinto prime ideals h a − bα i = n Y i =1 P η i i . Then a − bα is said to be B -smooth if for all N ( P i ) = p f i i it holds that p i ≤ B, where N : Q ( α ) → Q is the ﬁeld norm.Using these deﬁnitions we can, for some ﬁxed m to be deﬁned later, deﬁne sets S Z and S Z [ α ] with smoothness bounds B and B ′ respectively, as follows: S Z = (cid:8) ( a, b ) ∈ Z | gcd( a, b ) = 1 , a − bm is B -smooth as in Deﬁnition 3.1 (cid:9) S Z [ α ] = (cid:8) ( a, b ) ∈ Z | gcd( a, b ) = 1 , a − bα is B ′ -smooth as in Deﬁnition 3.2 (cid:9) The bulk of the work done in the GNFS is done in determining these sets and we will go inmuch more detail in the next section. If we succeed in ﬁnding a set S ⊆ S Z ∩ S Z [ α ] , such that | S | ≥ π ( B ) + π ( B ′ ) + 1, then the GNFS uses the remainder of its computation time to ﬁnd aset T ⊂ S such that Y ( a,b ) ∈ T ( a − bm ) is a square in Z , (1) Y ( a,b ) ∈ T ( a − bα ) is a square in Z [ α ] . (2)It is not a certainty that such a set T fulﬁlling these equations can be found. If it is not thenPage 30 of 114he GNFS can be restarted with larger smoothness bounds B and B ′ . For now assume we havefound such a set T . From here completion of the algorithm is nearly trivial:First we observe that f ′ ( m ) Q ( a,b ) ∈ T ( a − bm ) must give a square in Z , z say, and f ′ ( α ) Q ( a,b ) ∈ T ( a − bα ) must give a square in Z [ α ], ξ say, for which there exist elementary methods to ﬁnd thesquare roots z and ξ . Remark.

We have multiplied equation (1) and (2) by the square of the derivative of a polynomialat m and α respectively. This is to ensure that z ∈ Z and ξ ∈ Z [ α ]. A more thorough discussionof this follows in 3.3.From here we compute gcd( z − N ( ξ ) , n ) and gcd( z + N ( ξ ) , n ). If this is non-trivial, thenthe result is a prime factor of n . Else we have to conclude failure and start again with diﬀerentparameters.This describes the algorithm in grand lines. Now it is time some mathematical rigor is introduced. The ﬁrst step of the NFS is that we must choose a d ∈ N , d = 1 and deﬁne m = ⌊ n d ⌋ s.t.gcd( m, n ) = 1. This choice of m gives rise to a polynomial f ∈ Z [ x ] such that n | f ( m ).Over the years there have been many attempts at making the choice of polynomial as eﬀectiveas possible, but the simplest and eﬀectively cheapest way to do this is to use the “base- m ”method. For each integer m ∈ Z we can write n as a linear combinations of powers of m suchthat n = d X i =0 c i m i . When we say “cheapest” we mean computationally most eﬀective.

Page 31 of 114hen we can deﬁne a polynomial f ∈ Z [ x ] such that f ( x ) = d X i =0 c i x i , where the c i ∈ { , . . . , m − } are the same as the base- m expansion of n . This guarantees that n | f ( m ), in fact f ( m ) = n , and deg( f ) = d . By [[17], prop. 3.2] we have that the leadingcoeﬃcient c d = 1, such that f ∈ Z [ x ] is in fact monic. Remark.

Note that as | c i | ∈ { , . . . , m − } hence c i < m < n d we can see that the discriminantof f satisﬁes that | ∆( f ) | < d d n − d . We may also assume that f ∈ Z [ x ] is irreducible. To see this assume, to the contrary, that f ∈ Z [ x ] is reducible. This means that there exist non-trivial polynomials g, h ∈ Z [ x ] such that f ( x ) = g ( x ) h ( x ) . So a simple computation gives a non-trivial factorization, as g ( x ) and h ( x ) are assumed tobe non-trivial. As we may now assume f ( x ) to be monic and irreducible we have shown thefollowing: Proposition 3.3.

Let f ( x ) be a monovariate, monic, irreducible polynomial, with deg( f ) = d ,in Z [ x ] with roots α , . . . , α d , not necessarily in Z . Then for all i = 1 , . . . , d , f ( x ) = irr Z ( α i ) . Note that by assumption deg( f ) = d >

1. As f is an irreducible polynomial over Z [ x ] wehave that none of the roots of f lie in Z , for else f would split into linear factors over Z . Thisroot, α say, lies in C and as f ( α ) = 0 this means that α is an algebraic integer. Moreover, bytheorem 2.6 we have that Q ( α ) is a ﬁeld and has basis (cid:8) , α, . . . , α d − (cid:9) over Q .Page 32 of 114 .2.2 Sieving over Z Now that we have deﬁned a polynomial f ∈ Z [ x ] and we have deﬁned a ﬁeld extension Q ( α ) wecan start sieving for pairs ( a, b ) such that equation (1) and (2) hold. For the sake of expositionwe will focus on the mathematical aspects and therefore consider the pairs ( a, b ) over Z and Z [ α ]separately. Deﬁnition 3.4.

Let u ∈ Z , then the set of integer pairs deﬁned by U = (cid:8) ( a, b ) ∈ Z | | a | ≤ u, < b < u, gcd( a, b ) = 1 (cid:9) is called the universe of sieving.Now let u be a large integer dependent on the to-factor n and deﬁne B = (cid:8) p ∈ Z | p prime , p ≤ B (cid:9) ∪ (cid:8) ± (cid:9) as the rational factor base of the sieve , fully dependent on the chosen smoothness bound B .Then there is a standard procedure to work through to ﬁll up the set S Z : The rational factor base, normally, only contains the primes p ≤ B . In our case, however, appending ± a − bm < Page 33 of 114 lgorithm 1

Procedure to populate S Z Input:

Universe U of ( a, b ) pairs, smoothness bound B Output:

Set S Z = { ( a, b ) ∈ U | a − bm is B -smooth , gcd( a, b ) = 1 } Initialize array comprised of a − mb for all ( a, b ) ∈ U for each element in the array do compute the set, P ( a,b ) , of primes p , p ≤ B , such that a ≡ bm mod p for each p ∈ P ( a,b ) do divide a − bm by the maximal power of p , such that p does notdivide the quotient, and replace a − bm by this quotient end forif the the element in the array is ± and gcd( a, b ) = 1 then add ( a, b ) to S Z end ifend for We will accept that this algorithm terminates by choosing B large enough and returns theset S Z such that | S Z | ≥ π ( B ) + 1. As we have recorded the vectors { e p ( a − bm ) } p ≤ B of p -exponents for all p ≤ B prime for each a − bm we now deﬁne M ∈ M | S Z |× π ( B ) ( Z ) given by: ∀ ( a i , b i ) ∈ S Z , ∀ p j ∈ B : M =  sign( a − b m ) e p ( a − b m ) · · · e p π ( B ) ( a − b m ) · · · · · · · · · · · · sign( a | S Z | − b | S Z | m ) e p ( a | S Z | − b | S Z | m ) · · · e p π ( B ) ( a | S Z | − b | S Z | m )  , where the sign-bit is deﬁned as:sign( a i − b i m ) =  a i − b i m <

00 otherwise . To achieve the situation in which equation (1) holds we now want to ﬁnd an independent subsetof S Z . For this we ﬁrst consider a pair a − bm ∈ S Z , then it is easily observed that for this to bePage 34 of 114 square in Z we must have that x = a − bm = Y p ∈B p e ( p ) = (cid:18) Y p ∈B p e ( p )2 (cid:19) . Hence x = ± Y p ∈B p e ( p ) . So it suﬃces for us to look for the independent subset of S Z modulo F , the ﬁnite ﬁeld ofcharacteristic 2. Hence we deﬁne M = M mod 2 = ( m i,j mod 2) i ∈{ ,..., | S Z |} ,j ∈{ ,...,π ( B ) } . Let f ( a i + b i m ) be the i th row of M . As | S Z | > π ( B ) we have that there is a linearly dependentsubset in M and hence there is a non-trivial solution T Z ⊂ S Z to the linear equation: X ( a,b ) ∈ T Z f ( a − bm ) = 0 . With this set we obtain (1): Y ( a,b ) ∈ T Z ( a − bm ) is a square in Z . Z [ α ]To sieve over Z [ α ] we will attempt to have a similar construction as the process used for Z , butto do this we will ﬁrst have to deal with a few of the obstructions that arise. First recall fromtheorem 2.36 that for a prime ideal P we have that N ( P ) = p f , Page 35 of 114here p ∈ Z is prime and f ∈ N . Then p is called the prime lying below P and f is calledthe inertia degree of P . Moreover a prime ideal for which f = 1 is called a ﬁrst degree prime.For a ﬁrst degree prime ideal, P , we have that the index [ Z [ α ] : P ] = p , hence gives rise to theisomorphism Z [ α ] / P ∼ = Z /p Z , hence Z [ α ] / P is a ﬁeld. This gives us a direct link between Z [ α ] and Z /p Z : Theorem 3.5.

Let f ( x ) be a monic irreducible polynomial with coeﬃcients in Z . Let α be aroot of f ( x ), then there is a bijective correspondence between the set P of ﬁrst degree primeideals and the set { ( p, m ) : p prime , m ∈ Z /p Z , f ( m ) ≡ p } . Proof.

Let p be a ﬁrst degree prime ideal of Z [ α ]. then [ Z [ α ] : p ] = p for some prime integer p so that Z [ α ] / p ∼ = Z /p Z . There is a canonical ring epimorphism θ : Z [ α ] → Z [ α ] / p such thatker( θ ) = p , hence for z ∈ p : p | θ ( z ), moreover for n ∈ Z such that p | n there is a z ∈ p so θ ( z ) = n . As θ is a homomorphism θ (1) = 1 and so ∀ n ∈ Z : θ ( n ) ≡ n mod p .Now let m = θ ( α ) ∈ Z /p Z . If f ( x ) = P di =0 a i x i with a d = 1 and a i ∈ Z then θ ( f ( α )) ≡ p as f ( α ) = 0, and so0 ≡ θ ( f ( α )) ≡ θ d X i =1 a i x i ! ≡ d X i =1 a i θ ( x ) i ≡ d X i =1 a i k i ≡ f ( m ) mod p, hence f ( m ) ≡ p and p determines the unique pair ( p, m ).Conversely, let p be a prime integer and m ∈ Z /p Z with f ( r ) ≡ p . Then there isa natural ring epimorphism that maps polynomials in α to polynomials in r . In particular θ ( a ) ≡ a mod p for all a ∈ Z and θ ( α ) ≡ m mod p . Let p = ker( θ ) so that p is an ideal of Z [ α ].Since θ is surjective that means that Z [ α ] / p ∼ = Z /p Z and so [ Z [ α ] : p ] = p . This implication isPage 36 of 114nique, hence we have the two unique implications,( p, m ) ֒ → p ֒ → ( p, m ) , proving the theorem.Hence ﬁnding these ﬁrst degree prime ideals is equivalent to ﬁnding roots mod p for theminimal polynomial irr K ( α ). Finding roots of polynomials over ﬁnite ﬁelds has a well doc-umented background [5], for example Berlekamp’s algorithm or Cantor-Zassenhaus. We maytherefore assume that ﬁnding ﬁrst degree prime ideals is easy and therefore assume we can ﬁndsuﬃcient ﬁrst degree prime ideals.Now that we have restricted our prime ideals we may generalize the smoothness test for Z [ α ].Buhler et. al suggested the following algorithm in [17]: For a smoothness bound B ′ ∈ Z , fornow further undeﬁned, S Z [ α ] as deﬁned above, and the polynomial f with root α = t . And with easy we mean eﬀective, and in that not adding to our overall complexity, as the size of deg( f ) isfar smaller than the size of n . Page 37 of 114 lgorithm 2

Procedure to populate S Z [ α ] Input:

Universe U , polynomial f , smoothness bound B ′ Output:

Set S Z [ α ] = { ( a, b ) | a − bα is B ′ -smooth , gcd( a, b ) = 1 } for each prime p ≤ B ′ dofor each ( a, b ) ∈ U such that b p do initialize an array populated by N ( a − bα ) end forfor each r ∈ Z /p Z do compute the set R ( p ) = { r | f ( r ) ≡ p } end forfor each r ∈ R ( p ) doif a ≡ br mod p then retrieve N ( a − bα )divide N ( a − bα ) by the maximal power of p , such that p does not divide thequotient, and replace N ( a − bα ) by this quotient. end ifend forend forfor each element in the array corresponding to the pair ( a, b ) doif the element is ± then add the pair ( a, b ) to S Z [ α ] end ifend for Remark.

It is clear that if b ≡ p then there are no integers with ( a, b ) ∈ U and N ( h a − bα i ) = N ( a − bα ) ≡ p .What is left to us is to ﬁnd the square in Z [ α ] that we wish to use. However following thePage 38 of 114ame procedure as the rational sieve would leave us, not with (2), but with N  Y ( a,b ) ∈ S Z [ α ] ( a − bα )  is a square in Z . This is clearly a necessary condition for (2), but it is not suﬃcient. To combat this obstructionwe recall the pairs ( p, R ( p )) as deﬁned in algorithm 3.2.3 and deﬁne the following: Proposition 3.6.

Let a, b ∈ Z , gcd( a, b ) = 1, and p ∈ Z prime. Let r ∈ R ( p ). Then thefunction e ( p,r ) ( a − bα ) =  ord p ( N ( a − bα )) a − br ≡ p otherwise . , where for z ∈ Z : ord p ( z ) = f if p f || z , is well deﬁned. Moreover N ( a − bα ) = Y ( p,r ) (cid:0) p e ( p,r ) ( a − bα ) (cid:1) . To see how this can be used to produce the square in Z [ α ] we need the last theorem of thissection Theorem 3.7.

Let S ′ be a ﬁnite set of coprime integer pairs ( a, b ) fulﬁlling equation (2). Thenfor each prime number p and each r ∈ R ( p ) we have X ( a,b ) ∈ S Z [ α ] e ( p,r ) ( a − bα ) ≡ Proof.

For each prime ideal P ∈ Z [ α ] deﬁne the group homomorphism ϕ P : Q [ α ] ∗ → Z suchthat1. ϕ P ( β ) ≥ β ∈ Z [ α ] , β = 02. if β ∈ Z [ α ] , β = 0, then ϕ P ( β ) > β ∈ P Page 39 of 114. for each β ∈ Q [ α ] ∗ one has ϕ P ( β ) = 0 for all but ﬁnitely many P , and | N ( β ) | = Y P N ( P ) ϕ P ( β ) . Then ϕ P is a P -adic valuation from a multiplicative group of units to an additive group ofintegers , hence, for gcd( a, b ) = 1, if P is not a ﬁrst degree prime then ϕ P ( a − bα ) = 0 and if P corresponds to the pair ( p, m ) as deﬁned in theorem 3.6 then ϕ P ( a − bα ) = e p,r ( a − bα ).Now let P i ∈ Z [ α ] be a ﬁrst degree prime ideal and let Y ( a,b ) ∈ S Z [ α ] ( a − bα ) = ξ . Then, as ϕ P is a homomorphism, we get that X ( a,b ) ∈ S Z [ α ] e ( p,r ) ( a − bα ) = X ( a,b ) ∈ S Z [ α ] ϕ P i ( a − bα ) = ϕ P i  Y ( a,b ) ∈ S Z [ α ] ( a − bα )  = ϕ P i ( ξ ) = 2 ϕ P i ( ξ ) ≡ . If our assumptions hold that means we can now use this outcome to do a similar process aswe did for the rational factor base and ﬁnd the ﬁnal set S Z [ α ] and use this to explicitly deﬁnethe set S = S Z ∩ S Z [ α ] . However these have not been inconsequential and will all be explainedin the following section. If we, however, suspend our disbelief for a moment longer we can ﬁnishthe algorithm.Assume that we have found ( z , ξ ) ∈ Z × Z [ α ] such that they fulﬁl equations (1) and (2).Then all that rests us to do is ﬁnding the square roots. First consider the rational case, so z Page 40 of 114nd Z . Then, by the sieving process, we know the prime factorization of z : z = Y p i ∈ Z p e i i , but then it is trivial to ﬁnd z : z = Y p i ∈ Z p ei i . In the algebraic case, i.e. ξ and Z [ α ], there is a bigger challenge, as there is no simple way toconsider factorize ξ . One naive approach would be to compute the root of x − ξ , but thisis usually not eﬃciently achievable. We instead take a theoretical approach, with an eye oncomputational eﬃciency, and let q be an odd prime. If f mod q is irreducible in F q [ x ], then Z [ α ] /q Z [ α ] ∼ = F q [ x ] / ( f mod q ). It can be shown that with signiﬁcant probability there existsan odd q such that f mod q is irreducible. To see this consider Theorem 3.8.

Let f ∈ Z [ x ] be an irreducible polynomial of degree d , d >

1. Then the density,inside the set of all prime numbers, of the set of prime numbers q for which f mod q factors in F q [ x ] into distinct irreducible non-linear factors exists and is at least d To prove this we need the following proposition:

Proposition 3.9.

Let G be a ﬁnite group that acts transitively on a ﬁnite set Ω, with d >

1. Then there are at least Gd elements of G that act without ﬁxed points on Ω.Now we prove the theorem. Proof.

Let Γ = Gal( f / Q ), viewed as a permutation group of the set A = { α , . . . , α d } with rootsof f . For each prime number q that does not divide the discriminant of f , there is a Frobeniuselement σ q ∈ Γ with the property that the degrees of the irreducible factors of f mod q are thesame as the lengths of the cycles of the permutation σ q . Hence, we are interested in those q for which σ q acts without ﬁxed points on A . Then by the Chebotarev Density Theorem, 4.31,every subset S ⊂ G that is closed under conjugation, the set of prime numbers σ q belongs tohas density C G . The theorem follows from proposition 3.9.Page 41 of 114o except for the extremely small probability that we can not choose any f for a speciﬁc n this means we can assume that there exists a q so f mod q is irreducible in F q [ x ]. The followingtheorem completes the square root ﬁnding process. Theorem 3.10.

Let q = q Z [ α ] be an ideal of Z [ α ] /q Z [ α ]. Then there exists a δ ∈ Z [ α ] suchthat δ ξ ≡ q . Proof. As Z [ α ] /q Z [ α ] ∼ = F q [ x ] /f mod q we know that | Z [ α ] /q Z [ α ] | = q d , where d = deg( f ).Now consider I = q Z [ α ] = (cid:26) d − X i =1 a i α i : q | a i (cid:27) , which is a degree d prime ideal in Z [ α ]. As f mod q is assumed irreducible it follows that f ′ ( α ) I and for each ( a, b ) ∈ S Z [ α ] we have that a − bα I since gcd( a, b ) = 1. Therefore ξ = f ′ ( α ) Q ( a,b ) ∈ S Z [ α ] ( a − bα ) I .Using Berlekamp’s algorithm, [5], we ﬁnd an element δ mod I such that δ ξ ≡ I ,completing the proof.Hence there exists an element δ that is the inverse of a square modulo q . Now we can applyNewton–Rhapson iteration ([5],[17]) to ﬁnd approximations such that δ j ≡ δ j − (3 − δ j − ξ )2 mod ( q Z [ α ]) . In a ﬁnite number of steps we will ﬁnd a γ such that γ ≡ δ j ξ mod ( q Z [ α ]) , where γ = ξ in Z [ α ]. To complete the algorithm we compute gcd( z − N ( ξ ) , n ) and gcd( z + N ( ξ ) , n )to ﬁnd a factorization as described in section 3.1.Page 42 of 114 .3 Dealing with obstructions So far the technique we have been trying to describe can be captured in the following twobi-implications:Equation (1) ⇔ Y ( a,b ) ∈ T ( a − bm ) has non-negative even exponents at all primes p ≤ B, (3)Equation (2) ⇔ Y ( a,b ) ∈ T ( a − bα ) has even exponents at all prime ideals P ∈ Z [ α ] . (4)It is clear to see that the ﬁrst of these bi-implications holds, as we can ﬁnd the root by simplydividing all exponents by 2. The second of these is not completely clear however. One of the ﬁrstassumptions we made was that O Q ( α ) = Z [ α ], but this is rarely the case. In fact this is somethingthat was believed true until in the 19’th century and even featured in (eventually disproven)proposed proofs of Fermat’s Last Theorem [6]. This assumption leads to four obstructions thathave to be mitigated. For this let ω = Q ( a,b ) ∈ T ( a − bα ).1. Z [ α ] is not necessarily O Q ( α ) . At best we know that Z [ α ] ⊆ O Q ( α ) , hence we can not assumethat Z [ α ] is a Dedekind domain so ω O Q ( α ) might not be the square of an ideal in Z [ α ].2. If ω O Q ( α ) is the square of some ideal I , it is not certain that I is a principal ideal.3. If ω O Q ( α ) is the square of some principal ideal γ O Q ( α ) , it is not certain that ω = γ as ω agrees with γ only up to units of O Q ( α ) .4. And even if ω = γ , then we are not assured that γ ∈ Z [ α ]Clearly these are obstructions that break the number ﬁeld sieve. To deal with obstruction 4recall the standard fact that for any θ ∈ O Q ( α ) and f ( x ) = irr Q ( α ) we have that f ′ ( α ) · θ ∈ Z [ α ] . Page 43 of 114o we simply multiply equation (2) by f ′ ( α ) and equation (1) by f ′ ( m ) . The only conditionwe must apply is that gcd( f ′ ( m ) , n ) = 1 or else the resulting element given by equation (1) maynot be invertible, however this is simply checked and if this is not the case then we have founda factorization of n so we may simply assume that f ′ ( m ) and n are coprime.Now we attempt to subvert obstruction 1, which consequently allows us to negate both ob-structions 2 and 3 as well. To do this we will consider the following chain: V ⊃ V ⊃ V ⊃ V = V ∩ (cid:0) Q ( α ) × (cid:1) , where V is the group generated by Q ( α ) × with even exponents, i.e. e ( p,r ) ( v ) ≡ v ∈ V . This is a group with the following subgroups: V = { v ∈ V | v O Q ( α ) = I for some I ⊂ O Q ( α ) } ,V = { v ∈ V | v O Q ( α ) = h ϑ i for some h ϑ i ⊂ O Q ( α ) } ,V = V ∩ ( Q ( α ) × ) . We attempt to bound the index [ V : V ]. Considered as a Z -module O Q ( α ) is free of rank d =deg( f ) and by deﬁnition Z [ α ] ⊂ O Q ( α ) . It is a well-known identity in algebraic number theorythat ∆( f ) = (cid:2) O Q ( α ) : Z [ α ] (cid:3) · ∆. and as [ Q ( α ) : Q ] > > (cid:2) O Q ( α ) : Z [ α ] (cid:3) is bounded by p ∆( f ). Remark.

Recall that ∆ is the discriminant of the number ﬁeld as in Deﬁnition 2.38As we have factorization into prime ideals in O Q ( α ) there is a bijection between the primeideals Q of O Q ( α ) coprime to (cid:2) O Q ( α ) : Z [ α ] (cid:3) and the ideals I ⊂ Z [ α ] coprime to this index. Infact if we only consider the prime ideals P ⊂ Z [ α ] we get an isomorphism of local rings: Z [ α ] P → (cid:0) O Q ( α ) (cid:1) Q , Page 44 of 114 P = Q ∩ Z [ α ] . This allows us to generalize the deﬁnition of the map e ( p,r ) of proposition 3.6. Proposition 3.11.

Let P ∈ Z [ α ] be a prime ideal of Z [ α ]. Then e P ( a − bα ) = X Q ⊃ P f ( Q / P ) e ( q,r ) ( a − bα ) , where Q lies over a prime q ∈ Z .In the case that P does not divide the index (cid:2) O Q ( α ) : Z [ α ] (cid:3) , e P ( a − bα ) = e ( q,r ) ( a − bα ) asthere will only be one Q such that P ⊂ Q and f ( Q / P ) = 1.Now consider the following map V /V → M Q | [ O Q ( α ) : Z [ α ] ] Z / Z ,x ( e Q ( x ) mod 2) Q . This is an injective homomorphism, hence

V /V is a F -vectorspace of dimension bounded by (cid:12)(cid:12)(cid:8) Q | Q | (cid:2) O Q ( α ) : Z [ α ] (cid:3) (cid:9)(cid:12)(cid:12) . As the number of rational primes dividing the index is no more than | ∆( f ) | and for each of these primes there are at most deg( f ) prime ideals Q ⊂ O Q ( α ) thatdivide it, we obtain dim F ( V /V ) ≤ d f ) . This inequality is the ﬁrst step to resolving the ﬁrst obstruction. Further we will bound thedimensions dim F ( V /V ) and ﬁnally dim F ( V /V ) to obtain our ﬁnal result.As the class group, Cl Q ( α ) , is a ﬁnite abelian group and that its order h is, [13], boundedby h < p | ∆( f ) | d − | ∆( f ) | d − ( d − . Page 45 of 114eﬁne the map κ : V → Cl Q ( α ) by x I where x O Q ( α ) = I . By deﬁnition ker( κ ) = V , hencedim F ( V /V ) ≤ log h log 2 . Not that V consists of elements that are squares in Q ( α ) × up to units in O Q ( α ) , hence by [[13],8.3] | V /V | ≤ (cid:12)(cid:12)(cid:12) O ∗ Q ( α ) / (cid:0) O ∗ Q ( α ) (cid:1) (cid:12)(cid:12)(cid:12) ≤ d. This leads to the following theorem:

Theorem 3.12.

Let V be as above and suppose that n > d d >

1. Then the subgroup V = V ∩ ( Q ( α ) × ) of squares in V satisﬁesdim F ( V /V ) ≤ log( n ) This shows that the general number ﬁeld sieve algorithm is certain to generate an elementin V , however we are not sure yet that this is also a square, hence is in V . To obtain thiswe introduce the concept of quadratic characters. Let us start by proposing the idea over Z :Assume x, y ∈ Z then x is a quadratic residue modulo a prime p if x ≡ y mod p. This is easily tested by the Legendre symbol: (cid:18) xp (cid:19) = x p − ≡ ( y ) p − = y p − ≡ p. Now assume that x is a square, then it is also a square modulo p for every prime p such that p ∤ x . This means that for all p ∈ Z , p prime: (cid:18) xp (cid:19) = 1Page 46 of 114o if there is at least one p for which( xp ) = − x is not a square modulo p and by extensionnot a square element. This generalizes well to Q ( α ).Let P be a ﬁrst degree prime ideal of Z [ α ] lying over p and let ( p, R ( p )) be the set as deﬁned inalgorithm 3 . .

3. As there exists a ring homomorphism: π : Z [ α ] → Z /p Z with ker( π ) = P bydeﬁnition this gives rise to a Legendre symbol: (cid:18) · P (cid:19) : Z [ α ] → Z /p Z → {± , } , such that for a non-square x ∈ Z [ α ] we have (cid:16) x P (cid:17) = − . Restricting toLegendre symbols coming from P over a prime p ∈ Z such that p > B ′ we avoid the charactervalue 0, and as a consequence of Chebotarev’s Density Theorem we have that the Legendresymbols coming from P are equidistributed over the space of homomorphisms from V /V to {± } , denoted Hom( V /V ) , {± } ).Now let χ P = (cid:16) · P (cid:17) and consider the following lemma. Lemma 3.13.

Let k, r be non-negative integers, and let E be a k -dimensional F vector space.Then the probability that k + r elements that are uniformly at random drawn from E form aspanning set for E is at least 1 − − r Then the equidistribution over Hom(

V /V , {± } ) with the obtained bound on the dimensionof V /V as a F vector space from theorem 3.12 makes it overwhelmingly likely that a set of B ′′ = ⌊ n )log 2 ⌋ quadratic characters span the homomorphism space. If they do, then the converseto the ﬁnal theorem of this section shows that if β ∈ Z [ α ] \{ } satisﬁes χ Q ( β ) = 1 for all ﬁrstdegree primes Q with 2 β Q , then β is in fact a square in Z [ α ]. Furthermore, there can be aﬁnite number of exceptions which is explored greater in [17]. Theorem 3.14.

Let S be a ﬁnite set of integer pairs ( a, b ) such that gcd( a, b ) = 1 fulﬁlling that See [17] for details.

Page 47 of 114or some γ ∈ Q [ α ]: ω = γ . Moreover let q be a ﬁrst degree prime ideal corresponding to the pair ( s, q ) that does not divide h a − bα i for any pair ( a, b ) and for which f ′ ( s ) q . Then we have Y ( a,b ) ∈ S (cid:18) a − bsq (cid:19) = 1 . Proof.

Let ϕ : Z [ α ] → Z /q Z be the ring homomorphism given by ϕ ( α ) = r mod q and letker( ϕ ) = Q . Then Q is a ﬁrst degree prime ideal corresponding to the pair ( q, r ). Let χ Q : Z α →{± } and let ψ Q : Z [ α ] / Q → Z /q Z \{ } , such that χ Q = Leg Q ◦ ψ Q where Leg Q : Z /q Z → {± } is the Legendre symbol. Then χ Q ( a − bα ) = (cid:18) a − brq (cid:19) . Letting, as discussed, ξ = f ′ ( α ) Y ( a,b ) ∈ S ( a − bα ) , for some ξ ∈ Z [ α ]. By the hypothesis the factors on the left are not in Q , so we have that ξ Q .However χ Q ( ξ ) = 1 ,χ Q ( f ′ ( α ) ) = 1 , so χ q  Y ( a,b ) ∈ S (cid:18) a − brq (cid:19) = 1 . This completes the proof.So now we can be almost certain that we can ﬁnd a square in Z [ α ] and therefore havemitigated all obstructions. The only question left is: What is the complexity?Page 48 of 114 .4 Computational eﬃciency Let us record the number ﬁeld sieve algorithm step-by-step:

Algorithm 3

General Number Field Sieve

Input: n ∈ N , n odd composite, degree d >

1, universe U , smoothness bounds B and B ′ Output:

A factor of n or ¬ for failure.Choose m = ⌊ n d ⌋ .Deﬁne f ( x ) ∈ Z [ x ] as the polynomial with coeﬃcient corresponding to the base- m expansionof n . for each ( a, b ) ∈ U do Run algorithm 3.2.2 to ﬁnd S Z Run algorithm 3.2.3 to ﬁnd S Z [ α ] Use the techniques from section 3.3 to ﬁnd quadratic character base S Q such that | S Q | = B ′′ . end forif (cid:12)(cid:12) S Z ∩ S Z [ α ] (cid:12)(cid:12) > | B | + | B ′ | + 1 then Use a reduction algorithm over F to ﬁnd an dependent subset S ⊂ S Z ∩ S Z [ α ] Compute z = Q ( a,b ) ∈ S ( a − bm ) and use its known prime factorization to ﬁnd z Compute ζ = Q ( a,b ) ∈ S ( a − bα ) and use Newton–Rhapson iteration to ﬁnd ζ if z ∈ Z and ζ ∈ Z [ α ] thenreturn gcd( z − ζ , n ), gcd( z + ζ , n ) else ﬁnd a new dependent subset S . If all such sets produce no result return ¬ end ifelse return ¬ end if It is generally considered diﬃcult to do a complexity analysis on the number ﬁeld sieve asit is stated in all its generality. Already when it was analysed in [17] there was a conjecturedPage 49 of 114omplexity of L n , r

649 + o (1) ! , and over the 30 years that followed it showed that the conjectured complexity held up heuris-tically, however there are some things that we are able to say. Following [13] we are able tocarefully choose f and thereby deg( f ) to optimize the smoothness bound B ∗ = max B, B ′ andparameter u for the universe of sieving U . The “basic” cost of the algorithm is computed to be u o (1) + y o (1) .First we will address step 6. As the binary matrix that is formed will be sparse, i.e. there willbe signiﬁcantly more zeroes than ones, it allows us to use fast algorithms to ﬁnd dependencies.Two of these extremely fast algorithms are Block–Lanczos ([19],[20]) and Block–Wiedemann [21]which both have similar complexity. The size of these matrices are at most B ∗ × B ∗ , and bychoosing log B ∗ ≈ log u we balance the contributions of U and the matrix reduction.Looking at the sieving procedure we can see that a pair ( a, b ) is B ∗ -smooth if ( a − bm ) N ( a − bα ) is B ∗ -smooth, and thus, using that the size of m is approximately d √ n , we may bound this productby u d √ n ( d + 1) u d d √ n ≈ n d u d +1 , by choosing d optimally, namely d ≈ (cid:18) n log log n (cid:19) . The probability that a number x , of the right size, is smooth can be approximated by r − r where r = log x log y . In order to maximize this probability we choose r = log x log u ≈ d ′ d ′ + d ′ + 1 , for d ′ = q n log u . Page 50 of 114his means we get dependency in the matrix if we have at least y ≈ u r − r ( a, b ) pairs. Aswe have assumed log y ≈ log u taking logarithms we get log u ≈ r log r , equivalently r ≈ log u log log u .Hence, by taking powers: log u (log log u ) ≈ n ) . As we have, for a, s, t ∈ R , that s = t (log t ) a goes to t = (1 + o (1) s (log s ) − a as t goes to inﬁnity,we get that log y ≈ log u ≈ n ) (cid:18)

13 log log n (cid:19) = (cid:18) (cid:19) (log n ) (log log n ) . The ﬁnal GCD computation, which even with the most simple implementation: Euclid’s algo-rithm, has a complexity at most O (log n ) and as such can be completely disregarded in thecomplexity analysis.Hence, with this choice of basic parameters u, y and d ≈ d ′ ≈ (cid:0) n log log n (cid:1) we get the conjec-tured runtime of L n (cid:16) , q + o (1) (cid:17) , however this is fully heuristic and very little strong can beproven rigorously for this particular form of the number ﬁeld sieve even with modern techniques.Page 51 of 114 Preparing for randomness

Before now we have been able to suﬃce with elementary and algebraic number theory and someGalois theory. To introduce randomness we will be drawing strongly on analytic number theory.It was Riemann who really gave analytic number theory a kickstart by introducing the Riemannzeta function.

Remark.

Unless otherwise noted p, q ∈ Z are prime numbers. Deﬁnition 4.1.

Let s ∈ C and n ∈ N , then the Riemann zeta function is deﬁned as ζ ( s ) = ∞ X n =1 n s . This converges for all s such that Re( s ) >

1, has an analytic continuation to C except for asimple pole at s = 1, and admits a functional equation π − s Γ (cid:16) s (cid:17) ζ ( s ) = π − − s Γ (cid:18) − s (cid:19) ζ (1 − s ) , where Γ( s ) = R ∞ e − t t s − dt . Moreover for Re( s ) > ζ ( s ) admits an Euler product: ζ ( s ) = Y p (cid:18) − p − s (cid:19) . Remark.

The proofs for all the properties listed in the deﬁnition can be found in any undergrad-uate analytic number theory book. We suggest [8] or [10].One of the ﬁrst results of analytic number theory is the prime number theorem.

Theorem 4.2.

Let π ( x ) = P p ≤ x a > π ( x ) = Li( x ) + O (cid:18) x log x exp (cid:16) − a p log x (cid:17)(cid:19) = x log x (1 + o (1)) , Page 52 of 114here Li( x ) = Z x t dt = x log x (cid:18) O (cid:18) x (cid:19)(cid:19) is the logarithmic integral.One of the most inﬂuential open questions in mathematics is related to the zeta function:The Riemann Hypothesis. Proposition 4.3. [Riemann Hypothesis]

Let ζ ( s ) be the Riemann zeta function. ForRe( s ) >

0, if ζ ( s ) = 0 then Re( s ) = .This hypothesis has been so extensively tested that it is often accepted as true in the math-ematical community. Doing so there is a whole section of number theory that is considered‘conditional’, which will become unconditional the moment that the Riemann Hypothesis isproven. There is a very interesting result by Hardy, proven in 1914, which we will record forgood measure, [theorem 14.8, [8]]. Theorem 4.4. [Hardy’s Theorem]

There exist inﬁnitely many t ∈ R such that ζ (cid:18)

12 + it (cid:19) = 0 . However for the Riemann Hypothesis to be true this is not enough, as we want all the zeroesto be on the critical line of ℜ ( s ) = . This is still one of the major research areas in analyticnumber theory, but there are some results. The ﬁrst such was by Selberg in 1942, who showedthat there is some non-zero fraction of zeroes on the critical strip. This was later improved byLevinson who showed that at least 34 .

7% of the zeroes lie on this line in 1974-1975, and thiswas improved to 40% by Conrey in 1989. Since then there have been, to my knowledge, nosigniﬁcant improvements leaving this topic wide open for future research.Page 53 of 114 .2 Generalizing the Riemann zeta function

The Riemann zeta function is not the only function of its kind. The idea of the Riemannzeta function was generalized in two directions each with their own implications. Dirichletintroduced a generalization through the use of multiplicative character functions and Dirichletseries to deﬁne a whole family of meromorphic analytic arithmetic functions. Dedekind on theother hand chose the algebraic route and deﬁned the Dedekind zeta functions, which are deﬁnedover number ﬁelds. For the former we shall introduce Dirichlet characters, which extends theidea behind quadratic characters.

Deﬁnition 4.5.

Let G be a group. A character on G is a group homomorphism χ : G → C × and the set of characters of G is denoted ˆ G .Generally characters possess a few properties: Proposition 4.6.

Let χ be a G -character, then:1. As χ is a homomorphism: ∀ g, g ′ ∈ G : χ ( gg ′ ) = χ ( g ) χ ( g ′ )2. ∀ G, ∀ χ : χ ( e G ) = 13. ∀ G ∃ χ s.t. ∀ g ∈ G : χ ( g ) = 1. This is called the trivial or principal character.These properties reveal that ˆ G might have a group structure, and this is true for the ﬁnitecase: Lemma 4.7. If G is a ﬁnite group then so is ˆ G .There is one more property to consider for general characters before we can introduce Dirich-let characters and that is orthogonality: Page 54 of 114 eﬁnition 4.8. Let G be a ﬁnite group with set (and as G is ﬁnite: group) of characters ˆ G .Then G is said to have the orthogonality property if the following two equations hold: X g ∈ G χ ( g ) =  G, χ = χ , χ = χ X χ ∈ ˆ G χ ( g ) =  G , g = e G , g = e G Remark.

For the sake of clarity: Note that the ﬁrst sum requires a ﬁxed character and sumsover all elements, while the second requires a ﬁxed element and sums over all characters of G .It is an elementary result from representation theory that the orthogonality property holdsfor all ﬁnite groups, and in particular ﬁnite cyclic groups, which is of importance as we shallnow see: Deﬁnition 4.9.

1. Let q ∈ N . Then a Dirichlet character mod q is a character of the form χ : ( Z /q Z ) × → C .2. Let χ be a Dirichlet character mod q , we extend χ to an arithmetic function χ : Z → C as χ ( n ) =  χ ( n mod q ) , gcd( n, q ) = 10 , gcd( n, q ) > . Moreover a Dirichlet character mod q has the orthogonality property: d X n =1gcd( n,q )=1 χ ( n ) =  φ ( q ) , χ = χ , χ = χ , Page 55 of 114nd X χ mod q χ ( n ) =  φ ( q ) , n ≡ q , otherwise . As the Dirichlet characters are deﬁned over Z /q Z , and we will often consider q a prime, wehave a vested interest in how characters of cyclic groups behave. It turns out that these arerather well-behaved: Theorem 4.10.

Assume that G is a ﬁnite cyclic group such that | G | = n and h a i = G . Then:1. ˆ G has exactly n elements: χ k ( a m ) = ζ kmn , k ∈ { , . . . , n } . where ζ n = e πin is the n’th root of unity.2. G has the orthogonality property (either as deﬁned in Deﬁnition 4.8 or as a Dirichletcharacter3. ˆ G is cyclic with generator χ , hence G ∼ = ˆ G . Proof.

Let χ ∈ ˆ G . Then χ ( a ) = ζ kn for some k ∈ { , . . . , n } . Hence χ ( a m ) = χ ( a ) m = ζ kmn , proving part (1). By (1) ˆ G is cyclic and generated by χ , so G ≡ ˆ G as required. To see that G has orthogonality of characters it is simple to see that χ follows the deﬁnition of orthogonality.To illustrate we show it according to Deﬁnition 4.8. Note that X g ∈ G χ ( g ) = | G | Page 56 of 114s trivial. Hence assume that k ∈ { , . . . , n − } . Then X g ∈ G χ ( g ) = n − X m =0 χ k ( a m ) = n − X m =0 ζ kmn = 1 − ζ knn − ζ kn = 0 . By the duality of this identity the identity of ˆ G holds too, hence we have shown (2).If q is not prime we will still be dealing with a special group, as ( Z /q Z ) × will still be abelian,hence can be written as a direct product of cyclic groups. Therefore we can use theorem 4.10 tosay the following: Lemma 4.11.

Let G , G be ﬁnite cyclic groups and let G ∼ = G × G . Let χ i ∈ ˆ G i for i = 1 , χ : G → C × via χ ( g , g ) = χ ( g ) χ ( g ). This is a character. Conversely, if χ ∈ ˆ G then there exists a unique choice of χ ∈ ˆ G and χ ∈ ˆ G such that χ ( g ) = χ ( g ) χ ( g ).Furthermore ˆ G ∼ = ˆ G × ˆ G and G has orthogonality of characters and if G is ﬁnite abelian then G ∼ = ˆ G .It is clear that if q is an odd prime, then we can deﬁne χ ( n ) = (cid:18) np (cid:19) , where the right hand side is the Legendre symbol. This is a character, and it is easy to show thatthis is, in fact, a Dirichlet character. It is therefore that Dirichlet characters are a generalizationof quadratic characters, which will be extremely useful when we randomize the number ﬁeld sieve.Another very useful property is that, by deﬁnition, Dirichlet characters are (quasi-)periodic. Deﬁnition 4.12.

Let χ be a Dirichlet character mod q . We say that d ∈ N is a quasi-periodif for all m ≡ n mod d such that gcd( mn, q ) = 1: χ ( m ) = χ ( n ). Moreover the least such d iscalled the conductor.It is obvious from the deﬁnition that if there are two quasi-periods d , d then there is a third:gcd( d , d ). As q is always a quasi-period we have that the conductor of a Dirichlet charactermod q is at most q . We can however say something much stronger about the conductor of q :Page 57 of 114 roposition 4.13. Let χ be a Dirichlet character mod q with conductor d . Then d | q . Proof.

Let g = gcd( d, q ). Suppose m ≡ n mod g and gcd( mn, q ) = 1. By Euclid’s algorithmthere exists x, y ∈ Z such that m − n = dx + qy , thus χ ( m ) = χ ( m − qy ) = χ ( dx + n ) = χ ( n ) . Hence g is a quasiperiod of χ . As d is the conductor that means d | g and since g = gcd( d, q )we have that d | q .As per usual we have now found a way to consider characters “prime”: Deﬁnition 4.14.

A Dirichlet character modulo q is called primitive if it has conductor q . Remark.

Convention dictates that the trivial character χ is not primitive.This idea of primitive characters is especially useful when you are working with inducedcharacters. Taking inspiration from the notation as it is used in representation theory consider d | q and consider χ ∗ , a character mod d , and deﬁne χ ( n ) =  χ ∗ ( n ) , gcd( n, q ) = 10 , otherwise . Then χ ( n ) is a multiplicative character and has period q , so it is a Dirichlet character mod q .This is called the Dirichlet character mod q induced by a Dirichlet character mod d andwe will denote it Ind qd ( χ ). We conclude with the following theorem about induced Dirichletcharacters: Theorem 4.15.

Let χ be a Dirichlet character mod q with conductor d . Then there exists aunique character χ ∗ mod d such that χ ( n ) = Ind qd ( χ ∗ ( n )) . L -functions The deﬁnition of Dirichlet characters leads directly to the deﬁnition of Dirichlet L -functions:Page 58 of 114 eﬁnition 4.16. Let q ∈ N and let χ be a Dirichlet character modulo q . Then the Dirichlet L -function associated to χ , denoted L ( s, χ ), is deﬁned as L ( s, χ ) = ∞ X n =1 χ ( n ) n s , for R ( s ) > χ is a completely multiplicative character function we can immediately conclude that L ( s, χ ) is absolutely convergent for R ( s ) > L ( s, χ ) = Y p (cid:18) − χ ( p ) p s (cid:19) − , where p is prime. The behaviour of the trivial character and the non-trivial characters is quitediﬀerent. In the case of the trivial character we can rewrite the Euler product: L ( s, χ ) = Y p (cid:0) − p − s (cid:1) ζ ( s ) . This means L ( s, χ ) behaves like ζ ( s ) and allows for a meromorphic continuation with a singlesimple pole at s = 1. When χ is not trivial the behaviour changes radically. Theorem 4.17.

Let q ∈ N and let χ be a Dirichlet character modulo q such that χ = χ . Then L ( s, χ ) converges and is holomorphic for R ( s ) > Remark. R ( s ) = 0 is the abscissa of convergence for L ( s, χ ), but despite the convergence we cannot extend the deﬁnition of the Euler product of L ( s, χ ) to R ( s ) < R ( s ) = 1 in this case one might be interested in whatthese values actually are. This has been a problem for analytic number theorists for quite sometime, but we can say the following: Page 59 of 114 heorem 4.18. Let Q ∈ N and let χ be a Dirichlet character mod q . Then Y χ = χ L (1 , χ ) = 0 . This is absolutely not trivial and a particularly concise proof is presented in [[8], theorem4.9]. The ﬁrst of two jewels in this is the following theorem by Dirichlet.

Theorem 4.19.

Let q ∈ N and let a ∈ Z such that gcd( a, q ) = 1. Then there are inﬁnitelymany primes p ≡ a mod q Remark.

Despite not denoting the proof explicitly one the important step is that we show that X p ≡ a mod q p diverges. The theorem follows immediately.It might not surprise that with the generalization of the Riemann zeta function there followsa generalization of the Riemann Hypothesis: Proposition 4.20. [Generalized Riemann Hypothesis]

Let L ( s, χ ) be a Dirichlet L -function with χ a primitive character mod q, q >

1. Then for Re( s ) ∈ (0 , L ( s, χ ) = 0then Re( s ) = .We will come back to Dirichlet L -functions when we adapt the prime number theorem toarithmetic progressions in Section 4.3. Now we consider the other generalization of the zeta function, though it can in this case veryaccurately be called an extension.

Deﬁnition 4.21.

Let K be an algebraic number ﬁeld and let I ⊂ K be an ideal. Then thePage 60 of 114edekind zeta function over K is deﬁned as ζ K ( s ) = X I ⊆O K N K ( I )) s . The Dedekind zeta function is for R ( s ) > P ⊆ K be a prime ideal, then: ζ K ( s ) = Y P ⊆O K (cid:0) − N K ( P ) − s (cid:1) . It was proven by Hecke that the Dedekind zeta function admits an analytic continuation to thewhole complex plane and also admits a functional equation, so it is clear that in many ways theDedekind zeta function behaves like the Riemann zeta function. This correlation is so close thatit culminates in the following theorem:

Theorem 4.22. Landau, 1903 - Prime Ideal Theorem

Let K be an algebraic number ﬁeld and let P ⊆ O K be a prime ideal and deﬁne π K ( x ) = |{ P ∈ O K | N ( P ) ≤ x }| . Then for x → ∞ , π K ( x ) ≍ x log x . For our purposes it is suﬃcient to realize that the Dedekind zeta function over K behaves alot like the Riemann zeta function, and that the same language can be used to talk about theDedekind zeta function as what we use for the Riemann zeta function. In fact, there is also aRiemann Hypothesis for the Dedekind zeta functions: Proposition 4.23. [Extended Riemann Hypothesis]

Let K be a number ﬁeld. ForRe( s ) ∈ (0 , ζ K ( s ) = 0, then Re( s ) = . Remark.

As the statements are equivalent for the Dirichlet L -functions and the Dedekind zetafunctions it is not uncommon to see the Extended Riemann Hypothesis be called the GeneralizedPage 61 of 114iemann Hypothesis for number ﬁelds. We will in this paper refer to both as the GeneralizedRiemann Hypothesis, GRH, and let context infer which version we are referring to. One of the main problems that we encounter when we consider both Dirichlet L -functions andDedekind zeta functions is that we can not generalize the zero free region. Especially, for theRiemann zeta function we were able to extend the zero-free region just past the R ( s ) = 1 line,and this may not be true for Dedekind zeta functions and Dirichlet L -functions. In fact: Theorem 4.24.

There is a constant c > L ( σ + it, χ ) = 0 for some primitivecomplex Dirichlet character χ mod q then σ < − c log q ( | t | + 2) . If χ is a real primitive character then this holds for all zeroes of L ( s, χ ) with at most oneexception. The exceptional zero, if it exists is real and simple.Later we will see a result by Landau, 5.29, which develops on this idea. It is clear to see thatthis theorem is formulated for Dirichlet L -functions, but a similar result exists for Dedekind zetafunction. We therefore introduce the following concept: Deﬁnition 4.25.

Let L ( s, χ ) be a Dirichlet L -function, then the hypothetical value 1 − ν for ν ∈ (0 , ), such that L (1 − ν, χ ) = 0 , for a Dirichlet character modulo q , χ , is called a Siegel-zero of L ( s, χ ).Equivalently, for ζ K ( s ), if there exists a ν ∈ (cid:0) , (cid:1) such that ζ K (1 − ν ) = 0 , Page 62 of 114or a number ﬁeld K , is called a Siegel-zero of ζ K ( s ).The culminating theorem from Siegel on ‘his’ zeroes was Theorem 4.26. Siegel’s theorem

For any ǫ > c ( ǫ ) such that,if χ is a real primitive character mod q , then L (1 , χ ) > c ( ǫ ) q − ǫ . Even in the deﬁnition we elude to these zeroes as hypothetical, as the assumption of the(Generalized) Riemann Hypothesis has as an immediate consequence that such zeroes do notexist. For the sake of this paper [14] does not assume the Riemann Hypothesis in any way, hencewe must always concern ourselves with possible Siegel zeroes.

Now we turn to the next topic, which is a main factor in the Randomized Number Field Sieve:Smooth numbers on arithmetic progressions. To be able to do this we ﬁrst need to know a littlebit about prime numbers on arithmetic progressions and their distribution. First we note thatit would be beneﬁcial for us to have a prime counting function for arithmetic progressions. Thisturns out to be a simple modiﬁcation:

Deﬁnition 4.27.

For gcd( a, q ) = 1 the prime counting function for arithmetic progressions isdeﬁned as π ( a,q ) ( x ) = π ( x, a, q ) = X p ≤ xp ≡ a mod q . The ﬁrst question that comes to mind is one that was answered by Dirichlet: Are thereinﬁnitely many primes of that form? This is the case, as we stated in theorem 4.19:

Theorem 4.28.

Let x ∈ R and a, q ∈ Z such that gcd( a, q ) = 1. Then there are inﬁnitely manyprimes of the form p ≡ a mod q . Page 63 of 114his immediately shows that just like for π ( x ) we have that π ( x, a, q ) diverges. Therefore, weinfer, there must be something interesting to say about the asymptotic behaviour of the primecounting function. Not only is there something interesting to say, there is a whole lot to say aswell: Theorem 4.29.

Let x ∈ R , a, q ∈ Z such that gcd( a, q ) = 1, and let φ ( x ) be Euler’s totientfunction. Then π ( x, a, q ) ∼ x log x φ ( q ) . This immediately shows that the density is independent of a as long as gcd( a, q ) = 1. Before we start to talk about smooth numbers we want to consider if anything better thanDirichlet’s function can be said, and we can if we allow for an ineﬀective bound. For this wedeﬁne ψ ( x, a, q ) = X n ≤ xn ≡ a mod q Λ( n ) , where Λ( n ) is the von Mangoldt function:Λ( n ) =  log p ∃ k ∈ N : n = p k . Theorem 4.30. Siegel-Walﬁsz theorem

Let ψ ( x, a, q ) and Λ( n ) be as deﬁned above. Given q ∈ N there exists a positive constant c ( q ) such that ψ ( x, a, q ) = xφ ( q ) + O (cid:16) x exp (cid:16) − c ( q ) p log x (cid:17)(cid:17) . The reason this is ineﬀective is because it is based on Siegel’s theorem, 4.26, and this theoremgives no way to compute c ( n ). Another way to go about this is looking at average cases. Thisprobabilistic approach will be explored later and will lead to eﬀective constant and a far strongerPage 64 of 114ound.It may not come as a surprise that the idea of prime numbers along arithmetic progressionscan be expanded to number ﬁelds, as we have done with the Prime Ideal Theorem. For thiswe ﬁrst consider what an ‘arithmetic progression’ is on a number ﬁeld. This does not comeintuitively, but the answer lies in Galois Theory. Let L/K be a Galois extension with Galoisgroup G = Gal( L/K ). As

L/K is a Galois extension the Frobenius element, Frob P , deﬁnes aconjugacy class C = (cid:26) Frob Q | Q ⊂ L s.t. Q is a prime ideal and Q | P (cid:27) . It can be shown that this abides the rules of modular arithmetic and therefore can be used asan extension of the idea of arithmetic progressions. Using this we get the following result:

Theorem 4.31. Chebotarev Density Theorem

Let

L/K be Galois and let P ⊆ K be aprime ideal. Moreover let C ⊆ G be the conjugacy class deﬁned above. Then { P | P ∤ ∆ L/K , Frob P ∈ C } has density | C || G | .This has long been the best result available, but over the years it has been strengthened bothunder the GRH assumption and without. Later we will see a version which will be of particularinterest to us as it is both unconditional and has eﬀectively deﬁned constants. Lastly, before we move onto a diﬀerent topic, we discuss how smooth numbers behave on arith-metic progressions. First we need some deﬁnitions:Page 65 of 114 eﬁnition 4.32.

For x, y, a, q ∈ N and a Dirichlet character χ , we deﬁne:Ψ( x, y ) = |{ z ∈ N | z < x, z is y -smooth }| , Ψ r ( x, y ) = |{ z ∈ N | z < x, z is y -smooth , gcd( z, r ) = 1 }| , Ψ( x, y, a, r ) = |{ z ∈ N | z < x, z is y -smooth , z ≡ a mod r }| , Ψ( x, y, χ ) = X z

Conjecture 4.33.

Let A be a given positive real number. Let y and q be large with q ≤ y A .Then as log x log y → ∞ we have Ψ( x, y, a, q ) ∼ φ ( q ) Ψ q ( x, y ) . Soundararajan proved in [29] that the above holds for A ≥ √ e − ǫ given a certain bound on y . The latter restriction was later removed by Harper, [30]. We will see more of Harper’s worklater as we use these results.We now introduce some bounds based on Hildebrand and Tenenbaum’s work, which will bepresented proof-less here. For those especially interested in this we suggest [25],[26],[22]. Wewill avidly be working with these sets later on, but to do so we will ﬁrst need a few resultsstarting with [[14], fact 3.11]: Proposition 4.34.

Let ǫ > ≤ u ≤ (1 − ǫ ) log x log log x Then:Ψ (cid:16) x, x u (cid:17) = x exp (cid:18) − u (cid:18) log u + log log u − u − u + O ǫ (cid:18) log log u log u (cid:19)(cid:19)(cid:19) . A very rough result that we use is a direct result of this factPage 66 of 114 orollary 4.35.

Fix 0 < a < b ≤

1. Then uniformly in c, d > ρ ( L x ( b, d ) , L x ( a, c )) = L x (cid:18) b − a, d ( b − a ) c (cid:19) − o (1) . Proof.

Let u = log L x ( b, d )log L x ( a, c ) = dc (cid:18) log x log log x (cid:19) b − a . Then u → ∞ and u = log (cid:16) log x log log x (cid:17) . Hence ρ ( L x ( b, d ) , L x ( a, c )) = exp( − (1 + o (1)) u log u )= exp (cid:18) − (1 + o (1)) d ( b − a ) c log b − a ( x )(log log x ) b − a (cid:19) = L x (cid:18) b − a, d ( b − a ) c (cid:19) − o (1) . If we are substantially more careful however there are tighter results, whose proofs go beyondthe scope of this paper, by Hildebrand and Tenenbaum. These results allow short intervals tobe sieved for smooth numbers eﬀectively:

Theorem 4.36.

Fix any ǫ >

0. For any x ≥

3, log x ≥ log y ≥ (log log x ) + ǫ , z ≤ y , thefollowing estimate holds uniformly:Ψ (cid:0) x (cid:0) z − (cid:1) , y (cid:1) − Ψ( x, y ) = Ψ( x, y ) z (cid:18) O (cid:18) log( u + 1)log y (cid:19)(cid:19) . Theorem 4.37.

For any x, y , we set u = log x log y . There exists a saddle-point, α = α ( x, y ), suchthat for any 1 ≤ c ≤ y :Ψ( cx, y ) = Ψ( x, y ) c α ( x,y ) (cid:18) O (cid:18) u + log yy (cid:19)(cid:19) , withPage 67 of 114 ( x, y ) = log (cid:16) y log x + 1 (cid:17) log y (cid:18) O c (cid:18) log log( y + 1)log y (cid:19)(cid:19) . Theorem 4.38.

Let c > n ∈ N such that ω ( n ) is the number of primefactors of n (without multiplicity). Let n be a y -smooth number with 2 ≤ y ≤ x such that ω ( n ) ≤ y c (log(1+ u )) − . Then:Ψ n ( x, y ) = φ ( n ) n Ψ( x, y ) (cid:18) O (cid:18) log(1 + u ) log(1 + ω ( n ))log y (cid:19)(cid:19) . This ﬁnal theorem induces the following corollary which will prove crucial to us:

Corollary 4.39.

Take c > ω as in 4.38. Let 2 ≤ y ≤ x andwith ω ( n ) ≤ y c (log(1+ u )) − . Then:Ψ n ( x, y ) = φ ( n ) n Ψ( x, y ) (cid:18) O c (cid:18) log(1 + u ) log(1 + ω ( n ))log( y ) (cid:19)(cid:19) (cid:18) O (cid:18) ω ( n ) y (cid:19)(cid:19) . Proof.

Let n = sr for s y-smooth and r with no prime factor less than y . Then Ψ s ( x, y ) =Ψ r ( x, y ) , φ ( n ) = φ ( r ) φ ( s ) and, for p prime, φ ( r ) r − = Q p | r (1 − p − ) = 1 + O ( ω ( n ) y − ), whichimplies the given bound. The Randomized Number Field Sieve is a probabilistic algorithm and makes extensive use ofthe discrete uniform distribution. Deﬁne for x ∈ [ a, b ] and for y ∈ S , where S is a ﬁnite set, thediscrete uniform distribution as P ( x ) = 1 b − a + 1 , or P ( y ) = 1 | S | This is in many ways one of the simplest distributions to work with and overall we will onlyneed a limited amount of probability theory. There is however a need to understand how adistribution can be used to deﬁne a measure. For this recall the following deﬁnition:Page 68 of 114 eﬁnition 4.40.

A set function µ : F → R , for a ﬁeld F , is a probability measure if it satisﬁesthese conditions:1. 0 ≤ µ ( A ) ≤ A ⊆ F . µ ( ∅ ) = 0, µ ( F ) = 1 . µ ( A , A , . . . , A n ), for a disjoint sequence of F -sets, such that S ∞ i =1 A i ∈ F , then µ ∞ [ i =1 A i ! = ∞ X i =1 µ ( A i ) . Remark.

Note that if µ is a probability measure then the support of µ is any set A ⊂ F forwhich µ ( A ) = 1. It is clear that this always exists by the second condition.We can conﬁrm that for any ﬁnite set S we can deﬁne a uniform measure. To see this let µ be the uniform distribution of some ﬁnite set S = { s , . . . , s n } such that | S | = n . Then for any V ⊂ S with order | V | we have that µ ( V ) = | V || S | . Especially we have that µ ( ∅ ) = 0 , µ ( S ) = 1 , and 0 ≤ µ ( V ) ≤ . Lastly for any collection of disjoint subsets V , . . . , V m ⊂ S : µ ∞ [ i =1 V i ! = µ m [ i =1 V i ! = P mi =1 | V i || S | = m X i =1 | V i || S | = m X i =1 µ ( V i ) . Similarly we can conﬁrm that for any continuous interval [ a, b ] we can deﬁne a uniform measure.

Deﬁnition 4.41.

For any ﬁnite set S , we denote the uniform measure over S by U ( S ) . Page 69 of 114ne of the things we are going to use our uniform measure on are the zero-centered half-openinteger intervals of length L , which we will denote as follows: I ( L ) = (cid:20) − L, L (cid:19) ∩ Z . We note here, for good measure, that this is in fact a ﬁnite set. In the section above we alreadydiscussed the Dirichlet convolution of arithmetic functions, but there is also a convolution ofmeasures, which is normally deﬁned as an integral, but we are interested in the discrete uniformmeasures as deﬁned above. This allows us to restate the convolution of measures as follows:

Deﬁnition 4.42.

For any two measures µ, ν over an additive group G we deﬁne the convolutionof measures as ( µ ⋆ ν ) ( x ) = X y ∈ G µ ( y ) ν ( x − y ) . Remark.

Note that it is convention to denote the convolution of measures by ∗ , but to be distinctwe will denote the convolution of measures by ⋆ and the Dirichlet convolution by ∗ .One of the big concepts in the RNFS is the avoidance of the General Riemann Hypothesis,and for this we need to consider the moments of a probability distribution. We will recognizethe ﬁrst two moments, mean and variance, of the discrete uniform distribution: Deﬁnition 4.43.

Let U be a discrete uniform distribution with support S , then the ﬁrst moment,or mean, is deﬁned as E ( U ) = X s ∈ S P U ( s ) . Remark.

By convention we denote subscripted to the probability, P , the ﬁxed variables. In thiscase we ﬁx the distribution, but this is usually dropped when the distribution is clear fromcontext. Deﬁnition 4.44.

Let U be a discrete uniform distribution with support S , then the secondmoment, or variance, of x ∈ S is deﬁned asVar( x ) = E ( x − E ( x )) . Page 70 of 114he idea of the RNFS is to remove the dependence of the analysis of number ﬁeld sieve-typealgorithms on the second moment, for which certain bounds on the complexity exist. To do thiswe consider a probabilistic technique which Lee and Venkatesan, in [14], have dubbed stochasticdeepening . The core idea is as follows:

Lemma 4.45.

Let x be a random variable with E ( x ) = µ . Given there exists a K ≥ ≤ x ≤ Kµ uniformly, then there exists i ∈ { , . . . , ⌈ log K ⌉} such that: P (cid:18) x ≥ i µ ⌈ log K ⌉ (cid:19) ≥ i +1 . This lemma states that for non-negative random variables which are not too erratic, thereis a substantial set where the value is large and whose contribution to the mean is large. Thisallows us to provide a search algorithm whose run times are shown to be near optimal withoutestablishing variance bounds.The explicit statement of such a search algorithm is not too important for us, as we are analysingthe theoretical complexity, so we will not deﬁne such an algorithm explicitly. It will howeverbecome important in the proof of one of the key theorems, 5.2.Page 71 of 114

The Randomized Number Field Sieve

Recalling the GNFS algorithm, algorithm 3, we will focus in this section on giving a descriptionof the diﬀerences between the sieve from chapter 3 and a randomized version with provable com-plexity. It is well known that the Number Field Sieve’s runtime is dominated by two problems:1. Finding a set S of suﬃcient ( a, b )-pairs such that a − bm is B -smooth and a − bα is B ′ -smooth.2. Collapsing these ( a, b )-pairs to ﬁnd a subset T ⊆ S such that a congruence of squaresmodulo the to-factor integer n arises.In this version of the Number Field Sieve these two steps are randomized cleverly to produce aprovable complexity. To do this there is one more change that needs to be made, and that is thechoice of polynomial. It will be shown that by randomizing the choice of polynomial a rigorousbound can be obtained for the search. To start we consider the constants δ, κ, σ, β, β ′ under theconditions that κ > δ − , σ > max ( β, β ′ ) + δ − β (1 + o (1)) + σδ + κ β ′ (1 + o (1)) , (5) δ − < σδ + κ , (6)and ﬁx the smoothness bounds B = L n (cid:18) , β (cid:19) , B ′ = L n (cid:18) , β ′ (cid:19) . (7)The culminating theorem of the randomized number ﬁeld sieve is as follows:Page 72 of 114 heorem 5.1. Let δ, κ, σ, β, β ′ be constants as deﬁned in (1) and (2). Then for any n , therandomized number ﬁeld sieve runs in expected time L n (cid:18) , r

649 + o (1) (cid:19) , and produces a pair x, y with x ≡ y mod n .It is important to stress that this is a NFS-style algorithm, so we can not be sure that thecongruence found does not give a trivial factorization. The algorithm is started in the same wayas the GNFS by ﬁnding a polynomial, however in this randomized version we will insist that thepolynomial is a bivariate homogeneous monic irreducible polynomial of bounded degree, that is: f ( x, y ) ∈ Z [ x, y ] : f ( x, y ) = ˆ f m,n ( x, y ) + R ( x, y ) , where ˆ f m,n ( x, y ) is the polynomial given by n and m , such that ˆ f m,n ( m,

1) is the base- m expansionof n and R ( x, y ) = d − X i =0 c i ( x − ym ) x d − i − y i , such that every c i is determined uniformly at random and deg( f ) = d = δ q log n log log n , d odd.Moreover we bound the choice of m by m d ≤ n < m d . Remark.

The harshness of these bounds are all necessary to show a provable complexity, so wewill often refer back to this introduction as “the deﬁned parameters” without restating the exactparameters again.

Remark.

We will, like [14], freely switch between considering the polynomial f ( x, y ) as a bivariatepolynomial and it’s monovariate equivalent f ( x ) = f ( x, Theorem 5.2.

Take δ, κ, σ, β, and β ′ with the deﬁned parameters. For any n , the randomizednumber ﬁeld sieve can almost surely ﬁnd an irreducible polynomial f of degree d and height atPage 73 of 114ost L n ( ), with α a root of f , n | f ( m ), and L n ( 13 , max( β, β ′ ) + o (1))distinct pairs a < | b | ≤ L n ( , σ ) such that ( a − bm ) is B -smooth and a − bα is B ′ smooth, inexpected time at most L n ( , λ ) for any λ > max( β, β ′ ) + δ − β (1 + o (1)) + σδ + κ β ′ (1 + o (1)) . In particular, the probability that the randomized number ﬁeld sieve fails to produce such a setis bounded above by L n ( , κ − δ − ) − o (1) .The key purpose for the randomization of the polynomial selection process has to do withthe sieving process over the number ﬁeld, as it will cause the norm of a pair, N ( a − bα ), tobecome a random variable in Z . This allows the consideration of smoothness over Z and Z [ α ] tobe completely independent. We will explain this idea in more detail when we look at the sievingprocess over the number ﬁeld.This also leads us to question what the quadratic characters will then look like, and this isthe second part where the RNFS diﬀers signiﬁcantly from the GNFS. Where in the GNFS wedeﬁned the quadratic characters as the Legendre symbols (cid:0) rp (cid:1) modulo a a prime p , we nowhave to account for our randomization and instead will have to choose maps from Z [ α ] into F p k stochastically and close to uniformly across all k log( p ) < L n ( ). This exponentially increasesthe sizes of the ﬁelds, but we will see they are necessary for the unconditional equidistributionof the characters. Once the sieving process is ﬁnished we are, with the exception of a set of bad f , guaranteed a reduction to a congruence of squares: Theorem 5.3.

Let

B, B ′ be with the deﬁned parameters. Let f be irreducible of degree d andheight at most L n ( , κ ), and let α be a root of f . Then for all but a L n ( , κ − δ − (1 + o (1))) − Page 74 of 114raction of the set of f , if we are given L n (cid:18) , max( β, β ′ ) (cid:19) Ω (log log n )pairs a < b ≤ L n ( ) such that a − mb is B -smooth and a − bα is B ′ -smooth, we can ﬁnd acongruence of squares modulo n in expected time at most L n (cid:18) , (cid:18) δ , β, β ′ (cid:19)(cid:19) o (1) . Let’s dive into the details.

As we mentioned the algorithm is a GNFS based algorithm with two major diﬀerences: Howthe polynomial is chosen and how the quadratic characters are chosen. To start we will look athow the polynomial is chosen, and consequences related to that, before proving theorem 5.2. Todo this let β, β ′ , κ, σ, δ, B , and B ′ be as deﬁned in (5), (6), and (7).To begin we make a few restrictions: Deﬁnition 5.4.

Let X be the set of tuples ( f, m, n, a, b ) such that the following conditions hold:1. m ∈ Z and m ∈ h − d L n (cid:0) , δ − (cid:1) , L n (cid:0) , δ − (cid:1)i . f ∈ Z [ x, y ], deg( f ) = d = δ q log n log log n , 2 ∤ d , with coeﬃcients bounded by L (cid:0) , κ (cid:1) (1 + o (1)).Moreover let the coeﬃcients { c i } i ∈ I be drawn uniformly at random such that c i ∈ I (cid:18) L n (cid:18) , κ − δ − (cid:19)(cid:19) . a, b ∈ Z , ≤ a < | b | ∈ (cid:2) L n (cid:0) , σ (cid:1)(cid:3) , with a − bm B -smooth and f ( a, b ) B ′ -smooth.Moreover let X f,m,n = { ( a, b ) ∈ Z : ( f, m, n, a, b ) ∈ X } . The ﬁrst thing we note from thisPage 75 of 114eﬁnition and the deﬁnition of our parameters is that f ( x, y ) is not generated uniformly atrandom as it has a component solely determined by m and n , ˆ f m,n ( x, y ), which is a problem aswe try to obtain that f ( a, b ) is as likely to be B ′ -smooth as a uniformly random integer of thesame size. To mitigate this we shall use the observation that c i is much larger than m to provethat the randomized part of f ( x, y ) dominates and that f ( a, b ) therefore can be considered as auniformly distributed along any arithmetic progression of common diﬀerence ( a − mb ).Once we have achieved this we shall show that for most B -smooth moduli a − mb , the B ′ -smooth numbers are approximately uniformly distributed modulo ( a − mb ). It is only then thatwe can show that f ( a, b ) is as likely to be B ′ -smooth as a random integer of the same size.Once we have shown this we shall venture to prove that there are suﬃcient ( a, b )-pairs such that a − mb is B -smooth and f ( a, b ) is B ′ -smooth, which will lead to a proof of theorem 5.2. A keyin this is the observation that f ( a, b ) lies on the arithmetic progression given by (cid:26) ˆ f m,n ( a, b ) + ( a − mb ) z : | z | ≤ dL n (cid:18) , κ − δ − (cid:19) b d (cid:27) . Continuing our exposition using the deﬁnitions and parameters we have set up until now werecall the following: ∀ i ∈ { , . . . , d − } : ( c i ) ∼ µ = U I (cid:18) L n (cid:18) , κ − δ − (cid:19)(cid:19) d ! and denote c = ( c i ) to be the coeﬃcient vector of f .Note that for f ( a, b ) to be B ′ -smooth it is necessary for a − bm to be B -smooth, howeverthis is not something we can simply assume. Therefore we consider the deﬁnition of f ( x, y ) andnote that f ( a, b ) ≡ ˆ f m,n ( a, b ) mod a − bm. Page 76 of 114s gcd( a, b ) d | f ( a, b ) ˆ f m,n ( a, b ) we have that gcd( a, b ) d ( a − bm ) | R ( a, b ). Now let a and b beuniform in their ranges, then we can give an explicit description of the probability that a − mb is B -smooth: Proposition 5.5.

Fix b in its interval. If a, m are uniformly random, then: P a,m ( a − bm is B -smooth) = L n (cid:18) , δ − β (1 + o (1)) (cid:19) − . Proof.

We ﬁx b . Note that a is uniformly random on an interval of length b , and m is uniformlyrandom over an interval of length comparable to its largest value. In particular: a − bm ∼ U (cid:20) − bL n (cid:18) , δ − (cid:19) , − b (cid:18) − d L n (cid:18) , δ − (cid:19) + 1 (cid:19)(cid:19) = U (cid:2) − x (cid:0) z − (cid:1) , − x (cid:1) , for x = L n (cid:0) , δ − (1 + o (1)) (cid:1) and z ≈ d − O ( d − ). Note that d = log (cid:16) B (cid:17) , and thatlog log B = O (log log x ). Hence from theorem 4.36 the number of smooth values of a − mb is:Ψ( x, B ) z (cid:18) O (cid:18) log( u + 1)log B (cid:19)(cid:19) . Since the range of values is of length xz , P a,m ( a − bm is B -smooth) = ρ ( x, B ) (cid:18) O (cid:18) log( u + 1)log B (cid:19)(cid:19) . Recall that B = L n (cid:0) , β (cid:1) and x = L n (cid:0) , δ − (cid:1) − . Furthermore, note that log u < log log n = o (log B ). Hence by corollary 4.35 ρ L n (cid:18) , δ − (cid:19) o (1) , B ! = L n (cid:18) , δ − β (cid:19) − o (1) . Page 77 of 114bsorbing the multiplicative 1 + o (1) into the o (1) error term in the exponent to obtain P a,m ( a − bm is B -smooth) = L n (cid:18) , δ − β (1 + o (1)) (cid:19) − . Now we can ﬁx the residue of f ( a, b ) mod a − mb and consider only how the randomnessof our coeﬃcients c i aﬀect the polynomial. Therefore we will also want to be explicit aboutthe coeﬃcient vector, c = ( c i ) i ∈{ ,...,d − } , for f ( x, y ), hence we will, when when we need to beexplicit, denote f ( x, y ) = f c ( x, y ).Now we are set up to show that f ( a, b ) can be considered as uniformly distributed along pro-gressions of common diﬀerence ( a − mb ). Lemma 5.6.

Let a < b , with gcd( a, b ) = 1, deﬁne ϕ = ϕ a,b : Z d → Z by the following ϕ (( v , . . . , v d − )) = d − X i =0 v i a d − i − b i . There exists a set S ⊆ I (cid:0) L n (cid:0) , σ (cid:1)(cid:1) d such that ϕ bijects S and I (cid:0) b d − (cid:1) . Proof.

For each i ≥

0, we claim that for any | t | ≤ b i + a i +1 there exists a representation t = a i x + a i − bx + . . . + b i x i , with | x | , . . . , | x i | ≤ a + b . We proceed by induction on the number of terms. If i = 0 then theconclusion is trivial as | t | ≤ b + a = a + 1 and so t = a x = x , for | x | ≤ a + b , and as a + 1 ≤ a + b the result follows. Now let i >

0, then we may choosePage 78 of 114 y | < a such that (cid:12)(cid:12) t − ya i (cid:12)(cid:12) ≤ b i . Letting z ∈ { , . . . , b − } such that za i ≡ t − ya i mod b we can set x = y + z and as | y | < a and | z | < b it follows that | x | ≤ a + b . By construction b | t − x a i and (cid:12)(cid:12)(cid:12)(cid:12) t − x a i b (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) ( t − ya i ) − za i b (cid:12)(cid:12)(cid:12)(cid:12) ≤ b i − + a i . Hence by the induction hypotheses such a representation exists for every t with | t | ≤ b i + a i +1 .Now we show the existence of S directly. Let t ∈ I ( b d − ), then for any such t : | t | ≤ b d − andso the conditions of the above hold with i = d −

1. Hence there exists a sequence ( x i ) i ∈{ ,...,d − } such that d − X i =0 x i a d − i − b i = t, with | x i | < a + b for all i . Hence the vector given by the sequence, x t = ( x i ), fulﬁlls that ϕ ( x t ) = t . Moreover, x t ∈ I (2( a + b )) d and by Defn 2.4.3 this means x t ∈ I (cid:0) L n ( , σ ) (cid:1) d .Hence, taking S = { x t : t ∈ I (cid:0) b d − (cid:1) gives the construction of a single vector x t , hence ϕ isbijective on S . Remark.

Note that we have presupposed that gcd( a, b ) = 1, but it is a simple exercise to showthat this is trivial. By the deﬁnition of ( a, b ) ∈ X f,m,n we know that a − mb is B -smooth, hencegcd( a, b ) is B -smooth. Since we will only be interested in looking at B -smoothness to show theuniform randomness of f ( x, y ) we can assume without loss of generality that gcd( a, b ) = 1 asdivision of two B -smooth numbers preserves B -smoothness.We can now reformulate f ( x, y ) using the map ϕ . For this consider m, n ﬁxed and observethat: f ( a, b ) = f c ( a, b ) = f c ′ ( a, b ) + ( a − mb ) ϕ ( a,b ) ( c − c ′ ) . This motivates the following:

Deﬁnition 5.7.

Let S be the set given by the lemma 5.6 . For any l ≤ b d − , we deﬁne a set S l Page 79 of 114nd a measure ν l as follows: S l = { v ∈ S | ϕ ( v ) = I ( l ) } ,ν l = U ( S l ) . Particularly, ν l gives a uniformly random element of S whose image under ϕ is in I ( l ).It is clear to see that if v ∼ ν l , then we can write f v ( a, b ) ∼ f ( a, b ) + ( a − mb ) U ( I ( l )) . This allows the consideration that for the measures ν l with support S l give us additive alterationsto c which change f ( a, b ) additively by a − mb times a uniformly random value. This meansthat if f is indeed uniformly random, then we can replace the randomness of c over cosets of S l with randomness of f c ( a, b ) over arithmetic progressions. Deﬁnition 5.8.

For ¯ µ : X → R + a measure and F : X → Y a function, we deﬁne the measure F ¯ µ : Y → R + given by F ¯ µ ( y ) = ¯ µ (cid:0) { F − ( y ) } (cid:1) = X x : F ( x )= y ¯ µ ( x ) . Remark.

It will be left to the reader to prove that this is in fact a measure. Moreover note thatif F is bijective then F ¯ µ ( y ) = ¯ µ ( x ), for some unique x ∈ X .Now consider c ∼ ν l such that we can write f c ( a, b ) ∼ f ( a, b ) + ( a − mb ) U ( I ( l )) . As µ is a uniform distribution on a cube of side 2 L n (cid:0) , κ − δ − (cid:1) and we have a bijective func-tion ϕ : S → I ( b d − ) we can now show that ϕ µ is actually close to a convolution of uniformdistributions on intervals. In fact, by convolving with several distributions, ν l i , we can treateach coeﬃcient in f as if it were independent and uniformly random. This is made rigorous inthe following proposition: Page 80 of 114 roposition 5.9. Fix a, b . There is a distribution ϑ such that ϑ is the convolution of uniformdistribution on intervals of lengths L n (cid:0) , κ − δ − (cid:1) a i b d − − i for i ∈ { , . . . , d − } with k ϕ µ − ϑ k = O L n (cid:18) , ( κ − δ − )(1 + o (1)) (cid:19) − ! , and | E ( ϑ ) | ≤ d − X i =0 a i b d − i − . Proof.

Recall that the convolution of distributions, given by ⋆ , is deﬁned by Deﬁnition 4.42 anddeﬁne ν = µ ⋆ (cid:2) ⋆ d − i =0 ν a i b d − i − (cid:3) . From lemma 5.6, the support of ν a i b d − i − is contained in a cube of side 4 L n (cid:0) , σ (cid:1) . Hencethe support of P of ⋆ d − i =0 ν a i b d − i − is contained in a cube of side 4 dL n (cid:0) , σ (cid:1) . When k x k ∞

Recall from Deﬁnition 4.32Ψ( x, y, r, a ) = |{ z ∈ N : z < x, z is y -smooth , z ≡ a mod r }| and Ψ r ( x, y ) = |{ z ∈ N : z < x, z is y -smooth , gcd( z, r ) = 1 }| . We will often supress the ǫ as we are only interested in an error exponent up to o (1). More-over, we will say r is B -bad for F if it is not B -good for F and that r is B -good (dropping the F ) if r is B -good for all F ∈ (cid:2) L n (cid:0) (cid:1) ω − , L n (cid:0) (cid:1)(cid:3) for ω = L n ( ).With this notion we can start constraining the behaviour of f ( a, b ) mod a − mb for this wewill need to characterise the moduli for which the smooth numbers are uniformly distributedacross their residue classes. Once we have that it doesn’t matter which residue class f c ( a, b ) liesin as it does not aﬀect its probability of being smooth as c varies. From Section 4.3 we have a result of [30] proving that, under some conditions, y -smooth numbers ≤ x are uniformly distributed over the φ ( q ) residue classes a mod q with gcd( a, q ) = 1. Howeverthe conditions prevent us from concluding uniform distribution directly, hence if we can show thatthe distribution of B ′ -good numbers is close enough to uniform assuming a − mb is B -smooth,to be considered as such then we are done. To do this we consider a Bombieri–Vinogradov styletheorem proposed by Harper, [23]: Page 83 of 114 heorem 5.11. Let c and K be ﬁxed and eﬀective constants. Then or any log K F < B < F with u = log F log B , and Q ≤ p Ψ( F, B ): X r ≤ Q max ( s,t )=1 (cid:12)(cid:12)(cid:12)(cid:12) Ψ( F, B, r, s ) = Ψ r ( F, B ) φ ( r ) (cid:12)(cid:12)(cid:12)(cid:12) ≪ Ψ (

F, B ) (cid:16) e − cu log2( u ) + B − c (cid:17) + Q p Ψ( F, B ) log / F, with an implied eﬀective constant C = C ( c, k ) . Remark.

We remark that the deﬁnition of “eﬀective” diﬀers per application. For our purposewe may simply claim that c < K are chosen ﬁttingly for the to-factor n .If we consider Q max = max b,m | a + bm | = L n (cid:18) , δ − (1 + o (1)) (cid:19) , we can reconsider our question to the equivalent notion of bounding the probability that a B -smooth modulus less than Q max is B ′ -bad. This would immediately follow from showing thatthe number of B ′ -bad moduli below the Q -bound is much smaller than Ψ( Q max , B ). We will,for our speciﬁc needs, bound the sum in theorem 5.11 as we know that the common diﬀerencein the arithmetic progression is known to be y -smooth. For this let q = a − mb , then: Proposition 5.12.

Let ǫ >

K, c such that forany log K x < y < x x with u = log x log y , x ǫ ≤ Q ≤ p Ψ( x, y ) and ω = ω (1) with ω = y O (1) : X r ∈ [ Qω − ,Q ] r is y -smooth max gcd( a,r )=1 (cid:12)(cid:12)(cid:12)(cid:12) Ψ( x, y, r, a ) − Ψ r ( x, y ) φ ( r ) (cid:12)(cid:12)(cid:12)(cid:12) ≪ Ψ( x, y ) ρ ( Q, y ) (cid:16) e − cu log2 u + y − c (cid:17) + Q p Ψ( x, y ) log / x, for some eﬀective implied constant. Remark.

The proof for this is mostly an exercise in restating Harper’s [23] and is thereforeomitted. The full proof can be found in [14].Page 84 of 114ow we return to what we set out to prove: For most B -smooth moduli a − mb , the B ′ -smooth numbers are uniformly distributed. This follows immediately from the following twopropositions: Proposition 5.13.

Fix a, b, m, n in their intervals and let f be uniformly random as before.Then: P f ( f ( a, b ) is B ′ -smooth | ( a − mb ) is B ′ -good) = L n (cid:18) , κ + σδ β ′ (1 + o (1)) (cid:19) − . Proof.

Let a − bm = r . By proposition 5.9: P n,f ( f n,m ( a, b ) is B ′ -smooth) = P n,c (cid:16) ˆ f m,n ( a, b ) + rϕ ( a,b ) ( c ) is B ′ -smooth (cid:17) = P n,ϑ (cid:16) ˆ f m,n ( a, b ) + rϑ is B ′ -smooth (cid:17) + O L n (cid:18) , ( κ − δ − )(1 + o (1)) (cid:19) − ! . By the deﬁnition of the parameters we have κ > δ − and therefore κ − δ − >

0. Now recall,from proposition 5.9, ϑ has | E ( ϑ ) | ≤ P d − i =0 a i b d − i − and is sampled according to the convolutionof uniform measures on intervals of length L n (cid:0) , κ − δ − (cid:1) a i b d − i − for i = 0 , . . . , d −

1. Hence ϑ is unimodal with mode at some M satisfying | M | ≤ d − X i =0 a i b d − i − < db d − i − = L n (cid:18) , σδ (1 + o (1)) (cid:19) . Now deﬁne F max = L n (cid:18) , κ − δ − (cid:19) ( a − mb ) d − X i =0 a i b d − i − , then the support of ϑ is contained in (cid:2) M − F max | r | − , M + F max | r | − (cid:3) . We choose an ω = L n (cid:0) , o (1) (cid:1) , such that ω → ∞ , and set Y = L n (cid:16) e , κ − δ − (cid:17) b d − ω − = L n (cid:18) , κ − δ − + σδ − o (1) (cid:19) . Page 85 of 114ow we deﬁne a measure ϑ ′ to be ϑ ′ ( x ) =  ϑ (max( x, Y )) x ≥ ϑ (min( x, − Y )) x < . Then k ϑ ′ − ϑ k ≤ O z ∼ ϑ ( | z | < Y ) ≤ Y (cid:18) L n (cid:18) , κ − δ − (cid:19) b d − (cid:19) − = 2 ω − , from the deﬁnition of Y . Note that Y is much larger than M and so ϑ ′ is monotone decreasingaway from 0; hence there are non-negative weights W y for y ∈ Z , with W y = 0 for | y | > F max | r | − such that: ϑ ′ = X y ≥ Y W y U ([0 , y ]) + W − y U ([ − y, , and (cid:12)(cid:12)(cid:12) − P y W y (cid:12)(cid:12)(cid:12) ≤ ω − . Hence we have: P f ( f m,n ( a, b ) is B ′ -smooth) = O ( ω ) − + F max | r | − X y = Y W y P (cid:16) ˆ f m,n ( a, b ) + r U ([0 , y ]) is B ′ -smooth (cid:17) + W − y P (cid:16) ˆ f m,n ( a, b ) + r U ([ − y, B ′ -smooth (cid:17) . We note that O ( ω − ) = L n (cid:0) , o (1) (cid:1) − terms can be absorbed into our o (1) terms, and so itsuﬃces to show that for any ﬁxed, B ′ -good r and any y ∈ (cid:2) Y, F max | r | − (cid:3) : P (cid:16) ˆ f m,n ( a, b ) + r U ([0 , y ]) is B ′ -smooth (cid:17) = L n (cid:18) , κ + σδ β ′ (cid:19) − o (1) , P (cid:16) ˆ f m,n ( a, b ) + r U ([ − y, B ′ -smooth (cid:17) = L n (cid:18) , κ + σδ β ′ (cid:19) − o (1) . Since (cid:12)(cid:12)(cid:12) ˆ f m,n ( a, b ) (cid:12)(cid:12)(cid:12) ≤ ˆ F max := Y L n (cid:0) (cid:1) − , we can absorb the probability that the value on the leftis negative or positive (respectively) in the above two equations. From the deﬁnition of B ′ -goodPage 86 of 114nd corollary 4.39 , for any x ∈ h | r | Y − ˆ F max , F max + ˆ F max i :Ψ( x, B ′ , r, s ) = Φ r ( x, B ′ ) φ ( r ) L n (cid:18) , o (1) (cid:19) = Ψ( x, B ′ ) r L n (cid:18) , o (1) (cid:19) , and so to ﬁnish the estimate we observe that for any x ∈ h | r | Y − ˆ F max , F max + ˆ F max i : ρ ( x, B ′ ) = ρ (cid:18) L n (cid:18) , κ + σδ (cid:19) , B ′ (cid:19) = L n (cid:18) , κ + σδ β ′ (cid:19) − o (1) . Now there is only one thing left to do: Show that a − bm is B ′ -good signiﬁcantly often when a − bm is B -smooth, which we have assumed from the start and have a signiﬁcant probabilityfor from proposition 5.5. Proposition 5.14.

Fix any b . Then P a,m ( a − mb is B ′ -good | a − mb is B -smooth) = 1 − o (1) . Proof.

We begin by bounding the number of moduli which are F -bad for some F ∈ " F max L n (cid:18) (cid:19) − , F max . We ﬁx ω = B ′ for concreteness. Observe that Ψ( F, B ′ ) = F L n (cid:0) (cid:1) − . Since L n (cid:0) (cid:1) = ω (cid:0) L n (cid:0) (cid:1)(cid:1) .Q ≤ p Ψ (

F, B ′ ) L n (cid:18) , ǫ (cid:19) − . Furthermore, for any K ﬁxed, ω (cid:0) log K F (cid:1) = B ′ = o (cid:16) F F (cid:17) . Hence we can apply proposition5.12. Suppose that a modulus r is B -smooth and also B ′ -bad for F . Then for some residue a Page 87 of 114ith gcd ( a, r ) = 1, the contribution to the LHS of proposition 5.12 for this r is at leastΨ( F, B ′ ) φ ( r ) (1 + o (1)) = Ψ( F, B ′ ) r ≥ Ψ( F, B ′ ) Q , where for the ﬁrst equality we use corollary 4.39, noting that B ≤ B ′ so r is B ′ -smooth, u < log log n and the number of divisors of r sis bounded by log r so that the multiplicative error is1 + o (1). Now X r ∈ [ Q max ω − ,Q max ] r is y -smooth max gcd( a,r )=1 (cid:12)(cid:12)(cid:12)(cid:12) Ψ( F, B ′ , r, a ) − Ψ r ( F, B ′ ) φ ( r ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ C Ψ( F, B ′ ) ρ ( Q max , B ′ ) (cid:18) e − cu ′ log2 u + B ′− c (cid:19) + Q max p Ψ( F, B ′ ) log / F = CF ρ ( F, B ′ ) ρ ( Q max , B ′ ) (cid:18) e − cu ′ log2 u + B ′− c (cid:19) + Q max F ρ ( F c, B ′ ) log / F. First, we observe that F and Q max are L n (23), whilst B ′ is L n (cid:0) (cid:1) . Hence both densities ρ ( Q max , B ′ ) and ρ ( F, B ′ ) are L n (cid:0) (cid:1) − . From the deﬁnition of Q max and F max we have Q max = L n (cid:18) , δ − (1 + o (1) (cid:19) , F max = L n (cid:18) , ( κ + σδ ) (1 + o (1) (cid:19) , and from the parameters deﬁned in equation (2) we have 2 δ − < κ + δσ . Hence F Q − = L n (cid:0) (cid:1) .Since up to order L n (cid:0) (cid:1) o (1) , the ﬁrst term is F and the second is Q max F , we deduce that the ﬁrstterm dominates the second. If r is B ′ -bad for F it contributes at least Ψ( F, B ′ ) Q − (1 + o (1))to the sum on the left hand side. Hence the number of moduli which are in [ Q max ω − , Q max ],are B -smooth and B ′ -bad for F is at most Q max Ψ( F, B ′ ) Ψ( F, B ′ ) ρ ( Q max , B ′ ) ( C + o (1)) (cid:18) e − cu ′ log2 u ′ + B ′− c (cid:19) , = Ψ ( Q max , B ′ ) ( C + o (1)) (cid:18) e − cu ′ log2 u ′ + B ′− c (cid:19) . Page 88 of 114f a modulus is B ′ -bad near F max , it must be B ′ -bad for some F ∈ (cid:26) F max L n (cid:18) (cid:19) − , F max (cid:27) ∪ (cid:26) i : 2 i ∈ (cid:20) F max L n (cid:18) (cid:19) , F max (cid:21) (cid:27) , which is a set of logarithmic size. We can absorb a logarithmic factor into the constants c, C , sothe number of moduli which are in [ Q max ω − , Q max ], are B -smooth and B ′ -bad is at most:Ψ( Q max , B ′ ) ( C + o (1)) (cid:18) e − cu ′ log2 u ′ + B ′− c (cid:19) = o (Ψ ( Q max B ′ )) . Hence even assuming that every B -smooth number below Q max ω − is B ′ -bad gives: P a,m ( a − mb is B ′ -good | a − mb is B -smooth) ≥ − Ψ ( Q max ω − , B ′ ) + Ψ ( Q max , B ′ ))Ψ ( Q max , B ′ ) = 1 − o (1) . Hence the probability of a B -smooth modulus to be B ′ -good is nearing 1 and the probabilityfor f ( a, b ) to be B ′ -smooth given that a − mb is a B -smooth modulus is comparable to a randominteger of similar size to be B ′ -smooth under the same conditions. Hence we can now provetheorem 5.2. Now that we have shown that the changes that randomization causes are controllable we are ableto conclude that they are also suﬃcient to give a provable complexity. For this let us quicklyreﬂect on our choice compared to the general number ﬁeld sieve. Consider ( f, m, n, a, b ) ∈ X andlet α ∈ C be a root of the ﬁrst coordinate of f , f ( α,

1) = 0. Then there is a ring homomorphism Z [ α ] → Z /n Z , ( a − bα ) ( a − bm ) mod n. Page 89 of 114rom the ﬁeld norm we also have a multiplicative map from O Q ( α ) → Z , as we would expectfrom the general number ﬁeld sieve. Moreover, denoting f d for the leading coeﬃcient of f , theﬁeld norm is guaranteed to be in f d Z for any element of Z + α Z . Hence the only thing thatwe can’t be sure about is that f is irreducible as f is considered uniformly at random. If f was irreducible then we can apply the strategy from the general number ﬁeld sieve to obtain acongruence of squares modulo n . Lemma 5.15. P ( f is reducible) ≤ L N (cid:18) , κ − δ − + o (1)3 (cid:19) − . Proof.

Fix m, n and let H = 2 L n (cid:0) , κ − δ − (cid:1) be the range of each coeﬃcient of the random partof our polynomial f . Note that if a polynomial over Z is reducible if it is reducible modulo everyprime. Hence if we bound the number of reducible polynomials modulo F p for each prime p andbound how often a polynomial is reducible modulo several primes p , we can get good bounds onthe number of irreducible polynomials f . We count the reducible polynomials f with the Tur´ansieve, [11]. Let A = { ( c d − , . . . , c ) ∈ Z d , | c i | < H } which we equate with the set of polynomi-als f ( x, y ) = ˆ f m,n ( x, y ) + R ( x, y ) with f and ˆ f m,n are both homogeneous of degree d with ( c i )the coeﬃcients of R . For any prime r , let A r correspond to the subset of A corresponding toirreducible polynomials mod r . Note that for any f to correspond to g mod r we must have( x − my ) | ˆ f m,n − g ∈ F r [ x, y ] or equivalently g ( m, q ) ≡ n mod r . We do not insist that G ismonic, although any irreducible g must be a scalar multiple of a monic irreducible. To estimatethe number of irreducibles we accept the following fact, [[7], Chapter 2]:For any 0 < i < r , the number of monic irreducible g of degree D such that g ( m ) ≡ i mod r is r D − D ( D − + O ( r D/ ).Note that for any g over F r , such that g ( m ) ≡ n with r ≪ √ H , there are (cid:18) Hr + O (1) (cid:19) d = (cid:18) Hr (cid:19) d + O (cid:18) Hr (cid:19) d − ! Page 90 of 114olynomials lying over g in A , and none if g ( m ) n mod r . Hence by a union bound over theirreducibles mod r : |A r | ≤ H d d ( d −

1) + O (cid:18) H d r d − (cid:19) + O (cid:0) H d − r (cid:1) , |A r ∩ A r ′ | ≤ H d d ( d −

1) + O (cid:18) H d r d − (cid:19) + O (cid:18) H d r ′ d − (cid:19) + O ( H d − rr ′ ) . From the Tur´an sieve, considering all primes r ≤ p for any p much smaller than √ H , the numberof reducible polynomials f is much smaller than H d p − log p + H d − p , which for p ∼ H / log / H is H d − log H . Then the number of potential f for ﬁxed n, m is H d , and so the probabilitythat f is reducible is at most H − log H = L n (cid:18) , κ − δ − + o (1)3 (cid:19) − . Remark. If f is reducible we immediately assume that the algorithm fails, hence by sampling atmost L n (cid:0) (cid:1) polynomials the probability that any of them are reducible is o (1).Now assume that f is irreducible. Then the following theorem gives us the last ingredientwe need to prove theorem 5.2: Theorem 5.16.

With β = β ′ , δ, σ, κ chosen subject to (5), (6), (7), and Deﬁnition 5.4: E m,f ( |X f,m,n | ) ≥ L n (cid:18) , τ (cid:19) , with τ = 2 σ − δ − β ′ (1 + o (1)) + σδ + κ β (1 + o (1)) . Proof.

As proposition 5.5 and proposition 5.12 randomise over a, m for any ﬁxed b , and uniformlyover n, f , for any b, n, f : P a,m ( a − bm is B -smooth and B ′ -good) = L n (cid:18) , δ − β (1 + o (1)) (cid:19) − . Page 91 of 114ince proposition 5.11 randomizes over f for any ﬁxed a, b, m , we have for each ﬁxed b that P a,m,f ( a − bm is B -smooth and B ′ -good , f ( a, b ) is B ′ -smooth)= L n (cid:18) , δ − β + κ + σδ β ′ (cid:19) − o (1) , as multiplicative factors of 1 + o (1) may be absorbed into the o (1) in the exponent of the L n (cid:0) (cid:1) terms. Summing over the L n (cid:0) , σ (cid:1) choices for a ﬁxed pair ( a, b ): E m,f ( |X f,m,n | ) = X a,b P f,m,n (( f, n, m, a, b ) ∈ X )= L n (cid:18) , σ (cid:19) X b P f,m,n,a  ( a − bm ) is B -smooth ∧ f ( a, b ) is B ′ -smooth  ≥ L n (cid:18) , σ (cid:19) X b P f,m,n,a  ( a − bm ) is B -smooth ∧ ( a − bm ) is B ′ -good ∧ f ( a, b ) is B ′ -smooth  ≥ L n (cid:18) , σ − (cid:18) δ − β (cid:19) (1 + o (1)) + (cid:18) σδ + κ β ′ (cid:19) (1 + o (1)) (cid:19) . Proof of theorem 5.2 . Let τ = 2 σ − δ − β ′ − σδ + κ β , and note that: λ ≥ max( β, β ′ ) + δ − (1 + o (1)3 β + ( σδ + κ )(1 + o (1))3 β ′ = max( β, β ′ ) + 2 σ − τ + o (1) . For any ﬁxed pair ( m, f ), we can use the hyper-elliptic curve method to examine any pair ( a, b )for a suitable smoothness of a − mb and f ( a, b ) in max( B, B ′ ) o (1) time. Hence we can determinewhether a pair ( a, b ) is in X f,m,n in time L n (cid:0) , o (1) (cid:1) .Page 92 of 114y lemma 5.15 the probability that f is reducible is L n (cid:0) (cid:1) , and we have an unconditionaluniform bound |X f,m,n | ≤ L n (cid:0) , σ (cid:1) . Hence from theorem 5.16 we deduce that E f,m ( |X f,m,n | | f irreducible) ≥ L n (cid:18) , τ + o (1) (cid:19) . By using lemma 4.45 we can use stochastic deepening to consider |X f,m,n | as a random variableof ( f, m ), with K ≤ L n (cid:0) , σ − τ (cid:1) . Hence for some j ≤ ⌈ log K ⌉ = O (cid:16) log n (log log n ) (cid:17) ,we have P f,m (cid:18) |X f,m,n | ≥ j L n (cid:18) , τ + log(1) (cid:19)(cid:19) > − j , absorbing logarithmic terms where appropriate.Now consider the following: To ﬁnd a collection of pairs the algorithm iterates through each i ∈ { , . . . , ⌈ log K ⌉} , and for each i generates 2 i pairs ( f, m ), and for each pair ( f, m ) gen-erates 2 − i L n (cid:0) , max ( β, β ′ ) + 2 σ − τ + o (1) (cid:1) pairs ( a, b ) and tests for smoothness of a − mb and f ( a, b ). So if |X f,m,n | > i L n (cid:0) , τ + o (1) (cid:1) then the algorithm ﬁnds L n (cid:0) , max ( β, β ′ ) + o (1) (cid:1) pairs as required. Furthermore if P f,m (cid:18) |X f,m,n | ≥ i L n (cid:18) , τ + log(1) (cid:19)(cid:19) > − i , then with constant probability at least one of the pairs ( m, f ) satisﬁes this condition.Since the time taken for a single i is L n (cid:0) , max( β, β ′ ) (cid:1) + 2 σ = τ + o (1) we can absorb thelogarithmic number of iterations into the o (1) term. Hence, iterating it at most a logarithmicnumber of time reduces the probability of failure to L n (cid:0) , κ − δ − (cid:1) . Hence the expected timetaken to complete the algorithm is L n (cid:18) , max( β, β ′ ) + 2 σ − τ + o (1) (cid:19) . Page 93 of 114 .3 Algebraic obstructions Vol. 2: The congruence

Let S be the set of ( a, b ) pairs such that a − mb is B -smooth and f ( a, b ) is B ′ -smooth. Theprevious section provides a way to ﬁnd a suﬃcient amount of ( a, b ) pairs in a suﬃciently shorttime, which we now will assume to be fulﬁlled. Identically to the General Number Field Sievealgorithm we now start on the next procedure: sieving for the congruence. In Chapter 3 we wereinterested in ﬁnding a subset S i such that f ′ ( m ) Y ( a,b ) ∈ S i a − bm is a square in Z ,f ′ ( α ) Y ( a,b ) ∈ S i a − bα is a square in Z [ α ] . In the RNFS there will be a similar approach, but it can be recalled that we used particularlybroad observations to avoid having to work with the ring of algebraic integers O Q [ α ] for whichthe structure might be unknown or especially diﬃcult. In this section we will not only tackle theprocedure to ﬁnd these squares, but in particular how we can avoid similar algebraic obstructionsas we saw in section 3.3.One of the ﬁrst things to observe is that, given a set S such that for every ( a, b )-pair a − bm is B -smooth is that there is no reason for us not to use the standard method from the generalnumber ﬁeld sieve to ﬁnd a square. Given an element z ∈ Z with prime factorization Q di =1 p η i i it is easy to check if it is a square by ensuring that ∀ i ∈ { , . . . , d } : 2 | η i . So given the factor-izations of a − mb for B + 1 ( a, b )-pairs we can ﬁnd a dependent subset in the same way we didwith the general number ﬁeld sieve in Step 3.Once again it would be optimal if we could approach the Z [ α ] problem in the same way, butas we saw in Chapter 3 this requires some justiﬁcation. This justiﬁcation is only made morePage 94 of 114omplex as we are now dealing with a uniformly random polynomial. We start in much thesame fashion as we did with the general number ﬁeld sieve by deﬁning a set of pairs ( p, r ) suchthat f ( r, ≡ p for prime p and r ∈ { , . . . , p − } . Recall that the pairs ( p, r ) such that p is prime, 0 < r < p coprime to p , p | f ( r,

1) is in direct correspondence with the ﬁrst degreeprime ideals P ⊆ O Q ( α ) such that P | h p i , N ( P ) = p . Then from proposition 3.8 we get that thefollowing function is well deﬁned: e ( p,r ) ( a − bα ) =  ord r ( N ( a − bα )) a − br ≡ p otherwise . From the proof of theorem 3.7 we know that this map is well deﬁned from Q ( α ) × → Z . Hence,as we assumed that we have found suﬃcient ( a, b )-pairs we can use the same sieving techniquesas in Section 3 to ﬁnd a subset S such that P ( α ) = Q ( a,b ) ∈ S ( a − bα ) ∈ Z [ α ]. Moreover we canagain be sure that e r,s ( P ( α )) ≡ P ( α ) is a square in Z [ α ]. To give a three line summary of our technique: we will apply the pigeonhole principle toa set H , to show that for a randomized ﬁeld with a stochastic collection of characters with largeconductor the number of ways in which an element might appear square and not be square islimited. Now deﬁne the set we eluded to above: H = { z ∈ Q ( α ) × : ∀ s < r, e ( p,r ) ( z ) ≡ } / { z : z ∈ Q ( α ) × } . Because we attempt to use the pigeon hole principle we start by considering the size of this set,in fact a near-immediate result is the bounded dimension of H as a vector space:Page 95 of 114 roposition 5.17. H is an F vector space of dimension at most( δκ + o (1)) log ( n ) + δ κ n )) (log log( n )) . The proof of this follows, bar a few modiﬁcations, from the argument in [[17], theorem 6.7].The full proof can be found in [[14], p. 129-131], but we will accept this as a fact. To see that H is a vector space we only need to observe that Q ( α ) × is commutative, hence every elementof h ∈ H can be represented as a coset [ h ] = h · { z : z ∈ Q ( α ) × } . This means that for any h ∈ H , h is the identity since it is equivalent to h · { z : z ∈ ( Q ( α ) × } and h ∈ Q ( α ) × .The next step is to ﬁnd a signiﬁcant amount of quadratic characters. For this let P ⊂ O Q ( α ) bea prime ideal, as N ( P ) = p k and O Q ( α ) is a Dedekind Domain we have that O Q ( α ) / P = F p k . Hence we can identify P with a degree k monic irreducible polynomial p P over F r . We can saymore: Given an irreducible polynomial g of degree k over F p , if gcd( f, g ) = 1 then the quotient Q ( α ) / h g ( α ) i sends everything to 0, hence h g ( α ) i is not prime. So we may assume gcd( f, g ) > g is irreducible over F p that means that g is one of the irreducible factors of f mod h p i .In what follows we will regularly shift between following three equivalent concepts:1. Prime ideal P ⊆ O Q ( α ) .2. The irreducible polynomial divisor p P of f ( x,

1) mod p .3. Pair ( p, r ), p prime and r s.t. p P ( r, ≡ p for p P deﬁned over F p k .The following deﬁnition extends the Legendre symbol. Deﬁnition 5.18.

Let K be a ﬁeld and let f ( x ) ∈ K [ x ] be a polynomial. Let p ( x ) ∈ K [ x ] be anPage 96 of 114rreducible polynomial, p ( x ) ∤ f ( x ). Then the polynomial Legendre symbol is deﬁned as follows: (cid:18) f ( x ) p ( x ) (cid:19) L =  , ∃ g ( x ) ∈ K [ x ] s.t. f ( x ) ≡ g ( x ) mod p ( x ) − , ∄ g ( x ) ∈ K [ x ] s.t. f ( x ) ≡ g ( x ) mod p ( x ) . This gives a very natural extension to a Dirichlet character, χ P , deﬁned by the following: χ P : O Q ( α ) → F p k , h a − bα i 7→ (cid:18) a − bxp P (cid:19) L (8)All that remains now is ﬁnding P , for which we will seek to factorize f ( x,

1) mod p and lookat the irreducible divisors. Given a set F of these characters χ P = χ ( p,r ) , we can deﬁne themap Ψ F : H → F |F| sending x to the tuple (cid:0) χ ( p,r ) | χ ( p,r ) ∈ F (cid:1) . We will rely on the randomproduction of such a set F to show that this almost surely makes ker(Ψ F ) small. Lemma 5.19.

There is a sampleable distribution Υ for pairs ( p, r ) such that χ ( p,r ) is a characteras in 8, such that for all but log log n of the h ∈ H , considering map as a map from H to F : P Υ (cid:0) χ ( p,r ) = − (cid:1) ≥ o (1)2 . Sampling according to Υ takes at most L n (cid:0) , c (cid:1) time for a to-deﬁne c . Furthermore, eachcharacter, χ ( p,r ) , can be evaluated in time at most L n (cid:0) , c (cid:1) . To start assume that we have the ﬁnite-degree tower of number ﬁelds L ⊃ K ⊃ Q , where L/K is a Galois extension with Galois group Gal(

L/K ). Recall that ∆ L , ∆ K are the discriminants of L and K respectively, and let d L/K = [ L : K ], d L = [ L : Q ] and d K = [ K : Q ]. Let P ⊆ K be aprime ideal and let Q = { Q ⊆ L | Q lies over P } , then from proposition 2.30 we know that theFrobenius elements of all Q ∈ Q are conjugate, which allows us to give the following deﬁnition: Implicitly we are assuming F ∼ = C by the map b ( − b Page 97 of 114 eﬁnition 5.20.

Let L ⊃ K ⊃ Q be a ﬁnite degree tower of number ﬁelds with L/K

Galois.Let P ⊂ K be a prime ideal which is unramiﬁed in L and let Frob P be the Frobenius elementof P in Gal( L/K ). Then the Artin symbol, h L/K P i , is deﬁned as the conjugacy class of theFrobenius automorphisms of L/K corresponding to the prime ideals Q ⊂ L dividing P : (cid:20) L/K P (cid:21) = { Frob Q ∈ Gal(

L/K ) | Q | P } . Remark.

It is a common abuse of notation to write Frob P for any of the elements in h L/K P i andjust consider the element well deﬁned up to conjugacy.This allows us to deﬁne the set π C ( x ) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:26) P | P ⊂ K prime , N K/ Q ( P ) < x, (cid:20) L/K P (cid:21) ∈ C (cid:27)(cid:12)(cid:12)(cid:12)(cid:12) , where C ⊆ Gal(

L/K ) s.t. for all g ∈ Gal(

L/K ): gCg − = C . Now let K = Q ( α ) and let h ∈ O K be an element of minimal norm representing a non-trivial element of H . Let L = K ( √ h ), thenby deﬁnition [ L : K ] = 2, d K = d , d L = 2 d and Gal( L/K ) = C . Since Gal( L/K ) is abelian, wehave that h L/K P i only contains one element. Hence the value h L/K P i corresponds exactly to theaction of χ P on h .As h has minimal norm it fulﬁlls the Minkowski bound, 2.41, hence: N K/ Q ( h ) ≤ c K/ Q = p ∆ K (cid:18) n (cid:19) d d ! d d = n δκ (1+ o (1) . By construction the diﬀerent is generated by 2 h , and so is an integral ideal. As the relativediscriminant is the norm of the diﬀerent we get that:∆ L/ Q ≤ N K/ Q (2 h )∆ K/ Q ≤ n δκ (5+ o (1)) . We now need an improved version of Chebotarev’s Density Theorem which we alluded to before.Page 98 of 114 heorem 5.21. Unconditional Eﬀective Chebotarev Density Th.

Let

L/K/ Q be a towerof number ﬁelds with L/K

Galois with G = Gal( L/K ). Let C ⊆ G be a union of conjugacyclasses such that gCg − = C for all g ∈ G . Let (cid:12)(cid:12) ¯ C (cid:12)(cid:12) be the total number of conjugacy classes in G . Lastly let 1 − ν be a Siegel zero of ζ L if it exists and 0 otherwise. Then there exists c > x ≥ d L log (∆ L ): (cid:12)(cid:12)(cid:12)(cid:12) π G ′ ( x ) − | C || G | Li( x ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ | C || G | Li( x − ν ) + O x (cid:12)(cid:12) ¯ C (cid:12)(cid:12) exp( − c r log xd L ! . Using the unconditional eﬀective Chebotarev Density Theorem on the degree 2 extension

L/K , then for P chosen uniformly at random with N ( P ) ≤ x : (cid:12)(cid:12)(cid:12)(cid:12) P ( χ P ( h ) = 1) − (cid:12)(cid:12)(cid:12)(cid:12) < x − ν (1+ o (1)) + O x exp − c r log xd L !! , where 1 − ν is a possible Siegel zero of the Dedekind zeta function over L , ζ L . Ensuring that P ( χ P ( h ) = 1) = + log(1) requires us to insist that log( x ) = ω ( d L (log log x ) ) and, if ζ L has aSiegel zero, log( x ) = ω ( ν − ).Note that, for each non-trivial coset, we can choose a representative h ∈ O K of minimal normand then run through the previous paragraph for each of these. Despite the element h beingdiﬀerent every time we can get a similar result each time, dependent on h . Hence we make thefollowing deﬁnition: Deﬁnition 5.22.

For a ﬁeld K and h a minimal norm representative of an element of H , wedeﬁne L h = K ( √ h ). For ǫ > E K,ǫ = (cid:26) h · { z | z ∈ K × } ∈ H s.t. ∃ ν : ζ L h (1 − ν ) = 0 , ν − > L n (cid:18) , ǫ (cid:19) (cid:27) . It is clear that if h ∈ K × and k ∈ K × represent the same coset in H , then L h = L k . Tojustify this we only have to note that representing the same coset means that h and k diﬀer byPage 99 of 114 square: h = kz , hence √ h = √ kz = √ kz , since z ∈ K × we get that irr K ( √ h ) = irr K ( √ k ) and so L h = L k .The exceptional set E K,ǫ is the subset of H which cannot be reliably distinguished from 0by characters induced by primes of size e L n ( ,ǫ ). This means that if a Siegel zero of the formspeciﬁed exists, then almost every prime ideal of this size induces a character which vanishes forsome element of H . However we can limit the size of the exceptional set. Lemma 5.23.

Suppose that K = Q ( α ) is a number ﬁeld where α is a root of an irreducible f = ˆ f + ( x − m ) R where R is uniformly random. Then for ǫ = (cid:0) + o (1) (cid:1) δ , P f (cid:18) | E K,ǫ | >

43 log log n (cid:19) ≤ L n (cid:18) , κ − δ −

13 (1 + o (1) (cid:19) − . For now we accept this lemma as fact, as this will allow us to prove lemma 5.19. The prooffor this lemma is based on the sparseness of Siegel zeros for Dedekind zeta functions deﬁnedover L h and will be shown in Section 5.3.3. Accepting the lemma, we know there exists an x such that log( x ) = ω ( d L (log log x ) ) and, if there is a Siegel zero for ζ L h , log( x ) = ω ( ν − ) forall but log log n of the h ∈ H . Moreover for all but at most L n (cid:0) (cid:1) − of our maximum L n (cid:0) (cid:1) polynomials f we have log x < L n (cid:0) , ǫ (cid:1) . Choosing to acccept failure on all these polynomialsstill guarantees that the algorithm fails with probability o (1).Now we turn to the sampleable distribution Υ. Recall that any prime for which N ( P ) < x is guaranteed to divide a prime p with p < x . Moreover, if P is of degree k , then p < k √ x .Equivalently, a k th degree prime ideal dividing p corresponds to a simple k ’th degree divisor of f mod p . The following algorithm samples Υ in such a way that it outputs ideals of which allbut a small fraction are prime. Page 100 of 114 lgorithm 4 Sampling Υ for (prime) ideals.

Input:

A polynomial f of degree d Output:

A pair ( p, r )Uniformly at random sample k ∈ Z /d Z Uniformly at random sample p ∈ (cid:16) x ( k +1) − , x k − (cid:17) Test for primality of p with the Miller–Rabin test if r is prime then factor f mod p and ﬁnd the irreducible and unrepeated factors r i of degree < k if |{ r i }| ≥ j then Uniformly at random sample one r i and deﬁne r i = r else Sample a new k and repeat the algorithm end ifend if Remark.

We will now make a few remarks regarding the runtime of the algorithm. First notethat to obtain the required runtime of the sieve we need this algorithm to have runtime at most O (cid:0) log ( n ) (cid:1) . Let’s see how we obtain this: • First note that the runtime-bound immediately means log x = L (cid:0) (cid:1) . This means wecan not use deterministic primality tests, such as the AKS primality test we used for theGNFS. Especially we note that discarding a composite p using the Miller–Rabin primalitytest takes O (log ( x ) log log x ). • The Miller–Rabin primality test discards composite p with probability 1 − O (log − ( x ))and so p is prime with probability Ω ((log x ) − ) and so any p produced here is prime withprobability 1 − O ( d (log x ) − ). • Factoring f mod p is possible in time O (cid:0) ( d log x ) (cid:1) probabilistically. • If p is composite, but slips through the cracks of the Miller–Rabin primality test, then thefactorisation of f mod p may fail. If it does not fail, then this p still induced a quadraticPage 101 of 114haracter and therefore vanishes on the squares. Since we obtain at most one characterfrom each p and are guaranteed to ﬁnd a character if k = d and p is prime, the fraction ofcharacters which are not induced by primes is o (1).To ﬁnish the proof we need to show that this algorithm is suﬃciently fast and that characterscan be evaluated quickly. Moreover we, once again, have to show that these characters aresuﬃciently uniformly distributed that the bounds on P ( χ P = 1) hold. Proposition 5.24.

The expected time taken to sample ( p, r ) ∼ Υ as above is at most L n (cid:18) , (4 + o (1)) ǫ (cid:19) . Proof.

We noted in our remark that each attempt from the start of the algorithm takes time O (( d log x ) ) as the factorization of f mod r is slowest. We are guaranteed to ﬁnd a factorif our degree bound k is d , which happens with probability d , the integer r is prime withprobability Ω ( x ), and we successfully take an ideal in the ﬁnal step with probability Ω (cid:0) d (cid:1) if k = d . Hence the number of attempts needed to output a prime is bounded in expectation by O ( d log x ). Hence the time taken to ﬁnd an ideal is bounded in expectation by O ( d log ( x )) = L n (cid:0) , (4 + o (1)) ǫ (cid:1) . Proposition 5.25.

For any ﬁxed h , P Υ ( χ ( p,r ) ( h ) = − ≥ d (1 + o (1)). Proof.

The distribution of primes P generated is uniform over P mod h p i for p ∈ (cid:16) x ( k +1) − , x k − (cid:17) of degree at most k by the Prime Ideal Theorem, 4.22. This property also holds for a uniformdistribution over primes of norm ≤ x by an adaptation of Dirichlet’s theorem of primes in arith-metic progressions, 4.28. Thus the diﬀerence between Υ and a uniform distribution over primesof norm ≤ x is the distribution of the degree of these primes. The probability that Υ samples P with N ( P ) ≤ x and P | h p i for p is each of these intervals is d . Hence Υ pointwise dominates d − times the uniform distribution over all primes of norm below x . Therefore P Υ ( χ ( p,r ) ( h ) = − ≥ d P N ( P ) ≤ x ( χ P ( h ) = −

1) = 12 d (1 + o (1)) . Page 102 of 114he ﬁnal proposition immediately completes the proof of lemma 5.19, however it is quitetechnical as the logarithms we are interested in are quite large. We, therefore, have to track thearithmetic carefully.

Proposition 5.26.

Evaluating the characters χ ( p,r ) associated with the ideal P sampled asabove on a term a − bα takes time at most L n (cid:0) , (2 + o (1)) ǫ (cid:1) . Proof.

Let p = 2, then χ ( p,r ) = 1. Hence we may assume p >

2. For any polynomial P ∈ F p [ x ]let | P | = p deg( P ) . Recall χ ( p,r ) ( a − bα ) = (cid:18) a − bxr ( x ) (cid:19) L . Any constant c can reduce the computation to ﬁnding a Legendre symbol mod p : (cid:16) cP (cid:17) L = c | P |− = (cid:18) cp (cid:19) pk − r − = (cid:18) cp (cid:19) k . From [7] we call attention to the law of quadratic reciprocity for function ﬁelds by noting thatfor any two relatively prime monic irreducible polynomials over F p : (cid:18) PQ (cid:19) (cid:18) QP (cid:19) = ( − | P |− | Q |− . Hence χ ( p,r ) ( a − bα ) = (cid:18) − bp (cid:19) k (cid:18) x − ab − r ( x ) (cid:19) L = (cid:18) − bp (cid:19) k ( − p − p deg( r ) − (cid:18) r ( x ) x − ab − (cid:19) = (cid:18) − bp (cid:19) k ( − p − p deg( r ) − (cid:18) r ( ab − ) x − ab − (cid:19) = (cid:18) − bp (cid:19) k ( − p − p deg( r ) − (cid:18) r ( ab − ) p (cid:19) . The parities of p − and p k − are easily computed. Hence to compute χ ( p,r ) ( a − bα ) it suﬃcesto compute r ( ab − ) and two legendre symbols modulo p in O (log p ) additions or subtractions ofnumbers of size at most p .To compute b − mod p requires the extended Euclidean algorithm to be run, which requiresPage 103 of 114 (log p ) additions of numbers of size at most p .To compute ab − mod p requires one multiplication.To compute s ( ab − ) mod p requires at most O ( d ) addition and multiplications modulo p .Addition or subtraction of numbers of size p or modulo p takes O (log p ) steps, while multiplica-tion modulo p takes O (cid:0) log ( p ) (cid:1) steps by iterative addition and doubling. Hence the computationtime in total requires time O ( d log ( p )) = O ( d − log ( x )) = L n (cid:18) , (2 + o (1)) ǫ (cid:19) . To complete the proof of the lemma we now simply take c = (4 + o (1)) ǫ. As we remarked earlier this proof is dependent on the sparseness of Siegel zeroes over Dedekindzeta functions. To start this we need the following proposition, [[14], fact 6.22 and corollary6.23], which we will state without proof.

Proposition 5.27.

Let K be a number ﬁeld and let K be the normal closure of K . Let c ( K ) = 4if K/ Q is normal and c ( K ) = 4[ K : Q ] otherwise. Assume that there is a real Siegel zero of ζ K ,denoted 1 − ν , such that 1 − ( c ( K ) log | ∆ K | ) − ≤ − ν ≤ . Then there is a quadratic ﬁeld F ⊂ K such that ζ F (1 − ν ) = 0.Now we start the proof of lemma 5.23. To do this we may assume that f ( x ) ∈ Q ( α )[ x ] is anirreducible polynomial, as the probability that f is reducible can be absorbed in the error termby 5.15 Proposition 5.28. If L h ⊃ L h is the normal closure of L h = K (cid:16) √ h (cid:17) , then (cid:2) L h : Q (cid:3) ≤ d d !.Page 104 of 114 roof. Let K be the splitting ﬁeld of K . By construction, K/ Q is normal, especially that meansit is Galois, and [ K : Q ] ≤ d !. Let G = Gal( K/ Q ). Given h ∈ O K , let O h be the orbit of h under the action of G . Then |O h | ≤ d . We adjoin square roots of each element of O h to K toobtain a ﬁeld L . Then [ L : Q ] ≤ d d !.Since degree 2 extensions are normal, the compositum of normal extensions is normal, and L/K is a compositum of at most d degree 2 extensions, hence the extension L/K and K/ Q arenormal. As any σ ∈ Aut Q ( K ) acts on O h as a permutation we can extend σ to an element ofAut Q ( L ). So L/ Q is normal and L h ⊆ L , concluding our hypothesis.Hence from proposition 5.27, if ν − > d +2 d ! log ∆ L h / Q , then 1 − ν must be a zero of somequadratic ﬁeld F h = Q ( s h ) ⊆ L h . By assumption 2 ∤ [ K : Q ], hence K has no quadratic subﬁelds,which means that F h ∩ K = Q and so F h is the only quadratic subﬁeld of L h . Moreover, L h isthe smallest ﬁeld containing both F h and K and since the h ∈ H are not related by squares of K it holds that L h doesn’t contain any h ′ ∈ H from another class. This assures that the L h produced as h varies are all distinct ﬁelds and so all s h must be distinct.By the transitivity of the discriminant, lemma 2.47, we have, for the towers of number ﬁeld L h /F h / Q and L h /K/ Q that∆ dF h / Q N F h / Q (∆ L h /F h ) = ∆ K/ Q N K/ Q (∆ L h /K ) . As ∆ L h /K is the relative discriminant, hence is an ideal, we can use the Minkowski Bound: N K/ Q (∆ L h /K ) ≤ p ∆ K/ Q (cid:18) π (cid:19) d (cid:18) d ! d d (cid:19) ≤ p ∆ K/ Q . Since ∆ K/ Q ≤ L n (cid:0) , δ κ (cid:1) and ∆ L h / Q ≤ L n (cid:0) , δ κ (cid:1) , we can conclude that∆ F h / Q = O (cid:18) L n (cid:18) , δκ (cid:19)(cid:19) . Page 105 of 114y the contribution of the Euler product of the Dedekind zeta function: ζ F h / Q ( s ) = ζ ( s ) L (cid:18) s, (cid:18) ∆ F h / Q · (cid:19)(cid:19) , and by reciprocity j (cid:16) ∆ Fh/ Q j (cid:17) is a Dirichlet character modulo ∆ F h / Q .Now we invoke [[14], fact 6.24]: Proposition 5.29.

There is an eﬀective constant c > χ r , χ r ′ of moduli r, r ′ respectively, with χ r χ r ′ non-principal, then the product of Dirichlet L -functions L ( s, χ r ) L ( s, χ r ′ ) has at most one real zero in (cid:16) − c log rr ′ , (cid:17) .So if there are two characters modulo q and q ′ respectively there is at most one L -functionwith a zero 1 − ν and ν − > c log qq ′ for some eﬀective c . It immediately follows that there is atmost one character with modulus in [ q, q e ] with a zero at 1 − ν and ν − > ( e + 1) c log q .Since ∆ F h / Q < L n (cid:0) , (cid:0) + o (1) (cid:1) δκ (cid:1) and ∆ F h / Q ∈ Z it follows that all possible discriminantscan be covered by log log n ranges of form [ x, x e ]. Hence there are at most log log n excep-tional characters with exceptional zeroes such that ν − > ( e + 1) c log (cid:18) L n (cid:18) , (cid:18)

54 + o (1) (cid:19) δκ (cid:19)(cid:19) = O (cid:16) δκ log ( n ) log log( n ) − (cid:17) , as required. This bound is far weaker than the required bound of ν − > d +2 d ! log ∆ L h / Q , and sothere are at most log log n extensions L h / Q with exceptional zeroes and ν − > d +2 d ! log ∆ L h / Q .Finally: 2 d +2 d ! log ∆ L h / Q ≤ d d (1+ o (1) log O (1) ( n ) = L n (cid:18) , δ o (1)) (cid:19) . This completes the proof of the lemma. Page 106 of 114 .3.4 Proof of theorem 5.3

We have now removed all obstructions that were needed. It is therefore that we can now provetheorem 5.3. Recall that for a set of characters F we deﬁne the mapΨ F : H → F | F | ,x (cid:0) χ ( p,r ) ( x ) : χ ( p,r ) ∈ F (cid:1) . With the work we have done up to now we are ready to produce our linear Ψ F with small kernel,and this to produce a congruence of squares. Due to the size of some of the numbers involved,having L n (cid:0) (cid:1) bits, we track the arithmetic very closely. First, using lemma 5.19, we sample4 d (cid:18) δκ log n + δ κ ( n )log log n (cid:19) pairs ( p i , r i ) from Υ independently.Note that this sample is of size o (log ( n )) = L n (cid:0) , o (1) (cid:1) and each individual sample takesat most L n (cid:0) , c (cid:1) , so we can produce the complete sample in L n (cid:0) , c + o (1) (cid:1) . After this processwe have M = 1+ B + dB ′ +4 d (cid:18) δκ log n + δ κ ( n )log log n (cid:19) pairs from Section 5.2.1 and the samplefrom Υ. Note that M = L n (cid:0) , max ( β, β ′ ) (cid:1) o (1) . For each of these, we need to evaluate eachof our characters and as each character evaluates in L n (cid:0) , c (cid:1) we get that this process takes L n (cid:0) , c + max( β, β ′ ) (cid:1) o (1) .Fix some h ∈ H \{ } such that h is not in the exceptional set of size log log n . Each map χ ( p i ,r i ) is independent and induces a map in Hom( H, F ) such that P ( h ker (cid:0) χ ( p i ,r i ) (cid:1) ) ≥ o (1)2 d . As a corollary it follows that: P ( h ∈ ker(Ψ F )) ≤ (cid:18) − o (1)2 d (cid:19) M − (1+ B + dB ′ ) ≤ | H | − o (1) . Page 107 of 114ence by the union bound over the non-trivial elements of H the probability that any of thesenon-exceptional and non-zero elements is in the kernel is o (1). Hence with high probability thekernel of Ψ F has size at most log log n .From here we proceed as normal. With the M pairs representing linear polynomials we usea fast kernel ﬁnding algorithm for sparse matrices, such as Block–Wiedemann, [21], to ﬁnd asuitable subset S i to construct a polynomial P i in time O ( M ) = L n (cid:18) , β, β ′ )(1 + o (1) (cid:19) , such that P i ( m ) is a square in Z and P i ( α ) is a square in O Q ( α ) multiplied by one of at most log log n elements of H . Repeating the algorithm l = log log n times to generate polynomials P , . . . , P l , we are able to guarantee that for some i < j , P i and P j lie over the same element h ,hence P i P j is a square in O Q ( α ) . In what follows we consider the (cid:0) l (cid:1) ∼ (log log n ) polynomialsseparately.Now if γ ∈ O Q ( α and γ ∈ Z [ α ], then γ · f ′ ( α ) ∈ Z [ α ]. Let S = S i ∆ S j and ﬁx the polyno-mial P = (cid:20) ∂f∂x ( x, (cid:21) Y ( a,b ) ∈S ( a − bx ) , and so u = (cid:20) ∂f∂x ( x, y ) (cid:21) ( m, Y a,b ∈S ( a − mb )is a square in Z . Hence u can be found by taking the product modulo n over all p < B of p raisedto half the total order of p in the terms ( a − mb ) for ( a, b ) ∈ S and multiplying by f ′ ( m, The reason for computing the square root in this fashion is to ensure we need only M log n additions and divisions and at most M log n modular multiplications to compute u mod n fromthe exponents. This ensures polynomial running time. This technique diﬀers minimally from the technique for Z in the GNFS Page 108 of 114imilarly, for at least one of the (cid:0) l (cid:1) polynomials considered, there exists v ∈ Z [ α ] such that v = (cid:20) ∂f∂x ( x, y ) (cid:21) ( α, Y ( a,b ∈S ( a − bα ) . Using [24] we can compute square roots in the number ﬁeld to ﬁnd v ( m,

1) mod n in time O ( M ). We abuse notation and write v ( m ) for the element of Z /n Z obtained by substituting m for α . By the deﬁnition of f we have f ( m,

1) = n and so: v ( m ) mod f ( m,

1) = (cid:20) ∂f∂x ( x, y ) (cid:21) ( α, Y ( a,b ) ∈S ( a − bα ) mod f ( α,  ( m )= (cid:20) ∂f∂x ( x, y ) (cid:21) ( m, Y ( a,b ) ∈S ( a − mb ) mod ( f ( m, u mod n, and so we have constructed a congruence of squares in time L n (cid:18) , max(2 max( β, β ′ ) , max( β, β ′ ) + c , c ) (cid:19) o (1) . As c ≤ (cid:0) + o (1) (cid:1) δ , we can insist that we have at most log log n exceptional values of h , andour f lies oﬀ a set of probability at most L n (cid:16) , κ − δ − (1 + o (1)) (cid:17) − the run time bound is asclaimed, ﬁnishing the proof. The computational eﬃciency boils down to proving theorem 5.1. For this we ﬁx n, β, β ′ , σ, δ ,and κ satisfying the conditions of equations 5 and 6. Then by theorem 5.3 we can extract acongruence of squares mod n from L n (cid:0) , max( β, β ′ ) + o (1) (cid:1) pairs ( a, b ) ∈ X f,m,n for a ﬁxed( m, f ) in expected time L n (cid:18) , (cid:18) δ , β, β ′ (cid:19) (1 + o (1)) (cid:19) . Page 109 of 114heorem 5.2 tells us that a ﬁxed ( m, f ) and this many pairs ( a, b ) ∈ X f,m,n will be found inexpected time L n (cid:18) , max ( β, β ′ ) + δ − β (1 + o (1)) + σδ + κ β ′ (1 + o (1)) (cid:19) . Hence we can run the RNFS to obtain a congruence of squares mod n with the expected timebounded by L n (cid:18) , λ (1 + o (1) (cid:19) , λ = max (cid:18) (cid:18) δ , β, β ′ (cid:19) , max ( β, β ′ ) + (cid:18) δ − β + κ + σδ β ′ (cid:19)(cid:19) . Having chosen β, β ′ , σ, δ, κ satisfying the conditions of 5 and 6 we can optimize the constants.Note that increasing the lesser of β and β ′ cannot increase λ or cause the conditions on theconstants to be violated, so we can assume β = β ′ . We then compute:2 σ ≥ λ ≥ min β,δ (cid:18) β + 2 δ − + σδ + o (1)3 β (cid:19) ≥ min β β + √ σ + o (1)3 β ! ≥ r σ o (1) . Fix any ǫ > ǫ = o (1). If we take β = β ′ = σ = δ = q + ǫ and κ = q + ǫ thenthe above are all equalities. Furthermore all the conditions are satisﬁed, giving λ = 2 q + o (1).This proves a runtime of L n , r

649 + o (1) ! . As a ﬁnal note we want to have a short discussion on some of the ways that the Generalized Rie-mann Hypothesis could impact the analysis performed on the Randomized number ﬁeld sieve.As discussed: Until now we have not made any assumption regarding the GRH and only limitedassumptions on the heuristics, but if we were to accept the GRH then much of our discussionbecomes a lot simpler.One of the main reasons it becomes simpler is because the Generalized Riemann HypothesisPage 110 of 114llows us to assume that there are no Siegel zeroes. This makes the discussion around algebraicobstructions a lot simpler. For example we obtain the following:

Proposition 5.30.

Conditional on GRH, for ǫ = log − ( n ) = o ( δ ), P f ( | E K,ǫ | >

0) = 0 . This follows automatically from our discussion of the size of this set and the fact there areno Siegel zeroes. This makes lemma 5.23 vacuous and allows us to sharpen the L n (cid:0) (cid:1) boundsin lemma 5.19 polynomial in log n .The GRH also allows us to have far tighter eﬀective bounds on the prime numbers in arith-metic progressions. Without going into details it would allow us to subvert the use of theadapted Bombieri–Vinogradov theorems and use an eﬀective version of Chebotarev’s DensityTheorem which is completely dependent on the GRH holding.It is obvious that in a discussion like this it could easily be assumed that the GRH holds,but that would make the whole duscussion conditional and the beauty of this is that we makeno grand assumptions whatsoever. To obtain a complexity that is equivalent to the heuristiccomplexity without relying on any conditions makes this the best possible outcome.Does this mean we are done with the number ﬁeld sieve after thirty years? No, deﬁnitely not.Over the years many diﬀerent versions have found their way into mathematics and cryptographyto solve other problems than factorization. For example the Tower Number Field Sieve whichis an adapted algorithm to solve Discrete Logarithm Problems ([27], [28]) has been worked onfor many years and it will be interesting to see if a similar randomization leads to a provablecomplexity for that as well. Page 111 of 114 eferences Books [1] S. Alaca, K.S. Williams,

Introductory Algebraic Number Theory

Cambridge University Press,First edition, 2004.[2] R. Crandall, C. Pomerance,

Prime numbers, a computational perspective

Springer Verlag,First edition - corrected printing, 2002.[3] A.K. Lenstra, H.W. Lenstra Jr.,

Development of the Number Field Sieve

Springer Verlag,1993. Which is the source used for the papers [15], [16] cited below[4] I. Stewart,

Galois Theory , CRC Press, Fourth edition, 2015[5] H. Cohen,

A Course in Computational Algebraic Number Theory

Springer Verlag, FourthEdition, 2000[6] L. C. Washington,

Introduction to Cyclotomic Fields

Springer Verlag, Second edition, 1997[7] M. Rosen,

Number Theory in Function Fields , Springer Verlag, 2002.[8] H.L. Montgomery, R.C. Vaughan,

Multiplicative Number Theory 1: Classical Theory , Cam-bridge University Press, 2010.[9] J. Milne,

Clasds Field Theory

Multiplicative Number Theory , Springer Verlag, Second edition, 1980[11] G. Greaves,

Sieve in Number Theory

Springer Verlag, First Edition, 2001

Papers [12] M.A. Morrison, J. Brillhart,

A Method of Factoring and the Factorization of F Mathematicsof Computation, American Mathematical Society, Volume 29, 1975.Page 112 of 11413] P. Stevenhagen,

The number ﬁeld sieve

Algorithmic Number Theory, MSRI Publications,Volume 44, 2008.[14] J.D. Lee, R. Venkatesan,

Rigorous analysis of a randomised number ﬁeld sieve

Journal ofNumber Theory, Volume 187, 2018[15] J.M. Pollard,

Factoring with cubic integers

As printed in [3], Springer Verlag, 1993.[16] A.K. Lenstra, H.W. Lenstra Jr., M.S. Manasse, J.M. Pollard

The Number Field Sieve

Asprinted in [3], Springer Verlag, 1993[17] J.P. Buhler, H.W. Lenstra Jr., C. Pomerance

Factoring integers with the number ﬁeld sieve

As printed in [3], Springer Verlag, 1993[18] A. Granville,

It is easy to determine whether a given integer is prime

Bull. Amer. Math.Soc., Volume 42, 2005[19] D. Coppersmith,

Solving linear equations over GF(2): Block–Lanczos Algorithm , LinearAlgebra Applications, 1991.[20] P. Montgomery,

A Block–Lanczos Algorithm for Finding Dependencies over GF(2)

Adv. inCryptology - Eurocrypt ’95, 1995[21] D. Wiedemann

Solving sparse linear equations over ﬁnite ﬁelds , IEEE Trans. Inform.Theory, Volume 32, 1986[22] A. Hildebrand, G. Tenenbaum,

Integers without large prime factors , Jour. Th. des Nom.de Bordeaux, Volume 5, 1993[23] A.J. Harper,

Bombieri–Vinogradov and Barban-Davenport-Halberstam type theores forsmooth numbers , arXiv:1208.5992 [math.NT][24] E. Thom´e,

Square root algorithms for the number ﬁeld sieve , Proc. of the 4th Int. Workshopon Arith. in Finite Fields, Springer, 2012Page 113 of 11425] A. Hildebrand,

On the number of positive integers ≤ x and free of prime factors > y ,J.Number Theory, 1986, Volume 22, p. 289–307[26] A. Hildebrand, G. Tenenbaum, On integers free of large prime factors , Trans. Amer. Math.Soc., 1986, Volume 296, p. 265–290[27] O. Schirokauer,

The impact of the number ﬁeld sieve on the Discrete Logarithm Problem ,Algorithmic Number Theory, MSRI Publications, 2008, Volume 44[28] R. Barbulescu, et. al,

The Tower Number Field Sieve , Advances in Cryptology, ASI-ACRYPT 2015, 2015, Volume 9453[29] K. Soundararajan,

The Distribution of Smooth Numbers in Arithmetic Progressions

CRMProceedings and Lecture notes, American Mathematical Society, 2008, Volume 46, p. 115-128[30] A. Harper,