Probabilistic Szpiro, Baby Szpiro, and Explicit Szpiro from Mochizuki's Corollary 3.12
aa r X i v : . [ m a t h . N T ] A p r PROBABILISTIC SZPIRO, BABY SZPIRO, AND EXPLICIT SZPIROFROM MOCHIZUKI’S COROLLARY 3.12
TAYLOR DUPUY AND ANTON HILADO
Abstract.
In [DH20b] we gave some explicit formulas for the “indeterminacies” ind1 , ind2 , ind3in Mochizuki’s Inequality as well as a new presentation of initial theta data. In the presentpaper we use these explicit formulas, together with our probabilistic formulation of [Moc15a,Corollary 3.12] to derive variants of Szpiro’s inequality (in the spirit of [Moc15b]). In par-ticular, for an elliptic curve in initial theta data we show how to derive uniform Szpiro(with explicit numerical constants). The inequalities we get will be strictly weaker than[Moc15b, Theorem 1.10] but the proofs are more transparent, modifiable, and user friendly.All of these inequalities are derived from an probabilistic version of [Moc15a, Corollary 3.12]formulated in [DH20b] based on the notion of random measurable sets. Contents
1. Introduction 12. Explicit Computations in Tensor Packets 53. Conductors, Minimal Discriminants, and Ramification 104. Estimates on p -adic Logarithms 165. Archimedean Logarithms 186. Upper Bounds on Hulls 187. Probabilistic Versions of the Mochizuki and Szpiro Inequalities 228. Deriving Explicit Constants For Szpiro’s Inequality From Mochizuki’s Inequality 29References 351. Introduction
In [DH20b] we gave a probabilistic interpretation of Mochizuki’s Inequality (Corollary 3.12of [Moc15a]). In the present paper we perform some explicit computations using this inequal-ity to derive three inequalities which we will call “Probabilistic Szpiro”, “Baby Szpiro”, and“Explicit Szpiro”. All of these inequalities depend on hypothesis of an elliptic curve beingin “initial theta data built from the field of moduli”, [Moc15a, Corollary 3.12], and someassumed behavior at the archimedean place stated in Claim 5.0.1.In order to state our results we need to talk about initial theta data. As formulated in § F /F, E F , l, M , V , V badmod , ǫ ) Date : April 30, 2020. urrounding an elliptic curve E = E F over a field F satisfying various conditions whichare not important for the purposes of the introduction (the curious reader should consult[DH20b, § l is a fixed primeand, in the present paper, the choices of M , and ǫ will be irrelevant. We will discuss the setsof places V and V badmod momentarily. This requires some set up.In order to define the sets of places V badmod and V we need to introduce the fields F and K to which they belong. The field F is the field of Moduli of the elliptic curve defined by F := Q ( j E ) , in Mochizuki’s notation this is F mod . The field K is the l -division field of F given by K := F ( E [ l ]) , (1.1)obtained by adjoining the l -torsion of E ( F ) to F . In this paper, for any field L we will let V ( L ) denote the collection of places of L and for any non-archimedean place v ∈ V ( L ) wewill let κ ( v ) denote the residue field and L v denote the completion of L at v .We now come to the definitions of V badmod and V . First V badmod ⊂ V ( F ) is a non-empty set ofbad multiplicative places over the field of moduli: for every E an elliptic curve over F suchthat E ∼ = E ⊗ F F , if v ∈ V badmod , then E has multiplicative reduction at v . Next, the set V ⊂ V ( K ) is a set that maps bijectively to V ( F ) under the natural map V ( K ) → V ( F ).We will be using these quantities momentarily but first we need to describe a special typeof initial theta data that will be used in the course of this manuscript. For computationalpurposes one can always take an elliptic curve over its field of moduli (satisfying some mildconditions) and base change this field to a larger field to obtain some curve that can be putin initial theta data. We call such theta data “built from the field of moduli”. The precisedefinition is below. Definition 1.0.1.
Let
E/F be an elliptic curve inside initial theta data(
F /F, E F , l, M , V , V badmod , ǫ ) . We will say that such a tuple is built from the field of moduli provided E = E ⊗ F F where F = Q ( j E ) is the field of moduli of E , E is a model of E over F , F := F ( √− , E [30]) , and V badmod ⊂ V ( F ) is the full set of places of multiplicative reduction. We will often use the notation d := [ F : Q ] . The constants in our Szpiro-like inequalities will depend on this degree. In stating our resultswe recall from [DH20b] that for a rational prime p that V ( F ) p is given the structure of aprobability space where Pr : V ( F ) p → [0 ,
1] is defined byPr( v ) := [ F ,v : Q p ][ F : Q p ] . The definition of initial theta data in § sing V one can define some interesting probabilistic quantities which appear in our Prob-abilistic Szpiro inequality and give a good sense of the types of things that Mochizuki’sinequality “knows about”. Definition 1.0.2. (1) The probability that v ∈ V p is unramified is P unr ,p = X w ∈{ v ∈ V ( F ) p : e ( v/p )=1 } Pr( w ) . (1.2)(2) The average ramification degree of v ∈ V p is defined to be e p = E ( e ( v/p )) . (1.3)(3) The average different of V / Q is defined to beDiff( V / Q ) = Y p p diff p . (1.4)In (1.4) we have diff p = log p ( E ( p diff( v/p ) )) and diff( v/p ) = ord p (Diff( K v / Q p )); Diff( K v / Q p )is the different of K v over Q p .Using these quantities we can now state the Probabilistic Szpiro. Theorem 1.0.3 (Probabilistic Szpiro) . Assume [Moc15a, Corollary 3.12] and Claim 5.0.1.For any elliptic curve
E/F in initial theta data ( F /F, E F , l, M , V , V badmod , ǫ ) built from thefield of moduli we have
16 + ε l ln | ∆ min E/F | [ F : Q ] ≤ ln Diff( V / Q ) + X p ln( e p ) + A l,V (1.5) where A l,V = ln( π ) + X p (1 − P l +1 / ,p ) (cid:18) ln( b p ) + 5 l + 4 (cid:19) , and b p = 1 / exp(1) ln( p ) , and ε l = 24( l + 3) / ( l + l − . From the Probabilistic Szpiro Inequality we can derive a “Baby Szpiro” Inequality. Thisinequality only depends on discriminant and degree of the division field K . This inequalitycan be derived quickly dispensing with a discussion of ramification of the mod l Galoisrepresentation and its relation to the conductor (which is reviewed in § I in a ring of integers R we let | I | denote the absolute norm. Theorem 1.0.4 (Baby Szpiro) . Assume [Moc15a, Corollary 3.12] and Claim 5.0.1. Thenfor any elliptic curve
E/F in initial theta data ( F /F, E F , l, M , V , V badmod , ǫ ) built from the fieldof moduli we have
16 + ε l ln | ∆ min E/F | [ F : Q ] ≤ ln([ K : Q ] / ) ln( | Disc( K/ Q ) | / ) + ln( π ) . (1.6) In the above formula ε l = (24 l + 72) / ( l + l − and K = Q ( j E , E [30 l ] , √− . n § Theorem 1.0.5 (Explicit Szpiro) . Assume [Moc15a, Corollary 3.12] and Claim 5.0.1. If
E/F is an elliptic curve in initial theta data ( F /F, E F , l, M , V , V badmod , ǫ ) built from the fieldof moduli then | ∆ min E/F | ≤ e A d l + B d ( | Cond(
E/F ) | · | Disc( F/ Q ) | ) ε l , (1.7) where A = 84372107405 ,B = 316495 ,ε l = 96 ( l + 3) / ( l + l − . All of these computations follow the same pattern. First, one computes an upper boundfor the so-called hull of the multiradial representation (c.f. [DH20b, § § § Acknowledgements.
This article is very much indebted to many previous expositions ofIUT including (but not limited to) [Fes15, Hos18, Ked15, Hos15, Sti15, Mok15, Moc17,Yam17, Hos17, Tan18, SS17]. The first author also greatly benefitted from conversationswith many other mathematicians and would especially like to thank Yuichiro Hoshi for help-ful discussions regarding Kummer theory and his patience during discussions of the thetalink and Mochizuki’s comparison; Kirti Joshi for discussions on deformation theory in thecontext of IUT; Kiran Kedlaya for productive discussions on Frobenioids, tempered funda-mental groups, and global aspects of IUT; Emmanuel Lepage for helpful discussions on thep-adic logarithm, initial theta data, aut holomorphic spaces, the log-kummer correspondence,theta functions and their functional equations, tempered fundamental groups, log-structures,cyclotomic synchronization, reconstruction of fundamental groups, reconstruction of decom-position groups, the ”multiradial representation of the theta pilot object”, the third indeter-minacy, the second indeterminacy, discussions on Hodge Theaters, labels, and kappa coricfunctions, and discussions on local class field theory; Shinichi Mochizuki for his patiencein clarifying many aspects of his theory — these include discussions regarding the relation-ship between IUT and Hodge Arakelov theory especially the role of ”global multiplicativesubspaces” in IUT, discussions on technical hypotheses in initial theta data; discussions onTheorem 3.11 and ”(abc)-modules”, discussions on mono-theta environments and the inte-rior and exterior cyclotomes, discussions of the behavior of various objects with respect to utomorphisms and providing comments on treatment of log-links and the use of polyiso-morphisms, discussions on indeterminacies and the multiradial representation, discussionsof the theta link, discussions on various incarnations of Arakelov Divisors, discussions oncyclotomic synchronization; Chung Pang Mok for productive discussions on the p-adic log-arithm, anabelian evaluation, indeterminacies, the theta link, and hodge theaters; ThomasScanlon for discussions regarding interpretations and infinitary logic as applied to IUT andanabelian geometry. We apologize if we have forgotten anybody.The authors also benefitted from the existence of the following workshops: the 2015 Ox-ford workshop funded by the Clay Mathematics Institute and the EPSRC programme grant Symmetries and Correspondences ; the 2017 Kyoto
IUT Summit workshop funded by RIMSand EPSRC; the Vermont workshop in 2017 funded by the NSF DMS-1519977 and
Sym-metries and Correspondences entitled
Kummer Classes and Anabelian Geometry ; the 2018Vermont Workshop on
Witt Vectors, Deformations and Absolute Geometry funded by NSFDMS-1801012.The first author was partially supported by the European Research Council under theEuropean Unions Seventh Framework Programme (FP7/2007-2013) / ERC Grant agreementno. 291111/ MODAG while working on this project.The research discussed in the present paper profited enormously from the generous supportof the International Joint Usage/Research Center (iJU/RC) located at Kyoto UniversitiesResearch Institute for Mathematical Sciences (RIMS) as well as the Preparatory Center forResearch in Next-Generation Geometry located at RIMS.2.
Explicit Computations in Tensor Packets
The entire purpose of this subsection is Theorem 2.8.1 and the entire purpose of Theo-rem 2.8.1 is the for hull computation in §
6. At the end of the day the differents appearing inthese sections are what give rise to the conductor term in Szpiro inequalities (in conjunctionwith the material in § K , . . . , K m finite extensions of Q p . Let L = K ⊗ · · · ⊗ K m . We are interested inthe difference between the Z p -lattices O K ⊗ · · · ⊗ O K m ⊂ O L . It turns out that the indexof O K ⊗ · · · ⊗ O K m in O L is related to the Differents of K i / Q p which we describe in thesubsequent subsections. Again, this is needed for the hull computations. Integral Closures.
For a reduced ring T , the total ring of fractions is defined by κ ( T ) = S − T where S max is the multiplicatively closed set of non-zero divisors. We letInt κ ( T ) ( T ) denote the integral closure of T in κ ( T ).Let K , . . . , K m be finite extensions of Q p and consider the special case T = O K ⊗· · ·⊗O K m (where tensor products are taken over Z p ). It turns out that κ ( T ) = K ⊗ · · · ⊗ K m (wheretensor products are over Q p ). Let L = κ ( T ). We can use the Chinese Remainder Theorem A lattice of a Q p -vectors space V is a Z p -submodule V ⊂ V which is free of rank dim( V ) whose Q p -spanis all of V . And in explicit cases one can actually proceed differently, say, using conductors in the sense of Commu-tative Algebra. o write L ∼ = L rj =1 L j where each L j is a finite extension of Q p . Define O L = L rj =1 O L j . Italso turns out that Int L ( T ) = O L .2.2. Differents and Discriminants.
For a discussion on Differents and Discriminants offields we refer the reader to [Neu99, III.2] or [Sut15] or [Con]. A very comprehensive reviewof Differents in great generality can be found in [Aut19, 0DW4] and the references therein.For A ⊃ B a finite extension of rings we define the different ideal to beDiff( A/B ) := ann B (Ω B/A ) . Here Ω
B/A is the module of Kahler differentials. When
L/L is an extension of fields weuse the notation Diff( L/L ) := Diff( O L / O L ). For an extension of number fields L/L thedifferent ideal and be computed “as a product” of local differents (see [Neu99], for details).The different also behaves well in towers. If K is a finite extension of Q p with residue field k then O K can be written as O K = W ( k )[ x ] / ( f ( x )) , where f ( x ) is an Eisenstein polynomial of degree e (also e is the ramification degree of K/ Q p ) and W ( k ) is the full ring of p -typical Witt vectors of k . As Diff( W ( k ) / Z p ) = 1 (thisextension is unramified) to compute Diff( O K / Z p ) it remains to compute Diff( O K /W ( k )).From the formula Ω O K /W ( k ) = ( O K · dx ) / ( O K · df ) and d ( f ( x )) = f ′ ( x ) dx we find thatDiff( O K / Z p ) = ( f ′ ( π )) . (2.1)We can do some more computations to get some useful information. We find f ′ ( π ) = eπ e − + · · · where all of terms have distinct valuation and the leading term of f ′ ( π ) hasminimal valuation (this is due to the Eisenstein-ness hypothesis). This gives the formulaDiff( K/ Q p ) = ( eπ e − ) from which we computeord p (Diff( K/ Q p )) = ord p ( e ) + ( e −
1) ord p ( π ) = ord p ( e ) + ( e −
1) 1 e .
The discriminant of an extension of fields
L/L is then defined to be the ideal-norm ofthe different: Disc( L/L ) = N L/L (Diff( L/L )) ⊳ O L We remark that Q p p ord p Diff( L/ Q ) = | Disc( L/ Q ) | / [ L : Q ] . This is helpful when thinking about(say) (1.5). In later sections we will make use of the notation diff( K/ Q p ) = ord p Diff( K/ Q p ).2.3. Explicit Chinese Remainder Formulas.
Fix a field K and an algebraic closure K .Let K , . . . , K m be finite extensions of K sitting inside the common algebraic closure. Theisomorphism of rings K ⊗ · · · ⊗ K m ϕ −→ M ψ ∈ Φ L ψ (2.2)will play an important role for us. We describe its ingredients: See [Neu99]. Φ ⊂ L mi =1 Hom( K i , K ), is a complete system of representatives under the equiva-lence relation ( ψ , . . . , ψ m ) ∼ ( σψ , . . . , σψ m )for σ ∈ G ( K /K ). • For ψ = ( ψ , . . . , ψ m ) ∈ Φ we let L ψ be the compositum L ψ = ψ ( K ) · · · ψ m ( K m ) ⊂ K . • The isomorphism ϕ is defined via extending linearly the map ϕ ( a ⊗ · · · ⊗ a m ) = ( ϕ ψ ( a ⊗ · · · ⊗ a m )) ψ ∈ Φ , where ϕ ψ ( a ⊗ · · · ⊗ a m ) = ψ ( a ) · · · ψ m ( a m ).We prove (2.2) is an isomorphism: To see this we first note that Spec( K ⊗ · · · ⊗ K m ) = Q mi =1 Spec( K i ) so the scheme is zero dimensional (and the spectrum of a product of fields).Each maximal ideal in the tensor product is the kernel of some map ϕ : K ⊗ · · · ⊗ K m → K . Two such maps have the same kernel if and only if they differ by an automorphismof K . This explains the bijection between maximal ideals of the tensor product and( L Hom( K i , K )) / ∼ . Also, using the composition K i → K ⊗ · · · ⊗ K m → K we see that any ϕ induces ψ i : K i → K and we witness ϕ as having the special form ϕ ( a ⊗ · · · ⊗ a m ) = ψ ( a ) · · · ψ m ( a m ).2.4. Field Embeddings vs Choices of Roots.
Let
K/K be a finite field extension. Writethis as a primitive extension with K = K ( α ) and let f ( x ) be the minimal polynomial of α .Using this notation we can write down a bijectionHom K ( K, K ) ∼ −→ { β ∈ K : f ( β ) = 0 } ψ ψ ( α ) . Now, let Φ ⊂ L mi =1 Hom K ( K, K ) be a complete system of representatives for the equiva-lence relation ∼ . Let K i = K ( α i ) where α i has minimal polynomial f i ( x ). We can modifyany ψ = ( ψ , . . . , ψ m ) by some σ ∈ G ( K /K ) with σψ = id K so that( ψ , ψ , . . . , ψ m ) ∼ (id K , ψ ′ , . . . , ψ ′ m ) . Such choices of ψ will be called normalized (for K ) and a collection of embeddings Φ willbe called normalized if each element is normalized.Note that normalized Φ are in bijection with tuples of roots of the corresponding minimalpolynomials. Φ ∼ −→ { ~α ′ = ( α ′ , . . . , α ′ m ) : f ( α ′ ) = 0 , . . . , f m ( α ′ m ) = 0 } (id K , ψ , . . . , ψ m ) = ( ψ , , ψ , . . . , ψ m ) ( ψ ( α ) , . . . , ψ m ( α m )) . We record that the inverse map is given by ~α ′ ψ ~α ′ here the components of ψ ~α ′ are the field embeddings uniquely determined by where theysend the specified primitive element. We will make use of this correspondence frequently.2.5. Notation for Quotients.
For a polynomial ring R [ x , . . . , x n ] /I we will sometimesuse the notation ¯ x , . . . , ¯ x n to denote the images of x , . . . , x n in the quotient.2.6. Decomposition Comparisons.
Given fields K i = K ( α i ) with minimal polynomials f i ( x ) for i = 1 , . . . , m and Φ a K -normalized system of embeddings we are interested in adescription of the isomorphism (2.2) under the image of the base-change functor K ⊗ K − .This description will be used later in relating the tensor product of rings of integers to thering of integers of tensor products.First, we observe that K ⊗ · · · ⊗ K m ∼ = K [ x , . . . , x m ] / ( f , . . . , f m ). This gives K ⊗ K ( K ⊗ · · · ⊗ K m ) ∼ = K ⊗ K K [ x , . . . , x m ] / ( f , . . . , f m ) ∼ = K [ x , . . . , x m ] / ( f , . . . , f m ) ∼ = M ~α ′ K [ x , . . . , x m ] / ( x − α ′ , . . . , x m − α ′ m ) . Hence the isomorphism K ⊗ K ϕ : M ~α ′ K [ x , . . . , x m ] / ( x − α ′ , . . . , x m − α ′ m ) → K ⊗ K M ψ ∈ Φ L ψ . is now seen to be given by( f (¯ x , . . . , ¯ x m )) ~α ′ ( f ( α ′ , . . . , α ′ m )) ψ ~α ′ . The point here is that base changing to the algebraic closure splits fields and this allows usto work with roots of polynomials.2.7.
Idempotents and Differents.
Let K , . . . , K m be finite extensions of a field K with K i = K ( α i ) and minimal polynomials f i . If K contains the Galois closures of K , . . . , K m then the idempotents of K ⊗ · · · ⊗ K m have the form g j ,...,j m = m Y i =2 f i (¯ x i )(¯ x i − α i,j i ) 1 f ′ i ( α i,j i ) . (2.3)Here K ⊗ · · · ⊗ K m = K [ x , . . . , x m ] / ( f , . . . , f m ) and f i ( x ) = ( x − α i, )( x − α i, ) · · · ( x − α i,n i ) . Alternatively, we can fix some ψ := ( ψ , . . . , ψ m ) a tuple of embeddings ψ i : K i → K andwrite g ψ = m Y i =2 f i (¯ x i )(¯ x i − ψ i ( α i )) 1 f ′ i ( ψ ( α i )) . Proof.
We know that K [ x , . . . , x m ] / ( f , . . . , f m ) = M ~α ′ =( α ′ ,...,α ′ m ) K [ x , . . . , x m ] / ( x − α ′ , . . . , x m − α ′ m ) , o find the idempotents in this decomposition is the same as solving for g ~α ′ such that ( g ~α ′ ≡ x − α ′ , . . . , x m − α ′ m ) ,g ~α ′ ≡ x − β ′ , . . . , x m − β ′ m ) , ~β ′ = ~α ′ . Since f i ( x ) / ( x − α ′ i ) → f ′ i ( α i ) as x → α i by L’hˆopital’s rule (which by universality of thecomputation holds algebraically), the element e g i ( x ) = f i ( x )( x − α i ) 1 f ′ i ( α i )has e g i ( α ′ i ) = 1 and e g i ( β ′ i ) = 0 for β ′ i = α ′ i . To obtain our result we just take the product ofthe e g i as in the statement of the result. (cid:3) The relation between idempotents and differents now appears clear via formulas (2.1) in § Rings of Integers of Tensor Products vs Tensor Products of Rings of Integers [Moc15b, Theorem 1.1] . We now give the comparison of T = N mi =1 O K i and O L . Here O L = L ψ ∈ Φ O L ψ where L = K ⊗ · · · ⊗ K m = L ψ ∈ Φ L ψ . We remind ourselves that O L is a T -algebra. Here ϕ : T → O L is given by (extending linearly) ϕ ( a ⊗ · · · ⊗ a m ) = ( ψ ( a ) · · · ψ m ( a m )) ψ ∈ Φ . For future reference we will let ϕ ψ denote the component of ϕ in the ψ th factor. Explicitly, ϕ ψ ( a ⊗ · · · ⊗ a m ) = ψ ( a ) · · · ψ m ( a m ) if ψ = ( ψ , . . . , ψ m ). Theorem 2.8.1.
Let K , . . . , K m be finite extensions of Q p sitting in a fixed algebraic closure.Let T = O K ⊗ · · · ⊗ O K m . Let L = κ ( T ) = K ⊗ · · · ⊗ K m . Let k i for i = 1 , . . . , m denotethe respective residue fields of K i . If β = 1 ⊗ f ′ ( α ) ⊗ · · · ⊗ f ′ m ( α m ) where O K i = W ( k i )[ α i ] with Eisenstein polynomial f i ( x ) ∈ W ( k i )[ x ] then β ∈ ( T : L O L ) . That is β · O L ⊂ T .Proof. in what follows we let Z p denote the integral closure of Z p in Q p . In view of faithfulflatness [AM69, Chapter 3, exercises 16,17] it is enough to show Z p ⊗ O K ( β · O L ) ⊂ Z p ⊗ O K T. We will use the notation O L := Z p ⊗ O K O L and T := Z p ⊗ O K T . Using our embeddingdecomposition we have Z p ⊗ O K ( β O L ) = β · O L where β = P ψ ∈ Φ ϕ ψ ( β ) g ψ . Here we notethat ϕ ψ ( β ) = ϕ ψ (1 ⊗ f ′ ( α ) ⊗ · · · ⊗ f ′ m ( α m )) = f ′ ( ψ ( α )) · · · f ′ m ( ψ m ( α m )) , for ψ = ( ψ , . . . , ψ m ) ∈ Φ (here we take Φ to be K -normalized).We now use that the idempotents are given by g ψ = m Y i =1 f i (¯ x i )(¯ x i − ψ i ( α i )) 1 f ′ i ( ψ i ( α i )) ∈ ϕ ψ ( β ) Z p [¯ x , . . . , ¯ x m ] = 1 ϕ ψ ( β ) T . ow we just check: if x ∈ R it has the form x = P ψ ∈ Φ x ψ g ψ for some x ψ ∈ Z p . We have β · x = X ψ ∈ Φ ϕ ψ ( β ) g ψ ! X ξ ∈ Φ x ξ g ξ ! = X ψ ϕ ψ ( β ) x ψ g ψ ∈ T .
The second equality follows from orthogonality of idempotents and the last membershipstatement follows from the fact that ϕ ψ ( β ) g ψ ∈ T . (cid:3) Remark . The proof of Theorem 2.8.1 has nothing to do with K . We can chose some K i which makes the inclusion tightest.3. Conductors, Minimal Discriminants, and Ramification
This section contains definitions and facts about bad reduction, minimal discriminants,and Galois theory necessary for our applications. Readers just interested in ProbabilisticSzpiro or Baby Szpiro ( §
7) may skip the last two subsections and proceed directly to §
6. Fora quick reading, readers may which to skip to § § § Inertia/Decomposition Sequences.
Recall that for a finite extension K of Q p wehave an extension of topological groups1 → I K → G K → G k → , where I K is the inertia group and k is the residue field.If L is a global field and v ∈ V ( L ) is non-archimedean and v | v is a place of L we have1 → I ( v/v ) → D ( v/v ) → G ( κ ( v ) /κ ( v )) → G ( v/v ) = Stab G L ( v ) ∼ = G L v is the decomposition group.3.2. Unramified and Ramified Representations.
Let L be a finite extension of Q p . If X is an object in a category, a representation ρ : G L → Aut( X ) is unramified if and onlyif ρ ( I L ) = 1. We may speak of X being unramified, where the representation is understood(usually torsion points of an elliptic curve).Let L be a global field. Given ρ : G L → Aut( X ) we say that ρ is unramified at v if andonly if ρ | G v is unramified. In either of these cases, if a representation is not unramified it iscalled ramified .3.3. Good and Bad Reduction.
Let K be a finite extension of Q p . Let R = O K be itsring of integers and let k be its residue field. Let A K be an abelian variety over K . We recallthat A K has good reduction if and only if there exists and abelian schemes A over R whosegeneric fiber is isomorphic to A K . This is equivalent to the special fiber of the N´eron modelbeing an abelian variety. Given an abelian variety A over a global field L we say that A L has good reduction at v if and only if A L v does. .4. Division Fields and Galois Representations.
Let A be an abelian variety over afield L . Let m be an integer. We will abuse notation and let A [ m ] denote both the groupscheme of m -torsion points and the G L -module given by taking L -points of this group scheme.Assume now that L is a number field. We will let L l = L ( A [ l ]). We remark that L l maybe defined by literally adjoining the coordinates of torsion points in some model and thatthis field extension is independent of the model. If we fix an algebraic closure L we also have L l ∼ = L ker ρ l . We also note that G ( L l /L ) ∼ = im( ρ l : G L → Aut( A [ l ])).Consider now the Tate module T l A = lim ←− A [ l n ] in the category of Galois modules. We let ρ l ∞ : G L → Aut T l A denote action in the underlying representation. When it is necessary tospecify the abelian variety we use ρ l ∞ ,A .Serre’s surjectivity theorem says that for an elliptic curve without complex multiplicationthe image of ρ l surjective for all but finitely many l . This implies that for l sufficiently largeim( ρ l ) ∼ = GL ( F l ). We make this remark because the initial theta data hypotheses of[DH20b, §
5] require ρ l ( G F ) ⊃ SL ( F l ) — Serre’s Conjecture says this is generically true.3.5. Minimal Discriminants and Tate Parameters.
We suppose that E is an ellipticcurve over a number field F sitting in initial theta data. We will assume that the it issemi-stable (all bad places are places with multiplicative reduction). Note that if it is notsemi-stable one can make a finite change of base such that all places of the new field abovea place of additive reduction in the old field are places of good reduction. Under any basechange, places in the new field over places of multiplicative reduction in the old field stillhave multiplicative reduction (hence the word semi-stable).In the case that E is an elliptic curve over L , a finite extension of Q p by [Sil13, Ch V,Lemma 5.1] if | j E | p > E having multiplicative reduction) thereexists a Tate parameter q = q E ∈ L and an isomorphism of elliptic curves u : E → E q defined over L . Here E q is the Tate curve which admits a Tate uniformization. Note thatthis implies that all elliptic curves without potential good reduction have a unique Tateparameter at bad places. In fact: q E ∈ Q p ( j E ) if L is a finite extension of Q p .The following describes the relationship between the minimal discriminant and the Tateparameter. Lemma 3.5.1. If E is an elliptic curve over L a complete discretely valued field with valu-ation v then(1) If E has multiplicative reduction then ord v (∆ min ) = ord v ( q E ) , where ∆ min is theminimal discriminant E/L .(2) All Tate curves E q are minimal Weierstrass models.Proof. The proof of the first assertion follows from Ogg’s Formula. This formula states c = ord(∆ min ) + 1 − m. It is conjectured by Serre that for every number field L there exist some l max such that for every ellipticcurve and all l ≥ l max that im( ρ E,l ) = GL ( F l ). In the case L = Q it is further conjectured that l max = 37. ere c is the local conductor exponent, ∆ min is the minimal discriminant and m is the numberof irreducible components E s the special fiber of the N´eron model of E for R = O L . Since ourelliptic curve has multiplicative reduction this implies c = 1 which implies m = ord v (∆ min ).Now we have E s / E s ∼ = E q ( L ) / ( E q ) ( L ) ∼ = ( L × /q Z ) / ( R × /q Z ) ord v −−→ Z / ord v ( q ) . The first equality is the Kodaira-N´eron Theorem, where E s denotes the special fiber of theN´eron model and the superscript zero denotes the connected component of the identity. Also( E q ) ( L ) is the kernel of specialization. The second equality follows from Tate uniformizationand the last equality follows from taking valuations. From this the equality follows. (see[Sil09, Appendix C]).We now prove ∆ E q is minimal. We know that∆ E q = q Y n ≥ (1 − q n ) . This shows ord v (∆ E q ) = ord v ( q ) and since ord v ( q ) = ord v (∆ min E q ) from the first assertion ofthe lemma we are done. (cid:3) Minimal Discriminants and Base Change.
The following describes how minimaldiscriminants behave under base change.
Lemma 3.6.1.
Let
K/F be a finite extension of number fields. If
E/F is a semi-stableelliptic curve then [ K : F ] ln | ∆ min E/F | = ln | ∆ min E K /K | . Proof.
The proof is a computation:ln | ∆ min E K /K | = X w ∈ V ( K ) ord w (∆ min E K /K ) f ( w/p w ) ln( p w )= X w ∈ V ( K ) ord w ( q w ) f ( w/p w ) ln( p w )= X v ∈ V ( F ) X w ∈ V ( F ) w e ( w/v ) ord v ( q v ) f ( w/v ) f ( v/p v ) ln( p v )= X v ∈ V ( F ) X w | v [ K w : F v ] ord v ( q v ) f ( v/p v ) ln( p v )= [ K : F ] X v ∈ V ( F ) ord v ( q v ) f ( v/p v ) ln( p v ) = [ K : F ] ln | ∆ min E/F | . (cid:3) .7. Normalized Arakelov Degrees.
For a number field L and an Arakelov divisors D ∈ d Div( L ) the normalized Arakelov degree is defined by d deg L ( D ) = d deg L ( D )[ L : Q ] . For v ∈ V ( L ) with [ v ] ∈ d Div( L ) degrees are normalized so that d deg([ v ]) = ln | N v | = f v ln( p v )where p v is the characteristic of κ ( v ) and f v is the inertia degree. We use the property thatthe normalized degree is invariant under pullback: if f : V ( L ) → V ( L ) is the natural mapassociated to an extension of number fields L ⊂ L and D ∈ d Div( L ) then d deg L ( D ) = d deg L ( f ∗ D ) . We record that f ∗ [ v ] = P v | v e ( v/v )[ v ].3.8. q and Theta pilots. Fix initial theta data (
F /F, E F , l, M , V , V badmod , ǫ ). Furthermore,suppose it is built from the field of moduli so that V badmod ⊂ V ( F ) contains all the semi-stableplaces of bad reduction. Definition 3.8.1.
The q -pilot divisor of this data is then P q = X v ∈ V badmod ord v ( q / lv )[ v ] ∈ Div( F ) Q . (3.1)The q -pilot is related to the minimal discriminant of an elliptic curve by the followingformula: d deg F ( P q ) = 12 l ln | ∆ min E/F | [ F : Q ] . (3.2)To see this we perform a simple computation: d deg F X v bad ord v ( q v )[ v ] ! = d deg K X v ∈ V badmod ord v ( q v ) X w ∈ V ( K ) v e ( w/v )[ w ] = d deg K X v ∈ V badmod X w ∈ V ( K ) v ord w ( q v )[ w ] = d deg K X w bad ord w ( q w )[ w ] ! = d deg K X w bad ord w (∆ min E K /K )[ w ] ! = ln | ∆ min E K /K | [ K : Q ] = ln | ∆ min E/F | [ F : Q ] . We now discuss Theta pilots. efinition 3.8.2. The theta pilot divior is a tuple P Θ = ( P Θ ,j ) ( l − / j =1 ∈ d Div lgp ( F ) ( l − / Q where P Θ ,j = X v ∈ V ( F ) bad ord v ( q j / lv )[ v ] ∈ d Div( F ) Q . The relationship between the theta and q -pilots is given by d deg F ( P q ) = l ( l + 1)12 d deg lgp ,F ( P Θ ) . (3.3)This formula is derived by a simple computation: d deg lgp ,F ( P Θ ) = 2 l − ( l − / X j =1 d deg F ( P Θ ,j )= 2 l − ( l − / X j =1 j d deg F ( P q )= l ( l + 1)12 d deg F ( P q ) . Remark . (1) The assertion of [SS17, pg 10] is that (3.3) is the only relation betweenthe q -pilot and Θ-pilot degrees. The assertion of [Moc18, C14] is that [SS17, pg 10] isnot what occurs in [Moc15a]. The reasoning of [SS17, pg 10] is something like whatfollows:(a) The Θ × µ LGP -link in [Moc15a] is a polyisomorphism between F (cid:13)◮ × µ -strips, , F (cid:13)◮ × µ LGP and , F (cid:13)◮ × µ ∆ .(b) Within these objects there are two global realified Frobenioids , C (cid:13) LGP and , C (cid:13) ∆ .Also there exists objects , P (cid:13) Θ ∈ , C (cid:13) LGP and , P (cid:13) q ∈ , C (cid:13) ∆ called the (0,0) thetapilot object and (1,0) q pilot object respectively and the theta link Θ × µ LGP is suchthat Θ × µ LGP ( , P (cid:13) Θ ) = , P (cid:13) q .(c) To each such global realified Frobenioids C (cid:13) we can interpret a one dimensionalreal vector space Pic( C (cid:13) ). Also, to any object P (cid:13) ∈ C (cid:13) there is an associateddegree deg C (cid:13) ( P (cid:13) ) ∈ Pic( C (cid:13) ).(d) Any isomorphism between , C (cid:13) LGP and , C (cid:13) ∆ induces an isomorphism betweenPic( , C (cid:13) LGP ) and Pic( , C (cid:13) ∆ ).(e) An identification one can make is to fix isomorphisms α : Pic( , C (cid:13) LGP ) → R β : Pic( , C (cid:13) ∆ ) → R specified by extending linearly α (deg , C (cid:13) LGP ( , P (cid:13) Θ )) = d deg lgp ( P Θ ) ,β (deg , C (cid:13) ∆ ( , P (cid:13) q )) = d deg( P q ) , where the degree on the left hand side are as in the present subsection ([SS17,2.1.6] calls this the canonical trivialization.) f) The authors of the present article, Scholze-Stix, and Mochizuki all agree thatthe items above lead to a contradition. Stripping away the abstraction, theseassertions are tautologically equivalent to d deg lgp ,F ( P Θ ) = ( l ( l +1)) / · d deg F ( P q )and d deg lgp ,F ( P Θ ) = d deg F ( P q ). This clearly gives a contradiction.(g) It is our understanding that no such α map is specified in IUT; meaning thatcommutativity of the diagram consisting of the map induced by Θ × µ LGP , α , and β is not asserted.(2) We would like to point out that the diagram on page 10 of [SS17] is very similar tothe diagram on § . .We note that there is also the review [Rob 3] which some may find interesting.3.9. N´eron-Ogg-Shafarevich: Conductors and Good Reduction.
The following the-orem of Serre and Tate, which they call the N´eron-Ogg-Shafarevich Criterion, tells us howramification of an l -power Tate module is related to the reduction geometry of the N´eronmodel of corresponding the abelian variety. Theorem 3.9.1 ([ST68]) . Let A be an abelian variety over a local field L of residue char-acteristic p . The following are equivalent.(1) For all m ∈ N , ( m, p ) = 1 , A [ m ] is unramified.(2) There exist a rational prime l such that l = p and T l ( A ) is unramified.(3) There exist infinitely many m with ( m, p ) = 1 such that A [ m ] is unramified.(4) A has good reduction. We apply this in subsequent sections to get information about the behavior of ramificationdegrees in our computations. We can apply this theorem to get a criteria relating theconductor of Abelian varieties to discriminants the of an associate l division field. Theorem 3.9.2.
Let A be an abelian variety over a number field L . Let l be a rational prime.Let L l = L ( A [ l ]) . Let w be a non-archimedean place of L l coprime to l and char( κ ( w )) = p .The following holds e ( w/p ) > ⇐⇒ w | l or w | Cond(
A/L ) or w | Diff( L/ Q ) . Proof.
Suppose that e ( w/p ) >
1. Since e ( w/p ) = e ( w/v ) e ( v/p ), where v ∈ V ( L ) is theimage of w ∈ V ( L l ) under the natural map V ( L l ) → V ( L ), we must have e ( w/v ) > e ( v/p ) >
1. If e ( v/p ) > v | Diff( L/ Q ) which implies w | Diff( L/ Q ). If e ( w/v ) > I w/v = 1 since I w/v = e ( w/v ). Since ρ l : G ( L l /L ) → Aut( A [ l ]) is injective and T l has A [ l ] as a quotient, we know that ρ l ( I v/v ) = 1 and hence T l A is ramified. This implies v | Cond(
A/L ). The final option is w | l .Conversely suppose w | l or w | Cond(
A/L ) or w | Diff( L/ Q ) . f w | l then since L l ⊃ Q ( ζ l ) we have e ( w/l ) > l −
1. If w | Diff( L/ Q ) then by definition e ( v/p ) >
1. If w | Cond(
A/L ) then v | Cond(
A/L ) since L l /L is Galois. We know that v | Cond(
A/L ) ⇐⇒ c v = 0 ⇐⇒ v is ramified ⇐⇒ I w/v = 1 . This proves the result. Above, c v = ord v (Cond( A/L ). (cid:3) Estimates on p -adic Logarithms The material in this section is applied in § p -adic logarithm, ln denote the real valued natural logarithm, and log p denote the real valuedbase p logarithm. We refer the reader to [Rob00] for a quick review of elementary propertiesof the p -adic logarithm. See also [DH20a, § Notation. C p be the p -adic completion of Q p and let ord p be the unique extension ofthe valuation on Q p to C p with ord p ( p ) = 1. We normalize the p -adic absolute values by | x | p = p − ord p ( x ) .If K is local field with uniformizer π K we let ord K denote the valuation normalized byord K ( π K ) = 1. In the case that L is a global field and v ∈ V ( L ) is a non-archimedean place,we let ord v = ord L v denote the normalized valuation on L v .4.1.2. Let K/K be a finite extension of non-archimedean fields of residue characteristic p .We will let e ( K/K ) denote the ramification degree of the extension. We will say e ( K/K )is small provided e ( K/K ) < p −
1. Note that small implies tame.If L ′ ⊃ L is an extension of number fields and v ′ | v are places of the respective numberfields we let e ( v ′ /v ) := e ( L ′ v ′ /L v ). If L is a number field, we say that a non-archimedeanplace v of L is small if L v / Q p is small.4.1.3. For a p -adic field K , a ∈ K and r ≥ r by D K ( a, r ) = { x ∈ K : | x | p ≤ r } . Similarly if L = L mj =1 L j is a finite direct some of p -adic fields, ~a = ( a , . . . , a m ) ∈ L and ~r = ( r , . . . , r m ) is a vector of non-negative real numbers then we will denote the polydisc ofpolyradius ~r by D L ( ~a, ~r ) = { ( x , . . . , x m ) ∈ L : | x | p ≤ r and · · · and | x m | p ≤ r m } . When writing D L (0 , R ) where R ∈ R we will understand this to mean D L (0 , ( R, R, . . . , R )). .2. Estimates on The Size of The p -Adic Logarithm. We begin by estimating thesize of the p -adic logarithm (c.f. [Moc15b, Prop 1.2]). Lemma 4.2.1 (Crude Estimate) . Let a ∈ C p , with ord p ( a ) > . We have | log(1 + a ) | p < c p ord p ( a ) , where c p = (exp(1) ln( p )) − , where exp(1) = 2 . . . . is the base of the natural log.Proof. To get an upper bound on | − log(1 − a ) | p = | P n ≥ a n n | p for | a | p < | a n /n | p . Equivalently, we can compute the minimum of ord p ( a n /n ). We findthese lower bounds by usingord p ( a n /n ) = n ord p ( a ) − ord p ( n ) ≥ n ord p ( a ) − log p ( n ) , and minimizing the function f ( x ) = xc − log p ( x ) . The function has global minimum at x = 1 /c ln( p ) which gives f ( x ) ≥ f ( x ) = 1ln( p ) + log p ( c ln( p )) . Converting this lower bound on the order to an upper bound on the p -adic absolute valuegives our result. (cid:3) Remark . One can also minimize the function f ( x ) = p x c − x giving | log(1 + a ) | p ≤ b p | a | p ord p ( a ) , where b p = p ) e ln( p )2 . This is not of any use to us.The application of Lemma 4.2.1 gives an upper bound on the smallest radius r such thatlog( O × K ) ⊂ D K (0 , r ) where K is a finite extension of Q p . With knowledge that e ( K/ Q p ) issmall we can do much better. We state these results and omit the proofs. Lemma 4.2.3.
Let K/ Q p be a finite extension.(1) With no assumptions on the ramification of K/ Q we have log( O × K ) ⊂ D K (0 , e ( K/ Q p )ln( p ) exp(1) ) .(2) If e ( K/ Q p ) < p − then log( O × K ) = π O K where π is the uniformizer of K .Remark . In [Moc15b, Prop 1.2] Mochizuki proves log( O × K ) ⊂ p − b O K where b = ⌊ ln( pe ( K/ Q ) p − ) / ln( p ) ⌋ − e ( K/ Q ) . As far as usability goes, the formula in Lemma 4.2.3 whileweaker, seems to be easier to understand.4.3. p -Adic Log Shells. The present section collects and reformulates some of the materialin [Moc15c].
Definition 4.3.1.
Let K/ Q p be a finite extension. The log-shell of K is the Z p -submoduleof K defined by I K = p log( O × K ) Lemma 4.3.2 (Upper Semi-Compatibility) . I K contains both O K and log( O × K ) . One could have also used ord p ( a n /n ) = p m ord p ( a ) − m. along the sequence n = p m . This will givedifferent, less useful bounds. See the remark below. roof. It is clear that log( O × K ) ⊂ I K . Conversely, since | p | p < r p we have log(1 + 2 p O K ) =2 p O K since ord p ( wp ) > / ( p − I K ⊃ p log(1 + 2 p O K ) = 12 p (2 p O K ) = O K . (cid:3) Remark O × K )) . For K/ Q p a finite extension we not thatlog( O × K ) has the structure of an O K -module very rarely. In order for log( O × K ) to be an O K -module we need a log( b ) = log( b a )for a ∈ O K and b ∈ O × K . This in turn depends on the convervence of b a = P ∞ n =0 a ( a − ··· ( a − n +1) n ! ( b − n . We will not pursue this here, as estimates will not be needed. On the other hand wedo observe that log( O × K ) is always a Z p -module for exactly the same reason.5. Archimedean Logarithms
In order for our estimates to be complete we require definitions and estimates for hull( U Θ )at the Archimedean factor L ∞ . For ~v = ( v , . . . , v j ) ∈ V ( F ) j +1 ∞ we will let H ~v denote thecomponent of hull( U Θ ) in K v ⊗ · · · ⊗ K v j (since √− ∈ K we know that K v ∼ = C for each v ∈ V ). Claim 5.0.1. If ~v ∈ V ( F ) j +1 ∞ then H ~v ⊂ D L ~v (0; R ~v ) where ln( R ~v ) = ( j + 1) ln( π ) . We do not develop the theory necessary to discuss this bound as this requires an Archimedeantheory parallel to the p -adic theory in [ ? ]. A full anabelian treatment requires so-called aut-holomorphic spaces. The starting place is [Moc15a, Definition 1.1]. The claim above can befound in [Moc15b, Proposition 1.5, Proof of Theorem 1.10, step vii].6. Upper Bounds on Hulls
We now come to the section of the paper which contains the first major computation.Fix initial theta data (
F /F, E F , l, M , V , V badmod , ǫ ). In this section our goal is to find, for eachprime p and each ~v ∈ ` ( l − / j =1 V ( F ) j +1 p , the smallest poly-disc D L ~v (0 , R ~v ) such that thecomponent of the multiradial representation at ~v is contained in this polydisc. The smallestpossible polydisc here is called the hull.6.1. Hulls. If L = L mj =1 L j is a finite direct sum of p -adic fields and Ω ⊂ L then lets define R i (Ω) = max {| x i | p : ( x , . . . , x m ) ∈ Ω } , then define the poly-radius of Ω to be ~R (Ω) = ( R (Ω) , . . . , R m (Ω)) . Define the hull of Ω to be the smallest poly-disc containing Ω:hull(Ω) = D L (0 , ~R (Ω)) . It is easy to check that if α = ( α , . . . , α m ) ∈ L then ~R ( α · Ω) = ( | α | p R (Ω) , . . . , | α m | p R m (Ω)) . lso, given a collection of compact regions Ω i ⊂ L where i = 1 , , · · · then for each j where1 ≤ j ≤ m we have R j ( ∞ [ i =1 Ω i ) = sup { R j (Ω i ) : i ≥ } . Note that the right hand of the above equality is possibly infinite.We now state some basic properties of hulls. For
A, B ⊂ L we will write A ⊂ ∼ B ⇐⇒ hull( A ) ⊂ hull( B ) . Note that A ⊂ ∼ B if there exists some Q p -linear tranformation T : L → L with | det( T ) | p = 1and T ( A ) ⊂ B (such a T could be multiplication by a unit of L for example). Also if Ω ⊂ L and a ∈ K m (which we view as acting on L via multiplication on the m th tensor factor) then a N · Ω ⊂ ∼ a · Ω. To see that hull( a N · Ω) ⊂ hull( a · Ω), we observe that a ∈ K m acts on each directsummand of L by ψ j ( a ) where we have written L = L L ψ j using the Chinese Remainderformulas developed in § R j ( a N · Ω) = sup { R j ( a n · Ω) : n ≥ } = | a | p R j (Ω).This implies R j ( a N · Ω) ≤ R j ( a · Ω) and hence hull( a N · Ω) ⊂ hull( a · Ω).6.2.
Worst Case Scenario.
We now give a toy-version of our the computation of the hullbound associated to a tuple ~v ∈ V ( F ) j +1 . Here we make assumptions on ramification of ourfields.Let K , . . . , K m be finite extensions of Q p (in our actual application m will be j + 1). Let a ∈ K m with | a | p <
1. Let L = N mi =1 K i ∼ = L ri =1 L j where the factors of the right hand sidecome from the Chinese Remainder Theorem as in § I = N mi =1 I K i be the tensor product of log-shells and Aut( L : I )denote the collection of Q p -vector space automorphisms of L obtained by extending Q p -linearly Z p -lattice automorphisms of I . These automorphisms are a stand-ins for ind1 andind2 in our actual applications (see [DH20b, §
4] for definitions). This subsection gives a bound on the hull of the “multiradial representation” U = hull Aut Q p ( L : m O i =1 log( O × K i )) · ( O ind3( a ) L ) ! . This region is a stand-in for the random measurable set U ( j ) ⊂ A ⊗ j +1 V ,p of the hull of the coarsemultiradial representation of the Theta pilot region (see [DH20b, §
4] and the next section).We prove hull (cid:16)
Aut( L : I ) · ( O ind3( a ) L ) (cid:17) ⊂ D L (0; R ) (6.1)where the radius R is given byln( R ) = −⌊ ord p ( a ) + k diff k ∞ − k diff k ⌋ ln( p ) + m ln( c p ) + m X i =1 ln( e ( K i / Q p )) . (6.2) The only reason this subsection can’t directly be applied is because the actual ind1 has some permutationsamong different tensor product factors of A ⊗ j +1 V ,p . The permutation of these factors does not appear in thisexample. In [DH20b] we used the notation U for what we are now denoting U . he constant c p ∈ R and the vector diff ∈ R m are given by c p = 1 / exp(1) ln( p ) , diff = (diff( K / Q p ) , . . . , diff( K m / Q p )) . To obtain this radius we compute. We have labeled each line in the computation belowand give the justification for each step in the itemized environment following the displayedequations. Aut( L : I ) a N · O L ∪ m O i =1 log( O × K i ) !! (6.3) ⊂ ∼ Aut( L : I ) a · β − m O i =1 O K i ∪ m O i =1 log( O × K i ) !! (6.4) ⊂ ∼ Aut( L : I ) (cid:0) aβ − I (cid:1) (6.5) ⊂ ∼ Aut( L : I ) (cid:0) p ⌊ ord p ( a ) − ord p ( β ) ⌋ I (cid:1) (6.6)= p ⌊ ord p ( a ) − ord p ( β ) ⌋ I (6.7) ⊂ ∼ p ⌊ ord p ( a ) − ord p ( β ) ⌋ D L (0 , (cid:18) p exp(1) ln( p ) (cid:19) m m Y i =1 e ( K i / Q p )) (6.8)Since hull( D L (0 , R )) = D L (0 , R ) for all radiuses R > U ) ⊂ D L (0 , e ( K / Q p ) · · · e ( K m / Q p ) | p | p p −⌊ ord p ( a )+ k diff k ∞ −k diff k ⌋ ) . Here are the justifications for each step: • (6.1) to (6.3): Uses the main result concerning ind3 in [ ? ] • (6.4): Uses the theory of §
2. In particular there exist some β = ( β , . . . , β r ) ∈ L rj =1 L j = L such that β O L = L rj =1 β j O L j ⊂ N ri =1 O K i where ord p ( β j ) = k diff k −k diff k ∞ for each j where 1 ≤ j ≤ r . • (6.5): First we are using the “upper semi-compatibility” of I K , namely that for afinite extension K of Q p we have log( O × K ) , O K ⊂ I K . We use this fact tensor factorby tensor factor. Also, since the factors of β all have large order, multiplication by β − will increase the size of the hull. • (6.6): We are using the general fact that if A is a region and | a | p < | a | p then a A ⊂ ∼ a A . • (6.7): This uses that Aut( L : I ) is by definition Q p -linear and fixes I as a set. • (6.8): We are applying the results of Lemma 4.2.31. Remark . One can break this inclusion down in some alternative ways. Here we highlightsome areas for improvement. We do not pursue these here.(1) Alternative to (6.3): For bounding Aut( L : I )( a · ( O L ∪ L mi =1 log( O K i )) one could writeour an explicit Z p -basis for a · O L and explicitly compute the action by Aut( L : I ).
2) Alternative to (6.4): One could attempt to compute the index of N mi =1 O K i in O L .This seems practical to do in specific toy cases but the size of the division fields maygive in actual applications. It seems conceivable that other invariants around thisinclusions can be used to write down more precise results.(3) (6.5): One could attempt to find a smaller region here containing the two sets. Arelog-shells optimal? Maybe, maybe not.(4) (6.8): We can go beyond the worst case scenario and make additional considerationsabout the ramification of the fields to improve bounds on I . This includes applyingthe second part of Lemma 4.2.3 (which is applicable most of the times). In fact, forall but finitely many places of v ∈ V ( F ) we have I v = O K v .6.3. Actual Scenario.
Fix initial theta data (
F /F, E F , l, M , V , V badmod , ǫ ) built from the fieldof moduli. In what follows A V = Q v ∈ V ( F ) K v denotes the “fake adeles” from [DH20b, § § U ( j ) p ⊂ A ⊗ j +1 V ,p =: L ( j ) p where U ( j ) p is of the form U ( j ) p = ind2(ind1( O ind3( ~a j ) L ( j ) p )) . Here we have made the following notational conventions: O L ( j ) p = M v | p O L ( j ) v , O L ( j ) v = Peel jv ( O L ( j ) v ) , O ind3( ~a j ) L ( j ) p = M v | p O ind3( a j,v ) L ( j ) v , and we have let ~a j = ( a j,v ) v ∈ V ( F ) where a j,v = ( q j / lv , v bad multiplicative1 , else . All of this of course depends on a choice of initial theta data. The peel decompositionPeel jv ( O L ( j ) v ) is described in [DH20b, § § ~v = ( v , . . . , v j ) ∈ V j +1 p . We say that ~v is small if every e ( v i /p ) is small for 0 ≤ i ≤ j . Similarly we say that ~v is unramified if v i isunramified for each i where 0 ≤ i ≤ j . We will also let L ~v = K v ⊗ · · · ⊗ K v j , where thetensor products are over Q p . Lemma 6.3.1.
In the notation of this subsection, we have hull( U ( j ) p ) ⊂ Y ~v ∈ V j +1 p D L ~v (0 , R ~v ) here ln( R ~v ) = , ~v unramified and p ∤ ∞−⌊ ord p ( a j,v ) − ord p ( β ~v ) ⌋ ln( p ) , ~v small and p ∤ ∞−⌊ ord p ( a j,v ) − ord p ( β ~v ) ⌋ ln( p ) + ( j + 1) ln( b p ) + P ji =0 ln( e ( v j /p )) , p | ∞ and ~v general ( j + 1) ln( π ) , p | ∞ Proof.
There are three points of departure from the computation in § β and a , improvement of log-bounds, and the inclusion of the archimedean place. In thecase that ~v is unramified we know that a j,v j = 1 by N´eron-Ogg-Shafarevich. In the casethat ~v is small, we apply the bounds from Lemma 4. In the archimedean case we applyLemma 5.0.1. (cid:3) Probabilistic Versions of the Mochizuki and Szpiro Inequalities
Throughout this section we fix initial theta data (
F /F, E F , l, M , V , V badmod , ǫ ) built from thefield of moduli F = Q ( j E ).7.1. Probability Spaces.
Fix a rational prime p . Recall that, as in the introduction, wegive ` ( l − / j =1 V ( F ) j +1 p the structure of a finite probability space where ( v , v , . . . , v j ) ∈ ` ( l − / j =1 V ( F ) j +1 p is assigned probabilityPr(( v , v , . . . , v j )) = 2 l − K v : Q p ][ K v : Q p ] · · · [ K v j : Q p ][ F : Q ] j +1 . The space ` ( l − / j =1 V ( F ) j +1 p can be viewed as a uniform independent disjoint union of prob-ability spaces V ( F ) j +1 . For a random variable X ( ~v ) that depends on ~v = ( v , v , . . . , v j ) ∈ ` ( l − / j =1 V ( F ) j +1 we can view the expectation of X as an “iterated expectation”, by firstcomputing the expectation as we vary over ( v , . . . , v j ) ∈ V ( F ) j +1 p for a fixed j and thencomputing the expection of these expectations as we vary uniformly over j . In what follows E p will denote this iterated expectation: E p ( X ( ~v )) = E ( E ( X ( ~v ) : ~v ∈ V ( F ) j +1 ) : 1 ≤ j ≤ ( l − / . Note that the colons here do not denote conditional probabilities.7.2.
Jensen’s Inequality.
Jensen’s inequality states that for a convex function g ( x ) and arandom variable X that g ( E ( X )) ≤ E ( g ( X )) . The inequality goes the other way for concave functions and one can test for convexity usingthe second derivative test: a function of a real variable g ( x ) is convex if and only g ′′ ( x ) ≥ g ( x ) = exp( x ) is a convex function and g ( x ) = ln( x ) in concave. This allowsus to say that exp( E (ln( X ))) ≤ E ( X ) ≤ ln( E (exp( X ))) . (7.1) .3. Random Variables Pulled-back from a Projection.
Let S be a discrete probabilityspace. Let ( X , . . . , X n ) be a random variable on S n . If f ( X , . . . , X n ) only depends on X n (i.e. f ( X , . . . , X n ) = g ( X n ) for some function of a single variable g ) then the expected valueof f ( X , . . . , X n ) can be computed by just varying over what the function depends on. Insymbols: E ( f ( X , . . . , X n )) = E ( g ( X )) . It is also elementary to check that E ( g ( X ) g ( X ) · · · g ( X n )) = E ( g ( X )) n . Measures.
For L a direct sum of p -adic fields, we will often make use of the formulaln µ L ( D L (0 , R )) ≤ ln( R ) . Here, for a finite dimensional vector space V and a measurable set A ⊂ V we defineln µ V ( A ) = ln( µ V ( A )) / dim( V ) . Probabilistic Mochizuki.
Using the Probabilistic formalism developed in [DH20b, § Theorem 7.5.1 (Tautological Probabilistic Inequality) . For ~v ∈ V j +1 p let R ◦ ~v = sup { R ∈ R : U ~v ⊂ D L ~v (0 , R ) } , (7.2) here U ~v is the component of the multiradial representation in L ~v . Assuming [Moc15a, Corol-lary 3.12] we have − d deg F ( P q ) ≤ X p ∈ V ( Q ) E p (ln R ◦ ~v ) . (7.3)The radius R ~v in Lemma 6.3.1 gives an estimate on R ◦ ~v in (7.3) giving − d deg F ( P q ) ≤ X p ∈ V ( Q ) E p (ln R ~v ) . (7.4)The rest of this subsection is devoted to estimating ln R ~v (so we will be deriving, in effect,will be estimates of estimates). Remark . The computation of R ~v is not optimal. It can be improved upon by readers ingeneral or in special cases. It is unclear how far off R ◦ ~v is from R ~v . It would be interesting todevelop a table of R ◦ ~v in some numerical examples (if the computations involving the divisionfields are not prohibitively hard).The readers should compare what follows to [Moc15b, Proof of Theorem 1.10]. Fix p ∈ V ( Q ). We have E p (ln R ~v ) ≤ I p + II p + III p + IV p + V p (7.5) here I p = − E p (cid:16) ord p ( q j / lv j ) (cid:17) ln( p )II p = E p ( k diff ~v k − k diff ~v k ∞ ) ln( p )III p = E p (1 ram ( ~v )))IV p = E p (( j + 1) ln( b p )1 ram ( ~v ))V p = E p j X i =0 ln( e ( v j /p )) ! In the above formulas for III p and IV p the function 1 ram ( ~v ) is the function which is 0 if ~v isunramified and 1 if ~v is ramified. We will denote the sums over p of I p ,II p , III p , IV p , andV p by I, II , III, IV, and V respectively.
Remark . At this stage we can already see Mochizuki’s inequality as stated in [Fes15, § − d deg( P q ) ≤ log ν L (hull( U )) and log ν L (hull( U )) ≤ a ( l ) − b ( l ) d deg( P q ) to get ( b ( l ) − d deg( P q ) ≤ a ( l ) which gives d deg( P q ) ≤ a ( l ) b ( l ) − . [SS17, Claim 5] follows this style. Further approximate computations can be found at [Hos17,slide 17] (adapted in [SS17, § Computation of I p . We have E (ord p ( q j v j ) : ~v ∈ V ( F ) j +1 ) = E (ord p ( q j v ) : v ∈ V ( F ))= X v | p [ F ,v : Q p ][ F : Q p ] ord p ( q j v )= 1[ F : Q ] X v ∈ V ( F ) p e ( v/p ) ord p ( q j v ) f ( v/p ) ln( p )= d deg( X v ∈ V ( F ) p bad ord v ( q j v )[ v ]) . HenceI p = X p E p (cid:18) ord p ( q j v j ) (cid:19) = X p E ( d deg( X v ∈ V ( F ) p bad ord v ( q j v )[ v ]) = d deg lgp ,F ( P Θ ) . (7.6) We say a tuple ( v , . . . , v j ) is ramified if there exists some i with 0 ≤ i ≤ j such that e ( v i /p ) >
1. If atuple is not ramified it is called unramified . .7. Computation of II p . In what follows we make use of the average different order of V over p is defined to be the quantitydiff p := log p ( E ( p diff( v/p ) )) . (7.7)We will prove II p ≤ ( l + 1)4 diff p ln( p ) . (7.8)Note that if we define the average different for V by Diff( V / Q ) = Q p p diff p we getII ≤ ln Diff( V / Q ) . (7.9)Before establishing (7.8) it is convenient to make the following Lemma. Lemma 7.7.1.
For ~v ∈ V ( F ) np let diff ~v = diff ( v ,...,v n ) = (diff( v /p ) , . . . , diff( v n /p )) . For ~v ∈ V ( F ) np following inequalities hold.(1) k diff ~v k − k diff ~v k ∞ ≤ n − n k diff ~v k . (2) E (diff ( v ,...,v n ) ) ≤ n diff p . The subscripts and ∞ denote the usual l and l ∞ norms for vectors in R n .Proof. (1) The proof is a fortiori. For positive real numbers a , . . . , a n we have n ( n X i =1 a i − max ≤ i ≤ n a i ) = n ( n X i =1 a i ) − n max ≤ i ≤ n a i ) ≤ n ( n X i =1 a i ) − n X i =1 a i =( n − n X i =1 a i . This proves k ~a k − k ~a k ∞ ≤ n − n k ~a k , if we let ~a = ( a , . . . , a n ).(2) We will apply Jensen’s inequality, to turn an expectation of a sum E (diff( v /p ) + · · · +diff( v n /p )) into (the log of) an expectation of a product E ( p diff( v /p ) · · · p diff( v n /p ) ) . Now that this is a product of random variables the expectation factors, namely, E ( p diff( v /p ) · · · p diff( v n /p ) ) = E ( p diff( v/p ) ) n . This shows E (diff ( v ,...,v n ) ) ≤ n log p E ( p diff( v/p ) )which is our desired result. (cid:3) We now prove our desired formulas: E ( k diff ( v ,...,v j ) k − k diff ( v ,...,v j ) k ∞ ) ≤ jj + 1 E ( k diff ( v ,...,v j ) k ) ≤ jj + 1 (( j + 1)diff p )= j diff p . he first line follows from Lemma 7.7.11 and the second line follows from Lemma 7.7.12(which as an application of Jensen’s inequality together with the way expectations of prod-ucts of random variables behave). It remains to compute the expectation of these over { , . . . , j } . We have E ( k diff ( v ,...,v j ) k − k diff ( v ,...,v j ) k ∞ ) ≤ E ( j diff p ) = l − ( l − / X j =1 j diff p = l + 14 diff p , which gives our result.7.8. Computation of
III p . In what follows we will make use of the probability of a place v ∈ V p to be unramified. In formula this probability is defined by P unr ,p = 1 − E (1 ram ( v ) : v ∈ V ( F ) p ) . (7.10)Also recall that since a tuple ~v = ( v , . . . , v j ) ∈ V j +1 p is unramified if and only if each v i isunramified for 0 ≤ i ≤ j this means that E (1 ram ( v , . . . , v j )) = 1 − P j +1unr ,p . This then givesIII p = E p (1 ram ( v , . . . , v j )) = 2 l − ( l − / X j =1 (cid:0) − P j +1unr ,p (cid:1) = 1 − l − ( l − / X j =1 P j +1unr ,p . As the smallest of the P j +1unr ,p is P ( l +1) / ,p we get the following inequality:III p ≤ − P ( l +1) / ,p . (7.11)7.9. Computation of IV p . We will proveIV p ≤ l + 54 ln( b p ) (cid:0) − P l +1 / ,p (cid:1) . (7.12)Using identical reasoning to § E (( j +1) ln( b p )1 ram ( ~v )) = ( j +1) ln( b p )(1 − P j +1unr ,p ). This gives IV p = ln( b p ) l − ( l − / X j =1 ( j + 1)(1 − P j +1unr ,p ) ≤ ln( b p )(1 − P l +1 / ,p ) (cid:18) l + 54 (cid:19) . Computation of V p . It will be convenient to define e p , the average ramification index of V over p . In notation it is defined by e p := E ( e ( v/p ) : v ∈ V ( F ) p ) . (7.13) e now compute V p : we have E ( j X i =0 ln( e ( v i /p ))) = E (ln( k Y i =0 e ( v i /p ))) ≤ ln( E j Y i =0 e ( v i /p ) ! ) ≤ ln( E ( e ( v/p )) j +1 ) = ( j + 1) ln( e p )The first to second line is an application of Jensen’s inequality and the second to third lineuses that, for independent random variables, the expectation of the product is the productof the expectations. We then can compute the second expectation by computing the uniformaverage of j + 1 over { , . . . , ( l − / } . This givesIV p ≤ l + 54 ln( e p ) . (7.14)7.11. Archimedean Contribution.
Here we only have to deal with log-shells. ApplyingLemma 5.0.1 we have E ∞ = 2 l − ( l − / X j =1 ( j + 1) ln( π ) = l + 54 ln( π ) . Probabilistic Szpiro.
We now combine the results of the previous subsections. Theverification of the following identities requires some careful bookkeeping.
Theorem 7.12.1 (Probabilistic Szpiro) . Assume [Moc15a, Corollary 3.12] and Claim 5.0.1.Then for any elliptic curve
E/F in initial theta data ( F /F, E F , l, M , V , V badmod , ǫ ) built overthe field of moduli we have
16 + ε l ln | ∆ min E/F | [ F : Q ] ≤ ln Diff( V ) + X p ln( e p ) + A l,V (7.15) where A l,V = ln( π ) + X p (1 − P ( l +1) / ,p ) (cid:18) ln( b p ) + 5 l + 4 (cid:19) , and b p = 1 / exp(1) ln( p ) , and ε l = 24( l + 3) / ( l + l − .Proof. For the most part, this is just a combination of the bounds on I,II,III,IV and Vgiven by equations (7.6),(7.8), (7.11), (7.12), and (7.14). The most interesting aspect of thiscomputation is the appearance of the 6 + ε l . rom the Tautological Probabilistic Inequality we get − d deg F ( P q ) ≤ − d deg lgp ,F ( P Θ ) + l + 14 ln Diff+ X p (1 − P l +1 / ,p )(1 + l + 54 ln( b p )) + l + 54 X p ln( e p )+ l + 54 ln( π ) . Using that d deg lgp ,F ( P Θ ) = (( l + 1) l/ d deg( P q ) and d deg( P q ) = ln | ∆ min E/F | / l [ F : Q ] we get (cid:18) ( l + 1) l − (cid:19) l ln | ∆ min E/F | [ F : Q ] ≤ l + 14 ln Diff + X p (1 − P l +1 / ,p )(1 + l + 54 ln( b p ))+ l + 54 X p ln( e p ) + l + 54 ln( π )We now divide both sides by ( l + 5) / (cid:18) l ( l + 1)12 − (cid:19) (cid:18) l (cid:19) (cid:18) l + 5 (cid:19) = 16 + ε l where ε l = 24 l + 72 l + l − . This proves the assertion that ε l = O (1 /l ) as l → ∞ . Finally, putting everything togetherwe get 16 + ε l ln | ∆ min E/F | [ F : Q ] ≤ ln Diff + X p ln( e p ) + A l,V . (7.16)Here A l,V is as described in the statement of the proposition. (cid:3) Baby Szpiro.
To demonstrate the utility Probabilistic Szpiro we give a “Baby” Szpiroinequality.
Theorem 7.13.1 (Baby Szpiro) . Assume [Moc15a, Corollary 3.12] and Claim 5.0.1. Foran elliptic curve E over a field F sitting in initial theta data ( F /F, E F , l, M , V , V badmod , ǫ ) builtfrom the field of moduli we have
16 + ε l ln | ∆ min E/F | [ F : Q ] ≤ ln([ K : Q ] / ) ln( | Disc( K/ Q ) | / ) + ln( π ) , (7.17) here ε l = (24 l + 72) / ( l + l − . roof of Baby Szpiro. This is just a simple application of the Probabilistic Szpiro for thetadata (Theorem 7.12.1) using elementary bounds for the right hand side. We useln(Diff) ≤ ln(rad | Disc( K/ Q ) | · [ K : Q ]) , (7.18) X p ln e p ≤ ln([ K : Q ]) ω ( | Disc( K/ Q ) | ) , (7.19) X p (cid:16) − P l +14 unr ,p (cid:17) (cid:18) ln( b p ) + 4 l + 1 (cid:19) ≤ ln | Disc( K/ Q ) | , (7.20)where rad( N ) = Q p | N p and ω ( N ) = P d | n Thesetogether with the bounds ω ( N ) ≤ ln( N ) / ln ( N ) give that the right hand side of the secondprobabilistic Szpiro is less thanln( Dd ) + ln( D ) ln( d ) + ln( D )where D = | Disc( K/ Q ) | and d = [ K : Q ]. This simplifies toln( Dd ) + ln( D ) ln( d ) + ln( D ) ≤ (ln( D ) + 2)(ln( d ) + 2) . Since
D, d ≥ SL ( F l ) ≥ a ∈ Q such that ln( x a ) ≥ ln( x ) + 2. Solvingthe inequality gives a ≥ / ln( x ) + 1 and since2ln( x ) + 1 ≤ ≤ D ) + 2)(ln( d ) + 2) ≤ ln( D / ) ln( d / ) which proves the result. (cid:3) Deriving Explicit Constants For Szpiro’s Inequality From Mochizuki’sInequality
In order to get strong uniform versions of Szpiro’s inequality from Mochizuki’s inequalityone needs to do some careful ramification analysis based on the N´eron-Ogg-ShafarevichCriterion ( § | Disc( K/ Q ) | in terms of d , l, | Disc( F/ Q ) | .Here are the questions we needs to answer: What makes a place w ∈ V ( K ) p ramify? Whatis the maximum possible ramification index e ( w/p ) as we vary over w ∈ V ( K ) p ? Does thesize of p matter? We answer all of these questions in the subsequent section and apply theseresults to get our version of uniform Szpiro with exponent 24. The inequality (7.18) is an application of the bounds on the different order given in § p diff p = E ( p diff( v/p ) ) = X v | p [ F ,v : Q p ][ F : Q ] p diff( v/p ) ≤ X v | p [ F ,v : Q p ][ F : Q ] p − e ( v/p ) +ord p e ( v/p ) ≤ p · p ord p [ K : Q ] . We then have Diff(
V / Q ) ≤ Y p p · p ord p [ K : Q ] = rad( | Disc( K/ Q ) | )[ K : Q ] , which gives the result. .1. Ramification Analysis.
The following Lemma answers the question about the max-imal ramification index.
Lemma 8.1.1.
For every place w ∈ V ( K ) we have e ( w/p ) ≤ B l,d where B l,d = 276480 l d .Proof. Fix w ∈ V ( K ). We consider the successive extensions K ⊃ F = F ′ ( E [15]) ⊃ F ′ = F ( E [2] , √− ⊃ F ⊃ Q . We label the various images of w under the induced map on places as follows: V ( K ) → V ( F ) → V ( F ′ ) → V ( F ) → V ( Q ) w v v ′ v p. In this notation we have e ( w/p ) = e ( w/v ) e ( v/v ′ ) e ( v ′ /v ) e ( v /p ) ≤ l · · · [ F : Q ]= 276480 l d =: B l,d . We explain these inequalities: rach of the extensions (other than F ⊃ Q ) is Galois andwe have G ( K/F ) ⊂ GL ( F l ), G ( F/F ) ⊂ GL ( Z / G ( F ′ /F ) | . Knowing that ( F q ) = q ( q − q −
1) and plugging in explicit values gives the result. (cid:3)
As a Corollary we get the following.
Lemma 8.1.2. If p > B l,d then e ( w/p ) < p − . Note that this implies the ramification of K/ Q is small for all but finitely many places. The upshot of most ramification being small (Lemma 8.1.2) is that it allows us to applyour “trivial bounds” on the p -adic logarithm (Lemma 4.2.3) at all but finitely many places.The sum over p in the proof of explicit Szpiro can be broken down into three cases as shownin Figure 8.1 pB l;d = 276480 l d l + 1 e ( w=p ) small e ( w=p ) tame e ( w=p ) wild p ≤ l l + 1 < p < B l;d p > B l;d Figure 1.
A breakdown of the ramification of a tuple ~v = ( v , v , . . . , v j ) ∈ V j +1 p . These come from the hypotheses of “initial theta data built from the field of moduli”. .2. Explicit Szpiro.
In the remainder of the paper we derive the following version ofSzpiro’s inequality from (7.4).
Theorem 8.2.1.
Assume [Moc15a, Corollary 3.12] and Claim 5.0.1. If
E/F is an ellipticcurve in initial theta data ( F /F, E F , l, M , V , V badmod , ǫ ) built from the field of moduli then | ∆ min E/F | ≤ e A d l + B d ( | Cond(
E/F ) | · | Disc( F/ Q ) | ) ε l , (8.1) where A = 84372107405 , B = 316495 and ε l = (96 ( l + 3)) / ( l + l − . Let B = B l,d = 2 d ( Z / l ). In what follows let E p be the expected value ofln µ ~v ( I ~v ) + k diff ~v k − k diff ~v k ∞ + 1 ram ( ~v ) (8.2)over ~v = ( v , . . . , v j ) ∈ ` ( l − / j =1 V ( F ) j +1 p . Above, I ~v denotes the hull of the tensor productof log-shells for ~v ∈ ` ( l − / j =1 V j +1 p . We compute E p for a given by breaking p into the cases • infinite: p = ∞• large: p > B • small: p ≤ B Also, within each case we break (8.2) into three subcomputations:ln µ v ( I v ) | {z } I + k diff ~v k − k diff ~v k ∞ | {z } II + 1 ram ( ~v ) | {z } III . We then put these estimates together to get our results.8.3.
Computation at Infinite Places.
Over the infinite places we have E ∞ ≤ l + 54 ln( π ) . (8.3) Proof.
At the infinite prime II ∞ = III ∞ = 0. The number ln( π )( l + 5) / j + 1) ln( π ) over j which comes from Lemma 5.0.1. (cid:3) Computation at Large Places.
Over the large places we have X p>B,p = ∞ E p ≤ l + 54 X p>B,p || D K, Q | ln( p ) , (8.4)which we can further estimate using X p>B,p || Disc( K/ Q ) | ln( p ) ≤ (cid:18) ln | Disc( F/ Q ) | [ F : Q ] + ln | Cond(
E/F ) | [ F : Q ] (cid:19) . We give a proof of these two claims.
Proof.
By the results of § p > B l,d , and every place v ∈ V p we have e ( v/p ) < p − . his leads to improvements in both the log-shell bounds I p and the different bounds II p .From the estimates on log-shells we know that for ~v ∈ V j +1 p that I ~v ⊂ D ~v (0; p j +1 − P ji =0 /e ( v/p ) ) ⊂ D ~v (0 , p j +1 ) . This implies ln µ ~v ( I ~v ) ≤ ( j + 1) ln( p )and hence E p (ln µ v ( I v )) ≤ l + 54 ln( p )1 ram ( p ) . For the different term II p we have E p ( k diff ~v k − k diff ~v k ∞ k )) ≤ l + 14 diff p ln( p )where diff p = log p ( E ( p diff( v/p ) : v ∈ V p )). Using tameness, we have that diff( v/p ) = 1 − /e ( v/p ) ≤ E ( p diff( v/p ) ) ≤ E ( p ) = p . This givesdiff p ≤ ram ( p ) = ( , ∀ v | p e ( v/p ) = 1 , , ∃ v | p, e ( v/p ) > . Hence E p ( k diff ~v k − k diff ~v k ∞ ) ≤ l + 14 1 ram ( p ) ln( p ) . Finally we estimate the third term III p : E p (1 ram( ~v ) ) ≤ (1 − P l +14 unr ,p ) ≤ ram ( p ) . Putting the estimates for I p , II p , and III p together in the case that p > B we get E p ≤ l + 54 ln( p )1 ram ( p ) + l + 14 ln( p )1 ram ( p ) + 1 ram ( p ) ≤ (cid:18) l + 54 + l + 54 (cid:19) ln( p )1 ram ( p )= l + 52 ln( p )1 ram ( p ) . To finish our result we use the Lemma just outside this proof environment. (cid:3) Lemma 8.4.1. X p || Disc( K/ Q ) | ,p>B ln( p ) ≤ (cid:18) ln | Disc( F/ Q ) | [ F : Q ] + ln | Cond(
E/F ) | [ F : Q ] (cid:19) (8.5) We have decided to label this theorem because it is a critical juncture where discriminants for K meetconductors using N´eron-Ogg-Shafarevich. This seems to be the critical step in relating the two. roof. The hard part of this formula is not getting too greedy, it seems. For p > B we knowthat p | | Disc( K/ Q ) | ⇐⇒ p | | Disc( F/ Q ) | or p | | Cond(
E/F ) | . It is enough to show for each prime p with p > B and p | | Disc( F/ Q ) | we haveln( p ) ≤ (cid:18) − ln | Disc( F/ Q ) | p − ln | Cond(
E/F ) | p [ F : Q ] (cid:19) . Note that we are using p -adic absolute values to take the p -parts of these integers. Weobserve that − ln | Disc( F/ Q ) | p = X w ∈ V ( F ) p f ( w/p ) d ( w/p ) ln( p ) , − ln | Cond( F/ Q ) | p = X w ∈ V ( F ) p f ( w/p ) c E ( w ) ln( p )where we have usedCond( E/F ) = Y w P c E ( w ) w , Disc( F/ Q ) = Y w P d ( w/p w ) w From § d ( w/p w ) = e ( w/p w ) − p w > B . Hence, it is enough to showthat for each p | | Disc( K/ Q ) | that2 P w | p ( f ( w/p )( e ( w/p ) −
1) + f ( w/p ) c E ( w ))[ F : Q ] ≥ . (8.6)Using that p > B and 2( e ( w/p ) − ≥ e ( w/p ) together with the fact that P w ∈ V ( F ) p f ( w/p ) e ( w/p ) =[ F : Q ] we get LHS of (8.6) ≥ [ F : Q ] + P w ∈ V ( F ) p f ( w/p ) c E ( w )[ F : Q ]= 1 + 2 P w ∈ V ( F ) p f ( w/p ) c E ( w )[ F : Q ] . This proves the result. We note that it is strictly greater than one since the initial thetadata hypothesis says that there is a non-empty set of primes in V badmod of bad reduction. (cid:3) Computation at Small Places.
Over the small places we have X p ≤ B E p ≤ ( l + 3) ln( B ) π ( B ) ≍ l d . (8.7) Proof.
In the situation where p ≤ B l,d we have worse bounds for E p . We will not care so muchabout these bounds as they turn into the constant which appears in Szpiro’s inequality. For f ( x ) and g ( x ) positive functions of a single real variable we write f ( x ) ≍ g ( x ) as x → ∞ if and onlyif f ( x ) = O ( g ( x )) and g ( x ) = O ( f ( x )) as x → ∞ . On some level, of course, we do care because we would like better constants. This is secondary achieving some
Szpiro though. n the first term I p we use I ~v ⊂ D ~v p j +1 j Y i =0 e ( v i /p )ln( p ) exp(1) ! ⊂ D ~v (0; p j +1 B j +1 ) . This gives ln µ ~v ( I ~v ) ≤ log p ( p j +1 B j +1 ) ln( p ) = ( j + 1) ln( pB ) , which in turn gives (for p ramified) E p (ln µ ~v ( I ~v )) ≤ E (( j + 1) ln( pB )) = l + 54 ln( pB ) ≤ l + 52 ln( B ) , where the last inequality used p ≤ B .For term II p involving differents, we have E p ( k diff ~v k − k diff ~v k ∞ ) = l + 14 diff p ln( p ) . Since diff( v/p ) ≤ − /e ( v/p ) + ord p ( e ( v/p )) we get diff( v/p ) ≤ p [ K : Q ] which provesdiff p ≤ log p ( E ( p p [ K : Q ] )) = log p ( p p [ K : Q ] ) = 1 + ord p ([ K : Q ]) . Hence we have E p ( k diff ~v k − k diff ~v k ∞ ) ≤ l + 14 (1 + ord p [ K : Q ])) ln( p ) ≤ l + 12 ln( B )Finally in term III p we have E p (1 ram ( ~v )) ≤ (1 − P l +14 unr ,p ) ≤ ram ( p ) . Putting the estimates for I p , II p , and III p together we get X p ≤ B E p ≤ X p ≤ B (cid:20) l + 52 ln( B ) + l + 12 ln( B ) + 1 (cid:21) ≤ (( l + 3) ln( B ) + 1) π ( B ) ≤ ( l + 3) ln( B ) π ( B ) . This gives our main result. The asymptotic is then derived by using bounds in the primenumber function π ( x ). One such bound is Dusart’s bound [Dus18] which states that for x > π ( x ) ≤ x ln( x ) (cid:18) . x ) (cid:19) . (8.8)This then shows, using B = 276480 l d , that( l + 3) ln( B ) π ( B ) ≤ ( l + 3) B (cid:18) . B ) (cid:19) ≍ l d as l → ∞ . (cid:3) Remark . Using a slightly better form of Dusart’s bound gives an 1 / ln( B ) correctionterm. .6. Proof of Explicit Szpiro.
Working from d deg lgp ( P Θ ) − d deg( P q ) ≤ X p E p (8.9)The left hand side of (8.9) becomes (cid:18) l ( l + 1)12 − (cid:19) (cid:18) l (cid:19) ln | ∆ min E/F | [ F : Q ] , and the right hand side of (8.9) becomes X p E p = X p ≤ B E p + X p>B E p + E ∞ ≤ ( l + 3) ln( B ) π ( B ) + l + 52 · (cid:18) ln | Disc( F/ Q ) | [ F : Q ] + ln | Cond(
E/F ) | [ F : Q ] (cid:19) + (cid:18) l + 54 (cid:19) ln( π )We now divide both sides by ( l + 5). The coefficient of the left hand side becomes (cid:18) l ( l + 1)12 − (cid:19) (cid:18) l (cid:19) (cid:18) l + 5 (cid:19) = l + l − l ( l + 5) =: 124 + ε l , where solving for ε l gives ε l = 96 ( l + 3) l + l − . We now have124 + ε l ln | ∆ min E/F | ≤ [ln( B ) π ( B ) + ln( π )] [ F : Q ] + ln | Disc( F/ Q ) | + ln | Cond(
E/F ) | (8.10)Finally, using d = 276480 (the upper bound on [ F : F ]) so that B = l d d , we get[ln( B ) π ( B ) + ln( π )] [ F : Q ] ≤ l d (cid:18) d (cid:18) . d ) (cid:19) + ln( π ) (cid:19) d d ≤ A d l + B d where A = 84372107405, and B = 316495. This gives our result after rewriting (8.10)multiplicatively with new bounds. References [AM69] Michael Atiyah and Ian Macdonald,
Introduction to commutative algebra, , Addison Wesley, 1969.2.8[Aut19] Stacks Project Authors,
Stacks project , 2019. 2.2[Con] Keith Conrad,
Differents , Notes of course, available on-line. 2.2[DH20a] Taylor Dupuy and Anton Hilado,
Log-Kummer Correspondences and Mochizuki’s Third Indeter-minacy , pre-print (2020). 4[DH20b] ,
The Statement of Mochizuki’s Corollary 3.12, Initial Theta Data, and the First TwoIndeterminacies . (document), 1, 1, 1, 1, 3.4, 6.2, 8, 6.3, 7.5[Dus18] Pierre Dusart,
Explicit estimates of some functions over primes , Ramanujan J. (2018), no. 1,227–251. MR 3745073 8.5 Fes15] Ivan Fesenko,
Arithmetic deformation theory via arithmetic fundamental groups and nonar-chimedean theta-functions, notes on the work of Shinichi Mochizuki , Eur. J. Math. (2015),no. 3, 405–440. MR 3401899 1, 7.5.3[Hos15] Yuichiro Hoshi, IUT Hodge-Arakelov-theoretic evalutation , 2015. 1[Hos17] , [IUTchIII-IV] from the point of view of mono-anabelian transport , 2017. 1, 7.5.3[Hos18] ,
Introduction to mono-anabelian geometry , 2018. 1[Ked15] Kiran Kedlaya,
Etale theta function , 2015. 1[Moc15a] Shinichi Mochizuki,
Inter-universal Teichm¨uller theory III: Canonical splittings of the log-theta-lattice , RIMS preprint (2015). (document), 1, 1.0.3, 1.0.4, 1.0.5, 1, 1a, 5, 7.5, 7.5.1, 7.12.1, 7.13.1,8.2.1[Moc15b] ,
Inter-universal Teichm¨uller theory IV: log-volume computations and set-theoretic founda-tions , RIMS preprint (2015). (document), 2.8, 4.2, 4.2.4, 5, 7.5[Moc15c] , Topics in absolute anabelian geometry III: global reconstruction algorithms , J. Math. Sci.Univ. Tokyo (2015), no. 4, 939–1156. MR 3445958 4.3[Moc17] Shinichi Mochizuki, The mathematics of mutually alien copies: From Gaussian integrals to inter-universal Teichmuller theory , 2017. 1[Moc18] ,
Comments on the manuscript (2018-08 version) by Scholze-Stix concerning inter-universalTeichmuller theory (iutch) , 2018. 1[Mok15] Chung Pang Mok,
Notes on Hodge theaters (for the 2015 Oxford workshop). , Handwritten Notes,2015. 1[Neu99] J¨urgen Neukirch,
Algebraic number theory , Grundlehren der Mathematischen Wissenschaften [Fun-damental Principles of Mathematical Sciences], vol. 322, Springer-Verlag, Berlin, 1999, Translatedfrom the 1992 German original and with a note by Norbert Schappacher, With a foreword by G.Harder. MR 1697859 2.2, 4[Rob00] Alain M. Robert,
A course in p -adic analysis , Graduate Texts in Mathematics, vol. 198, Springer-Verlag, New York, 2000. MR 1760253 4[Rob 3] David Roberts, A crisis of identification , Inference Review (2019 in Volume 4, Issue 3). 3[Sil09] Joseph H Silverman, The arithmetic of elliptic curves , vol. 106, Springer Science & Business Media,2009. 3.5[Sil13] ,
Advanced topics in the arithmetic of elliptic curves , vol. 151, Springer Science & BusinessMedia, 2013. 3.5[SS17] Peter Scholze and Jakob Stix,
Why abc is still a conjecture. , 2017. 1, 1, 1e, 2, 7.5.3[ST68] Jean-Pierre Serre and John Tate,
Good reduction of abelian varieties , Ann. of Math. (2) (1968),492–517. MR 0236190 1, 3.9.1[Sti15] Jakob Stix, Reconstruction of fields using Belyi cuspidalization , 2015. 1[Sut15] Drew Sutherland,
Notes for 18.785 - number theory i , MIT course notes (2015). 2.2[Tan18] Fucheng Tan,
Note on IUT , 2018. 1, 2[Yam17] Go Yamashita,
A proof of the ABC conjecture after Mochizuki , RIMS preprint (2017). 1, RIMS preprint (2017). 1