[PDF] Heights on square of modular curves

Abstract

We develop a strategy for bounding from above the height of rational points of modular curves with values in number fields, by functions which are polynomial in the curve's level. Our main technical tools come from effective Arakelov descriptions of modular curves and jacobians. We then fulfill this program in the following particular case: If p is a not-too-small prime number, let X 0 (p) be the classical modular curve of level p over Q . Assume Brumer's conjecture on the dimension of winding quotients of J 0 (p) . We prove that there is a function b(p)=O( p 5 logp) (depending only on p ) such that, for any quadratic number field K , the j -height of points in X 0 (p)(K) which are not lifts of elements of X + 0 (p)(Q) , is less or equal to b(p) .

Full PDF

aa r X i v : . [ m a t h . N T ] J a n Heights on squares of modular curves

Pierre Parentwith an appendix by Pascal AutissierJanuary 17, 2019

Abstract

We develop a strategy for bounding from above the height of rational points of modularcurves with values in number ﬁelds, by functions which are polynomial in the curve’s level.Our main technical tools come from eﬀective Arakelov descriptions of modular curves andjacobians. We then fulﬁll this program in the following particular case:If p is a not-too-small prime number, let X ( p ) be the classical modular curve of level p over Q . Assume Brumer’s conjecture on the dimension of winding quotients of J ( p ). Weprove that there is a function b ( p ) = O ( p log p ) (depending only on p ) such that, for anyquadratic number ﬁeld K , the j -height of points in X ( p )( K ) which are not lifts of elementsof X +0 ( p )( Q ), is less or equal to b ( p ).AMS 2000 Mathematics Subject Classiﬁcation 11G18 (primary), 14G40, 14G05 (sec-ondary). Contents j -height and Θ -height 175 Height of modular curves and the various W d

246 Arithmetic B´ezout theorem with cubist metric 337 Height bounds for quadratic points on X ( p )

428 Appendix: An upper bound for the theta function, by P. Autissier 45

Let N be an integer, Γ N a level- N congruence subgroup of GL ( Z ), and X Γ N the associatedmodular curve over some subﬁeld of Q ( µ N ) which, to simplify the discussion, we assume from nowon to be Q . The genus g N of X Γ N grows roughly as a polynomial function of N . So if N is not toosmall, X Γ N has only a ﬁnite number of rational points with values in any given number ﬁeld, byMordell-Faltings. If one is interested in explicitly determining the set of rational points however,ﬁniteness is of course not suﬃcient; a much more desirable control would be provided by upperbounds, for some handy height, on those points. Proving such an “eﬀective Mordell” is known tobe an extremely hard problem for arbitrary algebraic curves on number ﬁelds.1n the case of modular curves, however, the situation is much better. Indeed, whereas thejacobian of a random algebraic curve should be a somewhat equally random simple abelian variety,it is well-known that the jacobian J Γ N of X Γ N decomposes up to isogeny into a product of quotientabelian varieties deﬁned by Galois orbits of newforms for Γ N . Moreover, in many cases, a nontrivialpart of those factors happen to have rank zero over Q . Our rustic starting observation is thereforethe following: if J Γ N ,e is the “winding quotient” of J Γ N , that is the largest quotient J Γ N ,e withtrivial Q -rank, and X Γ N ι ֒ → J Γ N π e ։ J Γ N ,e is some Albanese map from the curve to its jacobian followed by the projection to J Γ N ,e , then anyrational point on X Γ N has an image which is a torsion point (because rational) on J Γ N ,e , hencehas 0 normalized height. The pull-back of some invertible sheaf deﬁning the (say) theta height on J Γ N ,e therefore deﬁnes a height on X Γ N which is trivial on rational points. That height in turnnecessarily compares to any other natural one, for instance the modular j -height. Therefore the j -height of any rational point on X Γ N is also zero “up to error terms”. Making those error termsexplicit would give us the desired upper bound for the height of rational points on X Γ N .That approach can in principle be generalized to degree- d number ﬁelds, by considering rationalpoints on symmetric powers X ( d )Γ N of X Γ N (at least if dim J Γ N ,e ≥ d ). To be a little bit more precisein the present case of symmetric squares, let us associate to a quadratic point P in X ( p ) the Q -point Q := ( P, σ P ) of X ( p ) (2) . Its image ι ( Q ) via some appropriate Albanese embedding in J ( p )lies above a torsion point a in J e : assume for simplicity a = 0. We therefore know ι ( Q ) belongsto the intersection of ι ( X ( p ) (2) ) with the kernel ˜ J ⊥ e of the projection π e : J ( p ) ։ J e . To improve the situation we can further remark that ι ( Q ) actually lies at the intersection of ι ( X ( p ) (2) ) with the “projection”, in some appropriate sense, of the latter surface on ˜ J ⊥ e . Then onecan show that this intersection is 0-dimensional (but here we need to assume Brumer’s conjecture,see below) so that its theta height is controlled, via some arithmetic B´ezout theorem, in terms ofthe degree and height of the two surfaces we intersect. Using an appropriate version of Mumford’srepulsion principle one derives a bound for the height of ι ( P ) too (and not only for its sum ι ( Q )with its Galois conjugate). Then one makes the translation again from theta height to j -heighton X ( p ).Nontrivial technical work is of course necessary to give sense to the straightforward strategysketched above. The aim of this article is thus to show the possibility of that approach, by makingit work in what we feel to be the simplest non-trivial case: that of quadratic points of the classicalmodular curve X ( p ) as above (or X ( p ), for technical reasons), for p a prime number . In thecourse of the proof we are led to assume the already mentioned conjecture of Brumer, whichasserts that the winding quotient of J ( p ) := J Γ ( p ) has dimension roughly half that of J ( p ).That hypothesis is actually used in only one, technical, but crucial place, where we prove thata morphism between two curves is a generic isomorphism (see last point of Lemma 7.2). Notethat a lower bound of 1 / /

2) for the asymptotic ratio dim J e / dim J ( p )has been proven by Iwaniec-Sarnak and Kowalski-Michel-VanderKam. (Actually, (1 / ε ) wouldbe suﬃcient for us, see Lemma 7.2 and the proof of Theorem 7.5 below.) In any case we cannotat the moment get rid of this assumption - note it can in principle be numerically checked in allspeciﬁc cases. In this setting, our main result is the following (see Theorem 7.5). Theorem 1.1

For w p the Fricke involution, set X +0 ( p ) = X ( p ) /w p . Assume Brumer’s conjecture(see Section 2, (21) ). Then the quadratic points of X ( p ) , which are not lifts of elements of X +0 ( p )( Q ) , have j -height bounded from above by O ( p log p ) . Larson and Vaintrob have proven, under the Generalized Riemann Hypothesis, the asymptotic triviality ofrational points on X ( p ) with values in any given number ﬁeld which does not contain the Hilbert class ﬁeld ofsome quadratic imaginary ﬁeld (see [35], Corollary 6.5). Independently of any conjecture, Momose had alreadyproven the same result in the case where K is a given quadratic number ﬁeld ([45]). Our method however providesbounds which do not depend on the ﬁeld, and should generalize to some other congruence subgroups. The weak version of that conjecture we actually need is stated in (22). he same holds true for quadratic points of X ( p ) , without the restriction about X +0 ( p ) . Needless to say, this result cries for both sharpening and generalization. Yet it should bepossible to immediately use avatars of Theorem 1.1 to prove that rational points are only cusps andCM points, for some speciﬁc modular curves of arithmetic interest. If combined with lower boundsfor heights furnished by isogeny theorems as in [5], the above theorem already has consequenceson rational points (see Corollary 7.6).Regarding past works about rational points on modular curves, one can notice that most ofthem use, at least in parts, some variants of Mazur’s method, which can very roughly be dividedinto two steps: ﬁrst, map modular curves to winding quotients as described above; then provesome quite delicate properties about completions of that map to J e (formal immersion criteria).The second step is probably the most diﬃcult to carry over to great generality. The method wehere propose therefore allows one to use only the ﬁrst and crucial fact - the mere existence ofnontrivial winding quotients. In many cases, the existence of such quotients is known to be a deepresult of Kolyvagin-Logachev-Kato, `a la Birch-Swinnerton-Dyer Conjecture which, again, seemsto reﬂect, from the arithmetic point of view, the quite special properties of the image locus (inthe moduli space of principally polarized abelian varieties) of modular curves, among all algebraiccurves, under Torelli’s map.The methods used in this paper are mainly explicit Arakelov techniques for modular curvesand abelian varieties. Such techniques and results have been pioneered, as far as we know, byAbbes, Michel and Ullmo at the end of the 1990s (see in particular [2], [43] and [62], whose resultswe here eagerly use). They have subsequently been revisited and extended in the work developedby Edixhoven and his school, as mainly (but not exhaustively) presented in the orange book [13].That work was motivated by algorithmic Galois-representations issues, but its tools are well suitedto our rational points questions, as we wish to show here. We similarly hope that the eﬀectiveArakelov results about modular curves and jacobians we work out in the present article shall proveuseful in other contexts .The layout of this article is as follows. In Section 2 we start gathering classical instrumen-tal facts on quotients of modular jacobians and regular models of X ( p ) over rings of algebraicintegers. In Section 3 we make a precise description of the arithmetic Chow group of X ( p ).Section 4 provides an explicit comparison theorem between j -heights and pull-back of normalizedtheta height on the jacobian. Section 5 computes the degree and Faltings height of the image ofsymmetric products within modular jacobians. In Section 6 we prove our arithmetic B´ezout the-orem (in the sense of [8]) for cycles in J ( p ), relative to cubist metrics (instead of the more usualFubini-Study metrics). This seems more natural, and has the advantage of being quantitativelymuch more eﬃcient; that constitutes the technical heart of the present paper. Then we apply thatarithmetic B´ezout to our modular jacobian after technical computations on metric comparisons.Section 7 concludes the computations of the height bounds for quadratic rational points on X ( p )by making various intersections, projections and manipulations for which to refer to loc. cit. Convention.

In order to avoid numerical troubles, we safely assume in all what follows thatprimes are by deﬁnition strictly larger than 17.

Let K be a ﬁeld, J an abelian variety of dimension g over K and L an ample invertible sheafdeﬁning a polarization of J . Assume J is K -isogenous to a product of two (nonzero) subvarieties, For recent investigations related to more general questions of eﬀective bounds of algebraic points on curves,one can check [11]. ι A : A ֒ → J, ι B : B ֒ → J (1)endowed with the polarization ι ∗ A ( L ) and ι ∗ B ( B ) respectively, such that ι A + ι B : A × B → J is anisogeny. (Recall that by convention here, all abelian (sub)varieties are assumed to be connected.)Then π A : J → A ′ := J mod B , and similarly π B : J → B ′ , are called optimal quotients of J .To simplify things we also assume from now on that End K ( A, B ) = { } . The product isogeny π := π A × π B : J → A ′ × B ′ induces isogenies A → A ′ and B → B ′ . We writeΦ : A × B → J → A ′ × B ′ for the obvious composition. Taking for instance dual isogenies of A → A ′ and B → B ′ , we alsodeﬁne an endomorphism Ψ : J → A ′ × B ′ → A × B → J. (2)When K = C , the above constructions are transparent. There is a Z -lattice Λ in C g , endowedwith a symplectic pairing, such that J ( C ) ≃ C g / Λ and one can ﬁnd a direct sum decomposition C g = C g A ⊕ C g B such that if Λ A = Λ ∩ C g A and Λ B = Λ ∩ C g B , then A ( C ) ≃ C g A / Λ A and B ( C ) ≃ C g B / Λ B . If p A : C g → C g A and p B : C g → C g B are the C -linear projections relative to that decomposition,the analytic description of π A, C : J ( C ) → A ′ ( C ) is then z mod Λ z mod (Λ + Λ B ⊗ R ) = p A ( z ) mod ( p A (Λ)) . Summing-up we have lattice inclusions: Λ A ⊆ p A (Λ), Λ B ⊆ p B (Λ), with ﬁnite indices, in C g such that our isogenies are induced byΛ A ⊕ Λ B ⊆ Λ ⊆ p A (Λ) ⊕ p B (Λ) . The isogeny I ′ A : A → A ′ deduced from the inclusion Λ A ⊆ p A (Λ) has degree card( p A (Λ) / Λ A ).If N A is a multiple of the exponent of the quotient p A (Λ) / Λ A , there is an isogeny I A,N A : A ′ → A such that I A,N A ◦ I ′ A and I ′ A ◦ I A,N A both are multiplication by N A . The analytic descriptions ofthe above clearly are: ( A ( C ) ≃ C g A / Λ A I ′ A −→ A ′ ( C ) ≃ C g A /p A (Λ) z z and ( C g A /p A (Λ) I A,NA −→ C g A / Λ A z N A z. (3) Remark 2.1

Instead of considering two immersions as in (1), suppose only

A ֒ → J is given, and K is a number ﬁeld. One might apply [21], Th´eor`eme 1.3, to deduce the existence of an abelianvariety B over K such that, with our previous notations, the degree of A × B + −→ J : | A ∩ B | = | Λ / Λ A ⊕ Λ B | is bounded from above by an explicit function κ ( J ) of the stable Faltings’ height h F ( J ): κ ( J ) = (cid:16) (14 g ) g [ K : Q ] max( h F ( J ) , log[ K : Q ] , (cid:17) g and this does not depend on the choice of the embedding K ֒ → C . Note that when A and ( J mod A )are not isogenous (which will be the case for us), then there is actually no choice for that B ֒ → J :it has to be the Poincar´e complement to A . The isogeny J → A ′ × B ′ given by the two projectionshas degree | p A (Λ) ⊕ p B (Λ) / Λ | , which also is | A ∩ B | := N . One can therefore take the N A appearingin (3) as equal to N , and N ≤ κ ( J ) . Making the same for B ′ → B , the above morphism Ψ (see (2)) is then simply the multiplication J [ N · ] −→ J by the integer N . Although we will not need numerical estimates for those quantitiesin what follows, it is straightforward, using [62], to make them explicit in our setting of modularcurves and jacobians. 4 .1.2 Polarizations and heights Keeping the above notations and hypothesis, consider in addition now an ample sheaf Θ on J ,and let I A := I A,N : A ′ → A (respectively, I B,N ) be as in (3). We pull-back Θ along the composedmorphism: ϕ A : J π A −→ A ′ I A −→ A ι A −→ J (4)so that the immersion ı A : A ֒ → J deﬁnes a polarization Θ A := ı ∗ A (Θ) on A , whence a polarizationΘ A ′ := I A ∗ (Θ A ) on A ′ , and ﬁnally an invertible sheaf Θ J,A := π ∗ A (Θ A ′ ) on J . Composing themorphisms: J π A × π B −→ A ′ × B ′ I A × I B −→ A × B ι A + ι B −→ J (5)gives the multiplication-by- N map: J [ · N ] −→ J . Assuming for simplicity Θ is symmetric one thereforehas [ · N ] ∗ Θ ≃ Θ ·⊗ N ≃ Θ J,A ⊗ O J Θ J,B . (6)If K is a number ﬁeld, the N´eron-Tate normalization process associates with Θ a system ofcompatible Euclidean norms h Θ = k · k on the ﬁnite-dimensional Q -vector spaces J ( F ) ⊗ Z Q , for F/K running through the number ﬁeld extensions of K , and similarly Euclidean norms h Θ A := k · k ·⊗ N A := N k · k A on A ( F ) ⊗ Z Q and h Θ B := N k · k B on B ( F ) ⊗ Z Q such that, under theisomorphisms J ( F ) ⊗ Z Q ≃ ( A ( F ) ⊗ Z Q ) ⊕ ( B ( F ) ⊗ Z Q ), one hash Θ = h Θ A + h Θ B . (7)Recall from (3) the deﬁnition of N A , that of the maps A ′ I A,NA −→ A and A ι A ֒ → J . Denote by[ N A ] A the multiplication by N A restricted to A . If V is a closed algebraic subvariety of J , deﬁne P A ( V ) := (cid:0) ι A [ N A ] − A I A,N A π A (cid:1) ( V ) (8)as the reduced closed subscheme with relevant support. That map P A would simply be theprojection of V on A if J were isomorphic to the product A × B of subvarieties, and is the bestapproximation to that projection in our case when J is only isogenous to A × B .Note that P A ( V ) is a priori highly non-connected. All its irreducible geometric componentsare however obtained from each other by translation by a N A -torsion point of A ( Q ). For our laterpurposes (see proof of Theorem 7.5), we will have the possibility to replace P A ( V ) by one of itscomponents containing a speciﬁc point, say P : we shall denote that component by P A ( V ) P , andrefer to it as the “pseudo-projection” of V on A containing P .Suppose now J ∼ A × B as above is the jacobian of an algebraic curve X on K with positivegenus g . For P a point of X ( K ) (or more generally a K -divisor of degree 1 on X ) let ı P : (cid:26) X ֒ → JP ( P ) − ( P ) (9)be the Albanese embedding associated with P . We deﬁne the classical Theta divisor θ on J whichis the image of ı g − P : X g − → J and its symmetric versionΘ := ( θ ⊗ O J [ − ∗ θ ) ·⊗ (10)(which is a translate of θ obtained as ı g − κ ( X g − ), where ı κ = t ∗ κ ı P for t κ the translation bysome κ with (2 g − κ = κ : the canonical divisor on X . Of course Θ does not need to be deﬁnedover K ). Our ﬁrst aim will be to compare the height functions k ı P ( · ) k Θ A ·⊗ N on X ( F ), when X is a modular curve, with another natural height given by the modular j -function.5e will discuss in Section 3 an Arakelov description of N´eron-Tate height. We conclude thisparagraph by a few remarks as a preparation. Let B := { ω , . . . , ω g } be a basis of H ( X ( C ) , Ω X/ C ) ≃ H ( J ( C ) , Ω J/ C ), which is orthogonal with respect to the norm k ω k = i Z X ( C ) ω ∧ ω. The transcendent writing-up of the Abel-Jacobi map ι P : P ( R PP ω i ) ≤ i ≤ g shows that the pull-back to X ( C ) of the translation-invariant measure on J ( C ), normalized to have total mass 1,is µ = i g X B ω ∧ ω k ω k . (11)More generally, π A ◦ ι P is, over C , the map P ( R PP ω ) ω ∈ B A , where B A is some orthogonalbasis of H ( A ′ ( C ) , Ω A ′ / C ) ≃ H ( J ( C ) , π ∗ A (Ω A ′ / C )) ⊆ H ( J ( C ) , Ω J/ C ). Therefore, writing g A :=dim( A ′ ) = dim( A ) (we assume A = 0), the pull-back to X ( C ) of the translation-invariant measureon A ′ ( C ) (normalized so to have total mass 1 on the curve again) is µ A = i g A X B A ω ∧ ω k ω k . (12) Here we recall a few classical facts on the minimal regular model of the modular curve X ( p ), for p a prime number, over a ring of algebraic integers. The ﬁrst general reference on this topic is [14];see also [13] or [40], [41]. j -height The quotient of the completed Poincar´e upper half-plane

H ∪ P ( Q ) by the classical congruencesubgroup Γ ( p ) deﬁnes a Riemann surface X ( p )( C ) which is known to have a geometricallyconnected smooth and proper model over Q . All through this paper, we denote its genus by g .The ﬁrst technical theme of this article is the explicit comparison of various heights on X ( p )( Q ). When V is an algebraic variety over a number ﬁeld K , any ﬁnite K -map ϕ : V → P NK to some projective space deﬁnes a naive Weil height on V ( K ). This applies in particular when V is a curve and ϕ is the ﬁnite morphism deﬁned by an element of the function ﬁeld of V , and in thecase of a modular curve X Γ associated with some congruence subgroup Γ, say, a natural heightto choose on X Γ ( Q ) is precisely Weil’s height h( P ) = h( j ( P )) relative to the classical j -function.The degree of the associated map X Γ → X (1) ≃ P is [PSL ( Z ) : Γ], so that number is the class ofour Weil height in the N´eron-Severi group NS ( X Γ ) identiﬁed with Z . More explicitly if X = X Γ is deﬁned over the number ﬁeld K , say, the j -morphism is ( X  → P K = Proj( K [ X , X ]) ← ֓ A K = Spec( K [ X /X ]) P (1 , j ( P )) = (1 /j ( P ) , ← j ( P ) = X X ( P ) , and the Weil height of a point P ∈ X ( K ) is therefore the naive height of its j -invariant as analgebraic number:h( P ) = h( j ( P )) = 1[ K : Q ] X v ∈ M K [ K v : Q v ] log(max(1 , | j ( P ) | v ))which is also Weil’s projective height h(  ( P )) with respect to the above basis ( X , X = X j ) ofglobal sections of O P K (1). Our Weil height on X is associated with the linear equivalence classes6f divisors D corresponding to  ∗ ( O P K (1)), so that D ∼ (poles of j on X )( ∼ (zeroes of j )) ∼ X c ∈{ cusps of X } e c .c where each e c is the ramiﬁcation index of c via  .Those considerations lead to explicit comparisons with other heights. Indeed, a more intrinsicway to deﬁne heights on algebraic varieties is provided by Arakelov theory. Deﬁning this properlyin the case of our modular curves demands a precise description of regular models for them, whichwe now recall. The normalization of the j -map X ( p ) → X (1) / Z ≃ P / Z over Z deﬁnes a model for X ( p ), that wecall the modular model, it is smooth over Z [1 /p ].We ﬁx a number ﬁeld K , write O K for its ring of integers, and deduce by base change a modelfor X ( p ) over O K . We know its only singularities are normal crossing, so after a few blow-upsif necessary we obtain a regular model of X ( p ) over O K : see Theorem 1.1.d) of the Appendixof [39]. We denote it from now on by X ( p ) / O K , or simply X ( p ) if the context prevents confusion.We stress here that for F/K a ﬁeld extension, X ( p ) / O F is not the base change to O F of X ( p ) / O K if F/K ramiﬁes above p . Let v be a place of O K above p , with residue ﬁeld k ( v ). The dualgraph of X ( p ) at v is made of two extremal vertices, which we label C and C ∞ , containing thecusps 0 and ∞ respectively (see Figure 1). Those two vertices, which correspond to irreduciblecomponents of genus 0, are linked by s := g + 1branches. Each branch corresponds to a singular point S in X ( p )( F p ), which in turn parameter-izes an isomorphism class of supersingular elliptic curve E S in characteristic p .The Fricke involution w p acts on the dual graph as the continuous isomorphism which exchanges C and C ∞ and acts on the branches as a generator of Gal( F p / F p ).We list the supersingular points as S (1) , . . . S ( s ) and for each one deﬁne w n := S ( n )) / h± i := F p ( E S ( n ) ) / h± i (13)which is equal to 1 except in the (at most two) cases when the underlying supersingular ellipticcurve has j -invariant 1728 or 0, where it is equal to 2 or 3 respectively. Now each path, or branch,on our dual graph at v passes through ( w n e −

1) vertices (for e the ramiﬁcation index of K at v ), thatis, again, equal to e − e − j = 1728 mod v , if it exists), and a path of length 3 e − j = 0 mod v ). Weenumerate the vertices { C n,m } ≤ m ≤ w n e − in the n th path. We also denote by w (Eis) the familiarquantity P w n , the sum being taken over the set of all supersingular points of X ( p ) / O K,v . Thewell-known Eichler mass formula says that w (Eis) = X ≤ n ≤ s w n = p −

112 (14)(see for instance [24], p. 117). Recall this implies that the genus g of X ( p ) is asymptoticallyequivalent to p/

12 (the exact formula depending on the residue class of p mod 12) and in any case: p − ≤ g ≤ p + 112 (15)(see for instance p. 117 of [24] again).Abusing a bit notations, C ∞ will sometimes be also denoted as C n, , and similarly C mightbe written as C n,w n e . We choose as a basis for ⊕ C Z · C the ordered set B = ( C ∞ , ( C , , C , , · · · , C ,e − ) , ( C , , · · · , C ,e − ) , . . . , ( C s, , · · · , C s,w s e − ) , C ) (16)7 ∞ ✖✕✗✔✚✙✛✘ (cid:0)(cid:0) C , ✚✙✛✘ ✟✟ C , ✚✙✛✘ · · · C ,e − ✣✢✤✜ ❅❅✏ C , ✚✙✛✘ ✟✟ C , ✚✙✛✘ · · · C ,e − ✣✢✤✜ ❍❍ C s, ✚✙✛✘ ❍ C s, ✚✙✛✘ ❳❳ C s, ✚✙✛✘ · · · C s, ∗ e − ✣✢✤✜ · · · · · · · · · · · ·· · · · · · · · · · · · C ✖✕✗✔✚✙✛✘ Figure 1: Dual graph of X ( p ) / O K at v .(that is, we enumerate the vertices by running through each branch successively, and put thepossible branches of length twice or thrice the generic length at the end). At bad places v theintersection matrix restricted to each submodule ⊕ w n e − m =1 Z · C n,m (for some ﬁxed branch of index n )is then (log( k ( v )) · M , where M =  − · · · − · · ·

00 1 − · · · − · · · −  , (17)whose only dependence on n is that its type is ( w n e − × ( w n e − − w n e − w n e . Deﬁne the row vectors: L := (1 0 0 · · · , L ′ := (0 0 0 · · · V := L t , V ′ := L ′ t . The intersection matrix on the whole space Z B is ﬁnally (log( k ( v )) · M ) for M =  − s L L · · · L V M · · · V ′ V M · · · V ′ ... ... ... . . . ... ... V · · · M V ′ L ′ L ′ · · · L ′ − s  . (18)(This has to be modiﬁed in the obvious way when e v = 1.) We denote as usual the jacobian of X ( p ) Q by J ( p ). As follows from section 2.2.2, X ( p ) issemistable over Z , and the neutral component of the N´eron model J ( p ) of J ( p ) is a semi-abelianscheme over Z (and an abelian scheme over Z [1 /p ]). Its neutral component represents the neutralcomponent Pic Z ( X ( p )) of the relative Picard functor of X ( p ) over Z .We know from Shimura’s theory that the natural decomposition of cotangent spaces into Heckeeigenspaces induces a corresponding decomposition over Q of abelian varieties up to isogenies: J ( p ) ∼ Y f ∈ B / Gal( Q / Q ) J f (19)8ndexed by Galois orbits in some set B of newforms. A ﬁrst useful sorting of this decompositioncomes from the sign of the functional equations for the L -functions of eigenforms f , that is, whether w p ( f ) equals f or − f . One accordingly writes J ( p ) − for the optimal quotient abelian varietyassociated with Q f,w p ( f )= − f J f in (19), and similarly J ( p ) + , so that J ( p ) − = J ( p ) / (1+ w p ) J ( p )and J ( p ) + = J ( p ) / (1 − w p ) J ( p ). One knows thatdim J ( p ) − = ( 12 + o (1)) dim J ( p )(see e.g. [59], Lemme 3.2).A more subtle object is the winding quotient J e , deﬁned as the optimal quotient of J ( p )corresponding to Q f,L ( f, =0 J f in decomposition (19). One can write J e = J ( p ) /I e J ( p ) (20)for some ideal I e of the Hecke algebra T Γ ( p ) . Similarly, J ⊥ e = J ( p ) /I ⊥ e J ( p ) will denote theoptimal quotient corresponding to Q f,L ( f, J f . For obvious reasons regarding signs of functionalequations, J e is contained in J ( p ) − . But more is expected: in line with the principle that “thevanishing order of a (modular) L functions at the critical point should generically be as small asallowed by parity”, Brumer ([10]) conjectured that, as p tends to inﬁnity,(?) dim J e = (1 − o (1)) dim J ( p ) − . (Brumer) (21)Equivalently, it is conjectured that dim J e = ( + o (1)) dim J ( p ), or that the dimensions of J e and J ⊥ e should be, asymptotically in p , of equal size. Note that (21) above is also implied bythe “Density Conjecture” of [28], p. 56 et seq., see also Remark F on p. 65 . Actually, what weeventually need in this article (see Section 7) is a weaker form of (21), which is:(?) dim J e > dim J ( p )3 + 23 (22)for large enough p . An important theorem of Iwaniec-Sarnak and Kowalski, Michel and Vanderkamasserts something nearly as good, that is :( 14 − o (1)) dim J ( p ) ≤ dim J e ( ≤ ( 12 + o (1)) dim J ( p )) (23)as p goes to inﬁnity (so that ( − o (1)) dim J ( p ) ≤ dim J ⊥ e ≤ ( + o (1)) dim J ( p ), see [29],Corollary 13 and [34]). Breaking that is known to be closely linked to the Landau-Siegelzero problem. Assuming the Generalized Riemann Hypothesis for L -functions of modular forms,Iwaniec, Luo and Sarnak prove one can improve to ([28], Corollary 1.6, (1.54))... That seemsto be all for the moment.The central object of this paper will eventually be the maps X ( p ) ( d ) → J ( p ) → J e from symmetric products of X ( p ) (mainly the curve itself and its square) to the winding quotient. We now give a description of the Arakelov geometry of X ( p ), relying on the work of many people:that topic has been pioneered by Abbes, Ullmo and Michel ([2], [43], [62]) and notably developed Quoting Olga Balkanova (private communication), “Theorem 1.1 in [28] is proved for the test function φ ,whose Fourier transform is supported on the interval [ − , . The density conjecture claims that the same resultsare true without restriction on Fourier transform of φ , see formula 1.9 [of loc. cit.].”

9y Edixhoven, Couveignes and their coauthors (see [13]). We shall also use the work of Bruin([9]), Jorgenson-Kramer ([32]) and Menares ([40], [41]) among others. We refer to those articlesand their bibliography for general facts on Arakelov theory (see [12], [16]).Let X be any regular and proper arithmetic surface over the integer ring O K of a number ﬁeld K . Fixing in general smooth hermitian metrics µ on the base changes of X to C , it follows fromthe basics of Arakelov theory that for any horizontal divisor D on X over O K there are Greenfunctions g µ,D on each Archimedean completion X ( C ) satisfying the diﬀerential equation∆ g µ,D = − δ D + deg( D ) µ for ∆ = iπ ∂∂ the Laplace operator and δ D the Dirac distribution relative to D C on X ( C ). Thefunction g µ,D is integrable on the compact Riemann surface X ( C ) endowed with its measure µ , anduniquely determined up to an additive constant which is often ﬁxed by imposing the normalizingcondition that Z X ( C ) g µ,D µ = 0 . (24)When the horizontal divisor D is a section P in X ( O K ), one will sometimes also use the notation g µ ( P , z ) for g µ,P ( z ). The Green functions relative to ﬁxed smooth (1 , µ allow one todeﬁne an Arakelov intersection product relative to the µ , which will be denoted by [ · , · ] µ , or [ · , · ]if there is no ambiguity about the implicit form. In particular the index will often be dropped fordivisors intersections of which one at least is vertical, where the choice of µ does not intervene.We shall denote by µ the canonical Arakelov (1 , X ( C ) (as-sumed to have positive genus), inducing the “ﬂat metric”. It corresponds to the pullback, byany Albanese morphism X ( C ) → Jac( X K )( C ), of the “cubist” metric in the sense of Moret-Bailly ([46], more about this shortly) on the jacobian Jac( X K ), associated with the N´eron-Tatenormalized height h Θ .We now specialize to the case of X ( p ) as in Section 2.2. If f is a modular form of weight2 for Γ ( p ), let k f k be its Petersson norm. Because newforms are orthogonal in prime level wehave, as in (11): µ := i J ( p )) X f ∈ B f dqq ∧ f dqq k f k . (25)We shall also need to consider N´eron-Tate heights h A for subabelian varieties A ֒ → J ( p ) asin section 2.1.2 (recall A = 0). The associated (1 , µ A is given by (12). More speciﬁcally,we focus on h Θ e on J e (as in (7) and around, for A ′ = J e ) which induces a height h Θ e ◦ ι e,P on X ( p ) via the map ι e,P : X ( p ) ֒ → J ։ J e . The curvature form of the hermitian sheaf on X ( p )deﬁning the Arakelov height associated with h Θ e ◦ ι e,P is µ e := i J e ) X f ∈ B [ I e ] f dqq ∧ f dqq k f k . (26)where B [ I e ] stands for the set of newforms killed by the ideal I e deﬁning J e as in (20). Remark 3.1

Notice that both µ and µ e , or any µ A above, are invariant by pull-back w ∗ p by theFricke involution. In particular the Arakelov intersection products [ · , · ] µ and [ · , · ] µ e , relative to µ and µ e respectively, are w p -invariant. The latter was clear already from the fact that, moregenerally, w p is an orthogonal symmetry on J ( p ) endowed with its quadratic form h Θ , whichrespects the orthogonal decomposition Q f J f of (19).One can now specialize the Hodge index theorem to our modular setting (see [41], Theorem 4.16,[40], Theorem 3.26, or more generally [46], p. 85 et seq.):10 heorem 3.2 Let K be a number ﬁeld, µ be a smooth non-zero (1 , -form on X ( p )( C ) asgiven in (12), and d CH ( p ) num R ,µ be the arithmetic Chow group with real coeﬃcients up to numericalequivalence of X ( p ) over O K , relative to µ . Denote by ∞ the horizontal divisor deﬁned by the ∞ -cusp on X ( p ) over Z (which is the Zariski closure of the Q -point ∞ in X ( p )( Q ) ), compactiﬁedwith the normalizing condition (24). Write R · X ∞ for the line of divisors with real coeﬃcientssupported on some ﬁxed full vertical ﬁber X ∞ . Deﬁne, for all v ∈ Spec( O K ) above p , the R -vectorspace: G v := M C = C ∞ R · C where the sum runs through all the irreducible components of X ( p ) × O K k ( v ) except C ∞ (the onecontaining ∞ ( k ( v )) ). Identify ﬁnally J ( p )( K ) / torsion with the subgroup of divisor classes D which are compactiﬁed under the normalizing condition g D ( ∞ ) = 0 (which is therefore diﬀerent from (24)). One has a decomposition: d CH ( p ) num R ,µ = ( R · ∞ ⊕ R · X ∞ ) ⊕ ⊥ v | p G v ⊕ ⊥ ( J ( p )( K ) ⊗ R ) (27) where the “ ⊕ ⊥ ” mean that the direct factors are mutually orthogonal with respect to the Arakelovintersection product. Moreover, the restriction of the self-intersection product to J ( p )( K ) ⊗ R coincides with twice the opposite of the N´eron-Tate pairing. Proof

The proof can be immediately adapted from that of [41], Theorem 4.16, for L -admissiblemeasures (a setting allowing to deﬁne convenient actions of the Hecke algebra on the Chowgroup). For further computational use we recall how one decomposes divisors in practice. Take D in d CH ( p ) num R ,µ , with degree d on the generic ﬁber. There is a vertical divisor Φ D , with support inﬁbres above places of bad reduction (that is, of characteristic p ), such that ( D − d ∞ − Φ D ) hasa real multiple which belongs to the neutral component Pic ( J ( p )) / O K . That Φ D is well-deﬁnedup to multiple of full vertical ﬁbres, so we can assume Φ D belongs to ⊕ ⊥ G p (and is then unam-biguously deﬁned). One associates to ( D − d ∞ − Φ D ) ∈ R · J ( p )( O K ) an element δ in d CH ( p ) num R ,µ by imposing a compactiﬁcation such that [ ∞ , δ ] µ = 0. The general Hodge index theorem (see forinstance [46]) then ﬁnally asserts that ( D − d ∞ − Φ D − δ ) can be written as an element in R · X ∞ . (cid:3) In order to later on interpret the N´eron-Tate height (associated with some given (symmetric)invertible sheaf) as an Arakelov height in a suitable sense (see [1] paragraph 3, or [47]), we willneed to compute explicitly, given P ∈ X ( p )( K ), the vertical divisor Φ P = ⊕ v | p Φ P,v such that[

C, P − ∞ − Φ P ] = 0 (28)for any irreducible component of any ﬁber of X ( p ) → Spec( O K ), as in the proof of Theorem 3.2. Lemma 3.3

Consider a bad ﬁber X ( p ) k ( v ) , with e v the absolute ramiﬁcation index of v , and write k ( v ) = p f v . Let P ∈ X ( p )( K ) and let C P,v be the irreducible component of X ( p ) k ( v ) whichcontains P ( k ( v )) . As X ( p ) is assumed to be regular, the section P hits each ﬁber on its smoothlocus, so that the component P belongs to is unambiguously deﬁned in each bad ﬁber. Write Φ P,v = X n,m a n,m [ C n,m ] with notations as in (16). Recall that, by our convention, a C ∞ = a ∗ , = 0 .(a) If C P,v = C then for all n and m , a n,m = − p − · w n · m. Recall (see (13)) that w n := S ( n )) / h± i ∈ { , , } , with S ( n ) the supersingular pointcorresponding to the branch { C n,. } .)For further use we henceforth write Φ C for the above vector Φ P,v ∈ Z B .(b) If C P,v = C n ,m = C , C ∞ then • for n = n and m ∈ { , m } , one has a n,m = (cid:16) m w n e v (1 − p − w n ) − (cid:17) · m ; • for n = n and m ∈ { m , w n e v } , one has a n,m = (cid:16) m w n e v (1 − p − w n ) (cid:17) · m − m ; • for n = n and all m ∈ { , w n e v } , one has a n,m = − m ( p − w n e v · mw n .(c) (Of course if C P,v = C ∞ then Φ P,v = 0 .) Remark 3.4

We have distinguished diﬀerent cases above because the proof naturally leads todoing so, and it will be of interest below to have the simpler case ( a ) explicitly displayed. Notehowever that all outputs are actually covered by the formulae of case ( b ). Notice also that, in case( a ), all coeﬃcients of Φ P,v satisfy0 ≥ a n,m ≥ a := a C = a n,w n m = − e v / ( p − . As for case ( b ), all coeﬃcients of Φ P,v satisfy0 ≥ a n,m ≥ a n ,m = (cid:18) m w n e v (1 − p − w n ) − (cid:19) · m (remember 0 ≤ m ≤ w n e v for all m ). Computing the minimum of the above right-hand as apolynomial in m gives0 ≥ a n,m ≥ − e v w n − p − w n ) ≥ − e v w n − w n ≥ − e v (29)(recalling we always assume p ≥ Proof

Given the intersection matrix (18) and condition (28): [

C, P − ∞ − Φ P,v ] = 0 for all C in the ﬁber at v gives the matrix equation:log( k ( v )) M · Φ P,v = log( k ( v ))( − , , · · · , , , · · · , t (30)where the coeﬃcient 1 (respectively, −

1) in the right-hand column vector is at the place corre-sponding to C P,v = C n,m (respectively, to C ∞ = C n, ) in the ordering of our component basis (16).That is however more easily solved by running through the dual graph of X ( p ) k ( v ) “branch bybranch” as follows. Suppose ﬁrst that C P,v = C , and recall a C ∞ = 0 by convention. Equation (28)translates into: • ( − − P sn =1 a n, = 0) for C = C ∞ ; • (1 + sa − P sn =1 a n,w n e v − = 0) for C = C ; • ( a n,m − − a n,m + a n,m +1 = 0) for all others C = C n,m .The equations of the third line in turn deﬁne, for each branch (that is, for ﬁxed n ), a sequencedeﬁned by linear double induction with solution a n,m = m · α n for some α n which is easily computedto be − w (Eis) · w n = − p − w n (see (14)). (Note this is true even for e v = 1.)For case (b), the intersection equations become:12 ( − − P sn =1 a n, = 0) for C = C ∞ ; • ( sa − P sn =1 a n,w n e v − = 0) for C = C ; • (1 − a n ,m − + 2 a n ,m − a n ,m +1 = 0) if C = C P,v = C n ,m ; • ( a n,m − − a n,m + a n,m +1 = 0) for all others C = C n,m .As above, solving these equations in all branches not containing C P,v gives a n,m = mβ n and thesame is true in the branch containing C P,v for m ∈ { , . . . , m } . We also see that a n ,m +1 =( m + 1) β n + 1, and then a n ,m = m ( β n + 1) − m for m ∈ { m + 1 , w n e v } . We have a = w n e v β n for all n = n , so let β be the common value of the β n for n = n with w n = 1. (There is alwayssuch an n as we assumed p >

13. Note also those computations still cover the case e v = 1.)From β = a /e v and a = w n e v ( β n + 1) − m we derive β n = ( a + m − w n e v ) /w n e v = βw n + m w n e v − . Hence, because of the ﬁrst equation ( − − P sn =1 a n, = 0),0 = − − β n − X ≤ n ≤ s,n = n β/w n = − βw (Eis) − m w n e v so that β = − m w (Eis) w n e v = − m ( p − w n e v . (cid:3) Lemma 3.5

Let µ be some (1 , -form on X ( p )( C ) as in Theorem 3.2.(a) The class in d CH ( p ) num R ,µ of the cuspidal divisor (0) − ( ∞ ) satisﬁes (0) − ( ∞ ) ≡ Φ C := Φ C + X v | p e v p − X C [ C ]) = X v | p X n,m p −

1) ( e v − mw n )[ C n,m ] (31) with notations as in Lemma 3.3 (a). This is an eigenvector of the Fricke Z -automorphism w p with eigenvalue − .(b) One has [ ∞ , ∞ ] µ = [0 , µ = [0 , ∞ ] µ − pp − . If µ is the Green-Arakelov measure µ then ≥ [ ∞ , ∞ ] µ = O (log p/p ) and similarly [0 , ∞ ] µ = O (log p/p ) with [0 , ∞ ] µ non-positivetoo, at least for large enough p . If µ = µ e (see (26)) - or more generally any sub-measure of µ - then [0 , ∞ ] µ e = O ( p log p ) . Proof

By the Manin-Drinfeld theorem, (0) − ( ∞ ) is torsion as a divisor in the generic ﬁber X ( p ) × Z Q . One therefore has (0) − ( ∞ ) ≡ Φ + cX ∞ in the decomposition (27) of d CH ( p ) num R ,µ , for Φ some vertical divisor with support in the ﬁbersabove p . This divisor is determined by the same equations (28) as Φ C in Lemma 3.3 (a) . Foreach v | p the full v -ﬁber P C [ C ] is numerically equivalent to some real multiple of the archimedeanﬁber X ∞ ; there is therefore a real number a such thatΦ C := Φ C + X v | p e v p − X C [ C ]) ≡ Φ C + aX ∞ . Now w p switches the cusps 0 and ∞ so the divisor (0) − ( ∞ ) is anti-symmetric for w p : w ∗ p ((0) − ( ∞ )) = − ((0) − ( ∞ ))13nd clearly w ∗ p (Φ C ) = − Φ C . The fact that w p preserves the archimedean ﬁber concludes theproof of (a) .To prove (b) we compute0 = [0 − ∞ − Φ C , ∞ ] µ = [0 , ∞ ] µ − [ ∞ , ∞ ] µ − p − p and 0 = [0 − ∞ − Φ C , µ = [0 , µ − [0 , ∞ ] µ + 6 p − p so that [ ∞ , ∞ ] µ = [0 , µ = [0 , ∞ ] µ − pp − . The cusps 0 and ∞ are known not to intersect on X ( p ) / Z so that [0 , ∞ ] µ = − g µ (0 , ∞ ). When µ = µ , this special value of the Arakelov-Greenfunction has been computed by Michel and Ullmo: it satisﬁes g µ (0 , ∞ ) = 12 g log p (1 + O ( log log p log p )) = O ( log pp )by [43], formula (12) on p. 650. Finally, using [9], Theorem 7.1 (c) and paragraph 8, and plugginginto Bruin’s method the estimates of [43] regarding the comparison function F ( z ) = O ((log p ) /p )between Green-Arakelov and Poincar´e measures, we obtain a bound of shape O ( p log p ) for | g µ e (0 , ∞ ) | (see also Remark 4.5). This completes the proof of (b) . (cid:3) Instrumental in the sequel will be the explicit decomposition of the relative dualizing sheaf ω in the arithmetic Chow group. Proposition 3.6

The relative dualizing sheaf ω of the minimal regular model X ( p ) → O K canbe written, in the decomposition (27) of d CH ( p ) num R ,µ relative to the canonical Green-Arakelov (1 , -form µ , as: ω = (2 g − ∞ + X v | p Φ ω,v + ω + [ K : Q ] c ω X ∞ (32) where the above components satisfy the following properties. • The number c ω is equal to (1 − g )[ K : Q ] [ ∞ , ∞ ] µ , so that ≤ c ω ≤ O (log p ) . • Set H := 12 X P ∈H ( P −

12 (0 + ∞ )) , H := 23 X p ∈H ( P −

12 (0 + ∞ )) where the sums run over the sets H and H , whose number of elements can be or , ofHeegner points of X ( p ) with j -invariant and respectively. Deﬁne H := H + [ K : Q ] c X ∞ and H := H + [ K : Q ] c X ∞ for two numbers c and c with c = O (log p ) , and the same for c . (Recall this means the H ∗ are compactiﬁed with the normalizing condition (24), whereas the H ∗ are the orthogonalprojections on ( J ( p )( K ) ⊗ R ) ⊆ d CH ( p ) num R ,µ of the H ∗ , so that [ ∞ , H ∗ ] µ = 0 , for ∗ = 3 or .) One sets ω := − H − H , which can be chosen in J ( p ) ( Q ) . • Finally, the component Φ ω,v in each G v for v | p is Φ ω,v = −

12 ( g − p − X n,m mw n C n,m (33) with notations as in (16). We therefore have Φ ω,v = ( g − C using notations of Lemma 3.3.In particular, recalling e v is the ramiﬁcation index of K/ Q at v , the coeﬃcients ω n,m of Φ ω,v in (33) satisfy ≥ ω n,m ≥ − e v . (34)14 roof Many parts of those statements are deduced from [43], Section 6, and results of Edixhovenet al. from [18]. See also [41], Section 4.4.We start by estimating c ω . By Arakelov’s adjunction formula, − [ ∞ , ∞ ] µ = [ ∞ , ω ] µ = (2 g − ∞ , ∞ ] µ + [ K : Q ] c ω because of the orthogonality of the decomposition (27). Lemma 3.5 therefore implies0 ≤ c ω = (1 − g )[ K : Q ] [ ∞ , ∞ ] µ = O (log p ) . The computations of the J ( p )-part ω := − ( H + H ) follows from the Hurwitz formula, asexplained in [43], paragraph 6, p. 670. One indeed checks that, on the generic ﬁber X ( p ) / Q = X ( p ) × Z Q , the canonical divisor is linearly equivalent to(2 g − ∞ −  X j ( P )= e iπ/ ′ ( P − ∞ ) + 23 X j ( P )= e iπ/ ′ ( P − ∞ )  where the sums P ′ are here restricted to points P at which X ( p ) → X (1) is unramiﬁed (these arethe Heegner points alluded to in our statement). It follows from the modular interpretation thatin each of those sums there are two Heegner points (if any), which are then ordinary at p (recallwe assume p > > J ( p )( K ) ⊗ Z R -part of ω is indeed − ( H + H )with H = H + [ K : Q ] c X ∞ and H = H + [ K : Q ] c X ∞ for some real numbers c and c .(Note that, as Heegner points are preserved by the Atkin-Lehner involution ([23], paragraph 5,p. 90), their specializations above p share themselves between the two components C and C ∞ of X ( p ) / F p , so that 2 H = P j ( P )= e iπ/ ′ ( P − ∞ ) and H = P j ( P )= e iπ/ ′ ( P − ∞ ) belong to theneutral component J ( p ) ( O K ).) The estimates on c and c will be justiﬁed at the end of theproof.The bad ﬁbers divisors Φ ω,v := P n,m ω n,m [ C n,m ] can be computed with the “vertical” adjunc-tion formula ([37] Chapter 9, Theorem 1.37) as in [41], Lemma 4.22. Indeed, for each irreduciblecomponent C in the v -ﬁber having genus 0, one has[ C, C + ω ] = − k ( v )) . If M is the intersection matrix displayed in (18), and δ ∗ , ∗ is Kronecker’s symbol, we thereforehave C · M · Φ ω,v = − − k ( v )) [ C, C ] − (2 g − δ C,C ∞ =  C = C ∞ , C s − g if C = C ∞ s − C = C (35)that is, as s = g + 1: M · Φ ω,v = ( g − − , , · · · , , t . That equation is (30) (up to a multiplicative scalar), which has been solved in the ﬁrst case ofLemma 3.3. ThereforeΦ ω,v = ( g − C , that is : ω n,m = 12(1 − g )( p − · mw n . (36)As noted in Remark 3.4 and using (15), this implies the coeﬃcients ω n,m of Φ ω,v satisfy0 ≥ ω n,m ≥ − g ) p − e v > − e v .

15e ﬁnally estimate the intersection products c = − K : Q ] [ ∞ , H ] µ and c = − K : Q ] [ ∞ , H ] µ . By the adjunction formula and Hriljac-Faltings’ theorem ([12], Theorem 5.1 (ii)) we compute thatfor any P ∈ X ( p )( K ), − K : Q ]h Θ ( P − g − ω ) = [ P − g − ω − Φ ω ( P ) , P − g − ω − Φ ω ( P )] µ = 1(2 g − [ ω, ω ] µ + gg − P, P ] µ − Φ ω ( P ) where here Φ ω ( P ) is a vertical divisor supported at bad ﬁbers such that[ C, P − g − ω − Φ ω ( P )] = 0 (37)for any irreducible component C of any bad ﬁber of X ( p ) / O K . Hence1(2 g − ω + gg − P, P ] µ − Φ ω ( P ) = − K : Q ]h Θ (( P − ∞ ) + 12 g − H + H )) . (38)We specialize to the case when P = P ∗∗ (where the upper star is 1 or 2 and the lower star is 4or 3) is one of the Heegner points occurring in H or H , respectively. We replace for now the baseﬁeld K by F := Q ( P ∗∗ ) = Q ( √−

1) (respectively, Q ( √− − p )(1 + o (1)) or −

12 log( p )(1 + o (1)) , respectively , (39)by [43], p. 673. If those Heegner points occur we know that p splits in F , so there are two badprimes v and v ′ on O F (therefore two bad ﬁbers on X ( p ) / O F and two G v , G v ′ ) to take intoaccount. We compute Φ ω ( P ∗∗ ) and Φ ω ( P ∗∗ ) . As mentioned at the beginning of the proof, P ∗∗ specializes to the component C at a place, say v , of F above p , and to C ∞ at the conjugateplace v ′ . Conditions (37) therefore give that, for any irreducible component C of the ﬁber at v ,0 = [ C, P ∗∗ − g − ω − Φ ω ( P ∗∗ ) v ] = [ C, − ∞ − g − ω,v − Φ ω ( P ∗∗ ) v ]and using Lemma 3.3, Lemma 3.5 and (36) one obtainsΦ ω ( P ∗∗ ) v = − g − ω,v + Φ C ,v = 12 Φ C ,v whereas, at v ′ : Φ ω ( P ∗∗ ) v ′ = − g − ω,v ′ = −

12 Φ C ,v ′ . Using Lemmas 3.3 and 3.5 again we therefore haveΦ ω ( P ∗∗ ) = X w | p

14 Φ C ,w = X w | p

14 [Φ C ,w , − ∞ ] = 12 a log p = − p ) p − . (40)As for the self-intersection of ω one knows from [62], Introduction, that ω X ( p ) / Z = 3 g log( p )(1 + o (1)) .

16s the quantity F : K ] [ ω ] is known to be independent from the number ﬁeld extension F/K ,the dualizing sheaf ω X ( p ) / O F of X ( p ) over O F (instead of Z ) satisﬁes ω = 6 g log( p )(1 + o (1)).Summing-up, equation (38) implies that[ P ∗∗ , P ∗∗ ] µ = O (log( p )) (41)for each Heegner point P ∗∗ . Now, on the other hand, the vertical divisor Φ P ∗∗ in the sense of (28)and Lemma 3.3 is Φ P ∗∗ = Φ C ,v for the place v of F where P ∗∗ specializes on C and not C ∞ .Therefore − Θ ( P ∗∗ − ∞ ) = [ P ∗∗ − ∞ − Φ P ∗∗ , P ∗∗ − ∞ − Φ P ∗∗ ] µ = − P ∗∗ , ∞ ] µ + [ P ∗∗ , P ∗∗ ] µ + [ ∞ , ∞ ] µ − (Φ P ∗∗ ) (42)whence, using (39), (40), (41) and Lemma 3.5(b):[ P ∗∗ , ∞ ] µ = 12 (cid:0) [ P ∗∗ , P ∗∗ ] µ + [ ∞ , ∞ ] µ − (Φ C ,v ) + 4h Θ ( P ∗∗ − ∞ ) (cid:1) = O (log p ) . Putting everything together and using Lemma 3.5 once more we conclude that c = − K : Q ] [ ∞ , H ] µ = 12[ K : Q ] (cid:0) − [ ∞ , P + P ] µ + [ ∞ , ∞ ] µ (cid:1) = O (log p ) (43)and similarly for c . (Note that the Arakelov intersection products, in the computations around (42),were performed over F = Q ( P ∗∗ ) and not K , although we did not indicate this in the nota-tions in order to keep it from becoming too heavy. We however want quantities over K forthe statement of the theorem, so we need considering Arakelov products over K in (43) above.) (cid:3) Remark 3.7

It may be convenient to write, with notations as in (32), a more symmetric ω as ω = ( g − ∞ + 0) + ( − H − H ) + [ K : Q ] c ω X ∞ (44)which yields an element with no vertical component at bad ﬁbers. j -height and Θ -height In this section we compare two natural heights on X ( p )( Q ), namely the j -height and the oneinduced from the N´eron-Tate Θ-height on J ( p )( Q ). We start with an explicit description of thelatter, for which it is actually convenient to use a bit of Zhang’s language about “adelic metrics”(see [64]) which, in our modular setting, has a very concrete form.Using notations and results from Section 2.2.2 we therefore consider the limit, as e v goes to ∞ ,of the dual graph of the special ﬁber of X ( p ) at a place v of a p -adic local ﬁeld with ramiﬁcationindex e v at p (see Figure 2.2.2). Here we normalize the length of the s = g + 1 edges from C ∞ to C to be 1, so that the vertex C n,m corresponds to the point of the n th edge with distance me v w n from the origin C ∞ . Now associate to any edge n ∈ { , · · · , s } the quadratic polynomial function g n ( x ) : [0 , → R , x x (cid:18) ( w n − p −

1) ) x − w n −

12 ( g − p − (cid:19) . (45)For K any number ﬁeld, P in X ( p )( K ), and v a place of K whose ramiﬁcation degree and residualdegree are still denoted by e v and f v respectively, let G ( P ( K v )) = e v f v log( p ) · g n ( C P ( k ( v )) ) (46)where C P ( k ( v )) is the component to which the specialization of P belongs at v , identiﬁed to a pointof the n th edge where it lives. 17 heorem 4.1 For any number ﬁeld K , there is an element ˜ ω Θ ,K = ( g · ∞ + Φ Θ ,K + c Θ ,K X ∞ ) (47) of d CH ( p ) num R ,µ such that, for any P ∈ X ( p )( K ) one has, with notations as in Proposition 3.6, h Θ ( P − ∞ + 12 ω ) = 1[ K : Q ] [ P, ˜ ω Θ ,K ] µ (48) and the terms of (47) satisfy: ≥ [ P, Φ Θ ,K ] ≥ − K : Q ] log( p ) and c Θ ,K = [ K : Q ] O (log p ) . (49) Passing to the limit on all number ﬁelds, the height induced on X ( p )( Q ) by pulling-back N´eron-Tate’s Θ -height on J ( p )( Q ) via the embedding P P − ∞ + ω can be written as: h Θ ( P − ∞ + 12 ω ) = 1[ K : Q ]  g [ P, ∞ ] µ + X v ∈ M K ,v | p G ( P ( K v )) + c Θ ,K  (50) where Zhang’s Green function G at bad ﬁbers is deﬁned in (45) and (46).In any case one has that the height satisﬁes h Θ ( P − ∞ + ω K : Q ] [ P, g · ∞ ] µ + O (log p ) . (51) Proof

We prove (48) and (49); from there reformulation (50) and (51) are straightforward.Recall X ( p ) denotes the minimal regular model of X ( p ) on Spec( O K ), that J ( p ) is the N´eronmodel of J ( p ) on the same base, and J ( p ) stands for its neutral component. Let δ be an elementof J ( p )( K ), seen as a degree 0 divisor on X ( p ). Up to making a base extension we can assume δ is linearly equivalent to a sum of points in X ( p )( K ). We shall denote by ˜ δ = δ + Φ δ (for Φ δ some vertical divisor on X ( p ), with multiplicity 0 on the component containing ∞ , following ourrunning conventions) the associated element of the neutral component J ( p ) ( O K ) (that is, theone whose associated divisor has degree zero on each irreducible component, in any ﬁber, of X ( p ),and therefore deﬁnes a point of J ( p ) ( O K )). For any point P in X ( p )( K ) ֒ → X ( p )( O K ) letsimilarly Φ P be the vertical divisor on X ( p ), with support on the bad ﬁbers, such that ( P −∞− Φ P )has divisor class belonging to the neutral component J ( p ) ( O K ) and, again, Φ P has everywheretrivial ∞ -component, see (28). Recall we can compute Φ P explicitly by Lemma 3.3. We write Φ P = P v ∈ M K ,v | p P C v a C v [ C v ] where the sum is taken on irreducible components C v of vertical bad ﬁbersof X ( p ). Using notations of Lemma 3.3 (b) we also deﬁne the following new vertical divisor atbad ﬁbers: Φ ϑ,K := X v ∈ M K ,v | p X Q v a C Qv C Q v = X v | p X ( n ,m ) a vn ,m C n ,m (52)so that a vn ,m = (cid:18) m w n e v (1 − p − w n ) − (cid:19) · m . Our very deﬁnitions imply Φ P = [ P, Φ P ] = [ P, Φ ϑ,K ] (53)for any P ∈ X ( p )( K ). Using Faltings’ Hodge index theorem we can write the N´eron-Tate18eight h Θ ( P − ∞ + δ ) as:h Θ ( P − ∞ + δ ) = − K : Q ] [ P − ∞ + ˜ δ − Φ P , P − ∞ + ˜ δ − Φ P ] µ = 12[ K : Q ] ([ P, ω + 2 ∞ − δ ] µ + 2[ P, Φ P ] µ − [Φ P , Φ P ] µ +[˜ δ, ∞ − ˜ δ ] µ − [ ∞ , ∞ ] µ )= 12[ K : Q ] ([ P, ω + 2 ∞ − δ + Φ ϑ,K ] µ + [˜ δ, ∞ − ˜ δ ] µ − [ ∞ , ∞ ] µ )= 1[ K : Q ] [ P, ˜ ω δ ] µ (54)with ˜ ω δ := (cid:18)

12 ( ω + Φ ϑ,K ) + ∞ − ˜ δ (cid:19) + c δ X ∞ (55)for X ∞ some ﬁxed archimedean ﬁber of X ( p ) and c δ is the real number c δ = 12 (cid:16) − [ ∞ , ∞ ] µ + [˜ δ, ∞ − ˜ δ ] µ (cid:17) . (56)Note that ˜ ω δ does not depend on P (as Φ ϑ,K was introduced to that aim).Let us now take δ = ω / − ( H + H ) / ∈ · J ( p ) ( Q ), as deﬁned in Proposition 3.6.(This is Riemann’s characteristic (the “ κ ” of [26], p. 138 for instance, that is the generic ﬁber ofthe J ( p )( Q ) ⊗ R -part of ω in the decomposition (32).) Set Φ Θ ,K := (Φ ω + Φ ϑ,K ). Then˜ ω Θ := ˜ ω δ = ( g · ∞ + Φ Θ ,K + c Θ ,K X ∞ ) (57)for c Θ ,K which, still using notations of Proposition 3.6 and its proof, is explicitly given by:1[ K : Q ] c Θ ,K = 12 (cid:18) c ω − c − c + 12 h Θ ( H + H ) − K : Q ] ([ ∞ ] µ + [ ∞ , H + H ] µ ) (cid:19) = 12 (cid:18) c ω − K : Q ] [ ∞ ] µ + 12 h Θ ( H + H ) (cid:19) . As in the proof of Proposition 3.6 we invoke p. 673 of [43] to assert h Θ ( H + H ) = O (log( p )).We moreover know from the same Proposition and from Lemma 3.5 that both | c ω | = O (log p ) and[ ∞ , ∞ ] µ = [ K : Q ] O (log p/p ), so that c Θ ,K = [ K : Q ] O (log p ) . (58)The contribution of Φ Θ ,K is controlled by Lemma 3 . ≥ [ P, Φ ϑ,K ] = [ P, Φ P ] = X v ∈ M K ,v | p a C P ,v log( k v ) ≥ X v ∈ M K ,v | p − e v log( p f v ) ≥ − K : Q ] log( p ) (59)On the other hand, by (34), the coeﬃcients of the vertical components Φ ω,v satisfy 0 ≥ ω n,m ≥ − e v ,so writing ω n P ,m P ,v for the coeﬃcient in Φ ω,v of the component containing P ( k ( v )) we have:0 ≥ [ P, Φ ω ] = X v | p ω n P ,m P ,v log( k ( v )) ≥ X v | p − e v log( p f v ) = − [ K : Q ] log( p ) . (60)Putting (58), (59) and (60) together completes the proof of (48) and (49) and the proof. (cid:3) emark 4.2 Estimates on the Green-Zhang function on X ( p ) as in the above theorem will beextended below to the N´eron model over Z of the whole jacobian J ( p ), see Proposition 5.8. Remark 4.3

As already noticed, the involution w p acts as an isometry (actually, an orthogonalsymmetry) with respect to the quadratic form h Θ on J ( p )( K ) ⊗ Z R . Indeed w p acts as multipli-cation by ± J ( p ) ∼ Y f ∈ G Q · S (Γ ( p )) new J f whose factors are h Θ -orthogonal subspaces. (See also [40], Corollaire 4.3, or [41], Theorem 4.5(3).) As w p ( ω ) = ω (see the proof of Proposition 3.6) this impliesh Θ ( P − ∞ + 12 ω ) = h Θ ( w p ( P − ∞ + 12 ω )) = h Θ ( w p ( P ) − ω ) = h Θ ( w p ( P ) − ∞ + 12 ω )using once more that (0) − ( ∞ ) is torsion, so that[ P, ˜ ω Θ ] µ = [ w p ( P ) , ˜ ω Θ ] µ = [ P, w ∗ p (˜ ω Θ )] w ∗ p ( µ ) = [ P, w ∗ p (˜ ω Θ )] µ (61)(see Remark 3.1). This suggests it could sometimes be convenient to write ˜ ω Θ in a w p -eigenbasisof d CH ( p ) num R ,µ instead of that of Theorem 3.2, for instance d CH ( p ) num R ,µ = R ·

12 (0 + ∞ ) ⊕ R · X ∞ ⊕ v | p Γ v ⊕ ( J ( p )( K ) ⊗ R ) (62)where now the Γ v decompose as the direct sum of eigenspaces Γ w p = − v and Γ w p =+1 v , with bases: { C − n,m := C n,m − w p ( C n,m ) } ≤ n ≤ s ≤ m ≤ ewn/ and { C + n,m := C n,m + w p ( C n,m ) − C − C ∞ } ≤ n ≤ s ≤ m ≤ ewn/ (63)respectively. Using Lemma 3.5 and Proposition 3.6, a lengthy but easy computation allows one tocheck that ˜ ω Θ = g ·

12 (0 + ∞ ) + Φ +Θ + γ Θ X ∞ where Φ +Θ is an explicit vertical divisor above p with w ∗ p (Φ +Θ ) = Φ +Θ , so that indeed w ∗ p (˜ ω Θ ) = ˜ ω Θ thus recovering (61).Consider for instance the case of X ( p ) over Z , for p ≡ X ( p ) / Z is regular, sothat there is no need to blow-up singular points of width larger than 1). Here Γ v = Γ − v = R · C − = R · ([ C ∞ ] − [ C ])) and one readily checks that˜ ω Θ = g ∞ ) + γ Θ X ∞ (64)that is, there is no Γ v -component at all in that case. Evaluating h Θ ( ω ) as in the proof ofProposition 3.6 and using Lemma 3.5, γ Θ = − g ∞ , ∞ ] µ + h Θ ( 12 ω ) = gO (log p/p ) + O (log p ) = O (log p ) . We then turn to the j -height, ﬁrst making a comparison of h j with the “degree component”(in the sense of Theorem 3.2) of the hermitian sheaf ω .20 roposition 4.4 Let h j be Weil’s j -height on X ( p ) as deﬁned in in Section 2.2, and let µ and µ e be the (1 , -forms deﬁned in (25) and (26). Recall sup X ( p )( C ) g µ stands for the upper boundfor all Green functions g µ,a relative to some point a of X ( p )( C ) and to the measure µ .If p is a prime number, K is a number ﬁeld, and P belongs to X ( p )( K ) , then h j ( P ) ≤ ( p + 1) K : Q ] [ P, ∞ ] µ + sup X ( p )( C ) g µ + O (1) ! ≤ ( p + 1)[ K : Q ] [ P, ∞ ] µ + O ( p log p ) (65) and similarly h j ( P ) ≤ ( p + 1) K : Q ] [ P, ∞ ] µ e + sup X ( p )( C ) g µ e + O (1) ! ≤ ( p + 1)[ K : Q ] [ P, ∞ ] µ e + O ( p ) . (66) Remark 4.5

As explained in the proof below, the function O ( p log p ) of (65) comes from [63],Corollary 1.5, together with [62], Corollaire 1.3 for the estimate of Faltings’ δ invariant for X ( p ),which imply the suprema of our functions verify:sup X ( p )( C ) g µ ≤ O ( p log p ) . (67)The function O ( p ) of (66) in turns follows from the main result of [9]. Indeed this states explicitlythat sup X ( p )( C ) g µ ≤ . · p + 7 . · p + 1 . · , see [9], Theorem 1.2. It follows from measurescomparison (see (74) below) and the method of P. Bruin that this holds for sup X ( p )( C ) g µ e too, sothat sup X ( p )( C ) g µ e ≤ O ( p ) . (68)It seems that, at least in the case of X ( p ), if we plug into Bruin’s method the estimates of [43]regarding the comparison function F ( z ) between Green-Arakelov and Poincar´e measures, we re-cover bounds of shape O ( p log p ) instead of O ( p ) (see [9], p. 263, and Paragraph 8 (Theorem 7.1in particular)), and the same again holds true for the Green function g µ e . One should thereforebe able to obtain the same error term O ( p log p ) for (66) as for (65).Note that the main theorems of [32] and [3] might even yield that the above functions O ( p )or O ( p log p ) could be replaced by a uniform bound O (1). Proof

This is essentially a question of measure comparisons on X ( p )( C ), between j ∗ ( µ F S ) onone hand (where µ F S is the Fubini-Study (1 , X (1)( C ) ≃ P ( C )) and the Green-Arakelovform µ (respectively, µ e ) on the other hand. We adapt the main result of [17].We deﬁne ﬁrst a somewhat canonical Arakelov intersection product [ · , · ] µ FS on the projectiveline using µ F S . Write P / O K = Proj( O K [ x , x ]) = Spec Zar ( O K [ j ]) (with j = x /x ), so that thehorizontal divisor ∞ ( O K ) is V ( x ) and, for any P = [ x : x ], let the associated Green functionbe g µ FS , ∞ ( P ) = g µ FS , ∞ ( j ( P )) = 12 log (cid:18) | x | | x | + | x | (cid:19) = −

12 log(1 + | j ( P ) | )at any point diﬀerent from ∞ = [0 : 1]. (We note in passing this ad hoc Green function does notneed to fulﬁll the normalization condition (24).) Then for any P in X (1)( K ) one easily checksthat (cid:12)(cid:12)(cid:12)(cid:12) h j ( P ) − K : Q ] [ j ( P ) , ∞ ] µ FS (cid:12)(cid:12)(cid:12)(cid:12) ≤

12 log(2) . (69)21pplying [17], Theorem 9.1.3 and its proof to the setting described above gives, for any P in X ( p )( K ),[ j ( P ) , ∞ ] µ FS ≤ [ P, j ∗ ( ∞ )] µ + ( p + 1) X σ sup X ( p ) σ g µ + 12 X σ Z X ( p ) σ log( | j | + 1) µ (70)where σ runs through the inﬁnite places of K and X ( p ) σ := X ( p ) × O K ,σ C .We estimate the right-hand terms of (70). As for the last integrals we recall that, on the unionof disks of ray | q | < r around the cusps (that is, on the image in X ( p )( C ) of the open subset D r := { z ∈ H , ℑ ( z ) > − (log r ) / π } in Poincar´e upper-half plane H ) for some ﬁxed r in ]0 , (cid:12)(cid:12)(cid:12)(cid:12) f ( q ) q (cid:12)(cid:12)(cid:12)(cid:12) ≤ − r ) for any newform f in S (Γ ( p )). (See for instance [18], Lemma 11.3.7 and its proof.) We alsoknow that the Petersson norm of such an f satisﬁes k f k ≥ πe − π ([18], Lemma 11.1.2). Choose r = 1 / D / , we have (see (25)): µ = i J ) X f ∈ B f dqq ∧ f dqq k f k ≤ e π π i dq ∧ dq. (Sharper bounds should be achievable, but the one above is good enough for our present purpose.)It follows that there exists some real A such that, in the decomposition Z X ( p )( C ) log( | j | + 1) µ = Z X ( p )( C ) ∩ D / log( | j | + 1) µ + Z X ( p )( C ) \ D / log( | j | + 1) µ (71)the ﬁrst term of the right-hand side satisﬁes Z X ( p )( C ) ∩ D / log( | j | +1) µ ≤ e π π [SL ( Z ) : Γ ( p )] Z X (1)( C ) ∩ D / log( | j | +1) i dq ∧ dq ≤ ( p +1) A. As for the second term, remembering that µ has total mass 1 on X ( p )( C ) we check that Z X ( p )( C ) \ D / log( | j | + 1) µ ≤ M / := max X (1)( C ) \ D / (log( | j | + 1))whence the existence of some absolute real number A such that Z X ( p )( C ) log( | j | + 1) µ ≤ ( p + 1) A . (72)Putting this together with (70) we obtain a constant C for which (69) readsh j ( P ) ≤ K : Q ] [ P, j ∗ ( ∞ )] µ + ( p + 1)( sup X ( p )( C ) g µ + A ) . With notations of Lemma 3.5, one further has j ∗ ( ∞ ) = p (0) + ( ∞ ) ≡ ( p + 1) ∞ + p · Φ C (73)as elements of d CH ( p ) num R ,µ . Using Lemma 3.5 (a) we get | [ P, Φ C ] | ≤ [ K : Q ] 6 log pp − j ( P ) ≤ K : Q ] [ P, ( p + 1) ∞ ] µ + ( p + 1)( sup X ( p )( C ) g µ + A ) + O (log p ) ≤ K : Q ] [ P, ( p + 1) ∞ ] µ + C · p log p which is (65).The proof of (66) proceeds along the same lines, with one more ingredient. Applying The-orem 9.1.3 of [17] with the measure µ e instead of µ gives the corresponding version of (70).To obtain an upper bound for sup X ( p )( C ) g µ e we recall that the theorem of Kowalski, Micheland Vanderkam asserts that dim( J e ) ≥ dim( J ( p )) / p . Our measure µ e := J e ) P S e i f dqq ∧ f dqq k f k (see (26)) therefore satisﬁes0 ≤ µ e ≤ g dim( J e ) µ ≤ µ . (74)This shows that as in (68), Bruin’s theorem ([9], Theorem 7.1) provides a universal c e such thatsup X ( p )( C ) g µ e ≤ c e p . (75)Using (72) we obtain: Z X ( p )( C ) log( | j | + 1) µ e ≤ ( p + 1) A e . (76)Finally, equivalence (73) remains naturally true in the Chow group d CH ( p ) num R ,µ e relative to themeasure µ e instead of µ , as remarked in Lemma 3.5 (a) . This completes the proof of (66). (cid:3) We can ﬁnally relate h j and the N´eron-Tate height h Θ relative to the Θ-divisor (see (10)): Theorem 4.6

There are real numbers γ, γ such that the following holds. Let K be a numberﬁeld and p a prime number. Let ω := − ( H + H ) be the -component of the canonical sheaf ω on X ( p ) over K (as in Proposition 3.6 and Theorem 4.1). If P is a point of X ( p )( K ) then h j ( P ) ≤ (12 + o (1)) · h Θ ( P − ∞ + 12 ω ) + γ · p log p (77) and h j ( P ) ≤ (24 + o (1)) · h Θ ( P − ∞ ) + γ · p log p. (78) Remark 4.7

Theorem 4.6 oﬀers only one direction of inequality between j -height and Θ-height:with our method of proof, it is harder to give an eﬀective form to the reverse inequality, becauseof the metrics comparisons we use (see below).Notice also that going through the above proofs using the estimate sup X ( p )( C ) g µ = O (1)of [32] and [3] (see Remark 4.5) would even give an error term of shape O ( p ) instead of O ( p log p )in (78).Those results are in some sense (hopefully sharp) special cases of the main results of [54], afterrewriting the j -function in terms of classical Θ. Proof

Using Theorem 4.1, (51), Proposition 4.4 and (15) we obtainh j ( P ) ≤ p + 1 p −

13 h Θ ( P − ∞ + 12 ω ) + O ( p log p ) . The last estimate (78) of the theorem comes from the fact that h Θ is a quadratic form and thath Θ ( ω ) = O (log p ) (79)by the results of [43] now many times mentioned. (cid:3) Height of modular curves and the various W d We prove in this section a certain number of technical results about heights of cycles in themodular jacobian, which will be useful in the sequel. For applications of the explicit arithmeticB´ezout theorem displayed in next section (Proposition 6.1), we indeed ﬁrst need estimates forthe degree and height of the image of X ( p ), together with its various d th -symmetric products(usually called “ W d ”), within either J ( p ) or its quotient J e , relative to the Θ-polarization. (Formore general considerations on this topic, we also refer to [30].) We estimate those heights bothin the normalized N´eron-Tate sense and for some good (“Moret-Bailly”) projective models, to bedeﬁned shortly.Let us ﬁrst deﬁne the height of cycles relative to some hermitian bundle. For further detailson this we refer to [65], or to [1], Section 2 for a more informal introduction. Deﬁnition 5.1

Let K be a number ﬁeld and O K its ring of integers. Let X be an arithmeticscheme over O K , that is an integral scheme which is projective and ﬂat over O K , having smoothgeneric ﬁber X over K . Let F be a generically ample and relatively semiample hermitian sheafwith smooth metric, see [65], Section 5. We denote by ˆ c ( F ) the ﬁrst arithmetic Chern class of F , and similarly by c ( F ) the ﬁrst Chern class of F .Such a pair ( X , F ) will be called a model, in the sense of Zhang, of its pull-back ( X, F ) =( X K , F K ) to the generic ﬁber. Consider a model ( X , F ) as in Deﬁnition 5.1, and let Y be a d -dimensional subvariety of X .The degree of Y with respect to F is as usual the non-negative integer given by the d th -powerself-intersection of c ( F ) with Y , that isdeg F ( Y ) = (cid:0) c ( F ) d | Y (cid:1) . We shall sometimes also write that quantity as deg F ( Y ).Now let Y → X be some “generic resolution of singularities” of Y (that is, some good integralmodel for some desingularization of Y , see Section 1 of [65]). The height of Y with respect to F will similarly be the real number obtained by taking the the (dim Y ) th -power self-intersection ofˆ c ( F ) with Y , divided by the degree of Y and normalized so that:h F ( Y ) = (ˆ c ( F ) d +1 |Y )[ K : Q ]( d + 1) deg F ( Y ) . (80)One can check that deﬁnition does not depend on the desingularization Y → X .Instrumental to us will here be Zhang’s control of heights in terms of essential minima. Recallthat the (ﬁrst) essential minimum µ ess F ( Y ) of Y is the minimum of the set of real numbers µ suchthat there is a sequence of points ( x n ) in Y ( Q ) which is Zariski dense in Y and h F ( x n ) ≤ µ forall n . Zhang’s Theorem (5.2) of [65] then asserts thath F ( Y ) ≤ µ ess F ( Y ) . (81)Note that if h F ≥ Y ( Q ) one also knows from [65], Theorem 5.2 the reverse inequalityh F ( Y ) ≥ µ ess F ( Y ) d + 1 . (82)If ( X , F ) is a model over O K , in the sense of Deﬁnition 5.1, of a polarized abelian variety( X, F ) over K = Frac( O K ), and Y again is a d -dimensional subvariety of the generic ﬁber X , westill deﬁne its normalized N´eron-Tate height relative to F as the limith F ( Y ) := lim n →∞ N n h F ([ N n ] Y ) It could have been simpler to systematically use the deﬁnition of height of [8], Section 3.1, which does notdemand desingularization, as we do in the proof of Proposition 6.1 at the end of Section 6. We could not ﬁndreferences however for Zhang’s inequality (see (81)) in that setting, so we stick to the above deﬁnitions. N is any ﬁxed integer larger than 1 and [ N n ] Y is the image of Y under multiplication by N n in X . This normalized height, which is a direct generalization of the classical notion of N´eron-Tateheight for points, is known not to depend neither on the model X of X , nor the extension F of F , nor its hermitian structure (and not on N ), so that the notation h F ( · ) is ﬁnally unambiguous.We refer to [1], Proposition-D´eﬁnition 3.2 of Section 3 for more details. We will actually use theextension of the two inequalities (81) and (82) to the case where the heights and essential minimaare those given by the limit process deﬁning N´eron-Tate height (which is known to be non-negativeon points) that is, with obvious notations µ ess F ( Y ) d + 1 ≤ h F ( Y ) ≤ µ ess F ( Y ) (83)see Th´eor`eme 3.4 of [1]. As we will see in Section 5.3 and below, Moret-Bailly theory allows,under certain conditions, to interpret N´eron-Tate heights as Arakelov projective heights (that is,without going through limit process). We shall apply the above to cycles in modular abelian varieties endowed with their symmetrictheta divisor: the notation h Θ will always stand for normalized N´eron-Tate height of cycles. Proposition 5.2

Let X be the image via π A ◦ ι ∞ : X ( p ) → A of the modular curve X ( p ) mappedto a non-zero quotient π A : J ( p ) → A of its jacobian, endowed with the polarization Θ A inducedby the Θ -divisor (see (4), (9) and around). The degree and normalized N´eron-Tate height of X satisfy: deg Θ A ( X ) = dim( A ) = O ( p ) and h Θ A ( X ) = O (log p ) . Proof

If ( A, Θ A ) = (Jac( X ( p )) , Θ), it is well-known that the Θ-degree of X ( p ) (or in fact anycurve) embedded in its jacobian via some Albanese embedding, equals its genus. That can be seenin many ways, among which one can invoke Wirtinger’s theorem ([22], p. 171), which yields infact the desired result for any quotient ( A, Θ A ): using the notation before (12) we havedeg Θ A ( X ) = Z X ( p ) X f ∈ B A i f dqq ∧ f dqq k f k = dim A ≤ g ( X ( p )) . We then apply once more the fact (15) that the genus g ( X ( p )) is roughly p/

12. (We could alsohave more simply say that the degree is decreasing by projection, as in the argument below.)As for the height, the main result of [43] gives that the essential minimum of the normalizedN´eron- Tate height µ essΘ ( X ( p )) is O (log p ). As the height of points decreases by projection (seeSection 2.1.2, and in particular (7)) the same is true for µ essΘ A ( X ) and we conclude with Zhang’s(83). (cid:3) Now for the N´eron-Tate normalized height of symmetric squares and variants:

Proposition 5.3

Assume X := X ( p ) has gonality strictly larger than (which is true as soonas p > , see [52]). Let ι := ι ∞ : X ( p ) ֒ → J ( p ) be the Albanese embedding as in Proposition 5.2.Let X (2) be the symmetric square X ( p ) (2) embedded in J ( p ) via ( P , P ) ι ( P ) + ι ( P ) , andsimilarly let X (2) , − be the image of ( P , P ) ι ( P ) − ι ( P ) . Let X (2) e ⊥ and X (2) , − e ⊥ be the projectionsof X (2) and X (2) , − , respectively, to J ⊥ e (the “orthogonal complement” to the winding quotient J e ,see paragraph 2.2.3). Then with notations as in Proposition 5.2 taking A = J ( p ) and A = J ⊥ e respectively one has deg Θ ( X (2) ) = O ( p ) = deg Θ ( X (2) , − ) , h Θ ( X (2) ) = O (log p ) = h Θ ( X (2) , − )25 nd the same holds for the quotient objects: deg Θ ⊥ e ( X (2) e ⊥ ) = O ( p ) = deg Θ ⊥ e ( X (2) , − e ⊥ ) ; h Θ ⊥ e ( X (2) e ⊥ ) = O (log p ) = h Θ ⊥ e ( X (2) , − e ⊥ ) . Proof

Denoting by p and p the obvious projections below we factor in the common way (see[49], paragraph 3, Proposition 1 on p. 320) our maps over Q as follows: A ր p X ( p ) × X ( p ) π A ι × π A ι −→ A × A M −→ A × A ( x, y ) ( x + y, x − y ) ց p A (84)so X (2) = p ◦ M ◦ ( π A ι × π A ι )( X ( p ) × X ( p )) and X (2) , − = p ◦ M ◦ ( π A ι × π A ι )( X ( p ) × X ( p ))when A = J ( p ), and the same with X (2) e ⊥ and X (2) , − e ⊥ with A = J ⊥ e . We endow A × A with thehermitian sheaf Θ A ⊠ := p ∗ Θ A ⊗ p ∗ Θ A . Then M ∗ (Θ A ⊠ ) ≃ (Θ A ⊠ ) ⊗ ([49], p. 320). Therefore,writing X for π A ι ( X ( p )) in short and using Proposition 5.2,deg Θ A ⊠ ( M ( X × X )) = 4 deg Θ A ⊠ ( X × X ) = 8(deg Θ A ( X )) = O ( g ) . As degree decreases by our projections and O ( g ) = O ( p ), deg Θ A ( X (2) ) and deg Θ A ( X (2) , − ) are O ( p ).By deﬁnition of essential minima, µ essΘ A ⊠ ( X × X ) ≤ µ essΘ A ( X ) . This implies that µ essΘ A ⊠ ( M ( X × X )) ≤ µ essΘ A ( X ). Invoking (83) again and Proposition 5.2 togetherwith the fact that the height of points also decreases by projection, µ essΘ A ( X (2) ) ≤ µ essΘ A ⊠ ( M ( X × X )) ≤ µ essΘ A ( X ) ≤ Θ A ( X ) ≤ O (log p ) . Therefore h Θ A ( X (2) ) = O (log p ) . (cid:3) Note that this proof applies more generally to any sub-quotient of J ( p ). To build-up the projective models of the jacobian (over Z , or ﬁnite extensions), and associatedheights, that we shall need for our arithmetic B´ezout, we use Moret-Bailly theory, in the senseof [47], as follows. For more about similar constructions in the general setting of abelian varietieswe refer to [7], 2.4 and 4.3; see also [54].Let therefore ( J, L (Θ)) stand for the principally polarized abelian variety J ( p ) endowed withthe invertible sheaf associated with its symmetric theta divisor, deﬁned over some small extensionof Q (see (89) below and around for more details). Endow the complex base-changes of theassociated invertible sheaf L (Θ) with its cubist hermitian metric. If N J, O K is the N´eron model of J over the ring of integers O K of a number ﬁeld K , we know it is a semistable scheme over O K ,whose only non-proper ﬁbers are above primes P of characteristic p , where it then is purely toric.At any such P , with ramiﬁcation index e P , the group scheme N J, O K has components groupΦ P ≃ ( Z /N e P Z ) × ( Z /e P Z ) g − (85)for g := dim J and N := num( p − ) (see e.g. [36], Proposition 2.11).We choose and ﬁx an integer N > K ⊇ Q ( J [2 N ]), for all this paragraph,so that all the 2 N -torsion points in J have values in K . One then observes from (85) that 2 N e P , and Proposition II.1.2.2 on p. 45 of [47] asserts that L (Θ)has a cubist extension, let us denote it by L (Θ), to the open subgroup scheme N J,N of the N´eronmodel N J, O K over O K whose ﬁbers have component group killed by N .Such an extension L (Θ) is actually symmetric ([47], Remarque II.1.2.6.2) and unique (seeTh´eor`eme II.1.1.i) on p. 40 of loc. cit.). Moreover L (Θ) is ample on N J,N ([47], Proposition VI.2.1on p. 134). Its powers L (Θ) ⊗ r are even very ample on N J,N × O K O K [1 / p ] as soon as r ≥

3, asfollows from the general theory of theta functions. Provided

N >

1, the sheaf L (Θ) ⊗ N is spannedby its global sections on the whole of N J,N ([47], Proposition VI.2.2), although we shall not usethat last fact as such.Picking-up a basis of generic global sections in H ( J ( p ) K , L (Θ) ⊗ N ), with N ≥

3, we thusdeﬁnes a map J ( p ) K  N −→ P nK , for n = N g −

1. Assume our generic global sections extend toa set S in H ( N J,N , L (Θ) ⊗ N ). Let J  ֒ → P n O K be the schematic closure in P n O K of the genericﬁber ( N J,N ) K = J K via the associated composed embedding J K ֒ → P nK ֒ → P n O K . Deﬁne M =  ∗ O P n O K (1) on J . Let on the other hand M N J,N := (cid:0)P s ∈S O K · s (cid:1) be the subsheaf of L (Θ) ⊗ N on N J,N spanned by S . Write ν : e N J,N → N

J,N for the blowup at base points for M N J,N on N J,N ,that is, the blowup along the closed subscheme of N J,N deﬁned by the sheaf L (Θ) ⊗ N / M N J,N . Wehave a commutative diagram e N J,N ր ı N ↓ ց  N J K ֒ → J  ֒ → P n O K (86)where the only non-trivial map  N (whence ı N ) is deduced from the fundamental properties ofblowups. Considering the complex base-changes of the generic ﬁber we note that M is automat-ically endowed with a cubist hermitian structure induced by that of L (Θ) C (see [7], (4.3.3) andfollowing lines). Deﬁnition 5.4

Given an integer N ≥ , and a number ﬁeld K containing Q ( J ( p )[2 N ]) , wedeﬁne the “good model” for ( J ( p ) , L (Θ) ⊗ N ) relative to some ﬁnite set S in H ( N J,N , L (Θ) ⊗ N ) ,which spans H ( J ( p ) , L (Θ) ⊗ N ) , as the projective scheme J over Spec( O K ) enhanced with thehermitian sheaf M constructed above, and h M the associated height. Outside base points for M N J,N on N J,N the blowup ν : e N J,N → N

J,N is an isomorphism andon that open locus we have L (Θ) ⊗ N ≃ M N J,N ≃ ı ∗N M =  ∗N O P n O K (1) (87)so we dwell on the fact that the height h M of our “good models” for ( J ( p ) , L (Θ) ⊗ N ) will indeedcompute ( N times) the N´eron-Tate height of certain Q -points (those whose closure factorizesthrough N J,N deprived from the base points for S ), but deﬁnitely not all . For arbitrary points,still, one can deduce from the work of Bost ([7], 4.3) the following inequality. Proposition 5.5

For any point P in J ( p )( Q ) , the height h M ( P ) of Deﬁnition 5.4 satisﬁes h M ( P ) ≤ N h Θ ( P ) . Proof

We brieﬂy adapt [7], 2.4 and 4.3, using our above notations. Of course this statement hasnothing to see with modular jacobians, and holds for any abelian variety over a number ﬁeld. Let N ′ be some integer such that P deﬁnes a section of N J,N ′ ( O F ) for some ring of integers O F . Upto replacing O F by a suﬃciently ramiﬁed ﬁnite extension, we can assume L (Θ) ⊗ N has a cubistextension L (Θ) ⊗ N to all of N J,N ′ over O F ([47], Proposition II.1.2.2). One hash Θ ( P ) = 1 N F : Q ] d deg( P ∗ ( L (Θ) ⊗ N )) .

27s in (86) however we see that there is no well-deﬁned map from N J,N ′ to P n O F because L (Θ) ⊗ N needs not be spanned by elements of S on all of N J,N ′ (even though it is, by hypothesis, on thegeneric ﬁber). To remedy this we adapt the construction (86).If π ′ : N J,N ′ → Spec( O F ) is the structural morphism, we deﬁne now M ′N := (cid:0)P s ∈S O F · s (cid:1) as the subsheaf of L (Θ) ⊗ N on N J,N ′ spanned by S , still endowed with the metric induced by thatof L (Θ) ⊗ N . One checks (see [7], (4.3.8)) that the projective model J O F of ( N J,N ′ ) F ≃ J F in P n O F deﬁned as in (86) yields a sheaf M ′ on J O F , whence a height h M ′ , which coincides with the heighth M on the base change of the good model J O K .Replacing N J,N ′ by its blowup ν ′ : e N J,N ′ → N J,N ′ at base points for M ′N in L (Θ) ⊗ N on N J,N ′ ,we keep on following construction (86) to obtain maps ı ′N : e N J,N ′ → J O F and  ′N : e N J,N ′ → P n O F such that the Zariski closure of  ′N ( e N J,N ′ ) identiﬁes with J O F . We moreover have ı ′∗N ( M ′ ) = ν ′∗ ( L (Θ) ⊗ N ) ⊗ O ( − E )where E is the exceptional divisor of the blowup which is by deﬁnition eﬀective. The section P of N J,N ′ ( O F ) lifts to some e P of e N J,N ′ ( O F ). Let ε P be the section of J ( O F ) deﬁned by the Zariskiclosure of P ( F ) in J . One can ﬁnally computeh M ( P ) = h M ′ ( P ) = 1[ F : Q ] d deg( ε ∗ P ( M ′ )) = 1[ F : Q ] d deg( ˜ P ∗ ( ı ′∗N ( M ′ ))) ≤ F : Q ] d deg( ˜ P ∗ ( ν ′∗ ( L (Θ) ⊗ N ))) = 1[ F : Q ] d deg( P ∗ ( L (Θ) ⊗ N )) = N h Θ ( P ) . (cid:3) The following straightforward generalization to higher dimension will be useful in next section.

Corollary 5.6 If Y is a d -dimensional irreducible subvariety of J ( p ) then h M ( Y ) ≤ ( d + 1) N h Θ ( Y ) . Proof

Combine Zhang’s formulas (81) and (83) with Proposition 5.5. (cid:3)

Recall from (8) that one can deﬁne the “pseudo-projection” P ˜ J e ⊥ ( ι ∞ ( X ( p ))) of the image of X ( p ) ι ∞ ֒ → J ( p ) on the subabelian variety ˜ J e ⊥ ⊆ J ( p ). Let X e ⊥ be any of its irreducible com-ponents. Deﬁne similarly X (2) , X (2) , − , X (2) e ⊥ and X (2) , − e ⊥ as in Proposition 5.3. Note that, byconstruction, the degree and normalized N´eron-Tate height of X e ⊥ (and other similar pseudo-projections: X (2) e ⊥ etc.), as an irreducible subvariety of J ( p ) endowed with h Θ , are those of π J ⊥ e ( X ( p )) = X (2) , − e ⊥ relative to the only natural hermitian sheaf of J ⊥ e , that is, the Θ ⊥ e = Θ J ⊥ e described in paragraph 2.1.2 and estimated in Proposition 5.2. Corollary 5.7

For any ﬁxed integer N ≥ , and any number ﬁeld K containing Q ( J ( p )[2 N ]) ,let ( J , M ) be the good model for ( J ( p ) , L (Θ) ⊗ N ) , and h M the associated projective height, givenin Deﬁnition 5.4. Let X be the image of X ( p ) ι ∞ ֒ → J ( p ) , and more generally X (2) , X (2) , − , X (2) e ⊥ and X (2) , − e ⊥ be the objects X (2) , . . . deﬁned in Proposition 5.3 (or their pseudo-projections).Then their M ⊗ N -heights are bounded from above by similar functions as their N´eron-Tate height(Proposition 5.3). Explicitly, h M ⊗ N ( X ( p )) is less than O (log p ) , and h M ⊗ N X (2) , etc., are allless than O (log p ) . Similarly the M ⊗ N -degree of X ( p ) is O ( p ) , and the M ⊗ N -degrees of X (2) ,etc., are all O ( p ) . Proof

Combine Zhang’s formulas (81) and (83) with Propositions 5.2, 5.3 and 5.5. (cid:3) .3 Estimates on Green-Zhang functions for J ( p ) We shall later on need some control on the p -adic N´eron-Tate metric of Θ as alluded to in Re-mark 4.3. (Those statements can probably be best formulated in the setting of Berkovich theory,for which one might check in particular [15], Proposition 2.12, and [61]. A useful point of viewis also proposed by that of “tropical jacobians”, see [44] and [31]. We will content ourselves herewith our down-to-earth point of view). We therefore deﬁneˆΦ p := lim −→ K P ⊇ Q p Φ P as the direct limit, on a tower of totally ramiﬁed extensions K P / Q p , of the component groups Φ P of the N´eron models of J ( p ) at P , see (85). The compatible embeddings Z := h C − C ∞ i ≃ h (0) − ( ∞ ) i ≃ Z /N Z ֒ → Φ P for each P induce an exact sequence 0 → Z → ˆΦ p → lim −→ e P ( Z /e P Z ) g ≃ ( Q / Z ) g →

0. Passing tothe real completion yields a presentation:0 → Z ≃ Z /N Z → ˆΦ p, R → ( R / Z ) g → p, R must be the “skeleton”, in the sense of Berkovich, of the N´eron model over Z p of J ( p ), and the tropical jacobian, see [31], of the curve X ( p ) above p ). The right-hand side of (88)is more canonically written ( R / Z ) g ≃ ( R / Z ) s / ∆( R ), for ∆ the almost diagonal map∆( z ) ( 1 w i z ) ≤ i ≤ g +1 (see [36], Proposition 2.11.(c)).We then sum-up useful properties about theta divisors and theta functions “over Z ”.As J ( p ) is principally polarized over Q , the complex extension of scalars J ( p )( C ) can be givena classical complex uniformization C g / ( Z g + τ Z g ) for some τ in Siegel’s upper half plane. Theassociated Riemann theta function: θ ( z ) = X m ∈ Z g exp( iπ t m · τ · m + 2 iπ t m · z ) (89)deﬁnes the tautological global section 1 of a trivialization of O J ( p ) (Θ C )(= M ⊗ /N C ) for Θ C theimage W g − of some ( g − X ( p ) in J ( p ). More precisely, Riemann’s classicalresults (e.g. [22], Theorem on p. 338) assert that div( θ ( z )) = Θ C is the divisor with support { κ P + P g − i =1 ι P ( P i ) , P i ∈ X ( p )( C ) } , where for any P ∈ X ( p )( C ) we write ι P : X ( p ) ֒ → J ( p )for the Albanese morphism with base point P , and κ = κ P = “ ι P ( K X p ) )2 ” for the image ofRiemann’s characteristic, which is some pre-image under duplication in J ( p ) of the image of somecanonical divisor: ω = ι P ( K X ( p ) ) (see Theorem 4.6 above).Among the translates Θ D = t ∗ D Θ, for D ∈ J ( p )( C ), of the above symmetric Θ, the divisorΘ κ = t ∗ κ Θ = P g − i =1 ι ∞ ( X ( p ) Q ) deﬁnes an invertible sheaf L (Θ κ ) on J ( p ) over Q . If N J, denotesthe neutral component of the N´eron model of J over Z and L (Θ κ ) is the cubist extension of L (Θ κ )to N J, (compare [47], Proposition II.1.2.2, as in Section 5.2 above), we know that H ( N J, , L (Θ κ ))is a (locally...) free Z -module of rank 1, so that the complex base-change H ( J ( p )( C ) , L (Θ κ, C ))is similarly a complex line. This means that if s θ is a generator of the former space, whose imagein the later we denote by s θ, C , there is a nonzero complex number C ϑ such that s θ, C ( z ) = C ϑ · θ ( z + κ ) . (90)Up to making some base-change from Z to some O K we can now forget about κ and come backto the symmetric Θ: we deﬁne a global section s J := ( t ∗− κ ) s θ ∈ H ( N J, , L (Θ) O K ) so that s J , C ( z ) = C ϑ · θ ( z ) . (91)29f one replaces N J, by the N´eron model, say N O K , of J ( p ) over any extension K of K ,then [47], Proposition II.1.2.2 insures that up to making some further ﬁeld extension K /K thesheaf L (Θ) K has a cubist extension L (Θ) O K to N O K × O K O K . Therefore s J extends to a rational section (we shall sometimes write meromorphic section) of L (Θ) O K on N O K × O K O K .Abusing notations we still denote that extended section by s J , and write accordingly Θ for itsdivisor div( s J ) on N O K × O K O K . Because s J is well-deﬁned (and non-zero) on the neutralcomponent of the N´eron model, its poles on N O K × O K O K can only show-up at places of badreduction. Proposition 5.8

The multiplicity of the Θ -divisor at any component of the N´eron model of J ( p ) over Z , normalized to be along the neutral component, is O ( p ) . Proof

We start by the following observations. Let us write s J , C ( z ) = C ϑ · θ ( z ) as in (91). Take D in J ( p )( C ) which can written as the linear equivalence class of some divisor D = g X i =1 − ( Q i − ∞ )for points Q i in X ( p )( C ). We associate to D the embedding: ι κ + D : (cid:26) X ( p ) ֒ → J ( p ) P cl( P − ∞ + κ + D )where κ is Riemann’s characteristic (see just before (57)). For such a D whose Q i are assumed tobelong to X ( p )( Q ), we know from the proof of Theorem 4.1 (see (54)) thath Θ ( ι κ + D ( P )) = 1[ K ( P, D ) : Q ] [ P, ˜ ω D ] µ (92)with ˜ ω D = X i Q i + Φ D + c D X ∞ (93)and Φ D is the explicit vertical divisorΦ D = 12 (Φ ω + Φ ϑ ) − g X i =1 Φ Q i (94)at each bad place, with notations as those of the proof of Theorem 4.1, see (55).Moreover, it is well-known that there is a subset of J ( p )( C ) which is open for the complextopology, and even the Zariski topology, in which all points D = P g − ( Q i − ∞ ) as above are suchthat dim C H ( X ( p )( C ) , L ( − D + g · ∞ ) C ) = dim C H ( X ( p )( C ) , ι ∗ κ + D L (Θ C )) = 1 (95)so that ι ∗ κ + D (Θ C ) = P i Q i, C , the latter being an equality between eﬀective divisors, not justa linear equivalence ([22], pp. 336–340). As the height h Θ , in the N´eron model of J ( p ), canbe understood as the Arakelov intersection with Θ = div( s J ) it follows that, on the curve X ( p ), div( s J , C ) ∩ ι κ + D ( X ( p ))( C ) = ∪ i ι κ + D ( Q i, C ), or div( ι ∗ κ + D ( s J , C )) = P i Q i over C . Moreprecisely, extending base to some ring of integers O K so that the Q i deﬁne sections of the minimalregular model X ( p ) O K of X ( p ) over O K , and making if necessary a further base extension suchthat L (Θ) has a cubist extension on the whole N´eron model of J ( p ) over O K (as after (91)),one sees that s J deﬁnes a meromorphic section of L (Θ) O K and the restriction to the genericﬁber X ( p ) K of div( ι ∗ κ + D ( s J )) has to be equal (and not merely linearly equivalent) to P i Q i .Now in such a situation, the multiplicity of div( s J ) on a component of the N´eron model to30hich X ( p ) smooth O K is mapped via ι κ + D , can be read on the multiplicity of ι ∗ κ + D ( s J ) along thatcomponent of X ( p ) smooth O K . In turn, because of decompositions of the arithmetic Chow groupsimilar to that of Theorem 3.2, multiplicities of div( s J ) are determined by the Φ D of (93), upto constant addition of vertical ﬁbers. The property that div( s J ) has multiplicity 0 along theneutral component of the N´eron model (see (91)) ﬁxes that last indetermination. Now if P is aplace of bad reduction for X ( p ) O K , and if the Q i move sligthly in the P -adic topology (withoutmodifying their specialization component at P ), the vertical divisor Φ D does not change either at P , and the above reasoning regarding the components values of Θ is actually independent from thefact that condition (95) holds true or not (provided, we insist, that the specialization componentsof the Q i at P do not vary).We shall gain some ﬂexibility with a last preliminary remark. If k is any integer between 0and N − N is the order of the Eisenstein element (0 − ∞ )), the divisor ˜ ω D of (93) canstill be written as˜ ω D = (cid:18) k · g − k ) · ∞ − k Φ C + 12 (Φ ω + Φ ϑ ) − ˜ D (cid:19) + c D X ∞ so that if D = g X i =1 − ( Q i − ∞ ) ! + k (0 − ∞ ) = k X i =1 − ( Q i −

0) + g X i = k +1 − ( Q i − ∞ )then ˜ ω D = P gi =1 Q i + Φ D + c D X ∞ where Φ D is stillΦ D = 12 (Φ ω + Φ ϑ ) − g X i =1 Φ Q i . (96)Coming back to the proof of the present Proposition 5.8, and assuming ﬁrst D = 0, it followsfrom what we have just discussed that the multiplicity of the Θ-divisor on the components of thejacobian to which the components of X ( p ) smooth O K map under ι κ is given by the functions g n and G of (45) and (46), see Theorem 4.1. To obtain the multiplicity of the Θ-divisor on all componentsof the jacobian we shall shift our Albanese embeddings ι κ + D in order to explore all of J ( p ) /J ( p ) with successive translations of X ( p ) smooth O K inside J ( p ).To be more explicit, let C be an element of the component group J ( p ) /J ( p ) at P , and D = P gi =1 ( P i − ∞ ) be a divisor, with all P i in X ( p )( K ), which reduces to C at P . For all r in { , . . . , g } , set D r = P ri =1 ( P i − ∞ ) and let also k r in { , . . . , N − } and Q i,r be g associatedpoints on the curve such that one can write both D r = r X i =1 ( P i − ∞ ) and D r = g X i =1 − ( Q i,r − ∞ ) + k r (0 − ∞ ) . As always in this proof, up to making a ﬁnite base-ﬁeld extension one can assume all points havevalues in K . Recall also from the discussion above that one can move slightly the Q i in the P -adictopology, as all that interests us here is the component C r , 1 ≤ r ≤ g , of ( J ( p ) /J ( p ) ) P to which D r maps. One can therefore assume if one wishes that ι ∗ κ + D r (Θ C ) = P i Q i, C (equality, not justlinear equivalence). The presentation of Φ P given in (88) and above also shows one can assumethat the specialization components at P of the Q i,r , in X ( p ) smooth O K , which are not C ∞ , are alldiﬀerent (see Figure 2.2.2).Taking ﬁrst D = 0, that is, using the map ι κ , we already remarked that (94) implies the value V of div( s J ) on C is V = [ (Φ ϑ + Φ ω ) , P ] = (cid:0) [Φ ω , P ] + [Φ P ] (cid:1) (see (53)). By Remark 3.4and (34), | V | ≤ C by considering the Albanese image ι κ + D ( X ( p ) smooth O K ) andlooking at the image of P . Here we need not to forget that the ∞ -cusp in X ( p ) now maps to31 , so the normalization of components-divisor on the curve X ( p ) smooth O K at P cannot be ﬁxed tobe 0 along the ∞ -component any longer: it needs to take the value V found above, in order tomatch with the normalization of the theta divisor on the jacobian. Applying the same reasoningas before with formula (96) gives that the value of Θ on C is V = [ P ,

12 (Φ ω + Φ ϑ ) − g X i =1 Φ Q i, + V ] = 12 (cid:0) [Φ ω , P ] + [Φ P ] (cid:1) − g X i =1 [Φ Q i, , P ] + V so that | V | ≤ Q i, specialize to diﬀerent branchesof Figure 2.2.2.From there the inductive process is clear which yields that the value of Θ on C r has absolutevalue less or equal to 7 r , whence the proof of Proposition 5.8. (cid:3) We conclude this section by writing-down, for later use, an explicit version of Mumford’s well-known “repulsion principle” for points, in the case of modular curves.

Proposition 5.9

For P and Q two diﬀerent points of X ( p )( Q ) one has h Θ ( P − Q ) ≥ g − g (h Θ ( P − ∞ ) + h Θ ( Q − ∞ )) − O ( p log p ) . (97) Proof

Let K be a number ﬁeld such that both P and Q have values in K . Using notations ofSection 3, the adjunction formula and Hodge index theorem give2[ K : Q ]h Θ ( P − Q ) = − [ P − Q − Φ P + Φ Q , P − Q − Φ P + Φ Q ] µ = [ P + Q, ω ] µ + 2[ P, Q ] µ + [Φ P − Φ Q ] ≥ [ P + Q, ω ] µ − K : Q ] sup g µ + [Φ P − Φ Q ] . In the same way,[

P, ω ] µ = 2[ K : Q ]h Θ ( P − ∞ ) − P, ∞ ] µ + [ ∞ ] µ − [Φ P ] ≥ [ K : Q ]h Θ ( P − ∞ + 12 ω ) − P, ∞ ] µ + [ ∞ ] µ − [Φ P ] where the last inequality comes from the quadratic nature of h Θ , plus the fact that the error termof (97) allows us to assume h Θ ( P − ∞ ) ≥ − √ h Θ ( ω ) = O (log p ) (see (79) and the end of proofof Theorem 4.6). Now by (51),h Θ ( P − ∞ + 12 ω ) = 1[ K : Q ] [ P, g · ∞ ] µ + O (log p )and using Remark 3.4 and Lemma 3.5 gives[ P, ω ] µ ≥ g − g [ K : Q ]h Θ ( P − ∞ + 12 ω ) + [ K : Q ] O (log p ) . As [Φ P , Φ Q ] = [ P, Φ Q ] = [ Q, Φ P ], we have | [Φ P , Φ Q ] | ≤ K : Q ] log p using Remark 3.4 again.Putting everything together with Remark 4.5 about sup g µ we obtainh Θ ( P − Q ) ≥ g − g (cid:18) h Θ ( P − ∞ + 12 ω ) + h Θ ( Q − ∞ + 12 ω ) (cid:19) − O ( p log p )which, by our previous remarks, can again be written ash Θ ( P − Q ) ≥ g − g (h Θ ( P − ∞ ) + h Θ ( Q − ∞ )) − O ( p log p ) . (cid:3) (For large p , the angle between two points of equal large enough height is here therefore at leastarccos(3 / − ε > π/

6. Of course the natural value is π/

2, to which one tends when sharpeningthe computations.) 32

Arithmetic B´ezout theorem with cubist metric

We display in this section an explicit version of B´ezout arithmetic theorem, in the sense of Philip-pon or Bost-Gillet-Soul´e ([56], [8]), for intersections of cycles in our modular abelian varietiesover number ﬁelds, with the following variants: we use Arakelov heights (as in Section 5 above,see (80)) on higher-dimensional cycles, and we endow the implicit hermitian sheaf for this heightwith its cubist metric (instead of Fubini-Study).It indeed seems that one generally uses Fubini-Study metrics for arithmetic B´ezout becausethey are the only natural explicit ones available on a general projective space (a necessary framefor the approach we follow for B´ezout-like statements). They moreover have the pleasant featurethat the relevant projective embeddings have tautological basis of global sections with sup-normless than 1 which, for instance, allows for proving that the induced Faltings height is non-negativeon eﬀective cycles (see [19], Proposition 2.6). For our present purposes however, we need boundsfor the N´eron-Tate heights of points, that is, Arakelov heights induced by cubist metrics. Onecould in principle have tried working with Fubini-Study metrics as in [8] and then directly comparewith N´eron-Tate heights, but comparison terms tend to be huge. In the case of rational points,for instance (that is, horizontal cycles of relative dimension 0), within jacobians, those error termsare bounded by Manin and Zarhin ([38]) linearly in the ambient projective dimension, that isexponential in the dimension of the abelian variety. In other words, for our modular curves, theerror terms would be exponential in the level p . It is therefore much preferable to stick to cubistmetrics. This implies we avoid the use of joins as in [8], as those need a sheaf metrization on thewhole of the ambient projective spaces, and we instead use plain Segre embeddings. The extranumerical cost essentially consists of the appearance of modest binomial coeﬃcients, which do notsigniﬁcantly alter the quantitative bounds we eventually obtain.We also need to work with projective models which are “almost” compactiﬁcations of relevantN´eron models of our jacobians. This we do with the help of Moret-Bailly theory as introduced inSection 5.Let us also recall that there still is another approach for such arithmetic B´ezout theoremswhich uses Chow forms ([56], [57]). That is however known to amount to working again withFaltings height relative to the Fubini-Study metrics ([56]-I, [60]) that we said we cannot aﬀord.Finally, regarding generality: it would of course be desirable to have a proof available forarbitrary abelian varieties. Many of the present arguments are however quite particular to ourapplication to J ( p ). We therefore prefer working in our concrete setting from the beginning,instead of considering a somewhat artiﬁcial generality. Proposition 6.1 (Arithmetic B´ezout theorem for J ( p ) ). Let ( J ( p ) , Θ) be deﬁned oversome number ﬁeld K , endowed with the principal and symmetric polarization Θ . Let V and W be two irreducible K -subvarieties of J ( p ) , of dimension d V := dim K V and d W := dim K W respectively, such that d V + d W ≤ g = dim J ( p ) and assume V ∩ W has dimension .If P is an element of ( V ∩ W )( K ) then its N´eron-Tate Θ -height satisﬁes h Θ ( P ) ≤ d V + d W d V + d W + 1)! d V ! d W ! deg Θ ( V ) deg Θ ( W ) h ( d W + 1)h Θ ( W ) + ( d V + 1)h Θ ( V )+ O ( p log p ) i . (98) Remark 6.2

The general aspect of the above release of arithmetic B´ezout might look a bitdiﬀerent from the original ones, as can be found in [8]: this is due to the fact that our deﬁnition ofthe height of some cycle Y (see Section 5, (80)) amounts to dividing its height in the sense of [8]by the product of the degree and absolute dimension of Y .33et us ﬁrst sketch the strategy of proof, which occupies the rest of this Section 6. We henceforthﬁx a prime number p and some perfect square integer N := r . (We shall eventually take r = 2.)We write ( J , M ) for the Moret-Bailly projective model of ( J ( p ) , L (Θ) ⊗ N ) given by Deﬁnition 5.4,relative to some given set of global sections S in H ( N J,N , L (Θ) ⊗ N ), of size N g , to be describedlater (Lemma 6.5). That model is deﬁned over some ring of integers O K . Consider the morphisms: J ∆ −→ J × JP ↓ ց ι P n O K × P n O K S −→ P n +2 n O K (99)where ∆ is the diagonal map, n = N g − P is the product of two S -embeddings J  ֒ → P n = P n O K and the application ι : J × J → P n +2 n is the composition of the Segre embedding S with P . Assheaves, S ∗ ( O P n n (1)) = O P n (1) ⊗ O K O P n (1)and P ∗ ( O P n (1) ⊗ O K O P n (1)) = M ⊗ O K M =: M ⊠ so that ι ∗ ( O P n n (1)) = M ⊠ and ∆ ∗ ι ∗ O P n n (1) = M ⊗ O J M = M ⊗ . (100)We naturally endow the sheaves M ⊠ , M ⊗ , and so on, with the hermitian structures induced bythe cubist metric on the various M σ for σ : K ֒ → C , denoted by k · k cub .We then pick two copies ( x i ) ≤ i ≤ n and ( y j ) ≤ j ≤ n of the canonical basis of global sections foreach O P n (1) on the two factors of P n O K × P n O K of (99), which give our basis S by restriction to J .Then we provide the sheaf O P n n (1) on P n +2 n O K with the basis of global sections ( z i,j ) ≤ i,j ≤ n ,each of which is mapped to x i ⊗ O K y j under S ∗ . Deﬁne D as the diagonal linear subspace of P n +2 n O K deﬁned by the linear equations z i,j = z j,i for all i and j .Let V, W ⊆ J = J K be two closed subvarieties over K . The support of V ∩ W is the sameas that of ( ι ◦ ∆) − ( D ∩ ι ( V × W )). To bound from above the height of points in V ∩ W it istherefore suﬃcient to estimate Faltings’ height of D ∩ ι ( V × W ), relative to the hermitian linebundle O P n n (1) | ι ( J × J ) endowed with the cubist metric. As D is a linear subspace that height isessentially the same as that of ( V × W ), up to an explicit error term which depends on the degree.In turn this error term is a priori linear in the number of (relevant) equations for D , and this isway too high. But if one knows V ∩ W has dimension 0, it is enough to choose (dim V + dim W )equations (up to perhaps increasing a bit the size of the set whose height we estimate), whichmakes the error term much smaller.That is the basic strategy of proof for Proposition 6.1. To make it eﬀective however we mustcontrol the “error terms” alluded to in the preceding lines, and those crucially depend on thesupremum, on the set S , of values for the cubist metric of global sections deﬁning the projectiveembedding J ֒ → P n O K . We shall build that S using theta functions as follows.Recall Riemann’s theta function on J ( p ) introduced in Section 5.3, see (89). Its usual analyticnorm is k θ ( z ) k an := det( ℑ ( τ )) / exp( − πy ℑ ( τ ) − y ) | θ ( z ) | (101)for z = x + iy ∈ C g (see [48], (3.2.2)). That analytic metric will have to be compared to the cubistone, about which we recall the following basic facts.Let A be an abelian variety over a number ﬁeld K , which extends to a semiabelian scheme A over the ring of integers O K . We endow A with a symmetric ample invertible sheaf L . Deﬁne,34or I ⊆ { , , } , the projection p I : A → A , p I ( x , x , x ) = P i ∈ I x i . It is known to follow fromthe theorem of the cube ([47]) that the sheaf D ( L ) := N I ⊆{ , , } p ∗ I L ⊗ ( − | I | is trivial on A .Let us therefore ﬁx an isomorphism φ : O A → D ( L ). For every complex place σ of O K one canendow L σ with some cubist metric k · k σ such that one obtains through φ the trivial metric on O A . Each cubist metric k · k σ is determined only up to multiplication by some constant factor sowe perform the following rigidiﬁcation to remove that ambiguity. If 0 A : Spec( O K ) → A denotesthe zero section, we replace L by L ⊗ O K ( π ∗ ∗A L ⊗− ) on A . Then0 ∗A ( L ) ≃ O K and we demand that the k · k σ be adjusted so that the above sheaf isomorphism is an isometry ateach σ , where O K is endowed with the trivial metric so that k k = 1. This uniquely determinesour cubist metrics k · k σ . Now by construction the hermitian sheaf L on A deﬁnes a height hverifying the expected normalization condition h(0) = 0.Having the same curvature form, the analytic and cubist metrics are known to diﬀer by constantfactors, at each complex place, on the Theta sheaf, as we shall use in the proof of Lemma 6.4 below.Recall we also deﬁned in (91) a “meromorphic theta function s J over Z ”, which can begeneralized: we have [ r ] ∗ L (Θ) |N J, ≃ L (Θ) ⊗ r on N J,r ([54], Proposition 5.1) so we deﬁne a globalsection s M := ([ r ] ∗ t ∗− κ ) s J ∈ H ( N J,r , [ r ] ∗ L (Θ) O K ) . (102)We will shortly show how to control the supremum of k s J k cub , therefore of k s M k cub , on J ( p )( C )(see Lemma 6.4). Writing N = r , we shall moreover ﬁx the morphism  M : e N J,N → J ֒ → P n O K of (86) by mapping the canonical coordinates ( x i ) ≤ i ≤ n to sections ( s i ) which will be translatesby r -torsion points of a multiple of the above s M by some constant, as explained in Lemma 6.5and its proof.This will allow us to control as well the supremum of those s i , relative to the cubist metrics, onthe complex base change of our abelian varieties, as is required by the proof of arithmetic B´ezouttheorems.We now start the technical preparation for the proof of Proposition 6.1, for which we need someLemmas on the behavior of heights and degree under Segre maps, comparison between cubist andanalytic metrics on theta functions, and estimates for all. Lemma 6.3

There is an inﬁnite sequence ( P i ) i ∈ N of points in X ( p )( Q ) which are ordinary at allplaces dividing p and have everywhere integral j -invariant. Moreover their normalized theta heightsatisﬁes h Θ ( P i − ∞ + ω ) = O ( p ) , with notations of Theorem 4.1. Proof

Let ( ζ i ) N be a inﬁnite sequence of roots of unity. One can assume none are congruentto some supersingular j -invariant in characteristic p , modulo any place of Q above p . (Indeed, asthe supersingular j -invariants are quadratic over F p , it is enough for instance to choose for the ζ i some primitive ℓ i -roots of unity, with ℓ i running through the set of primes larger than p − j -invariant equal to ζ i to some point P i in X ( p )( Q ). By construction, this makes a sequenceof points with j -height h j ( P i ) equal to 0. As for their (normalized) theta height one sees fromTheorem 4.1 thath Θ ( P i − ∞ + 12 ω ) = 1[ K ( P i ) : Q ] [ P i , ˜ ω Θ ] µ = − K ( P i ) : Q ] X σ : K ( P i ) ֒ → C g · g µ ( ∞ , σ ( P i )) + O (log p )as the contribution at ﬁnite places of [ P i , ∞ ] is 0. It is therefore enough to bound the | g µ ( ∞ , σ ( P i )) | .Now | j ( P i ) | σ = 1 for all σ : K ( P i ) ֒ → C , so the corresponding elements τ in the usual funda-mental domain in Poincar´e upper half-plane for X ( p ) or X ( p ) are absolutely bounded, and thesame for the absolute values of q τ = e iπτ . (For a useless explicit estimate of this bound, one can35heck Corollary 2.2 of [4] which proposes | q τ | ≥ e − .) From this, running through the proof ofTheorem 11.3.1 of [18], and adapting it to the case of X ( p ) instead of X ( pl ), we deduce that the σ ( P i ) do not belong to the open neighborhood, in the atlas of loc. cit., of the cusp ∞ in X ( p )( C ).Therefore Proposition 10.13 of [42] applies and gives, with notations of that work, | g µ ( ∞ , σ ( P i )) | = | g µ ( ∞ , σ ( P i )) − h ∞ ( σ ( P i )) | = O ( p ) (103)(see Theorem 11.3.1 of [18] and its proof). (cid:3) Lemma 6.4

Let s θ be the “theta function over Z ”, that is, the global section introduced justbefore (90). One has: sup J ( p )( C ) (log k s θ k cub ) ≤ O ( p log p ) . (104) Proof

Writing s θ, C ( z ) = C ϑ · θ ( z + κ ) as in (90), we shall bound from above both | C ϑ | and thecontribution of the diﬀerence between cubist and analytic metrics. Then we will use upper boundsfor the analytic norm of the theta function due to P. Autissier and proven in the Appendix of thepresent paper.We invoke again some key arguments of the proof of Proposition 5.8. For D in J ( p )( C ),written as the linear equivalence class of some divisor P gi =1 ( P i − ∞ ) on X ( p )( C ), we indeed oncemore consider the embedding ι κ − D : (cid:26) X ( p ) ֒ → J ( p ) P cl( P − ∞ + κ − D )as in Proposition 5.8. For such a D whose P i are assumed to belong to X ( p )( Q ), we recall (92)that h Θ ( ι κ − D ( P )) = 1[ K ( P, D ) : Q ] [ P, X i P i + Φ D + c D X ∞ ] µ . If the P i all have everywhere ordinary reduction, as will be the case in (105) below, the verticaldivisor Φ D will contribute at most O (log p ) to the height of points (see Remark 3.4).Note that we can fulﬁll condition (95) considering only points P i of same type as occurringin Lemma 6.3 (which, in particular, are ordinary and have integral j -invariants), because those P i make a Zariski-dense subset of X ( p )( Q ) (and the onto-ness of the map X ( p ) ( g ) ι g ∞ ։ J ( p )).We therefore conclude as in the proof of Proposition 5.8 that div( ι ∗ κ − D ( s θ )) has indeed to be( P i P i + Φ D ) on X ( p ) smooth O K . On the other hand, for some of those choices of ( P i ) ≤ i ≤ g , our Z -theta function s θ does notvanish at ι κ − D ( ∞ )( C ), so h Θ ( ι κ − D ( ∞ )) can also be computed as the Arakelov degree:h Θ ( ι κ − D ( ∞ )) = d deg( ∞ ∗ ι ∗ κ − D ( L (Θ))) . Integrality of the P i shows the intersection numbers [ ∞ , P i ] have trivial non-archimedean contri-bution. The only ﬁnite contribution to our Arakelov degree therefore comes from intersection withvertical components, that is, if K D is a suﬃciently large ﬁeld, over which D is deﬁned, then for aset of elements ( z σ ) σ : K D ֒ → Q which lift σ ( − D ) in the complex tangent space of J ( p ) to 0 one has:h Θ ( ι κ − D ( ∞ )) = d deg(0 ∗J ( p ) ( t ∗ κ − D L (Θ))) = d deg(0 ∗J ( p ) ( t ∗− D L (Θ κ )))= − K D : Q ] X K D σ ֒ → C log k s θ ( z σ ) k cub + O (log p ) Although we shall not use this, one can check that h Θ ( ι κ − D ( ∞ )) = k − ( P i P i − ∞ ) + ω k = O ( p ) byLemma 6.3 and (79). s θ, C ( z ) = C ϑ · θ ( z + κ ):log | C ϑ | = − h Θ ( ι κ − D ( ∞ )) − K D ( κ ) : Q ] X K D ( κ ) σ ֒ → C log k θ (( z + κ ) σ ) k cub + O (log p ) . (105)Following [20], paragraph 8, we now write J ( p )( C ) = C g / ( Z g + τ Z g ) for τ in Siegel’s fundamentaldomain, write z ∈ C g as z = τ · p + q for p, q ∈ R g , and introduce the function F : C g → C deﬁnedas F ( z ) = det(2 ℑ ( z )) / X n ∈ Z g exp( iπ t ( n + p ) τ ( n + p ) + 2 iπ t nq ) . One then has | F ( z ) | = 2 g/ k θ ( z ) k an . Indeed there is a constant A ∈ R ∗ + such that | F ( z ) | = A · k θ ( z ) k an (see the end of proof of Lemma 8.3 of [20]), R J ( p )( C ) | F | dν = 1 (where dν is theprobability Haar measure on J ( p )( C ); see [20], Lemma 8.2 (1)), and R J ( p )( C ) k θ ( z ) k dν = 2 − g/ (see e.g [48], (3.2.1) and (3.2.2)). Therefore Lemme 8.3 of [20] gives, using deﬁnitions of loc. cit.,Th´eor`eme 8.1, − K D ( κ ) : Q ] X K D ( κ ) σ ֒ → C (cid:16) log k θ (( z + κ ) σ ) k an + g (cid:17) ≤ h Θ ( ι κ − D ( ∞ )) + 12 h F ( J ( p )) + g π. Remember Faltings height of J ( p ) is known to satisfy h F ( J ( p )) = O ( p log p ) by [62], Th´eor`eme 1.2.(We remark that Ullmo’s normalization of Faltings’ height diﬀers from that of Gaudron-R´emond,but the diﬀerence term is linear in g = O ( p ) so the bound O ( p log p ) remains valid for the aboveh F ( J ( p ))). Writing k · k cub = e ϕ k · k an we therefore see that (105) implieslog | C ϑ | + ϕ ≤

12 h F ( J ( p )) + O ( p ) ≤ O ( p log p ) . Given this upper bound for e ϕ | C ϑ | we can now go the other way round to derive an upperbound for k s θ k cub = C ϑ · k θ ( z + κ ) k cub , by using estimates for analytic theta functions. For anyprincipally polarized complex abelian variety whose complex invariant τ is chosen within Siegel’sfundamental domain F g , Autissier’s result in the Appendix (Proposition 8.1 below) indeed gives,with notations as in (101), that:1det( ℑ ( τ )) / k θ ( z ) k an = exp( − πy ℑ ( τ ) − y ) | θ ( z ) | ≤ g g/ . (106)We refer to the Appendix for a bound which is even slightly sharper. As for the factor det( ℑ ( τ )) / , Lemma 11.2.2 of [18] gives the general result:det( ℑ ( z )) / ≤ (2 g )! V g g V g Y g +1 ≤ i ≤ g λ i where for any k we write V k for the volume of the unit ball in R k endowed with its standardEuclidean structure, and the λ r are the successive minima, relative to the Riemann form, of thelattice Λ = Z g + τ · Z g . To bound the λ i we need to invoke an avatar of loc. cit., Lemma 11.2.3.But the very same proof shows that for any integer N , the group Γ ( N ) has a set of generatorshaving entries of absolute value less or equal to the very same bound N /

4. (That term couldbe improved, but this would have an invisible impact on the ﬁnal bounds so we here contentourselves with it.) We can therefore rewrite the proof of Lemma 11.2.4 verbatim. This gives thatΛ is generated by elements having naive hermitian norm k x k E less or equal to gp . Finally, inour case the Gram matrix is diagonal (no 2 × Works of Igusa and Edixhoven-de Jong ([18], pp. 231-232) give ℑ ( τ )) / k θ ( z ) k an ≤ g +5 g . k · k P denotes the hermitian product on C g induced bythe polarization, k · k P ≤ e π π k · k E . This allows to conclude as in p. 228 of [18]:( g Y i = g +1 λ i ) ≤ ( e π π gp ) g so that log(det( ℑ ( τ ))) ≤ O ( p log p )and combining with (106), log k θ ( z ) k an ≤ O ( p log p ) . Putting everything together ﬁnally yields:sup z ∈ J ( p )( C ) log k s θ, C ( z ) k cub = sup z ∈ J ( p )( C ) log k C ϑ · θ ( z + κ ) k cub = (log | C ϑ | + ϕ ) + sup z ∈ J ( p )( C ) log k θ ( z + κ ) k an ≤ O ( p log p ) . (cid:3) Lemma 6.5

Assume the same hypothesis and notations as in Deﬁnition 5.4. After possibly mak-ing some ﬁnite base extension one can pick a set S in H ( N J, , L (Θ) ⊗ ) of g global sections ( s i ) ≤ i ≤ g , which span L (Θ) ⊗ on N J, [1 / p ] , and verify sup J ( p ) (log k s i k cub ) ≤ O ( p log p ) . (107) Proof

We ﬁx N = r = 4 for the construction of a good model as in Deﬁnition 5.4. Up tomaking a base extension, we can assume L (Θ) ⊗ and [2] ∗ L (Θ) have cubist extensions L (Θ) ⊗ and [2] ∗ L (Θ)) on N J, , respectively. As Θ is symmetric one knows there is an isomorphism[2] ∗ L (Θ) → L (Θ) ⊗ which actually is an isometry ([54], Proposition 5.1), by which we identifythose two objects from now on. On the other hand, every element x of J ( p )[4]( Q ) = J ( p )[4]( K )deﬁnes a section ˜ x in N J, (Spec( O K )). Letting t ˜ x denote the translation by ˜ x on N J, we have t ∗ ˜ x L (Θ) ⊗ ≃ L (Θ) ⊗ . (108)(This is indeed true over C by Lemma 2.4.7.c) of [6], hence over K , then over Spec( O K ) byuniqueness of cubist extensions.) The interpretation as N´eron-Tate heights shows that as L (Θ) isendowed with its cubist metric, this isomorphism even is an isometry. Recall the section s M deﬁnedin (102), belonging to H ( N J, , [2] ∗ L (Θ)). Up to making an extension to some larger base ring ofinteger, we may assume s M extends as a meromorphic section on N J, and Proposition 5.8, whichgives estimates on the poles of s J at bad components, implies that s M is actually holomorphic(has no pole on the new components) after multiplication by some power C of p with log C = O ( p log p ). We can therefore deﬁne a set ( s i ) ≤ i ≤ g in H ( N J, , [2] ∗ L (Θ)) made of 4 g elements ofshape s i := t ∗ ˜ x i C · s M (109)for ˜ x i running through a set of representatives, in J ( p )[4]( K ), of J ( p )[4] /J ( p )[2]. Note that onecan explicitly lift s M on the complex tangent space at 0 of J ( p )( C ) as s M , C ( z ) = C ϑ · θ (2 · z ) (110)where C ϑ is deﬁned in the proof of Lemma 6.4 and the s i, C are constant multiple of the basisdenoted by h ~a,~b ( ~z ) in [51], Proposition II.1.3.iii) on p. 124 . From here, Lemma 6.4 and Proposi-tion 5.8 give (107). where it seems by the way that the expression “h ~a,~b ( ~z ) = ϑ [ ~a/k~b/k ]( ℓ · ~z, Ω)” should read “ · · · = ϑ [ ~a/k~b/k ]( k · ~z, Ω)”(notations of loc. cit.).

38y the theory of theta functions ([54], Proposition 2.5 and its proof, [49] and [47], ChapitreVI) the s i make a generic basis of global sections, which span L (Θ) ⊗ over Spec( O K [1 / p ]). (cid:3) Lemma 6.6

Let V and W be two closed K -subvarieties, with dimension d V and d W respectively,of a smooth projective variety A over a number ﬁeld K , endowed with an ample sheaf M . Assumethe ﬂat projective scheme ( A , M ) over Spec( O K ) , with M an hermitian sheaf on A , is a model for ( A, M ) . Let V and W be the Zariski closure in A of V and W respectively. Then, with deﬁnitionsas in [8], Section 3.1, ( c ( M ⊠ ) d V + d W | ( V × W )) = (cid:18) d V + d W d V (cid:19) ( c ( M ) d V | V )( c ( M ) d W | W ) (111) and (ˆ c ( M ⊠ ) d V + d W +1 |V × W ) = (cid:18) d V + d W + 1 d V (cid:19) ( c ( M ) d V | V ) (ˆ c ( M ) d W +1 |W ) + (cid:18) d V + d W + 1 d W (cid:19) (ˆ c ( M ) d V +1 |V )( c ( M ) d W | W ) . (112) Remark 6.7

Equation (111) can be read asdeg M ⊠ ( V × W ) = (cid:18) d V + d W d V (cid:19) deg M ( V ) deg M ( W )Equation (112) in turn ﬁts with Zhang’s interpretation (83) in terms of essential minima, comparethe proof of Proposition 6.1 below. Proof (of Lemma 6.6).

For (111), one can realize it is elementary, or refer to Lemme 2.2 of[58], or proceed as follows. Using (2.3.18), (2.3.19), and Proposition 3.2.1, (iii) of [8], and noticing c ( M ⊠ ) = c ( M ) × + × c ( M )(and same with ˆ c ( M ) and ˆ c ( M ⊠ ) instead) one computes( c ( M ⊠ ) d V + d W | ( V × W )) = ( d V + d W X k =0 (cid:18) d V + d W k (cid:19) c ( M ) k × c ( M ) d V + d W − k | V × W )= d V + d W X k =0 (cid:18) d V + d W k (cid:19) ( c ( M ) k × c ( M ) d V + d W − k | V × W )= d V + d W X k =0 (cid:18) d V + d W k (cid:19) ( c ( M ) k | V )( c ( M ) d V + d W − k | W )= (cid:18) d V + d W d V (cid:19) ( c ( M ) d V | V )( c ( M ) d W | W )where the last equality comes from the fact that the only nonzero term in the line before occursfor k = d V .An analogous computation, using [8], (2.3.19), can be used for the arithmetic degree:(ˆ c ( M ⊠ ) d V + d W +1 |V × W ) = d V + d W +1 X k =0 (cid:18) d V + d W + 1 k (cid:19) (ˆ c ( M ) k × ˆ c ( M ) d V + d W +1 − k |V × W )= (cid:18) d V + d W + 1 d V (cid:19) ( c ( M ) d V | V ) (ˆ c ( M ) d W +1 |W ) + (cid:18) d V + d W + 1 d W (cid:19) (ˆ c ( M ) d V +1 |V )( c ( M ) d W | W ) . (cid:3) J , M ) for ( J ( p ) , Θ) (see (99)) as the one builtwith the set S of N g = 4 g sections provided by Lemma 6.5. Before settling the proof of thearithmetic B´ezout theorem, we need a last lemma on comparison between the projective heighton ( J , M ) and its normalized N´eron-Tate avatar. Lemma 6.8

Up to translation by torsion points, the projective height h M on points in J ( p )( Q ) (associated with the good model ( J , M ) ) diﬀers from the N´eron-Tate theta-height Θ by an errorterm of shape O ( p log p ) . Proof

Lemma 6.5 implies that the elements of S extend as holomorphic sections to any compo-nent of the N´eron model N of J ( p ) over Z (see (109)). As remarked in the proof of Lemma 6.5,Mumford’s algebraic theory of theta-functions implies that the sections in S do deﬁne a projectiveembedding of N over Z [1 / p ]: the only ﬁbers of N over Z where base points for S can show up areabove 2 and p . If one seeks to approximate the N´eron-Tate height of a given point P in J ( p )( Q )by the projective height of our good model ( J , M ), one needs the section of the N´eron model N deﬁned by P to avoid those base points, or at least control their length.Given P in J ( p )( Q ), we claim one can translate P by some torsion point in J ( p )( Q ) so thatthe translated new point P + t does avoid base points in characteristic 2. Indeed, choose a Galoisextension F/ Q such that the base locus is deﬁned over Spec( O F ⊗ F ). Summing-up, as divisors, allthe Galois conjugates of that base locus in each ﬁber of characteristic 2, one obtains a constant cycle C κ , in each ﬁber at κ , which is deﬁned over F . (In our case one actually could have taken F = Q .)Density of torsion points then shows that one can replace our point P by P + t , for some torsionpoint t , such that P + t does not belong to C κ for some κ , then for all κ of characteristic 2 because C κ is constant. This proves our claim. Now in characteristic p , we know from Proposition 5.8 againthat possible base points have length at most O ( p ), which gives an estimate of size O ( p log p ) for thediﬀerence error term between projective height on J and N´eron-Tate height ([54], Proposition 4.1). (cid:3) Proof of Proposition 6.1 . Before proceeding we will allow ourselves, for this proof only, andin the hope not to weighten too much the computations, to work with heights deﬁned as in [8],Section 3.1. Namely, for Y a cycle of dimension ( d + 1) in a regular arithmetic variety endowedwith a hermitian sheaf F , we multiply our deﬁnition (80) of its height by degree and absolutedimension and we set: h ′F ( Y ) = (ˆ c ( F ) d +1 |Y )[ K : Q ] . Note that h and h ′ coincide on K -rational points, in which case we might use either notation.Construction (99) gives a Q -embedding V × W ι ֒ → P n +2 n via a Segre map. We set s i,j := ι ∗ ( z i,j − z j,i )for all ( i, j ), and denote by O N the ambiant line bundle ι ∗ ( O P n n (1)) = M ⊠ as before (100).(Recall we will eventually specialize to N = 4.) Set also O N := O N ⊗ Q . We intersect ι ( V × W )with one of the div( z i ,j − z j ,i ) Q such that the two cycles meet properly: deﬁne J = div( s i ,j Q ) ∩ ( V × W )in the generic ﬁber ( J ( p ) × J ( p )) Q . As div( z i ,j − z j ,i ) is a projective hyperplane we have bydeﬁnition deg O N ( J ) = deg O N ( V × W ) . For the same linearity reason, a similar statement is true for heights. Indeed, let V and W denotethe schematic closure in J of V and W respectively, and J the schematic closure of J in J × J ,which satisﬁes h ′O N ( J ) ≤ h ′O N (div( s i ,j ) ∩ ( V × W ))40as there might be vertical components in the intersection of the right-hand side which do notintervene in the left, and contribute positively to the height).Proposition 3.2.1 (iv) of [8] gives, with notations of loc. cit., that:h ′O N (div( s i ,j ) ∩ ( V × W )) = h ′O N ( V × W )+ 1[ K : Q ] X σ : K֒ → C Z ( V × W ) σ ( C ) log k s i ,j C k c ( O N ) d V + d W (113)where k · k = k · k cub shall denote the cubist metric, or the metric induced by the cubist metric onproducts or powers of relevant sheaves. To estimate the last integral we note that at any point of( V × W ) σ ( C ) and for any ( i, j ), k s i,j k = k z i,j − z j,i k M ⊠ ≤ k z i,j k M ⊠ + k z j,i k M ⊠ ≤ k x i k M k y j k M + k x j k M k y i k M ≤ i k x i k M ) ≤ exp(2 log(sup k s i k cub ) + log 2)with notations of Lemma 6.5. Setting M J , M = log(sup k s i k cub ) we obtainh ′O N ( J ) ≤ h ′O N ( V × W ) + (2 M J , M + log 2) deg O N ( V × W ) . Call I one of the reduced irreducible components of J containing the point ι (∆( P )) of V ∩ W considered in the statement of Proposition 6.1, and let I denote its Zariski closure in J . It has O N -height (and degree) less than or equal to those of J , so that againh ′O N ( I ) ≤ h ′O N ( V × W ) + (2 M J , M + log 2) deg O N ( V × W )and we can iterate the process with I in place of V × W : we obtain some J , J , I , I such thath ′O N ( I ) ≤ h ′O N ( I ) + (2 M J , M + log 2) deg O N ( I ) ≤ h ′O N ( V × W ) + 2(2 M J , M + log 2) deg O N ( V × W ) . (The only obstruction to this step is if all the s k,l vanish on I , which implies it is containedin the diagonal of J ( p ) × J ( p ) - so that I = ι (∆( P )) by construction and that means we arealready done.) Processing, one builds a sequence ( I k ) of integral closed subschemes of J × J ,with decreasing dimension, such that the last step givesh ′O N ( I d V + d W ) ≤ h ′O N ( V × W ) + ( d V + d W )(2 M J , M + log 2) deg O N ( V × W ) . Now h ′O N ( I d V + d W ) ≥ h ′O N (∆( P, P )) = h ′M ⊗ ( P ) = h L ⊗ N ( P ) = 2 N h Θ ( P ) + O ( p log p ) , for h Θ ( P ) the N´eron-Tate normalized theta height. Indeed the statement of the present Propo-sition 6.1 is invariant by translation of every object by some ﬁxed torsion point, so that one canapply Lemma 6.8.Using Lemma 6.6 and Corollary 5.6 and writing h ′ Θ ( Y ) = (dim( Y ) + 1) deg Θ ( Y )h Θ ( Y ) wetherefore obtain2 N h Θ ( P ) ≤ N d v + d W +1 (cid:20) ( d W + 1) (cid:18) d V + d W + 1 d V (cid:19) h ′ Θ ( W ) deg Θ ( V )++ ( d V + 1) (cid:18) d V + d W + 1 d W (cid:19) h ′ Θ ( V ) deg Θ ( W ) (cid:21) + N d V + d W ( d V + d W )(2 M J , M + log 2) (cid:18) d V + d W d V (cid:19) deg Θ ( V ) deg Θ ( W )+ O ( p log p ) . From here, ﬁxing N = 4, the bound M J , M ≤ O ( p log p ) (Lemma 6.5) concludes the proof, afterexpressing quantities h ′ Θ back into h Θ . (cid:3) That arithmetic B´ezout theorem will be our principal tool in the sequel.41

Height bounds for quadratic points on X ( p ) Proposition 7.1

Let ι : X ֒ → J be some Albanese map from a curve (of positive genus) oversome ﬁeld K to its jacobian J . Let π : J → A be some quotient of J , with dim( A ) > , and X ′ bethe normalization of the image π ◦ ι ( X ) of X in A . Then the map π ′ : X → X ′ induced by π ◦ ι veriﬁes deg( π ′ ) ≤ dim( J ) − A ) − . Proof

The map π ◦ ι induces an inclusion of function ﬁelds which deﬁnes the map π ′ : X → X ′ . If J ′ is the jacobian of X ′ , Albanese functoriality says that π factorizes through surjective morphisms J → J ′ → A . Hurwitz formula writes:deg( π ′ ) = dim( J ) − − deg R dim( J ′ ) − R the ramiﬁcation divisor of π ′ , whence the result. (cid:3) Lemma 7.2

For all large enough prime p , let X := X ( p ) and π e : J ( p ) ։ J e be the projection.Let ι P : (cid:26) X ( p ) ֒ → J ( p ) P cl( P − P ) for some P in X ( p )( Q ) such that w p ( P ) = P (there are roughly √ p such points, see Proposi-tion 3.1 of [25]), and set ϕ e := π e ◦ ι P . Then: • if a ∈ J e ( Q ) is some (necessarily torsion) point, the equality ϕ e ( X ( p )) = a − ϕ e ( X ( p )) implies ϕ e ( X ( p )) = a + ϕ e ( X ( p )) (114) and a = 0 ; • If d is the degree of the map X ( p ) → ^ ϕ e ( X ( p )) to the normalization of ϕ e ( X ( p )) , then d is either , or ; • Assuming moreover Brumer’s conjecture (see (21) and (22)) equality (114) implies d = 1 for large enough p . Proof

Notice ﬁrst that, by our choice of P (whence ι ), and because J e belongs to the w p -minuspart of J ( p ), one has: ϕ e ( w p ( P )) = w p ( ϕ e ( P )) = − ϕ e ( P )for all P ∈ X ( p )( C ), whence equality (114). So let n be the order of a , which also is that of theautomorphism “translation by a restricted to ϕ e ( X ( p ))” . We remark that the degree d cannot beequal to 2, as otherwise the extension of fraction ﬁelds K ( X ( p )) /K ( ϕ e ( X ( p ))) would be Galoisand X ( p ) would possess an involution diﬀerent from w p , which it does not by Ogg’s theorem([53], or [33]). If d = 1, the same reason that Aut( X ( p )) = h w p i implies that n = 1. Let now X ′ be the normalization of the quotient of ϕ e ( X ( p )) by the automorphism P P + a , that is,the image of ϕ e ( X ( p )) by the quotient morphism J e ։ J e / h a i . Let π be the composed map J ( p ) ϕ e −→ J e → J e / h a i . The degree of X ( p ) → X ′ is d · n and Proposition 7.1 together with theleft part of inequalities (23) implies: d · n ≤ g − − o (1)) g − ≤ o (1)for large enough p . This shows that if d = 3 or 4 one still has a = 0, whence the Proposition’s ﬁrsttwo statements. Assuming (22) we have d · n <

3, so that d = 1 and a = 0 by previous arguments. (cid:3) emark 7.3 Replace, in Lemma 7.2, the map X ( p ) → J e by X ( p ) ϕ −→ J ( p ) − (by which theformer factorizes, by the way). The above proof shows that the map X ( p ) → ϕ ( X ( p )) is ofgeneric degree 1 (independently on any conjecture), but of course it needs not be injective onpoints: a ﬁnite number of points can be mapped together to singular points on ϕ ( X ( p )). In ourcase one checks those are among the Heegner points P such that P = w p ( P ) (for which we againrefer to Proposition 3.1 of [25]). Indeed, the endomorphism of J ( p ) deﬁned by multiplication by(1 − w p ) factorizes through ϕ , and · (1 − w p ) is the map considered in (4) and what follows, inducingmultiplication by 2 on tangent spaces. Therefore, if P maps to a multiple point of ϕ ( X ( p )), italso maps to a multiple point of (1 − w p ) ◦ ι ( X ( p )). Now assuming X ( p ) has gonality larger than2 (which is true as soon as p >

71, [52], Theorem 2), the equality cl((1 − w p ) P ) = cl((1 − w p ) P ′ )in J ( p ), for some P ′ on X ( p ) diﬀerent from P , implies P = w p P and P ′ = w p P ′ . That is, P and P ′ are Heegner points. Lemma 7.4

Suppose P belongs to X ( p )( K ) for some quadratic number ﬁeld K , and P is nota complex multiplication point. Then for one of the two natural degeneracy morphisms π from X ( p ) to X ( p ) , the point Q := π ( P ) in X ( p )( K ) does not deﬁne a Q -valued point of the quotientcurve X +0 ( p ) := X ( p ) /w p . Proof

Using the modular interpretation, we write P = ( E, C p ) for E an elliptic curve over K and C p a cyclic K -isogeny of degree p , from which we obtain the two points Q := ( E, p · C p )and Q := ( E/p · C p , C p mod p · C p ) in X ( p )( K ). Assume both Q and Q do deﬁne elementsof X +0 ( p )( Q ). If σ denotes a generator of Gal( K/ Q ) we then have w p ( Q ) = ( E/p · C p , E [ p ] mod p · C p ) ≃ σ ( Q )and w p ( Q ) = ( E/C p , E [ p ] + C p mod C p ) ≃ σ ( Q ) . Therefore E ≃ σ ( E/p · C p ) ≃ E/C p , which means E has complex multiplication. (cid:3) We can now conclude with the main result of this paper.

Theorem 7.5

There is an integer C such that the following holds. If p is a prime number suchthat (22), the weak form of Brumer’s conjecture, holds, and P is a quadratic point of X ( p ) (thatis: P is an element of X ( p )( K ) for some quadratic number ﬁeld K ) which does not come from X ( p ) + ( Q ) , then its j -height satisﬁes h j ( P ) < C · p log p. (115) If P is a quadratic point of X ( p ) then the same conclusion holds without further assumptionapart from (22). Proof

In the case P is a quadratic point of X ( p ), by Lemma 7.4 one can deduce from P apoint P ′ in X ( p )( K ) which does not induce an element of X +0 ( p )( Q ), and whose j -height, say, isequal to h j ( P ) + O (log p ) for an explicit function O (log p ) (see e.g. [55], inequality (51) on p. 240and [5], Proposition 4.4 (i)). Replace P by P ′ if necessary. By Theorem 4.6 it is now suﬃcient toprove that h Θ ( P − ∞ ) = O ( p log p ).Keep the notation of Lemma 7.2. By construction, the point: a := ϕ e ( P ) + ϕ e ( σ P ) = ϕ e ( P ) − ϕ e ( w p ( σ P )) = ϕ e ( P − w p ( σ P ))is torsion. First assume a = 0. Set X (2) , − := (cid:8) ι ∞ ( x ) − ι ∞ ( y ) , ( x, y ) ∈ X ( p ) (cid:9) as in Proposi-tion 5.3. Recall from Section 2 that ˜ I J ⊥ e ,N ⊥ e : J ⊥ e → ˜ J ⊥ e is the map deﬁned as in (3), that ι ˜ J ⊥ e ,N ⊥ e

43s the embedding ˜ J ⊥ e ֒ → J ( p ), and denote by [ N ˜ J ⊥ e ] ˜ J ⊥ e the multiplication by N ˜ J ⊥ e restricted to˜ J ⊥ e . As in (8) and before Corollary 5.7 we use our pseudo-projections and deﬁne e X (2) , − := ι ˜ J ⊥ e ,N ⊥ e [ N ˜ J ⊥ e ] − J ⊥ e ˜ I J ⊥ e ,N ⊥ e π J ⊥ e ( X (2) , − ) . Then P − w p ( σ P ) belongs to X (2) , − ∩ ˜ J ⊥ e , and even to the intersection of surfaces (in the genericﬁber): X (2) , − ∩ e X (2) , − . Recall (see (8)) that e X (2) , − is a priori highly non-connected, being the inverse image of multipli-cation by N ˜ J ⊥ e in ˜ J ⊥ e of the (irreducible) surface ˜ I J ⊥ e ,N ⊥ e π J ⊥ e ( X (2) , − ). However, in what followswe can replace e X (2) , − by one of its connected components containing P − w p ( σ P ). Denote thatcomponent by e X (2) , − P .By construction, the theta degree and height of e X (2) , − P , as an irreducible subvariety of J ( p )endowed with Θ, are those of π J ⊥ e ( X (2) , − ) = X (2) , − e ⊥ relative to the only natural hermitian sheaf of J ⊥ e , that is, the Θ ⊥ e = Θ J ⊥ e described in paragraph 2.1.2. One can therefore apply Proposition 5.3to obtain that all theta degrees are O ( p ), all N´eron-Tate theta heights are O (log p ). We claim thedimension of ( X (2) , − ∩ e X (2) , − P ) is zero. That intersection indeed corresponds to pairs of distinctpoints on X ( p ) having same image (0) under ϕ e . On the other hand, Brumer’s conjecture implies X ( p ) → ϕ e ( X ( p )) has generic degree one (see Lemma 7.2), so our intersection points correspondto singular points on ϕ e ( X ( p )), which of course make a ﬁnite set.We therefore are in position to apply our arithmetic B´ezout theorem (Proposition 6.1), whichyields h Θ ( P − w p ( σ P )) ≤ O ( p log p ). The two points ( P − ∞ ) and ( w p ( σ P ) − ∞ ) have sameΘ-height (recall w p is an isometry on J ( p ) for h Θ , compare the end of Remark 4.3), and are byhypothesis diﬀerent, so one can apply them Mumford’s repulsion principle (Proposition 5.9) toobtain h Θ ( P − ∞ ) ≤ O ( p log p ) . (116)Let us ﬁnally deal with the case when the torsion point a = ϕ e ( P ) + ϕ e ( σ P ) is nonzero. Weadapt the previous argument: pick a lift ˜ a ∈ J ( p )( Q ) of a by π ⊥ e which also is torsion, and let t ˜ a be the translation by ˜ a in J ( p ). Replace ( P − w p ( σ P )) by t ∗ ˜ a ( P − w p ( σ P )), X (2) , − by t ∗ ˜ a X (2) , − and e X (2) , − by g t ∗ ˜ a X (2) , − = ι ˜ J ⊥ e ,N ⊥ e [ N ˜ J ⊥ e ] − J ⊥ e ˜ I J ⊥ e ,N ⊥ e π J ⊥ e ( t ∗ ˜ a X (2) , − ) . Now t ∗ ˜ a ( P − w p ( σ P )) belongs to ( t ∗ ˜ a X (2) , − ∩ g t ∗ ˜ a X (2) , − ). The theta degree and height of t ∗ ˜ a X (2) , − and g t ∗ ˜ a X (2) , − (or rather, as above, some connected component g t ∗ ˜ a X (2) , − P of it containing t ∗ ˜ a ( P − w p ( σ P )))are the same as for the former objects in the case a = 0. The fact that the intersection t ∗ ˜ a X (2) , − ∩ g t ∗ ˜ a X (2) , − P is zero-dimensional comes from the fact that otherwise, we would have ϕ e ( X ( p )) = a − ϕ e ( X ( p )),a contradiction with our present hypothesis a = 0 by Proposition 7.2. The height bound for P istherefore the same as (116). (cid:3) Corollary 7.6

Under the assumptions of Theorem 7.5, if p is a large enough prime number and P is a quadratic point of X ( p γ ) for some integer γ , such that P is not a cusp nor a complexmultiplication point, then γ ≤ . roof Let P be a point in X ( p γ )( K ), which is not a cusp nor a CM point, for some quadraticnumber ﬁeld K . Then the isogeny bounds of [20], Theorem 1.4 imply there is some real κ with p γ < κ (h j ( P )) . Now Theorem 7.5 gives that there is some absolute real constant B such that, if p ≥ B then γ ≤ (cid:3) Remark 7.7

A similar (but technically simpler) approach for the morphism X ( p ) → J e over Q should give (independently of any conjecture) a bound of shape O ( p log p ) for the j -height of Q -rational (non-cuspidal) points of X ( p ) (which are known not to exist for p >

163 by Mazur’stheorem). The same should apply for Q -points of X split ( p ) (and here again, we obtain a weakversion of known results).Actually, sharpening results directly coming from Section 4 (that is, avoiding the use of B´ezout)might even yield the full strength of the above results about X ( p )( Q ) and X split ( p )( Q ), with morestraightforward (unconditional) proofs. In this appendix, I give a new upper bound for the norm of the classical theta function on anycomplex abelian variety. This result, apart from its role in the present paper (see Section 6), hasbeen used by Wilms [63] to bound the Green-Arakelov function on curves.

Let g be a positive integer. Write H g for the Siegel space of symmetric matrices Z ∈ M g ( C ) suchthat Im Z is positive deﬁnite. To every Z ∈ H g is associated the theta function deﬁned by θ Z ( z ) = X m ∈ Z g exp( iπ t mZm + 2 iπ t mz ) , ∀ z ∈ C g , and its norm deﬁned by k θ Z ( z ) k = √ det Y exp( − π t yY − y ) | θ Z ( z ) | , ∀ z = x + iy ∈ C g , where Y = Im Z .My contribution here is the following: Proposition 8.1

Let Z ∈ H g and assume that Z is Siegel-reduced. Put c g = g + 22 if g ≤ and c g = g + 22 (cid:16) g + 2 π √ (cid:17) g/ if g ≥ . The upper bound k θ Z ( z ) k ≤ c g (det Im Z ) / holds for every z ∈ C g . Let us remark that c g ≤ g g/ for every g ≥

2. In comparison, Edixhoven and de Jong ([18]page 231) obtained the statement of Proposition 8.1 with c g replaced by 2 g +5 g .45 .2 Proof Fix a positive integer g . Denote by S g the set of symmetric matrices Y ∈ M g ( R ) that are positivedeﬁnite. Let us recall a special case of the functional equation for the theta function (see equation(5.6) of [50] page 195): for every Y ∈ S g and every z ∈ C g , one has θ iY − ( − iY − z ) = √ det Y exp( π t zY − z ) θ iY ( z ) . (117) Lemma 8.2

Let Z ∈ H g and z ∈ C g . Putting Y = Im Z , one has the inequality k θ Z ( z ) k ≤ k θ iY (0) k = θ iY (0) √ det Y .

Proof

Put y = Im z . One has | θ Z ( z ) | = (cid:12)(cid:12)(cid:12) X m ∈ Z g exp( iπ t mZm + 2 iπ t mz ) (cid:12)(cid:12)(cid:12) ≤ X m ∈ Z g (cid:12)(cid:12)(cid:12) exp( iπ t mZm + 2 iπ t mz ) (cid:12)(cid:12)(cid:12) = θ iY ( iy ) , that is, k θ Z ( z ) k ≤ k θ iY ( iy ) k . The functional equation (117) gives k θ iY − ( Y − y ) k = k θ iY ( iy ) k ,and one deduces k θ Z ( z ) k ≤ k θ iY − ( Y − y ) k . (118)Applying again (118) with Z replaced by iY − and z by Y − y , one gets k θ iY − ( Y − y ) k ≤ k θ iY (0) k . Whence the result. (cid:3)

Let Y ∈ S g . Deﬁne λ ( Y ) = min m ∈ Z g −{ } t mY m . For every t ∈ R ∗ + , put f Y ( t ) = θ itY (0) = X m ∈ Z g exp( − πt t mY m ) . Lemma 8.3

Let Y ∈ S g and put λ = λ ( Y ) . The following properties hold.(a) The function R ∗ + → R that maps t to t g/ f Y ( t ) is increasing.(b) One has the estimate f Y (cid:16) g + 22 πλ (cid:17) ≤ g + 22 . Proof ( a ) The functional equation (117) implies √ det Y t g/ f Y ( t ) = f Y − (1 /t ) for every t ∈ R ∗ + ;conclude by remarking that f Y − is decreasing.( b ) Part ( a ) gives dd t [ t g/ f Y ( t )] ≥

0, that is, g t f Y ( t ) ≥ − f ′ Y ( t ) for every t >

0. On the otherhand, − π f ′ Y ( t ) = X m ∈ Z g t mY m exp( − πt t mY m ) ≥ X m ∈ Z g −{ } λ exp( − πt t mY m ) = λ [ f Y ( t ) − . One infers g t f Y ( t ) ≥ πλ [ f Y ( t ) − t = g + 22 πλ , one obtains the result. (cid:3) Proposition 8.4

Let Y ∈ S g . Putting λ = λ ( Y ) , one has the upper bound θ iY (0) ≤ g + 22 max h(cid:16) g + 22 πλ (cid:17) g/ , i . roof Put t = g + 22 πλ . If t ≥

1, then Lemma 8.3 ( a ) implies the inequality f Y (1) ≤ t g/ f Y ( t ). If t ≤

1, then f Y (1) ≤ f Y ( t ) since f Y is decreasing. In any case, one obtains θ iY (0) = f Y (1) ≤ max( t g/ , f Y ( t ) . Conclude by applying Lemma 8.3 ( b ). (cid:3) Now, to prove Proposition 8.1 from Lemma 8.2 and Proposition 8.4, it suﬃces to observe thatif Z ∈ H g is Siegel-reduced, then λ (Im Z ) ≥ √

32 (see lemma 15 of [27] page 195).

Acknowledgments

The main body of this work (by P.P) beneﬁted from hours of discussions withthe author of the Appendix (P.A.), who shared with great generosity his expertise in Arakelovgeometry, provided extremely valuable advices, references, explanations, critics, insights, and evenread large parts of preliminary releases of the present paper . Pascal actually ended writing thepresent Appendix, and the bounds its displays for theta functions should deﬁnitely be useful in amuch wider context than the present work .Many thanks are also due to Qing Liu for clarifying some points of algebraic geometry, FabienPazuki for explaining general diophantine geometry issues, and to Ga¨el R´emond for describing ushis own approach to Vojta’s method, which under some guise plays a crucial role here.As already stressed, the inﬂuence of the orange book [13] should be obvious all over this text.We have used many results of the deep eﬀective Arakelov study of modular curves led there by BasEdixhoven, Jean-Marc Couveignes and their coauthors. We also beneﬁted from a visit to LeidenUniversity in June of 2015, where we had very enlightening discussions with Bas, Peter Bruin,Robin de Jong and David Holmes.Olga Balkanova, Samuel Le Fourn and Guillaume Ricotta helped a lot with references andexplanations about some results of analytic number theory, and Jean-Benoˆıt Bost kindly answeredsome questions about his own arithmetic B´ezout theorem.Finally, many thanks are due to the referee for her or his substantial and helpful work. References [1]

A. Abbes , Hauteurs et discr´etude, S´eminaire Bourbaki 1996-1997, expos´e , Ast´erisque (1997), 141–166.[2]

A. Abbes, E. Ullmo , Comparaison des m´etriques d’Arakelov et de Poincar´e sur X ( N ), Duke Math. J. (1995), 295–307.[3] A. Aryasomayajula , Bounds for Green’s functions on hyperbolic Riemann surfaces of ﬁnite volume. PhD thesis,Humboldt-Universit¨at zu Berlin, 2013.[4]

Yu. Bilu, P. Parent , Runge’s method and modular curves, Int. Math. Res. Not. 2011, no. 9, 1997–2027.[5]

Yu. Bilu, P. Parent, M. Rebolledo , Rational points on X +0 ( p r ), Ann. Inst. Fourier , n. 3 (2013), 957–984.[6] C. Birkenhake, H. Lange , Complex abelian varieties, Grund. der Mat. Wiss , 2004.[7]

J.-B. Bost , Intrinsic heights on stable varieties and abelian varieties, Duke math. J. n. 1 (1996), 21–70.[8] J.-B. Bost, H. Gillet, C. Soul´e , Heights of projective varieties and positive Green forms. J. Amer. Math. Soc. (1994), 903–1027.[9] P. Bruin , Explicit bounds on automorphic and canonical Green functions of Fuchsian groups, Mathematika (2014), 257–306.[10] A. Brumer , The rank of J ( N ), Ast´erisque (1995), 41–68.[11] S. Checcoli, F. Veneziano, E. Viada , The explicit Mordell conjecture for families of curves, Preprint (2016).arxiv.org/abs/1602.04097[12]

T. Chinburg , An introduction to Arakelov intersection theory, in Arithmetic Geometry, ed. by G. Cornell andJ. Silverman, Springer-Verlag (1986).[13]

J.-M. Couveignes, B. Edixhoven et al. , Computational aspects of modular forms and Galois representations,Ann. of Math. Stud. , Princeton Univ. Press, Princeton, NJ, 2011. Although, as goes without saying, he bears no responsibility for the mistakes which remain. They have already been used by R. Wilms in [63], see the introduction to Autissier’s Appendix. P. Deligne, M. Rapoport , Les sch´emas de modules de courbes elliptiques, in “Modular functions of one variable,II (Proc. Internat. Summer School, Univ. Antwerp, Antwerp, 1972)”, pp. 143–316; Lecture Notes in Math. , Springer, Berlin, 1973.[15]

A. Ducros , Espaces analytiques p -adiques au sens de Berkovich, S´eminaire Bourkaki n. , 2006.[16] B. Edixhoven, R. de Jong , Short introduction to heights and Arakelov theory, in

Computational aspects ofmodular forms and Galois representations , Ann. of Math. Stud. , Princeton Univ. Press, Princeton, NJ,2011, pp. 79–94.[17]

B. Edixhoven, R. de Jong , Applying Arakelov theory, in

Computational aspects of modular forms and Galoisrepresentations , Ann. of Math. Stud. , Princeton Univ. Press, Princeton, NJ, 2011, pp. 187–202.[18]

B. Edixhoven, R. de Jong , Bounds for Arakelov invariants of modular curves, in

Computational aspects ofmodular forms and Galois representations , Ann. of Math. Stud. , Princeton Univ. Press, Princeton, NJ,2011, pp. 217–256.[19]

G. Faltings , Diophantine approximation on Abelian varieties, Ann. of Math. (2) (1991), 549–576.[20] ´E. Gaudron, G. R´emond , Th´eor`eme des p´eriodes et degr´es minimaux d’isog´enies, Comment. Math. Helv. (2014), 343–403.[21] ´E. Gaudron, G. R´emond , Polarisations et isog´enies, Duke Math. J., (2014), 2057–2108.[22] Ph. Griﬃths, J. Harris , Principles of algebraic geometry. New York: Wiley Interscience, 1978.[23]

B. H. Gross , Heegner points on X ( N ), in Modular forms (ed. R.A. Rankin), Chichester: Ellis Horwood87–106 (1984).[24] B. H. Gross , Heights and the special values of L -series. In Number theory (Montreal, Que., 1985), volume of CMS Conf. Proc., pp. 115–187. Amer. Math. Soc., Providence, RI, 1987.[25] B. H. Gross , Heegner points and the modular curve of prime level, J. Math. Soc. Japan, n. (1987),345–362.[26] M. Hindry, J. H. Silverman , Diophantine Geometry. An Introduction, GTM 201, Springer, 2000.[27]

J. Igusa , Theta functions. Grundlehren der math. Wissenschaften (1972).[28]

H. Iwaniec, W. Luo, P. Sarnak , Low lying zeros of families of L -functions, Pub. Math. I.H.´E.S. (2000),55–131.[29] H. Iwaniec, P. Sarnak , The non-vanishing of central values of automorphic L -functions and Landau-Siegelzeros, Israel J. Math. (2000), 155–177.[30] R. de Jong , N´eron-Tate heights of cycles on jacobians. J. Alg. Geom. (2018), 339–381.[31] R. de Jong, F. Shokrieh , Tropical moments of tropical Jacobians, preprint available athttps://arxiv.org/abs/1810.02639.[32]

J. Jorgenson, J. Kramer , Bounds on canonical Green’s functions. Compositio Math. (3) (2006), 679–700.[33]

M. A. Kenku, F. Momose , Automorphism groups of the modular curves X ( N ), Compositio math. , n. 1(1988), 51–80.[34] E. Kowalski, Ph. Michel, J. Vanderkam , Non-vanishing of high derivatives of automorphic L -functions at thecenter of the critical strip, J. reine angew. Math. (2000), 1–34.[35] E. Larson, D. Vaintrob , Determinants of subquotients of Galois representations associated with abelian vari-eties. With an appendix by Brian Conrad. J. Inst. Math. Jussieu (2014), no. 3, 517–559.[36] S. Le Fourn , Surjectivity of Galois representations associated with quadratic Q -curves, Mathematische Ann. (2016), 173–214. arxiv.org/abs/1212.4713[37] Q. Liu , Algebraic geometry and arithmetic curves. (Translated from the French by Reinie Ern´e). OxfordUniversity Press, Oxford, 2002.[38]

Yu. I. Manin, Yu. G. Zarhin , Height on families of abelian varieties, Mat. Sb. (N.S.) (131) (1972), 171–181.[39] B. Mazur , Modular curves and the Eisenstein ideal, Publications math´ematiques de l’I.H.E.S. (1977),33–186.[40] R. Menares , Nombres d’intersection arithm´etiques et op´erateurs de Hecke sur les courbes modulaires X ( N ),Th`ese de l’universit´e de Paris-Sud-Orsay, 2008. http://tel.archives-ouvertes.fr/tel-00360171[41] R. Menares , Correspondences in Arakelov geometry and applications to the case of Hecke operators on modularcurves, Manuscripta Math. (2011), 501–543. arXiv:0911.0546[42]

F. Merkl , An upper bound for Green functions on Riemann surfaces, in

Computational aspects of modularforms and Galois representations , Ann. of Math. Stud. , Princeton Univ. Press, Princeton, NJ, 2011,pp. 203–216.[43]

Ph. Michel, E. Ullmo , Points de petite hauteur sur les courbes X ( N ), Invent. Math. (1998), 645–674.[44] G. Mikhalkin, I. Zharkov , Tropical curves, their Jacobians and theta functions. In

Curves and abelian varieties ,203–230, Contemp. Math. , AMS, Providence, RI, 2008. F. Momose , Isogenies of prime degree over number ﬁelds. Compositio Math., n. 3, (1995), 329–348.[46] L. Moret-Bailly , M´etriques permises, in S´eminaire sur les pinceaux arithm´etiques : la conjecture de Mordell.Ast´erisque , 29–87, 1985.[47]

L. Moret-Bailly , Pinceaux de vari´et´es ab´eliennes, Ast´erisque , 1985.[48]

L. Moret-Bailly , Sur l’´equation fonctionnelle de la fonction thˆeta de Riemann, Compositio Math. n. (1990), 203–217.[49] D. Mumford , On equations deﬁning abelian varieties I, Invent. Math. (1966), 287–354.[50] D. Mumford , Tata lectures on theta I. Progress in math. (1983).[51] D. Mumford , Tata lectures on Theta Functions I, Progr. in Math., No. , Birkh¨auser, 1984.[52] A. Ogg , Hyperelliptic modular curves, Bull. Soc. Math. France (1974).[53]

A. Ogg , ¨Uber die Automorphismengruppe von X ( N ), Mathematische Ann. , 279–292.[54] F. Pazuki , Theta height and Faltings height, Bull. Soc. Math. Fr. (2012), 19–49.[55]

F. Pellarin , Sur une majoration explicite pour un degr´e d’isog´enie liant deux courbes elliptiques, Acta Arith. (2001), 203–243.[56]

P. Philippon , Sur des hauteurs alternatives, I [Math. Annalen (1991), 255–283], II [Ann. Fourier, (1994), 1043–1065], III. [J. Math. Pures Appl. (1995), 345–365].[57] G. R´emond , D´ecompte dans une conjecture de Lang, Invent. Math. (2000), 513–545.[58]

G. R´emond , Nombre de points rationnels des courbes, Proc. London Math. Soc. (3) (2010), 759–794.[59]

E. Royer , Petits z´eros de fonctions L de formes modulaires, Acta Arith. (2001), no 2, 147–172.[60] Ch. Soul´e , G´eom´etrie d’Arakelov et th´eorie des nombres transcendants, Ast´erisque (1991), 355–371.[61]

A. Thuillier , Th´eorie du potentiel sur les courbes en g´eom´etrie non archim´edienne. Applications`a la th´eorie d’Arakelov. Th`ese de doctorat, universit´e de Rennes 1, 2005. https://tel.archives-ouvertes.fr/ﬁle/index/docid/48750/ﬁlename/tel-00010990.pdf[62]

E. Ullmo , Hauteur de Faltings de quotients de J ( N ), discriminants d’alg`ebres de Hecke et congruences entreformes modulaires, American Journal of Math. (2000), 83–115.[63] R. Wilms , New explicit formulas for Faltings’ delta-invariant, arXiv:1605.00847v2, Invent. math. (2017),481–539.[64]

Sh.-W. Zhang , Admissible pairing on a curve, Invent. Math. (1993), 171–193.[65]