aa r X i v : . [ m a t h . N T ] J u l COUNTING POINTS OF BOUNDED HEIGHT IN MONOID ORBITS
WADE HINDES(with appendix by UMBERTO ZANNIER)
Abstract.
Given a set of endomorphisms on P N , we establish an upper bound on thenumber of points of bounded height in the associated monoid orbits. Moreover, we give amore refined estimate with an associated lower bound when the monoid is free. Finally, weshow that most sets of rational functions in one variable satisfy these more refined bounds. Introduction
Let H be the absolute multiplicative Weil height on P N ( Q ) and let K be a number field.Then given a subset X ⊆ P N ( Q ) of interest in some context, the growth rate of the numberof K -points in X of bounded height, X ( K, B ) := { Q ∈ X ∩ P N ( K ) : H ( Q ) ≤ B } , is known to encode interesting invariants of X and K . For instance, if X = P N ( K ) , then X ( K, B ) ∼ C K,N B ( N +1)[ K : Q ] where C K,N depends on the regulator, class group, etc. of K . If X is an abelian variety, then X ( K, B ) ∼ C K,X log( B ) r/ where r is the rank of the Mordell-Weil group X ( K ) . If X is a smooth curve of genus at least , then X ( K, B ) ∼ C K,X . Moregenerally, if X is a thin set, i.e., a proper Zariski closed subset or the image of some genericallyfinite morphism of degree at least two, then Theorem 3 in [22, §13.1] implies that(1) X ( K, B ) ≪ B ( N +1 / K : Q ] log( B ) . Likewise there are a few height-counting results in arithmetic dynamics, where orbits play therole of X ; see [2, 13, 16, 25, 26] for examples on Markoff varieties, K3 surfaces, and projectivespace. For instance, suppose that φ is a dominant rational self-map of P N with dynamicaldegree δ φ > . Then, if P ∈ P N ( K ) is a point such that the orbit Orb φ ( P ) = { φ n ( P ) } n ≥ isZariski dense, the Kawaguchi-Silverman Conjecture predicts that(2) { Q ∈ Orb φ ( P ) : H ( Q ) ≤ B } ∼ log( δ φ ) − log log( B ); see [16] for the relevant definitions and background. Of course, this asymptotic is knownin the case of morphisms, when deg( φ ) = δ φ > and P is not preperiodic. Similarly if S = { φ , . . . φ s } is a set of endomorphisms of degree at least two equipped with a probabilitymeasure ν , then for almost every sequence γ of elements of S , we have the analogous asymptoticto (2) for random orbits:(3) { Q ∈ Orb γ ( P ) : H ( Q ) ≤ B } ∼ log( δ S,ν ) − log log( B ) , δ S,ν = Π φ ∈ S deg( φ ) ν ( φ ) . Here, the bound holds for all P with large enough height; see [13, Corollary 1.3] for details.In this paper, we study the problem of counting points of bounded height in monoid (orsemigroup) orbits in P N , that is, counting all of the points of bounded height obtained byapplying all possible compositions of maps within a fixed set S to a given initial point P ;compare to [2, 26]. Intuitively, one expects that if the maps in S are related in some way(for instance, if they commute), then this should cut down the number of possible points inthe associated orbits. However, for most S we expect to see no relations (free monoids), andwith this in mind, we have the following result; here and throughout, M S denotes the monoidgenerated under composition by a set S of endomorphisms of P N defined over Q . Mathematics Subject Classification : Primary: 37P15, 37P05. Secondary: 11G50, 11D45.
Theorem 1.1.
Let S = { φ , . . . , φ s } be a set of endomorphisms on P N ( Q ) with distinctdegrees all at least two. If M S is free, then for all ǫ > there exists an effectively computablepositive constant b = b ( S, ǫ ) and a constant B S depending only on S such that (log B ) b ≪ { f ∈ M S : H ( f ( P )) ≤ B } ≪ (log B ) b + ǫ holds for all P ∈ P N ( Q ) with H ( P ) > B S . Moreover, the implicit constants and error termsdepend on P and are effectively computable if B S is.Remark . When S = { φ , φ } generates a free monoid with deg( φ ) = 2 and deg( φ ) = 3 ,then we give explicit computations for the bounds in Theorem 1.1 in Example 1 below.In particular, we can use the upper bound in Theorem 1.1 on the number of functions inthe free case to give an upper bound on the number of points of bounded height in arbitrarydynamical orbits; compare to (1), to [2, Theorem 4.15], and to the asymptotic for abelianvarieties above. In what follows, Orb S ( P ) = { f ( P ) : f ∈ M S } denotes the total orbit of P under the monoid M S . Corollary 1.2.
Let S = { φ , . . . , φ s } be a set of endomorphisms on P N ( Q ) all of degree atleast two (and distinct if s ≥ ). Then there exists an effectively computable positive constant b and a constant B S depending only on S such that { Q ∈ Orb S ( P ) : H ( Q ) ≤ B } ≪ (log B ) b holds for all P ∈ P N ( Q ) with H ( P ) > B S .Remark . Although we expect that log( B ) b is also a lower bound for some choice of b andmost S (see Conjecture 1.3 and Theorem 1.6 below), we note that it is only an upper boundin general, even for s ≥ . For instance, if M S is a free commutative monoid (e.g., if S is acertain set of monic power maps), then the asymptotic height growth rate in orbits will be aconstant times log log( B ) ; see [13, §5] for details. This matches the case of a single map (alsoa commutative monoid); see also (2) and (3) above.Motivated by the upper and lower bounds in Theorem 1.1, we conjecture the followingexact asymptotic for the number of points (not functions) of bounded height in total orbitsassociated to free monoids: Conjecture 1.3.
Let S = { φ , . . . , φ s } be a set of endomorphisms on P N ( Q ) with distinctdegrees all at least two. If M S is free, then there exist constants a P = a ( S, P ) and b = b ( S ) such that lim B →∞ { Q ∈ Orb S ( P ) : H ( Q ) ≤ B } (log B ) b = a P holds for all sufficiently generic P ∈ P N ( Q ) (i.e., all P ∈ P N ( Q ) outside of the union of aproper Zariski closed subset and a set of points of bounded height).Remark . Hence, we expect most monoid orbits in P N to exhibit similar height growth as:orbits on Markoff varieties [26], orbits on K surfaces in P × P × P given by (2 , , -forms[2, Theorem 4.5], and Mordell-Weil groups of abelian varieties. However in these cases, therelevant monoids (or the underlying varieties themselves) form groups, and there is less needto distinguish between counting functions and points. For instance if there are inverses in M S ,distinct functions that agree at a point determine a non-trivial fixed point, and these fixedpoints can typically be controlled. On the other hand in the case of abelian varieties (whereone considers the monoid generated by multiplication maps), distinct functions that agree ata point determine a torsion point. Thus this situation may be avoided by throwing away a setof bounded height.To motivate our conjecture, we restrict our attention to morphisms of P . To state ourresults in this setting, recall that w ∈ P ( C ) is called a critical value of φ ∈ C ( x ) if φ − ( w ) contains fewer than deg( φ ) elements. Likewise, we call a critical value w of φ simple if φ − ( w ) contains exactly deg( φ ) − points. In particular, we have the corresponding notions for sets: OUNTING POINTS OF BOUNDED HEIGHT IN MONOID ORBITS 3
Definition 1.4.
Let S = { φ , . . . , φ s } be a set of rational maps on P and let C φ i denote theset of critical values of φ i . Then S is called critically separate if C φ i ∩ C φ j = ∅ for all i = j .Moreover, S is called critically simple if every critical value of every φ ∈ S is simple.As evidence for Conjecture 1.3 above, we establish the following weak version for genericsets of rational maps on P . In particular, we are able to count points instead of just functions. Theorem 1.5.
Let S = { φ , . . . , φ s } be a set of rational maps on P ( Q ) with distinct degreesall at least four. If S is critically separate and critically simple, then M S is a free monoid andfor all ǫ > there exists an effectively computable positive constant b = b ( S, ǫ ) and a constant B S depending only on S such that (log B ) b ≪ { Q ∈ Orb S ( P ) : H ( Q ) ≤ B } ≪ (log B ) b + ǫ holds for all P ∈ P ( Q ) with H ( P ) > B S . Finally, since the theorem above does not directly apply to sets of polynomials, we givea different proof in this case which works quite generically. In what follows, for m ≥ the polynomials Z m = x m are called cyclic polynomials and the polynomials T m satisfying T m ( x + x − ) = x m + x − m are called Chebychev polynomials (of the first kind).
Theorem 1.6.
Let S = { φ , . . . , φ s } be a set of polynomials defined over Q , and let a i x d i denote the leading term of φ i . Suppose that S satisfies the following conditions: (1) The set of degrees { d , . . . , d s } is a multiplicatively independent set in Z . (2) The set of leading coefficients { a , . . . , a s } is a multiplicatively independent set in Q ∗ . (3) Each φ ∈ S is not of the form F ◦ E ◦ L for some polynomial F ∈ Q [ x ] , some cyclicor Chebychev polynomial E , and some linear L ∈ Q [ x ] .Then M S is a free monoid and for all ǫ > there exists an effectively computable positiveconstant b = b ( S, ǫ ) and a constant B S depending only on S such that (log B ) b ≪ { Q ∈ Orb S ( P ) : H ( Q ) ≤ B } ≪ (log B ) b + ǫ holds for all P ∈ P ( Q ) with H ( P ) > B S . We briefly outline the proofs of our results in dimension one above. The first step is toshow that M S is free. For rational functions, this follows from the genus calculations in [20]and Picard’s theorem. For polynomials, conditions (1) and (2) of Theorem 1.6 imply that M S is free; see Theorem 4.2 below. In particular, Theorem 1.1 implies the desired growth rateon the number of functions f ∈ M S with H ( f ( P )) ≤ B in both cases. On the other hand,for rational functions the genus calculations in [20], Faltings’ Theorem, and Tate’s telescopinglemma 2.2 imply that(4) { f ∈ M S : f ( P ) = Q } is uniformly bounded for all Q ∈ Orb S ( P ) of sufficiently large height; see Lemma 4.10 below.Likewise, the same property holds for polynomials by condition (3) of Theorem 1.6 and theintegral point classification theorems in [3] and the Appendix 5. From here, the desired esti-mate for orbits (for both rational and polynomial functions) follows from Theorem 1.1 and theuniform bounds on (4). We note that it is possible that the full classification theorems in [1, 3]can be used to strengthen the statement of Theorem 1.6, without reference to leading termsand degrees. However, we have endeavored to give as self-contained and broadly applicable astatement as possible. Acknowledgements:
We thank Yuri Bilu, Andrew Bridy, Alexander Evetts, Joseph Silver-man, and Umberto Zannier for discussions related to this paper. We also thank the authorsof [14]; Lemma 3.2 in their paper inspired the proof of Theorem 4.2. Finally, we are especiallygrateful to Umberto Zannier (again) for including the appendix to this paper.
WADE HINDES Auxiliary results
To count points of bounded height in orbits, we recall some basic facts about heights andgenerating functions. However as motivation for what is to come, we begin with a brief sketchof the proof of Theorem 1.1, an important ingredient for all other results in this paper. Thebasic idea, consistent with our earlier work on orbits attached to sequences in [10, 11], is thatthe logarithmic height of a point f ( P ) ∈ Orb S ( P ) is roughly determined by the size of deg( f ) ,as long as the initial point P is sufficiently generic; see Lemma 2.2 below. With this in mind,to count the number of functions f ∈ M S with log H ( f ( P )) ≤ B , we should in some sensesimply be counting the number of f ’s of bounded degree. In particular, when M S is a freemonoid, we can relate the number of f ∈ M S with bounded degree to the number of restrictedinteger compositions of bounded size, once we approximate log deg( φ ) for all φ ∈ S by rationalnumbers. Finally, we use generating functions (and the location of their poles via Lemma 2.6and Lemma 2.5 below) to estimate the number of restricted integer compositions of boundedsize. These facts together imply Theorem 1.1. With this sketch in place, we move on andreview some basic facts about heights. Remark . Since multiplicative heights tend to grow exponentially when evaluating functions,it is convenient to use the logarithmic height h = log ◦ H (instead of H ) to state certainheight estimates in dynamics. However, since height-counting on varieties is usually donewith multiplicative heights, we convert back to H at the end of the proof of Theorem 1.1, tobe consistent with similar results in the literature.Suppose that φ : P N ( Q ) → P N ( Q ) is is a morphism defined over Q of degree d φ . Then it iswell known that(5) h ( φ ( P )) = d φ h ( P ) + O φ (1) for all P ∈ P N ( Q ) ;see, for instance, [24, Theorem 3.11]. With this in mind, we let(6) C ( φ ) := sup P ∈ P N (¯ Q ) (cid:12)(cid:12)(cid:12) h ( φ ( P )) − d φ h ( P ) (cid:12)(cid:12)(cid:12) be the smallest constant needed for the bound in (5). Then, in order to control height growthrates when composing arbitrary elements of a set of endomorphisms, we define the followingfundamental notion; compare to [10, 11, 15]. Definition 2.1.
A set S of endomorphisms of P N ( Q ) is called height controlled if the followingproperties hold:(1) d S := inf { d φ : φ ∈ S } is at least .(2) C S := sup { C ( φ ) : φ ∈ S } is finite. Remark . We note first that any finite set of morphisms of degree at least is height con-trolled. To construct infinite collections, let T be any non-constant set of maps on P and let S T = { φ ◦ x d : φ ∈ T, d ≥ } . Then S T is height controlled and infinite; a similar constructionworks for P N in any dimension. Remark . Although the results in this paper are for finite S , we include the notion of heightcontrolled sets to motivate future work. For instance, many of the tools used below: canonicalheights, generating functions, etc. work perfectly well for infinite sets. However, the generatingfunctions that appear in this case are not rational, which adds some subtlety.As in the case of iterating a single function, it is Tate’s telescoping Lemma (generalizedbelow) that allows us to transfer information back and forth between heights and degrees; fora proof, see [10, Lemma 2.1]. Lemma 2.2.
Let S be a height controlled set of endomorphisms of P N ( Q ) , and let d S and C S be the corresponding height controlling constants. Then for all f ∈ M S , (cid:12)(cid:12)(cid:12)(cid:12) h ( f ( Q ))deg( f ) − h ( Q ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ C S d S − for all Q ∈ P N ( Q ) . OUNTING POINTS OF BOUNDED HEIGHT IN MONOID ORBITS 5
Now that we have a tool to pass from functions yielding a bounded height relation tofunctions of bounded degree (via Lemma 2.2), we next relate counting functions of boundeddegree to counting restricted integer compositions; this is essentially achieved by the factthat log deg( F ◦ G ) = log deg( F ) + log deg( G ) for all endomorphisms F and G . However, tomake this idea precise, we briefly discuss integer compositions, a classical object of study incombinatorics. For more details, see [6, §I.3.1].Let T ⊆ N > be a collection of positive integers (not necessarily finite). Then a restrictedcomposition of an integer n with summands in T (or a T -restricted composition of n ) is an ordered collection of elements in T whose sum is n . For instance, and are two different restricted compositions of for the set T = { , } . Given n , let f Tn be thenumber of distinct ways of writing n as a composition with summands (parts) in T . Thento give an asymptotic for f Tn , one can try and understand the ordinary generating function f T ( z ) = P n f Tn z n . In particular, if in addition f T ( z ) is a rational or meromorphic function,then the radius of convergence of the generating function, determined by the poles of f T ( z ) ,can be used to deduce an asymptotic for f Tn . Luckily, the generating functions for restrictedcompositions are particularly simple rational functions; see Proposition I.1 in [6]. Proposition 2.3.
The ordinary generating function of the number of compositions havingsummands restricted to a set T ⊆ N > is given by f T ( z ) = 11 − P n ∈ T z n . As mentioned above, once we have an expression for f T ( z ) as a rational function, we can usethe poles of f T ( z ) to estimate the f Tn . Specifically, we have the following Theorem, a simpleconsequence of partial fractions and Newton expansion. In what follows, if F ( z ) = P n a n z n isa power series expansion about z = 0 for a meromorphic function F , then we use the notation [ z n ] F ( z ) = a n to extract coefficients. Theorem 2.4 (Expansion of rational functions) . If F ( z ) is a rational function that is analyticat zero and has poles at points α , α , . . . , α m , then its coefficients (as a power series about )are a sum of exponential-polynomials: there exist m polynomials { Π j ( x ) } mj =1 such that for n larger than some fixed n , [ z n ] F ( z ) = m X j =1 Π j ( n ) α j − n . Furthermore, the degree of Π j is equal to the order of the pole of F at α j minus one. In particular, after combining Proposition 2.3 and Theorem 2.4, we see that to obtain anasymptotic formula for the number of integer compositions whose parts are restricted to theset { n , . . . , n s } , we must control the roots of smallest modulus of g ( z ) = 1 − ( z n + · · · + z n s ) .With this in mind, we have the following elementary proposition. Lemma 2.5.
Let n , n , . . . , n s be positive integers satisfying gcd( n , n , . . . , n s ) = 1 . Thenthe polynomial g ( z ) = 1 − ( z n + z n + · · · + z n s ) has a unique complex root α of smallestmodulus. Moreover, α is the unique positive real root of g , and α has multiplicity one .Proof. We first show that any positive real root α of g is a root of smallest modulus for g (clearly g has a positive root by the Intermediate Value Theorem). This is a simple consequenceof Rouché’s Theorem: let r < α , let p ( z ) = − − z n − · · · − z n s , and let q ( z ) = 2 . Then forall | z | = r , we have that | p ( z ) | = | − − z n − · · · − z n s | ≤ | z | n + · · · + | z | n s = 1 + r n + · · · + r n s < α n + · · · + α n s = 2 − (1 − α n − · · · − α n s ) = | q ( z ) | by the triangle inequality and since α is a root of g . In particular, p and q are holomorphicfunctions on the disc D r of radius r such that | p ( z ) | < | q ( z ) | on the boundary D r . Hence,Rouché’s Theorem implies that q and q + p = g have the same number of roots inside D r .Therefore, g has no complex roots in D r , and α is a root of smallest modulus for g . On theother hand, it is clear that g restricted to the positive real numbers is strictly decreasing. WADE HINDES
Hence, g has only one positive real root. Likewise, it is easy to see that g ′ ( α ) < (since α ispositive). Hence, α must be a root of multiplicity one for g .We next show that α is the unique complex root of g of smallest modulus. This portion ofthe proof of Lemma 2.5 follows from results and arguments in [6, IV.6], namely the “DaffodilLemma" [6, IV.1] and the proof of [6, Proposition IV.3] on the commensurability of dominantdirections for rational generating functions arising from regular languages. To see this, supposethat ζ = αe iθ is another root of smallest modulus of g . Let f ( z ) = z n + · · · + z n s , so that ζ satisfies | f ( ζ ) | = | | = 1 = f ( α ) = f ( | ζ | ) . In particular, [6, Lemma IV.1] implies that θ = 2 πr/p for some integers ≤ r < p with gcd( r, p ) = 1 (when r = 0 ). Moreover, f admits p as a span; see [6, Definition IV.5]. In particular (since f admits p as a span), f ( z ) = z a h ( z p ) for some polynomial h and some non-negative integer a . Note also that gcd( a, p ) = 1 , since gcd( n , . . . , n s ) = 1 by assumption. On the other hand, f ( ζ ) = ζ a h ( ζ p ) = ( αe i πr/p ) a h (cid:0) ( αe i πr/p ) p (cid:1) = e i πar/p α a h ( α p ) = e i πar/p f ( α ) = e i πar/p . Hence, ar/p ∈ Z . But this is impossible unless r = 0 , since gcd( ar, p ) = 1 otherwise. Inparticular, ζ = α and α is the unique complex root of g of smallest modulus as claimed. (cid:3) Lastly, we include a technical result that allows us to approximate the number of boundedcompositions whose parts are restricted to the set of non-integers { log deg( φ ) , . . . , log deg( φ s ) } ,a task that is equivalent to counting the number of functions in M S of bounded degree, byinteger compositions whose parts satisfy the gcd condition needed to apply Lemma 2.5. Lemma 2.6.
Let c < c < · · · < c s be distinct positive real numbers. Then for all δ > thereexist positive integers n , . . . , n s , m , . . . , m s and u such that the following conditions hold: (1) c i − δ ≤ n i u < c i < m i u ≤ c i + δ . (2) gcd( n , . . . , n s ) = 1 = gcd( m , . . . , m s ) .Remark . In particular, we may assume that n < · · · < n s < m < · · · < m s by choosing δ sufficiently small. Proof.
Clearly integers n , . . . , n s , m , . . . , m s and u satisfying condition (1) of Lemma 2.6exist. Therefore, to find integers satisfying both (1) and (2), we choose integers satisfying (1)and deform them to ensure that both conditions hold. Specifically, fix an integer r > , let v = ( n · · · · n s m · · · · m s u ) r , and define a new list as follows:(7) n ′ = n v + 1 , n ′ i = n i v, m ′ = m v + 1 , m ′ i = m i v, u ′ = u v for all i = 1 . In particular, we note that n ′ i u ′ = n i u and m ′ i u ′ = m i u for all i = 2 and that n ′ u ′ = n u + 1 v and m ′ u ′ = m u + 1 v . Therefore, we may certainly choose r sufficiently large so that n ′ , . . . , n ′ s , m ′ , . . . , m ′ s and u ′ satisfying condition (1), since the original sequence does. On the other hand, it is easy to seethat gcd( n ′ , n ′ i ) = 1 and gcd( m ′ , m ′ i ) = 1 for all i = 2 by construction. For instance, supposethat p is a prime such that p | n ′ and p | n ′ i for some i = 2 . Then since p | n ′ i , we see that p | v or p | n i . But if p | v , then p | n v and p | n ′ . In particular, p | ( n ′ − n v ) = 1 by (7), a contradiction.Likewise, if p | n i , then p | v by definition of v . Therefore, we may repeat the argument aboveto reach a contradiction. Similarly, the fact that gcd( m ′ , m ′ i ) = 1 holds for all i = 2 followsmutatis mutandis. In particular, we see that both conditions (1) and (2) of Lemma 2.6 holdfor the new list n ′ , . . . , n ′ s , m ′ , . . . , m ′ s and u ′ , which completes the proof. (cid:3) height-counting in orbits With the necessary background in place, we are ready to prove the bounds on the numberof functions f ∈ M S yielding a bounded height relation from the Introduction. (Proof of Theorem 1.1). Let S = { φ , . . . , φ s } be a finite set of endomorphisms on P N allof degree at least , and suppose that the monoid M S generated by S under composition isfree. We begin by defining some lengths on M S , which we then relate to integer compositions. OUNTING POINTS OF BOUNDED HEIGHT IN MONOID ORBITS 7
Given any vector v = ( v , . . . , v s ) ∈ R s> of positive real weights, we define l S, v ( φ i ) = v i for φ i ∈ S and extend l S, v to all f ∈ M S by:(8) l S, v ( f ) = n X j =1 l S, v ( θ j ) , where f = θ ◦ θ ◦ · · · ◦ θ n for some θ j ∈ S . Remark . Note that since S is a free basis of M S there is a unique way to write f as acomposition of elements of S . In particular, l S, v is a well-defined function. Alternatively, inthe non-free case one can define l S, v ( f ) by taking an inf over the possible expressions in (8).On the other hand, since M S is a set of functions there is a natural choice of weightinggiven by c = ( c , . . . , c s ) where c i = log deg( φ i ) ; moreover, we assume c < c < · · · < c s . Inparticular, it follows from the fact that deg( F ◦ G ) = deg( F ) · deg( G ) for morphisms that(9) l S, c ( f ) = log deg( f ) for all f ∈ M S ,independent of the generating set. However, non-integer weights (like logs of integers) appearsparingly in the literature, and so we approximate the growth rate of l S, c (which relates to thegrowth rate of heights in orbits via Tate’s telescoping argument) using integer weights.To wit, choose positive integers n , . . . , n s , m , . . . , m s and u depending on δ as in Lemma2.6 and Remark 7. Then it follows by construction that u − l S, n ( f ) ≤ l S, c ( f ) ≤ u − l S, m ( f ) forall f ∈ M S . Hence,(10) { f ∈ M S : l S, m ( f ) ≤ uB } ⊆ { f ∈ M S : l S, c ( f ) ≤ B } ⊆ { f ∈ M S : l S, n ( f ) ≤ uB } holds for all positive B ; here n = ( n , . . . , n s ) and m = ( m , . . . , m s ) . Now given a positiveinteger n we define(11) L n := { f ∈ M S : l S, m ( f ) = n } and U n := { f ∈ M S : l S, n ( f ) = n } . In particular, since n and m are integer weight vectors, it follows from (9), (10) and (11) that(12) [ uB ] X n =0 L n ≤ { f ∈ M S : log deg( f ) ≤ B } ≤ [ uB ] X n =0 U n . Here [ uB ] denotes the nearest integer to uB . On the other hand, since S generates M S as a free monoid, we can identify M S with the set of finite sequences of elements of S . Inparticular, L n (respectively U n ) represents the number of ways of writing n as the sum of asequence of elements in { m , . . . , m s } (respectively in { n , . . . , n s } ). Such sequences have beenextensively studied in combinatorics [6, §I.3.1] and are called restricted integer compositions.Specifically, generating functions for these compositions are known; see Proposition 2.3 above.In particular,(13) L n = (cid:2) z n (cid:3) − ( z m + · · · + z m s ) and U n = (cid:2) z n (cid:3) − ( z n + · · · + z n s ) . As a reminder, [ z n ] F ( z ) denotes the operation of extracting the coefficient of z n in the formalpower series F ( z ) = P f n z n ; see [6, p.19]. On the other hand, since gcd( n , . . . , n s ) = 1 and gcd( m , . . . , m s ) = 1 by construction, Lemma 2.5 implies that both of the rational functionsin (13) have unique poles of smallest modulus (and these poles are positive real numbers ofmultiplicity one). Let α , . . . α r be the roots of g n ( z ) = 1 − ( z n + · · · + z n s ) arranged inincreasing order of modulus and let β , . . . , β r be the roots of g m ( z ) = 1 − ( z m + · · · + z m s ) arranged in increasing order of modulus. Then Theorem 2.4 and (13) together imply that(14) L n = κ β − n + p ( n ) β − n + · · · + p r ( n ) β − nr and U n = τ α − n + q ( n ) α − n + · · · + q r ( n ) α − nr for some constants κ and τ and some polynomials p i , q j ∈ C [ z ] . Explicitly,(15) κ = − β g ′ m ( β ) and τ = − α g ′ n ( α ) . Here we use the residue method for extracting partial fraction coefficients and Newton’s ex-pansion; see the proof of [6, Theorem IV.9]. Moreover, the expressions in (14) and (15) hold
WADE HINDES simultaneously for all n > n for some constant n ∈ N . In particular, by summing (14) andusing the triangle inequality (for both sums and differences) we see that(16) κ β − m − κ m r | β | − m − κ ≤ m X n =0 L n and m X n =0 U n ≤ τ α − m + τ m r | α | − m + τ holds for all m sufficiently large. Again, in the interest of being as explicit as possible (at leastfor the main terms), we have that(17) κ = κ ( β )( β ) − − β (1 − β ) g ′ m ( β ) and τ = τ ( α )( α ) − − α (1 − α ) g ′ n ( α ) , obtained by summing the corresponding geometric series. Moreover, r (respectively r ) is themaximum of the multiplicities of the roots of g m (respectively g n ) minus one. Hence, aftertaking m = [ Bu ] , combining (12) and (16), and absorbing u into the relevant constants, wesee that(18) κ C B − κ B r C B − κ ≤ { f ∈ M S : log deg( f ) ≤ B } ≤ τ C B + τ B r C B + τ holds for all B sufficiently large; here we use also that Bu − ≤ [ Bu ] ≤ Bu + 1 , so that (some)of the relevant constants are given explicitly by C = 1 β u , κ = κ β = 1( β − g ′ m ( β ) , C = 1 | β | u ,C = 1 α u , τ = τ α = 1 α ( α − g ′ n ( α ) , C = 1 | α | u . (19)We note in particular that C > C and C > C , since β < | β | and α < | α | by construc-tion. Now suppose that P ∈ P N ( Q ) is such that h ( P ) > b S := C S / ( d S − , where C S and d S are the constants from Definition 2.1 above. Then, Tate’s telescoping Lemma 2.2 impliesthat deg( f )( h ( P ) − b S ) ≤ h ( f ( P )) ≤ deg( f )( h ( P ) + b S ) . Therefore, for all B we have the subset relations:(20) (cid:26) f ∈ M S : log deg( f ) ≤ log (cid:18) Bh ( P ) + B S (cid:19)(cid:27) ⊆ (cid:8) f ∈ M S : h ( f ( P )) ≤ B (cid:9) ⊆ (cid:26) f ∈ M S : log deg( f ) ≤ log (cid:18) Bh ( P ) − B S (cid:19)(cid:27) . In particular, if we replace B with log( B/ ( h ( P ) + B S )) on the left side of (18), replace B with log( B/ ( h ( P ) − B S )) on the right side of (18), and apply the change of base formulas forlogarithms, then we deduce from (18) and (20) that κ (cid:18) Bh ( P )+ b S (cid:19) log( C ) − κ log (cid:18) Bh ( P )+ b S (cid:19) r (cid:18) Bh ( P )+ b S (cid:19) log( C ) − κ ≤ (cid:8) f ∈ M S : h ( f ( P )) ≤ B (cid:9) ≤ τ (cid:18) Bh ( P ) − b S (cid:19) log( C ) + τ log (cid:18) Bh ( P ) − b S (cid:19) r (cid:18) Bh ( P ) − b S (cid:19) log( C ) + τ (21)holds for all B sufficiently large and all initial points P such that h ( P ) > b S . Moreover, sincemost height counting problems on varieties are stated in terms of multiplicative heights, wereplace B with log B in (21) to obtain OUNTING POINTS OF BOUNDED HEIGHT IN MONOID ORBITS 9 (cid:18) κ ( h ( P )+ b S ) log( C (cid:19) log( B ) log( C ) − (cid:18) κ ( h ( P )+ b S ) log( C (cid:19) log (cid:18) log Bh ( P )+ b S (cid:19) r log( B ) log( C ) − κ ≤ (cid:8) f ∈ M S : H ( f ( P )) ≤ B (cid:9) ≤ (cid:18) τ ( h ( P ) − b S ) log( C (cid:19) log( B ) log( C ) + (cid:18) τ ( h ( P ) − b S ) log( C (cid:19) log (cid:18) log Bh ( P ) − b S (cid:19) r log( B ) log( C ) + τ . (22)Hence, after renaming the constants above, we see that there exist positive constants a ( S, P, δ ) , a ( S, P, δ ) , b ( S, δ ) , b ( S, δ ) and B S := e b S such that(23) a (log B ) b + o (cid:0) (log B ) b (cid:1) ≤ { f ∈ M S : H ( f ( P )) ≤ B } ≤ a (log B ) b + o (cid:0) (log B ) b (cid:1) holds for all P ∈ P N ( Q ) with H ( P ) ≥ B S . Moreover, b and b depend only on the set S and δ , and a and a (and the lower order terms) depend on S , δ and P . Specifically, (15), (17)and (19) together imply a = 1( β − g ′ m ( β ) log (cid:0) B S H ( P ) (cid:1) log( β − u ) , b = log( β − u ) ,a = 1 α ( α − g ′ n ( α ) log (cid:16) H ( P ) B S (cid:17) log( α − u ) , b = log( α − u ) . (24)Moreover, since roots of polynomials can be approximated to any accuracy effectively, b and b can be computed effectively (also integers as in Lemma 2.6 can be produced effectively forall δ ). Therefore, to complete the proof of Theorem 1.1, we need only show that the difference b − b > can be made arbitrarily small (by letting δ go to zero); see (31) below. Then weset b = b and b = b + ǫ to deduce the claim in Theorem 1.1.To do this, we use the Mean Value Theorem applied to the functions f ( x ) = − g m ( x ) and h ( x ) = u log( x ) on the intervals [ α , β ] . With this in mind, we begin with a few estimates,all of which follow easily from part (1) of Lemma 2.6:(25) δc < δun < δc − δ , < m n < c + δc − δ , um < c . To simplify the expressions that follow, let α = α and β = β . Then since n ≤ n i and < α < , we see that P si =1 α n i ≤ sα n . Therefore,(26) (cid:16) s (cid:17) n ≤ α. In particular, (25) and (26) together imply the following lower bound on the derivative:(27) f ′ ( α ) = m s α m s − + · · · + m α m − ≥ m α m − ≥ m α m ≥ m (cid:16) s (cid:17) m n ≥ m (cid:16) s (cid:17) c δc − δ . Similarly, (25) and (26) together imply that: f ( α ) = α ( msu − nsu ) u · α n s + · · · + α ( m u − n u ) u · α n − ≥ α δu · α n s + · · · + α δu · α n − α δu ( α n s + · · · + α n ) − α δu − ≥ (cid:16) s (cid:17) δun − ≥ (cid:16) s (cid:17) δc − δ − . Here, we use also that ≤ m i u − n i u ≤ δ by construction; see Lemma 2.6 part (1). In particular,we deduce the following key upper bound:(28) − f ( α ) ≤ − (cid:16) s (cid:17) δc − δ . We are now ready to apply the Mean Value Theorem to f ( x ) on [ α, β ] . Specifically, m (cid:16) s (cid:17) c δc − δ ≤ f ′ ( α ) = min α ≤ x ≤ β f ′ ( x ) ≤ f ( β ) − f ( α ) β − α = − f ( α ) β − α ≤ − ( s ) δc − δ β − α follows from (27), (28), and the Mean Value Theorem. Therefore, we have the estimate:(29) ≤ β − α ≤ − ( s ) δc − δ m ( s ) c δc − δ . Likewise, the Mean Value Theorem for h ( x ) = u log( x ) on [ α, β ] , (26), and the fact that n > together yield(30) ≤ h ( β ) − h ( α ) β − α ≤ max α ≤ x ≤ β h ′ ( x ) = h ′ ( α ) = uα − ≤ su. Hence, after combining (24),(25), (29) and (30), we deduce that ≤ b − b = h ( β ) − h ( α ) ≤ su · − ( s ) δc − δ m ( s ) c δc − δ = s · um · − ( s ) δc − δ ( s ) c δc − δ ≤ sc · − ( s ) δc − δ ( s ) c δc − δ (31)However, the upper bound in (31) goes to zero as δ goes to zero. Therefore, the exponents b and b in (23) can be made arbitrarily close. (cid:3) Remark . If S has only two maps ( s = 2 ), then the trinomials g n ( z ) = 1 − z n − z n and g m ( z ) = 1 − z m − z m must have non-zero discriminant (in fact, here we need only that n = n and m = m , making no assumptions on gcd’s); this fact follows easily from thediscriminant formula in [9, Theorem 4]. In particular, r and r from (18) and (22) must bezero. Hence, we obtain simpler bounds for the number of functions of bounded degree (hence,also for the number of points of bounded height in orbits). For instance, κ C B − κ C B − κ ≤ { f ∈ M S : log deg( f ) ≤ B } ≤ τ C B + τ C B + τ holds for all B sufficiently large. Example . In particular, if S = { φ , φ } with deg( φ ) = 2 and deg( φ ) = 3 , then we use thecrude approximations < log(2) < and < log(3) < as inputs to Lemma 2.6 to obtain some explicit bounds for Theorem 1.1. Specifically, . B S H ( P )) . ! log( B ) . + o (cid:0) log( B ) . (cid:1) ≤ { f ∈ M S : H ( f ( P )) ≤ B }≤ . (cid:0) H ( P ) BS (cid:1) . ! log( B ) . + o (cid:0) log( B ) . (cid:1) holds for all P ∈ P N ( Q ) of sufficiently large height; here we use (19), (22) and Magma [5] toapproximate roots of polynomials.Lastly, we can use the bounds in Theorem 1.1 on the number of functions in free monoidssatisfying a bounded height relation to give an upper bound on the number of points ofbounded height in arbitrary monoid orbits. (Proof of Corollary 1.2).
Let S = { φ , . . . , φ s } be a set of endomorphisms all of degree atleast . If s = 1 (i.e., S = { φ } contains just one map), then one may use the canonical height[24, §3.4] associated to φ to reach the desired bound. Namely, the fact that | ˆ h φ − h | ≤ c φ andthat ˆ h ( φ n ( P )) = d nφ ˆ h φ ( P ) together imply that ( n : n ≤ log d φ (cid:18) log( B ) − c φ ˆ h φ ( P ) (cid:19)) ⊆ { Q ∈ Orb φ ( P ) : H ( Q ) ≤ B } ⊆ ( n : n ≤ log d φ (cid:18) log( B )+ c φ ˆ h φ ( P ) (cid:19)) for all non-preperiodic P . On the other hand, if P is preperiodic, then Orb φ ( P ) is finite. Inparticular, the number of points with (multiplicative) height at most B is certainly boundedabove by a constant times log log( B ) ≪ log( B ) as claimed; hence, b = 1 in this case. OUNTING POINTS OF BOUNDED HEIGHT IN MONOID ORBITS 11
Now assume that s ≥ , and let F S be the free monoid generated by S under concatenation.Then, given a word w = θ . . . θ n ∈ F S , we can define an action of w on P N ( Q ) via w · P = θ ◦ · · · ◦ θ n ( P ) . Likewise, we define the degree of w to be deg( θ ◦ · · · ◦ θ n ) . In particular, (bycounting words of bounded degree) it is straightforward to see that we can replace M S with F S in the proof of Theorem 1.1 and deduce that a log( B ) b + o (cid:0) log( B ) b (cid:1) ≤ { w ∈ F S : H ( w · P ) ≤ B } ≤ a log( B ) b + o (cid:0) log( B ) b (cid:1) for some constants a ( P ) , a ( P ) , b and b (whenever H ( P ) > B S , as before); here we canchoose δ = 0 . , small enough to separate logs of distinct integers (see Remark 7). In particular,since every point Q ∈ Orb S ( P ) is of the form Q = w · P for some w ∈ F S , we have that { Q ∈ Orb S ( P ) : H ( Q ) ≤ B } ≤ { w ∈ F S : H ( w · P ) ≤ B } ≤ a log( B ) b + o (cid:0) log( B ) b (cid:1) . Therefore, the number of points in
Orb S ( P ) with height at most B is ≪ log( B ) b . Concretely,by choosing δ = 0 . we get the crude bound b ≤ c − . log( s ) from (24) and (26). (cid:3) Remark . It is likely that the statement and proof of Theorem 1.1 hold for height controlledsets of simultaneously polarizable maps on any projective variety. The main arithmetic in-gredient, Tate’s telescoping Lemma 2.2, works perfectly well with this level of generality; see[10, Lemma 2.1]. Moreover, the other components of the proof (generating functions anddiophantine approximation of degrees) don’t depend on P N .4. Monoid orbits in dimension one
In this section, we prove Theorems 1.5 and 1.6 on monoid orbits over P . To do this, we firstshow that the relevant sets of maps generate free monoids under composition. For criticallyseparate and simple sets of rational maps, this follows directly from the main results of [20]. Theorem 4.1.
Let S = { φ , . . . , φ s } be a set of rational maps on P ( C ) all of degree at leastfour. If S is critically separate and critically simple, then M S is a free.Proof. Suppose that f = θ ◦ · · · ◦ θ n = τ ◦ · · · ◦ τ m = g for some θ i , τ j ∈ S . Without lossof generality, we may assume that n ≥ m . Clearly if n = m = 1 , then θ = τ and there isnothing to prove. Therefore, we may assume that n ≥ m > . Write f = θ ◦ · · · ◦ θ n and g = τ ◦ · · · ◦ τ m so that θ ( f ) = τ ( g ) . However, since f and g are non-constant and S iscritically separate, [20, Theorem 1.1] implies that θ = τ . Likewise since S is critically simpleand deg( θ ) ≥ , we see that f = g by [20, Theorem 1.3]. Repeating the argument abovenow for f and g (instead of f and f ), we see that θ = τ and θ ◦ · · ·◦ θ n = τ ◦ · · ·◦ τ m . Wecan clearly keep going to deduce that θ i = τ i for all ≤ i ≤ m . Finally, by equating degreesgiven by the original relation f = g , we see that deg( θ n − m ) · · · deg( θ n ) = 1 , a contractionunless n = m . This completes the proof that M S is free. (cid:3) Next we show that polynomial sets with multiplicatively independent degrees and leadingcoefficients generate free monoids under composition. This is perhaps known to the experts.However, without a reference, we include a proof for completeness. Our argument is inspiredby the proof of [14, Lemma 3.2].
Theorem 4.2.
Let S = { φ , . . . , φ s } be a set of polynomials defined over a field K of charac-teristic zero, and let a i x d i denote the leading term of φ i . If { d , . . . , d s } is a multiplicativelyindependent set in Z and { a , . . . , a s } is a multiplicatively independent set in K ∗ , then M S isa free monoid.Proof. As the statement of the theorem suggests, it suffices to study the monoid generated bythe leading terms in S . To make this statement precise, we note the following lemma: Lemma 4.3.
Let S = { φ , . . . , φ s } be a set of polynomials defined over a field K , let a i x d i denote the leading term of φ i , and let S ′ = { a x d , . . . , a s x d s } . If M S ′ is a free monoid, then M S is a free monoid. Proof.
This statement is a simple consequence of the fact that lt ( f ◦ g ) = lt ( f ) ◦ lt ( g ) for all f, g ∈ K [ x ] ; here lt ( · ) denotes the leading term of a polynomial. To see this, suppose that M S ′ is a free monoid and that there is some relation(32) θ ◦ θ ◦ · · · ◦ θ n = τ ◦ τ ◦ · · · ◦ τ m for some θ i , τ j ∈ S . Then, in particular, we have an equality of leading terms, lt ( θ ) ◦ lt ( θ ) ◦ · · · ◦ lt ( θ n ) = lt ( τ ) ◦ lt ( τ ) ◦ · · · ◦ lt ( τ m ) . But this is a relation in M S ′ , which is free on the letters in S ′ . Therefore, n = m and lt ( θ i ) = lt ( τ i ) . However, again since M S ′ is free, lt ( θ i ) = lt ( τ i ) implies that θ i = τ i . Hence therelation in (32) is a trivial one. (cid:3) Now back to the proof of Theorem 4.2. In particular, in light of Lemma 4.3, we mayassume that S = { φ , . . . φ s } is a set of monomials with φ i = a i x d i , that { d , . . . d s } is amultiplicatively independent set in Z , and that { a , . . . a s } is a multiplicatively independentset in K ∗ . Now, given F = θ ◦ · · · ◦ θ n ∈ M S and φ ∈ S , we define e φ ( F ) = { j | θ j = φ } to be the number of φ ’s appearing in the string defining F (strictly speaking this is an abuseof notation; e φ is a function on words). In particular, if there is a relation F = G for some F, G ∈ M S , then we see that(33) d e φ ( F )1 · · · d e φs ( F ) s = deg( F ) = deg( G ) = d e φ ( G )1 · · · d e φs ( G ) s . However, the d i ’s are multiplicatively independent by assumption, so that e φ i ( F ) = e φ i ( G ) forall i . In particular, the strings defining F and G have the same length (i.e., the total numberof letters from S ) equal to n = P e φ i ( F ) . Hence,(34) F = θ ◦ · · · ◦ θ n = τ ◦ · · · ◦ τ n = G for some θ i , τ i ∈ S. Moreover, e φ i ( F ) = e φ i ( G ) for all i . From here, we will show that θ i = τ i by induction on thelength n . The n = 1 case is clear. For n > , if (34) holds then(35) F ′ ◦ θ = F = G = G ′ ◦ τ for some θ, τ ∈ S and some monomials F ′ and G ′ given by strings of length n − of elementsof S . We proceed in cases. Case(1):
Suppose that θ = τ , and write θ = ax d , F ′ = a F ′ x deg( F ′ ) and G ′ = a G ′ x deg( F ′ ) .Here we use that deg( F ) = deg( G ) and θ = τ , so that deg( F ′ ) = deg( G ′ ) . Therefore, (35)becomes a F ′ a deg( F ′ ) x d deg( F ′ ) = a G ′ a deg( F ′ ) x d deg( F ′ ) , and we deduce that a F ′ = a G ′ . However, then F ′ = a F ′ x deg( F ′ ) = a G ′ x deg( F ′ ) = G ′ and F ′ , G ′ ∈ M S are polynomials obtained by composing strings of elements of S of length n − .In particular, we may deduce that θ i = τ i for all i < n by induction. On the other hand, θ n = θ = τ = τ n by construction. Therefore, θ i = τ i for all i ≤ n as claimed. Case(2):
Suppose that θ = τ . We fix some notation. Given a string θ . . . θ m of elements of S , write f = θ ◦ · · · ◦ θ m = a f x deg( f ) = ( a n · · · a n s s ) x deg( f ) . Then define the a i -degree of f (or more accurately, the a i -degree of the corresponding string)to be deg a i ( f ) = n i . Note that this construction is well-defined since the leading coefficient a f is in the (multiplicative) semigroup generated by the a i ’s and the a i ’s are multiplicativelyindependent by assumption. Now write θ = ax d . Then we will show that deg a ( F ) = deg a ( G ) ,a contradiction, using (35), the fact that θ = τ , and the following elementary observationsabout a -degrees: Lemma 4.4.
Let S be as in Theorem 4.2 and let θ = ax d ∈ S . Then the following statementshold: (1) If f , f , g ∈ M S , deg a ( f ) ≤ deg a ( f ) , and deg( f ) ≤ deg( f ) , then deg a ( f ◦ g ) ≤ deg a ( f ◦ g ) . (2) Let f ∈ M S , and suppose that e θ ( f ) = e ≥ . Then deg a ( f ) ≤ d e − d − · deg( f ) d e . OUNTING POINTS OF BOUNDED HEIGHT IN MONOID ORBITS 13
We grant Lemma 4.4 for now and return to the proof later. To see that deg a ( F ) = deg a ( G ) in Case 2, let e = e θ ( F ) = e θ ( G ) be the number of θ ’s appearing in the strings defining F and G . Then, writing F = F ′ ◦ θ as in (35), we see that(36) deg a ( F ) = deg a ( F ′ ) + deg( F ′ ) deg a ( θ ) = deg a ( F ′ ) + deg( F ′ ) ≥ deg( F ′ ) . On the other hand, Lemma 4.4 part (2) applied to f = G ′ implies that(37) deg a ( G ) = deg a ( G ′ ) + deg( G ′ ) deg a ( τ ) = deg a ( G ′ ) ≤ d e − d − · deg( G ′ ) d e . Here we use that G = G ′ ◦ τ and that deg a ( τ ) = 0 , since θ = τ and the leading coefficientsof the elements in S are multiplicatively independent. Therefore, if deg a ( F ) = deg a ( G ) , then(35), (36), (37) together imply that deg( τ ) deg( F ) = deg( τ ) d deg( F ′ ) ≤ deg( τ ) d deg a ( F )= deg( τ ) d deg a ( G ) ≤ d e − d − · deg( τ ) deg( G ′ ) d e − ≤ d e − d − · deg( G ) d e − < d e d − · deg( G ) d e − = dd − G ) . (38)However, F = G so that deg( F ) = deg( G ) . In particular, (38) implies that ≤ deg( τ ) < dd − ≤ , a contradiction. Therefore, deg a ( F ) = deg a ( G ) and Case 2 is incompatible with (35). There-fore, any relation in M S must be of the form in Case 1. However, since we have settledTheorem 4.2 in this case by induction, M S is a free monoid as claimed. (cid:3) We now include a proof of Lemma 4.4 regarding a -degreess. (Lemma 4.4). The first statement is a simple consequence of the definition of a -degrees. Sup-pose that f , f , g ∈ M S , that deg a ( f ) ≤ deg a ( f ) , and that deg( f ) ≤ deg( f ) . Then deg a ( f ◦ g ) = deg a ( f ) + deg a ( g ) · deg( f ) ≤ deg a ( f ) + deg a ( g ) · deg( f ) = deg a ( f ◦ g ) as claimed. For the second statement, let f ∈ M S and suppose that e θ ( f ) = e ≥ . Then, wemay write(39) f = g t +1 ◦ θ r t ◦ g t ◦ · · · ◦ g ◦ θ r ◦ g for some g i ∈ M S with deg a ( g i ) = 0 , some t ≥ , and some r i ≥ with P ti =1 r i = e . We willshow by induction on t that(40) deg a ( f ) ≤ deg a ( g e +1 ◦ g e ◦ · · · ◦ g ◦ θ e ) , from which statement (2) of the Lemma easily follows. If t = 1 , then deg a ( g ◦ θ r ◦ g ) = deg( g ) deg a ( θ r ) ≤ deg( g ) deg( g ) deg a ( θ r ) ≤ deg a ( g ◦ g ◦ θ r ) . Here we use that deg a ( g i ) = 0 . On the other hand, assume that t > and that (40) is truefor polynomials of the form in (39) with t − appearances of substrings of the form θ r i . Thengiven f as in (39), let f = g t +1 ◦ θ r t ◦ g t ◦ θ r t − ◦ · · · ◦ g , let f = g t +1 ◦ · · · ◦ g ◦ θ r where r = P ti =2 r i , and let g = θ r ◦ g . Then f = f ◦ g and deg( f ) = deg( f ) . Hence, part 1 ofLemma 4.4 and the induction hypothesis together imply that(41) deg a ( f ) = deg a ( f ◦ g ) ≤ deg( f ◦ g ) = deg a (( g t +1 ◦ · · · ◦ g ) ◦ θ e ◦ g ) . On the other hand letting g ′ = g t +1 ◦ · · · ◦ g , we see that the t = 1 case above applied to g ′ ◦ θ e ◦ g in place of f implies that(42) deg a (( g t +1 ◦ · · · ◦ g ) ◦ θ e ◦ g ) ≤ deg a (( g t +1 ◦ · · · ◦ g ) ◦ θ e ) . Therefore after combining (41) and (42), we establish (40) as claimed. Finally, the bound inpart 2 of Lemma 4.4 follows easily from (40), the fact that deg a ( θ e ) = ( d e − + · · · + d + 1) = ( d e − / ( d − , and that deg( g t +1 ◦ · · · ◦ g ) = deg( f ) /d e . (cid:3) We are nearly ready to prove Theorems 1.5 and 1.6, versions of Conjecture 1.3 in dimensionone, for some fairly general sets of maps. However to complete the main remaining step, (i.e.,to pass from counting functions to counting points), we need to show that f ( P ) = g ( P ) occursrarely for f, g ∈ M S and P of large enough height. This is largely achieved by ensuring thatthe rational (or integral) points on the curves(43) C i : φ i ( x ) − φ i ( y ) x − y = 0 and C j,k : φ j ( x ) = φ k ( y ) for j = k are finite. For critically separate and simple sets of rational maps this follows from the genuscalculations in [20] and Faltings’ theorem: Proposition 4.5.
Let S = { φ , . . . , φ s } be a set of rational maps on P ( Q ) all of degree atleast . If S is critically separate and critically simple, then the curves in (43) have at mostfinitely many rational points over any number field.Proof. Since S is critically separate, [20, Proposition 3.1] implies that each C j,k is an irre-ducible curve for all j = k . Likewise, it is shown on [20, p208] that the genus of C j,k isgiven by (deg( φ j ) − φ k ) − ≥ . Hence, the C j,k have at most finitely many rationalpoints over any number field by Faltings’ theorem. Likewise, [20, Corollary 3.6] implies thateach C i is an irreducible curve. Moreover, it is shown on [20, p210] that the genus of C i is (deg( φ i ) − ≥ . Hence, the C i also have at most finitely many rational points over anynumber field by Faltings’ theorem. (cid:3) For the sets of polynomials in Theorem 1.6, it suffices for our purposes to show that thecurves in (43) have only finitely many integral points (as opposed to rational points). To dothis, we need the integral point classification theorems in [3] and the Appendix 5. To putthese results in context, we first recall the definition of Siegel factors and Siegel’s integralpoint theorem.
Definition 4.6. A Siegel polynomial over a field K is an absolutely irreducible polynomial Φ( x, y ) ∈ K [ x, y ] for which the curve Φ( x, y ) = 0 has genus zero and has at most two pointsat infinity. A Siegel factor of a polynomial Ψ( x, y ) ∈ K [ x, y ] is a factor of Ψ which is a Siegelpolynomial over K .The following result explains the relevance of Siegel factors in this context and is one of themost important results in arithmetic geometry; see Theorems 8.2.4 and 8.5.1 in [18]. Theorem 4.7 (Siegel) . Let R be a finitely generated integral domain of characteristic zero,let K be the field of fractions of R , and let Φ( x, y ) ∈ K [ x, y ] . Then there are only finitelymany pairs ( x, y ) ∈ R × R for which Φ( x, y ) = 0 unless Φ( x, y ) has a Siegel factor over K .Remark . Clearly if K is a number field (viewed inside the complex numbers) and Φ( x, y ) has no Siegel factors over C , then Φ( x, y ) has no Siegel factors over K . Therefore, to provethat the equation Φ( x, y ) = 0 has only finitely many solutions ( x, y ) in some ring of S -integers R ⊂ K , it suffices to show that Φ( x, y ) has no Siegel factors over C .To use Siegel’s integral point theorem to show that f ( P ) = g ( P ) occurs infrequently for f, g ∈ M S and P of sufficiently large height (see Lemma 4.10 for a precise statement), we needthe following theorem of Bilu and Tichy [3, Theorem 10.1], which classifies the polynomials Φ( x, y ) = F ( x ) − G ( y ) having a Siegel factor. OUNTING POINTS OF BOUNDED HEIGHT IN MONOID ORBITS 15
Theorem 4.8.
For non-constant
F, G ∈ C [ x ] , if F ( x ) − G ( y ) has a Siegel Factor in C [ x, y ] then F = E ◦ F ◦ µ and G = E ◦ G ◦ ν , where E, µ, ν ∈ C [ x ] with deg( µ ) = deg( ν ) = 1 andeither ( F , G ) or ( G , F ) is one of the following pairs (here m, n ≥ and p ∈ C [ x ] K { } ): (a) (cid:0) x m , x r p ( x ) m (cid:1) , where r ∈ N is coprime to m; (b) (cid:0) x , ( x + 1) p ( x ) (cid:1) ; (c) (cid:0) T m , T n (cid:1) with gcd( m, n ) = 1 ; (d) (cid:0) T m , − T n (cid:1) with gcd( m, n ) > ; (e) (cid:0) ( x − , x − x (cid:1) .Remark . Technically, the statement above is a simplified version of [3, Theorem 10.1] takenfrom [8, Corollary 2.7]. For a more detailed description of the classification of pairs ( F, G ) such that F ( x ) − G ( y ) has a Siegel factor (with the relevant fields of definition taken intoaccount), see [3].In particular, condition (3) of Theorem 1.6 implies that the affine curves C i,j : φ i ( x ) = φ j ( y ) for i = j have only finitely many integral points. Here we use Theorem 4.7, Remark 11, and Theorem4.8: the pairs (a)-(d) in Theorem 4.8 are ruled out by condition (3) by examining first coor-dinates only (all cyclic or Chebychev polynomials). Likewise, ( x − = F ◦ E ◦ L , where F ( x ) = ( x − , E ( x ) = x is cyclic, and L ( x ) = x . Hence, the pair in (e) is also ruled outby condition (3). Similarly, condition (3) implies that the affine curves C i : φ i ( x ) − φ i ( y ) x − y = 0 have only finitely many integral points. Here we use Theorem 4.7 and Theorem 2 in theAppendix; Zannier has shown that such curves have at least points at infinity over C andthus cannot have a Siegel factor over any number field. In particular, we are now ready toprove our orbit counts for P from the Introduction. (Proof of Theorems 1.5 and 1.6). Suppose that S is a critically separate and critically simpleset of rational functions or that S is a set of polynomials satisfying conditions (1)-(3) ofTheorem 1.6. Then in particular, M S is free by Proposition 4.5 in the rational function case,and M S is free by Theorem 4.2 in the polynomial case. Hence, Theorem 1.1 implies that thenumber of functions f ∈ M S satisfying H ( f ( P )) ≤ B , has the desired growth rate (in eithercase), whenever P has large enough height.To pass from functions to points, we need to control when f ( P ) = g ( P ) is possible for f, g ∈ M S . With this in mind, let R P ⊂ K be a ring of S -integers in some number field K (not the same S as the set of functions) containing P and the coefficients of the maps in S .Then define the quantities κ P := max n h ( x ) : ( x, y ) ∈ C i ( K ) or ( x, y ) ∈ C j,k ( K ) for some y ∈ K and some i, j, k o . in the rational function case and κ P := max n h ( x ) : ( x, y ) ∈ C i ( R P ) or ( x, y ) ∈ C j,k ( R P ) for some y ∈ R P and some i, j, k o . in the polynomial case. Then in either case, κ P is finite by Proposition 4.5, Theorem 4.7,Remark 11, Theorem 4.8, and Theorem 1 in the Appendix. Now given f = θ ◦ θ ◦· · ·◦ θ n ∈ M S ,define the length of f to be ℓ ( f ) = n ; note that this quantity is well-defined since M S is free.Moreover letting v = (1 , . . . , , we see that ℓ = ℓ S, v in our earlier notation. Next, recall theconstant b S given by b S = C S / ( d S − , where C S and d S are the height constants in Definition2.1 above. Then, Tate’s telescoping Lemma 2.2 implies that if h ( ρ ( P )) ≤ κ P for some P with h ( P ) > b S and some ρ ∈ M S , then(44) ℓ ( ρ ) b S ≤ deg( ρ )( h ( P ) − b S ) ≤ h ( ρ ( P )) ≤ κ P . Hence, the length of such ρ is bounded; specifically, ℓ ( ρ ) ≤ max (cid:8) , ⌈ log ( κ P /b S ) ⌉ (cid:9) := r P ,from which we deduce the following fact. Lemma 4.9.
Suppose that S satisfies the conditions of Theorems 1.5 or 1.6 and let ρ ∈ M S .If ℓ ( ρ ) > r P , h ( P ) > b S , and θ ( ρ ( P )) = τ ( P ′ ) for some P ′ ∈ R P and some θ, τ ∈ S , then θ = τ and ρ ( P ) = P ′ . In particular, this allows us to control the number of functions in M S that can agree at P . Lemma 4.10.
Suppose that S satisfies the conditions of Theorems 1.5 or 1.6 and h ( P ) > b S .Then there is a constant t P,S depending only on P and S such that (cid:8) f ∈ M S : f ( P ) = Q (cid:9) ≤ t P,S holds for all but finitely many Q ∈ Orb S ( P ) .Proof. Let d S = max { deg( φ ) : φ ∈ S } and suppose that Q ∈ Orb S ( P ) satisfies h ( Q ) > d Sr P +1 ( h ( P ) + b S ) , true of all but finitely many Q by Northcott’s Theorem; each Q ∈ Orb S ( P ) ⊆ P ( K ) byconstruction of K . Then, it follows from Tate’s telescoping Lemma 2.2 that ℓ ( f ) > r P + 1 forall f ∈ M S with f ( P ) = Q : otherwise, h ( Q ) = h ( f ( P )) ≤ deg( f )( h ( P ) + b S ) ≤ d Sℓ ( f ) ( h ( P ) + b S ) ≤ d Sr P +1 ( h ( P ) + b S ) , a contradiction. In particular, each function taking the value of Q at P has length strictlylarger than r P +1. Now, let f Q ∈ M S be a function of smallest length taking the value of Q at P . Then ℓ ( f Q ) > r p + 1 and we may write f Q = τ ◦ · · · ◦ τ m ◦ ρ Q form some τ i ∈ S , some m ≥ , and some ρ Q ∈ M S of length r P + 1 . Likewise, for any other f ∈ M S with f ( P ) = Q ,we may write f = θ ,f ◦ · · · ◦ θ m,f ◦ q f ◦ ρ f for some θ i,f ∈ S , some q f ∈ M S , and some ρ f ∈ M S of length r P + 1 ; here we use the minimality of the length of f Q . Then f ( P ) = f Q ( P ) implies:(45) θ ,f ◦ · · · ◦ θ m,f ◦ q f ◦ ρ f ( P ) = τ ◦ · · · ◦ τ m ◦ ρ Q ( P ) . Now for all ≤ i ≤ m , let ρ i = θ i +1 ,f ◦ · · · ◦ θ m,f ◦ q f ◦ ρ f and P ′ i = τ i +1 ◦ · · · ◦ τ m ◦ ρ Q ( P ) . Inparticular, (45) becomes θ ,f ( ρ ( P )) = τ ( P ′ ) . On the other hand, P ′ i ∈ R P by definition of R P and ℓ ( ρ i ) ≥ ℓ ( ρ f ) = r P + 1 > r P for all i .Hence, Lemma 4.9 applied to ρ = ρ , P ′ = P ′ , θ = θ ,f , and τ = τ implies that θ ,f = τ and ρ ( P ) = P ′ . Therefore, θ ,f ◦ · · · ◦ θ m,f ◦ q f ◦ ρ f ( P ) = τ ◦ · · · ◦ τ m ◦ ρ Q ( P ) . Repeating the same argument, this time with ρ = ρ , P ′ = P ′ , etc., we see that Lemma 4.9implies that θ ,f = τ and ρ ( P ) = P ′ . We can clearly continue this argument ( m -times) andobtain that(46) q f ◦ ρ f ( P ) = ρ Q ( P ) and θ i,f = τ i for all ≤ i ≤ m. On the other hand, Tate’s Telescoping Lemma 2.2 and the fact that h ( P ) > b S imply thelower bound(47) r P +1 b S ≤ deg( ρ f )( h ( P ) − b s ) ≤ h ( ρ f ( P )) . Likewise, we have the upper bound(48) h ( ρ Q ( P )) ≤ deg( ρ Q )( h ( P ) + b S ) ≤ d Sr P +1 ( h ( P ) + b S ) . Hence, after combining (46), (47) and (48) with Lemma 2.2 applied to the map q f , we seethat deg( q f )(2 r P +1 − b S ≤ deg( q f )( h ( ρ f ( P )) − b S ) ≤ h ( q f ◦ ρ f ( P )) = h ( ρ Q ( P )) ≤ d Sr P +1 ( h ( P ) + b S ) . In particular, dividing both sides of the inequality above by (2 r P +1 − b S , we deduce that(49) ℓ ( q f ) ≤ deg( q f ) ≤ d Sr P +1 ( h ( P ) + b S )(2 r P +1 − b S . Hence the length of q f is bounded. But S is a finite set of maps, so the number of possible q f ’sis finite. Likewise, the length of ρ f is r P + 1 is bounded, and so there are only finitely manypossible ρ f ’s. In summation, we have shown that if f ∈ M S is any function with f ( P ) = Q , OUNTING POINTS OF BOUNDED HEIGHT IN MONOID ORBITS 17 then f = τ ◦ · · · ◦ τ m ◦ q f ◦ ρ f such that: the τ i are fixed, and the number of possible q f ’s and ρ f ’s are bounded independently of Q . Specifically, we have that (cid:8) f ∈ M S : f ( P ) = Q (cid:9) ≤ s log l d SrP +1( h ( P )+ bS )(2 rP +1 − bS m + r P +1 holds for all Q ∈ Orb S ( P ) with h ( Q ) > d Sr P +1 ( h ( P ) + b S ) , which proves the claim. (cid:3) We now finish the proof of Theorems 1.5 and 1.6. Note that Lemma 4.10 implies that: t − P,S · (cid:8) f ∈ M S : H ( f ( P )) ≤ B (cid:9) + O (1) ≤ (cid:8) Q ∈ Orb S ( P ) : H ( Q ) ≤ B (cid:9) ≤ (cid:8) f ∈ M S : H ( f ( P )) ≤ B (cid:9) holds for all B sufficiently large and all P such that H ( P ) > e b S . Moreover, combining thebounds above with Theorem 1.1, we see that for all ǫ > there exists an effectively computablepositive constant b = b ( S, ǫ ) such that (log B ) b ≪ { Q ∈ Orb S ( P ) : H ( Q ) ≤ B } ≪ (log B ) b + ǫ as desired. (cid:3) In higher dimensions, it is possible that one can attack Conjecture 1.3 in a similar mannerto that above, provided that one can give a reasonable condition ensuring that the set ofrational/integral points on the variety V f,g := { ( P, Q ) ∈ P N × P N : f ( P ) = g ( Q ) } is not Zariski dense (for all distinct f, g ∈ M S of some fixed length). To do this, it is likelynecessary to assume the Bombieri-Lang Conjecture.Likewise (although most sets generate free monoids), it would be interesting to study theheight growth rates in monoid orbits which are not free (or free commutative). As a test case,one might consider the following example from [14, Remark 1.5]: let ω be a primitive cuberoot of unity and let F ( x ) = x and G ( x ) = ωx . Then the monoid generated by S = { F, G } has three independent relations: F = G , F ◦ G = G ◦ F , and G ◦ F ◦ G = F ◦ G ◦ F .5. Appendix: integral points on curves f ( X ) − f ( Y ) X − Y (by Umberto Zannier) Let f ∈ C [ X ] be a polynomial of degree d ≥ and let O be a finitely generated subring of C . For the sequel we put(50) F ( X, Y ) = f ( X ) − f ( Y ) X − Y .
Recall also that the cyclic polynomial of degree n is simply X n , and the Chebyshev polynomial of degree n is the unique polynomial T n satisfying the identity T n ( Z + Z − ) = Z n + Z − n . Thepurpose of the present Appendix is to prove the following: Theorem 1.
Assume that the plane curve defined by F ( X, Y ) has infinitely many points in O . Then there are an integer n > and polynomials g, l ∈ C [ X ] , with deg l = 1 , such that f = g ◦ S n ◦ l , where S n is either the cyclic or the Chebyshev polynomial of degree n .Remark . Note that the result has an easy converse, as soon as we allow some freedom on O , as we now illustrate:(i) If S n ( X ) = X n (after applying l − ) we obtain factors X − ζY ( ζ n = 1 , ζ = 1 ) for ourpolynomial F ( X, Y ) , i.e. components of the curve which are lines defined over Q ( ζ ) . Thereforewe obtain infinitely many points in O as soon as O contains ζ (and the coefficients of l ).(ii) In the case S n = T n , from the defining property of T n we easily obtain (well-known)factors of T n ( X ) − T n ( Y ) given by X − ( ζ + ζ − ) XY + Y + ( ζ − ζ − ) , for ζ = ± an n -th root of unity. On setting Y = W + W − , this quadratic in turn factors as ( X − ζW − ζ − W − )( X − ζ − W − ζW − ) . Hence, if we let w take values in O ∗ (which may well beinfinite) and set X = x = ζw + ζ − w − we obtain again an infinity of points in O . We also obtain similarly quadratic factors of T n ( X ) + T n ( Y ) , which are relevant when g ( X ) = h ( X ) is even. These factors divide also T n ( X ) − T n ( Y ) , since T n = T ◦ T n = T n − .In the next version of the result, i.e. Theorem 2 below, we shall add a further conclusionwhich implies that all but finitely many integral points arise in this way.As to the theorem, we recall at once that in virtue of Siegel’s Theorem (extended suitablyto finitely generated subrings) an irreducible affine curve can have can have infinitely many(integral) points defined over O only if(i) it has genus and(ii) it has at most two points at infinity. See [4], or [17], or [23]. The crucial case is the original Siegel’s 1929 version over Z , asextended later by Mahler to the rings of S -integers in a number field.Thus the problem is to investigate when the (possibly reducible) curve defined by F ( X, Y ) has a component satisfying these ‘Siegel conditions’ (which cannot generally be improved).This leads in the first place to the need to establish when the defining polynomial F canbe reducible . If f is indecomposable (i.e. not of the shape g ◦ h for polynomials g, h of degree > ) then the correct condition was found by Fried [7]: namely, F is irreducible unless f ( X ) is either a cyclic or a Chebyshev polynomial up to a linear change of variable , which of coursecorresponds to our conclusion. (See also Schinzel’s book [21], especially 1.5, where fields ofdefinitions are considered as well, which instead we disregard here.) An application of Fried’sresult would then directly yield the present theorem in the indecomposable cases.However, if f is decomposable then certainly F ( X, Y ) is anyway reducible, and the issueleads to more delicate problems concerning the nature of the irreducible factors. In the paper[1] a laborious classification is obtained for all the cases when there is a factor defining a curveof genus . The results of [1] depend on some finite-group theory, which is used to an evenmuch heavier extent in Mueller’s paper [19], which again obtains certain complete laboriousclassifications relevant for suitable applications of Siegel’s theorem.An applications of [1] would suffice for the present purposes of proving Theorem 1, evenforgetting about Siegel’s condition (ii). But in fact it turns out that adding such conditionnot only makes the former (i) automatic, but also leads to a much simpler and self-containedelementary proof, which can be hopefully useful for some readers and for other applications.Moreover this proof yields with little effort a slightly more precise conclusion, as in the lastphrase of the statement below (which, as in the Remark above, allows to describe all butfinitely many integral points).To present such a proof is the scope of this Appendix. By the remarks above, for Theorem1 it will suffice to prove the following result (even disregarding the last conclusion): Theorem 2.
Assume that the polynomial F ( X, Y ) has an irreducible factor Φ defining a curvewith at most two points at infinity (in a closure in P ). Then deg Φ ≤ and there are aninteger n > and polynomials g, l ∈ C [ X ] , with deg l = 1 , such that f = g ◦ S n ◦ l , where S n is the cyclic (if deg Φ = 1 ) or the Chebyshev (if deg Φ = 2 ) polynomial of degree n .If deg Φ = 1 , then Φ divides l ( X ) n − l ( Y ) n . If deg Φ = 2 , then Φ is symmetric and eitherit divides S n ( l ( X )) − S n ( l ( Y )) , or g is even and Φ divides S n ( l ( X )) + S n ( l ( Y )) .Proof. To start with, we normalize f by assuming it is monic and with vanishing secondcoefficient: f ( X ) = X d + f X d − + . . . + f d , f i ∈ C . This does not affect the results on takinginto account the linear polynomial l ( X ) in the statement.Our affine (possibly reducible) curve C F : F ( X, Y ) = 0 has degree d − . Note that thepoints at infinity in P of (the closure of) this curve are given in homogenous coordinates ( x : y : z ) by z = 0 , x d = y d , x = y , so they form a set of d − pairwise distinct points. Let Φ( X, Y ) ∈ C [ X, Y ] be an irreducible factor of F ( X, Y ) , defining an irreducible curve C Φ with at most two points at infinity. The homogeneous part of Φ of highest degree must bea factor of ( X d − Y d ) / ( X − Y ) , and the points at infinity correspond to linear factors of this By points at infinity we mean the missing points with respect to a projective closure of the curve. Thisnumber may increase by passing to a smooth model, but the theorem applies to any model. They are smooth points, which simplifies things as we do not need to refer to smooth models.
OUNTING POINTS OF BOUNDED HEIGHT IN MONOID ORBITS 19 homogeneous part. Since this has not multiple factors, we deduce that C Φ has deg Φ pointsat infinity. Hence, if C Φ satisfies Siegel’s condition (ii), we must have deg Φ ≤ .From these considerations it also follows that we may assume that Φ is monic in Y .Suppose first that deg Φ = 1 , so Φ( X, Y ) = Y − aX − b ; hence we must have f ( aX + b ) = f ( X ) identically. Since however f has vanishing second coefficient, this entails b = 0 , hence f ( aX ) = f ( X ) . We already know that a is a d -th root of unity, a = 1 . If n is the exact orderof a , then n > divides d and f must be a polynomial in X n , i.e. f ( X ) = g ( X n ) and we fallinto one of the cases of the conclusion.Note that Y − aX divides indeed X n − Y n so the last assertion holds as well.Suppose now that deg Φ = 2 . The two points at infinity of C Φ correspond to two Puiseuxexpansions Y = P ± ( X ) := a ± X + b ± + b ± X − + . . . in descending powers of X , where b i ± are complex numbers and a ± are two distinct d -th roots of , both different from .We have Φ( X, P ± ( X )) = 0 hence F ( X, P ± ( X )) = 0 , so f ( X ) = f ( P ± ( X )) identically. Asbefore, since f has vanishing second coefficient this yields b ± = 0 . We may write Φ( X, Y ) = ( Y − a + X )( Y − a − X ) + L ( X, Y ) − k, where L is linear homogeneous and k ∈ C . We have that P ± ( X ) − a ± X = O ( X − ) , in the sensethat it is a Puiseux series where no non-negative power of X appears. Since Φ( X, P ± ( X )) = 0 we get that L ( X, P ± ( X )) = O (1) for both choices of the sign. But then, since a ± are distinctthis implies L = 0 , and since Φ is irreducible we have k = 0 . Hence, setting s := a + + a − , p := a + a − , we have pk = 0 and Φ( X, Y ) = ( Y − a + X )( Y − a − X ) − k = Y − sXY + pX − k. Let now x be a variable over C and let y be a solution of Φ( x, y ) = 0 in an extension of C ( x ) , so F := C ( x, y ) is the function field of C Φ . Note that F is a quadratic extension ofboth C ( x ) and C ( y ) ; looking at the equation we find that the Galois groups are generatedrespectively by the automorphisms σ, τ of F (of order ) given by σ ( x ) = x, σ ( y ) = sx − y τ ( x ) = (cid:0) sp (cid:1) y − x, τ ( y ) = y. It will be notationally convenient to have another expression for F . Define the linear forms Z ± := Y − a ± X , so Φ = Z + Z − − k . Letting z ± = y − a ± x we thus have z + z − = k and x = z + − z − a − − a + = γ ( z + − z − ) , y = γ ( a − z + − a + z − ) , where we have put γ := ( a − − a + ) − . So in particular we have F = C ( z + ) and by an easycomputation one finds that the above automorphisms are expressed by(51) σ ( z + ) = − z − = αz + , τ ( z + ) = − a + a − z − = βz + , where α = − k , β = − ka + /a − .Now, since Φ( x, y ) = 0 we have F ( x, y ) = 0 whence f ( x ) = f ( y ) , so the field K := C ( x ) ∩ C ( y ) contains C ( f ( x )) and thus the degree [ F : K ] is finite. The field K is left fixedby both σ, τ , and thus by the group G that they generate inside Aut ( F / C ) = PGL ( C ) . Bybasic Galois theory actually the fixed field of G is precisely the intersection C ( x ) ∩ C ( y ) = K .We have σ ( τ ( z + )) = ( β/α ) z + , hence β/α = a + /a − is a root of unity of a certain order n :actually, we already knew that a + , a − are d -th roots of unity, and they are distinct, so n > is a divisor of d .The group G is generated by σ and ξ := στ . On looking at the action on z + it is now easilyseen that σ − ξσ = ξ − , so G is a dihedral group of order n .Now, the rational function of z + given by w := z n + + α n z − n + of degree n is plainly invariantby both σ and ξ , hence by G . Again by simple Galois theory, we have C ( w ) = K . Therefore f ( x ) , which lies in K , is a rational function of w , f ( x ) = g ( w ) . (On comparing degrees wefind deg g = d/n .) Recall that x = γ ( z + − z − ) = γ ( z + + ( − k ) z − ) = γ ( z + + αz − ) . Hence x has only thepoles z + = 0 , ∞ , and the same holds for f ( x ) (as functions of z + ). It follows at once that g must be a polynomial, of degree d/n .The proof is now easily completed by a simple change of variables. We have w ∈ K ⊂ C ( x ) ,so we may write w = S ( x ) with S a rational function of degree n , which as above must be apolynomial.Set z = δz + where δ α = 1 . Hence x = γδ − ( z + z − ) . Also, w = δ − n ( z n + z − n ) . Hence δ n S ( γδ − ( z + z − )) = z n + z − n , and by uniqueness it follows that δ n S ( γδ − X ) = T n ( X ) isthe Chebyshev polynomial of degree n . Hence in conclusion we find f ( X ) = g ( δ − n T n ( γ − δX )) , as required.To check the last assertion, for notational simplification we slightly change conventions andreplace g ( δ − n X ) with g ( X ) and f ( X ) with f ( γδ − X ) , so to suppose f ( X ) = g ( T n ( X )) . Inthe above notation, x becomes z + z − and y = a − z + a + z − . (Note that these substitutionsleave unchanged the set { a + , a − } .)Also, let µ = a + /a − , so µ n =: ǫ ∈ {± } . We have y = µa − (( z/µ ) + ( z/µ ) − ) , so T n (( µa − ) − y ) = ǫ ( z n + z − n ) = ǫT n ( x ) . Hence, setting ν := ( µa − ) − , we have T n ( νy ) = ǫT n ( x ) , g ( T n ( y )) = f ( y ) = f ( x ) = g ( T n ( x )) = g ( ǫT n ( νy )) . Denoting b := deg g = d/n , we then deduce that deg( T n ( y ) b − ( ǫT n ( νy )) b ≤ ( b − n . Buton factoring the left side and noting that all factors but at most one have degree ≥ n , thisimplies that in fact one of the factors is constant, hence (52) T n ( y ) = θǫT n ( νy ) + c, g ( θX + c ) = g ( X ) , for some b -th root of unity θ . Note that all of these equalities hold identically.Now, the Chebyshev polynomial T n ( X ) starts with X n − nX n − + . . . , whence the first ofthe equations gives θǫν n = ν = 1 . Also, if n is odd then T n (0) = 0 whence c = 0 ; if n is eventhen ν n = 1 so θǫ = 1 and again setting y = 0 we find c = 0 anyway. Conversely, if theseequalities hold it is easy to check that the equation holds, since T n has the same parity of n .So we may suppose in the sequel that θǫν n = ν = 1 and that g ( θX ) = g ( X ) .Now, consider again the equation T n ( νy ) = ǫT n ( x ) , i.e. ν n T n ( y ) = ǫT n ( x ) .If ν n = ǫ we have T n ( x ) = T n ( y ) so Φ( X, Y ) divides ( T n ( X ) − T n ( Y )) / ( X − Y ) , and we arein the first case of the conclusion.If ν n = ǫ , then T n ( x ) = − T n ( y ) hence Φ( X, Y ) divides T n ( X ) + T n ( Y ) . Also, we havealready observed that θ = ǫν n which in this case equals − so g is an even polynomial by thesecond equation in (52) (since c = 0 ), again as in the sough conclusion.Finally, from the above equations we derive p = a + a − = ( a + /a − )( a − ) = ( µa − ) = ν − = 1 , hence Φ( X, Y ) is symmetric.This concludes the proof of Theorem 2. (cid:3) Remark . Actually, the proof yields some small supplementary information on the structureof the factors (which however can be deduced independently a posteriori ).We also note that the last conclusion could have been stated as follows: if n is maximal suchthat the decomposition holds, then the quadratic factor anyway divides S n ( l ( X )) − S n ( l ( Y )) .Indeed, if g is even, then since T ( X ) = X − , g can be written as h ◦ T and now we usethat T ◦ T n = T n (well known and easy to deduce). This argument is fairly standard.
OUNTING POINTS OF BOUNDED HEIGHT IN MONOID ORBITS 21
References [1] R. Avanzi and U. Zannier, The equation f ( X ) = f ( Y ) in rational functions X = X ( t ) , Y = Y ( t ) , Compositio Mathematica K surfaces in P × P × P , Math. Ann. f ( x ) = g ( y ) , Acta Arithmetica , CambridgeUniversity Press 2006.[5] W. Bosma, J. Cannon, and C. Playoust, The Magma algebra system I: The user language, Journal ofSymbolic Computation
Cambridge University Press , 2009.[7] M. Fried, On a conjecture of Schur,
Michigan Math. J ., 17 (1970): 41-50.[8] D. Ghioca, T. J. Tucker, and M. E. Zieve, Linear relations between polynomial orbits,
Duke MathematicalJournal
Linear algebra and its applications
J. Number Theory , 201 (2019): 228-256.[11] W. Hindes, Dynamical and arithmetic degrees for random iterations of maps on projective space, preprintarXiv:1904.04709[12] W. Hindes, Finite orbit points for sets of quadratic polynomials,
Int. J. Number Theory , 15.8 (2019):1693-1719.[13] W. Hindes, Dynamical height growth: left, right, and total orbits, submitted, arXiv:2002.09798.[14] Z. Jiang and M. Zieve, Functional equations in polynomials, REU project.[15] S. Kawaguchi, Canonical heights for random iterations in certain varieties,
Int. Math. Res. Not. , ArticleID rnm023, 2007.[16] S. Kawaguchi and Joseph H. Silverman, On the dynamical and arithmetic degrees of rational self-maps ofalgebraic varieties,
Journal für die reine und angewandte Mathematik (Crelles Journal) 2016.713 (2016):21-48.[17] S. Lang, Diophantine Geometry Springer-Verlag, 1982.[18] S. Lang, Fundamentals of Diophantine geometry, Springer Science & Business Media, 2013.[19] P. Mueller, Permutation groups with a cyclic two-orbits subgroup and monodromy groups of Laurentpolynomials.
Ann. Sc. Norm. Super. Pisa Cl. Sci. (5) 12 (2013), no. 2, 369-398.[20] F. Pakovich, Algebraic curves P ( x ) − Q ( y ) = 0 and functional equations, Complex Variables and EllipticEquations 56.1-4 (2011): 199-213.[21] A. Schinzel, Polynomials with special regard to reducibility, Cambridge Univ. Press, 2000.[22] J-P. Serre. Lectures on the Mordell-Weil theorem. Aspects of Mathematics. Friedr. Vieweg & Sohn, Braun-schweig, third edition, 1997. Translated from the French and edited by Martin Brown from notes by MichelWaldschmidt, With a foreword by Brown and Serre.[23] J-P. Serre, Lectures on the Mordell-Weil Theorem, 2nd Ed., Vieweg, 1990.[24] J. Silverman, The Arithmetic of Dynamical Systems, Vol. 241, Springer GTM, 2007.[25] J. Silverman, Rational points on K surfaces: a new canonical height, Inventiones mathematicae
Mathematics of Computation39.160(1982): 709-723.