Entropy and drift in word hyperbolic groups
aa r X i v : . [ m a t h . P R ] J a n ENTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS
SÉBASTIEN GOUËZEL, FRÉDÉRIC MATHÉUS, FRANÇOIS MAUCOURANT
Abstract.
The fundamental inequality of Guivarc’h relates the entropy and the drift ofrandom walks on groups. It is strict if and only if the random walk does not behave likethe uniform measure on balls. We prove that, in any nonelementary hyperbolic groupwhich is not virtually free, endowed with a word distance, the fundamental inequality isstrict for symmetric measures with finite support, uniformly for measures with a givensupport. This answers a conjecture of S. Lalley. For admissible measures, this is provedusing previous results of Ancona and Blachère-Haïssinsky-Mathieu. For non-admissiblemeasures, this follows from a counting result, interesting in its own right: we show that,in any infinite index subgroup, the number of non-distorted points is exponentially small.The uniformity is obtained by studying the behavior of measures that degenerate towardsa measure supported on an elementary subgroup. Main results
Let Γ be a finitely generated infinite group. Although the following discussion makessense in a much broader context, we will assume that Γ is hyperbolic since all results ofthis article are devoted to this setting. There are two natural ways to construct randomelements in Γ : • Let d be a proper left-invariant distance on Γ (for instance a word distance). Forlarge n , one can pick an element at random with respect to the uniform measure ρ n on the ball B n = B ( e, n ) (where e denotes the identity of Γ ). • Let µ be a probability measure on Γ . For large n , one can pick an element atrandom with respect to the measure µ ∗ n (the n -th convolution of the measure µ ).Equivalently, let g , g , . . . be a sequence of random elements of Γ that are distributedindependently according to µ . Form the random walk X n = g · · · g n . Then thedistribution of X n is µ ∗ n .From a theoretical point of view, these methods share a lot of properties. From a compu-tational point of view, the second method is much easier to implement in general groupssince it does not require the computation of the ball B n (note however that, in hyperbolicgroups, simulating the uniform measure is very easy thanks to the automatic structure ofthe group). It is therefore of interest to find probability measures µ such that these twomethods give equivalent results, in a sense that will be made precise below. This is themain question of Vershik in [Ver00]. In free groups (with the word distance coming fromthe usual set of generators), everything can be computed: if µ is the uniform measure onthe generators, then µ ∗ n and ρ n behave essentially in the same way. The situation is the Date : January 20, 2015. same in free products of finite groups, again thanks to the underlying tree structure. How-ever, in more complicated groups, explicit computations are essentially impossible, and itis expected that the methods always differ. Our main result confirms this intuition in aspecial class of groups: In hyperbolic groups which are not virtually free (i.e., there is nofinite index free subgroup), if d is a word distance, the two methods are always different, ina precise quantitative way. Remark 1.1.
We emphasize that the question really depends on the choice of the distance d , since the shape of the balls B n depends on d . For instance, for any symmetric probabilitymeasure µ on Γ whose support is finite and generates Γ , there exists a distance d (calledthe Green distance, see [BHM11]) for which the measures ρ n and µ ∗ n behave in the sameway. A famous open problem (to which our methods do not apply) is to understand whathappens when Γ acts cocompactly on the hyperbolic space H k , and the distance d is givenby d ( e, γ ) = d H k ( O, γ · O ) where O is a base point in H k . In this case, it is also expectedthat the two methods are always different. Here are the main partial results in this context:(1) The two methods are different for some symmetric measures with finite support([LP07], see also Theorem 5.9 below).(2) If, instead of a cocompact lattice, one considers a lattice with cusps, the two methodsare always different [GLJ93].(3) If, instead of a lattice, one considers a nice dense subgroup, there exist symmetricmeasures with finite support for which the two methods are equivalent [Bou12].This question also makes sense in continuous time, for negatively curved manifolds. Aconjecture of Sullivan asserts that, in this setting, the two methods coincide if and only ifthe manifold is locally symmetric, see [Led95].One can give several meanings to the question “are the two methods equivalent?” Let usfirst discuss an interpretation in terms of behavior at infinity. The measures µ ∗ n converge inthe geometric compactification Γ ∪ ∂ Γ to a measure µ ∞ , supported on the boundary, calledthe exit measure of the random walk, or its stationary measure. Geometrically, the randomwalk ( X n ) n > converges almost surely to a random point on the boundary ∂ Γ , the measure µ ∞ is its distribution. On the other hand, let ρ ∞ be the Patterson-Sullivan measure on ∂ Γ associated to the distance d , constructed in [Coo93] in this context. One should think ofit as the uniform measure on the boundary (it is equivalent to the Hausdorff measure ofmaximal dimension on the boundary, for any visual distance coming from d ). The measures ρ n do not always converge to ρ ∞ , but all their limit points are equivalent to ρ ∞ , with adensity bounded from above and from below (this follows from the arguments of [Coo93],see Lemma 2.13 below). A version of the question is then to ask if the measures µ ∞ and ρ ∞ are mutually singular: in this case, the random walk mainly visits parts of the groupsthat are not important from the point of view of the uniform measure.Another version of the same question is quantitative: Does the random walk visit parts ofthe groups that are exponentially negligible from the point of view of the uniform measure?This is made precise through the notions of drift and entropy. Define(1.1) L ( µ ) = X g ∈ Γ µ ( g ) | g | , H ( µ ) = X g ∈ Γ µ ( g )( − log µ ( g )) , NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 3 where | g | = d ( e, g ) . The quantity L ( µ ) is the average distance of an element to the identity.The quantity H ( µ ) , called the time one entropy of µ , is the average logarithmic weight ofthe points. They can both be finite or infinite. The functions L and H both behave in asubadditive way with respect to convolution: L ( µ ∗ µ ) L ( µ ) + L ( µ ) and H ( µ ∗ µ ) H ( µ ) + H ( µ ) . It follows that the sequences L ( µ ∗ n ) and H ( µ ∗ n ) are subadditive. Hence,the following quantities are well defined:(1.2) ℓ ( µ ) = lim L ( µ ∗ n ) /n, h ( µ ) = lim H ( µ ∗ n ) /n. They are called respectively the drift and the asymptotic entropy of the random walk.They also admit characterizations along typical trajectories. If L ( µ ) is finite, then almostsurely ℓ ( µ ) = lim | X n | /n . In the same way, if H ( µ ) is finite, then almost surely h ( µ ) =lim( − log µ ∗ n ( X n )) /n . The most intuitive characterization of the entropy is probably thefollowing one: at time n , the random walk is essentially supported by e h ( µ ) n points (seeLemma 2.4 for a precise statement). Let us also define the exponential growth rate of thegroup with respect to d , i.e.,(1.3) v = lim inf n →∞ log | B n | n , where B n is the ball of radius n around e . In hyperbolic groups, it satisfies the apparentlystronger inequality C − e nv | B n | Ce nv , by [Coo93]. For large n , most points for µ ∗ n are contained in a ball B (1+ ε ) ℓn , which has cardinality at most e (1+2 ε ) ℓnv . Since therandom walk at time n essentially visits e hn points, we deduce the fundamental inequalityof Guivarc’h [Gui80] h ℓv. If this inequality is an equality, this means that the walk visits most parts of the group.Otherwise, it is concentrated in an exponentially small subset. Another version of our mainquestion is therefore: Is the inequality h ℓv strict?In hyperbolic groups, it turns out that the two versions of the question are equivalent,at least for finitely supported measures, and that they also have a geometric interpretationin terms of Hausdorff dimension. If µ is a probability measure on a group, we write Γ + µ for the semigroup generated by the support of µ , and Γ µ for the group it generates. When µ is symmetric, they coincide. We say that µ is admissible if Γ + µ = Γ . The followingresult is Corollary 1.4 and Theorem 1.5 in [BHM11] (see also [Haï13]) when the measure issymmetric, and is proved in [Tan14] when µ is not necessarily symmetric and d is a worddistance. Theorem 1.2.
Let Γ be a non-elementary hyperbolic group, endowed with a left-invariantdistance d which is hyperbolic and quasi-isometric to a word distance. Let v be the expo-nential growth rate of (Γ , d ) . Let d ∂ Γ be a visual distance on ∂ Γ associated to d . Consideran admissible probability measure µ on Γ , with finite support. Assume additionally eitherthat the measure µ is symmetric, or that the distance d is a word distance. The followingconditions are equivalent: (1) The equality h = ℓv holds. (2) The Hausdorff dimension of the exit measure µ ∞ on ( ∂ Γ , d ∂ Γ ) is equal to the Haus-dorff dimension of this space. NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 4 (3)
The measure µ ∞ is equivalent to the Patterson-Sullivan measure ρ ∞ . (4) The measure µ ∞ is equivalent to the Patterson-Sullivan measure ρ ∞ , with densitybounded from above and from below. (5) There exists
C > such that, for any g ∈ Γ , | vd ( e, g ) − d µ ( e, g ) | C, where d µ is the “Green distance” associated to µ , i.e., d µ ( e, g ) = − log P ( ∃ n, X n = g ) where X n is the random walk given by µ starting from the identity (it is anasymmetric distance in general, and a genuine distance if µ is symmetric). The different statements in this theorem go from the weakest to the strongest: sinceentropy is an asymptotic quantity, an assumption on h seems to allow subexponential fluc-tuations, so the assumption (1) is rather weak. On the other hand, (3) says that twomeasures are equivalent, so most points are controlled. Finally, in (5), all points are uni-formly controlled. The equivalence between these statements is a strong rigidity theorem.The equivalence between (1) and (2) follows from a formula for the respective dimensions.The definition of a visual distance at infinity d ∂ Γ involves a small parameter ε . In terms ofthis parameter, one has HD ( µ ∞ ) = h/ ( εℓ ) and HD ( ρ ∞ ) = HD ( ∂ Γ) = v/ε , so that thesedimensions coincide if and only if h = ℓv .In this theorem, the finite support assumption can be weakened to an assumption ofsuperexponential moment (i.e., for all M > , P g ∈ Γ µ ( g ) e M | g | < ∞ ), thanks to [Gou13].The assumption that µ is symmetric or that d is a word distance is probably not necessary.However, the most important assumption in Theorem 1.2 is admissibility: it ensures thatthe random walk can see the geometry of the whole group (which is hyperbolic). For arandom walk living in a strict (maybe distorted) subgroup, one would not be expecting thesame nice behavior.Our main theorem follows. It states that, in hyperbolic groups which are not virtuallyfree, endowed with a word distance, the different equivalent conditions of Theorem 1.2 arenever satisfied, uniformly on measures with a fixed support. Theorem 1.3.
Let Γ be a hyperbolic group which is not virtually free, endowed with a worddistance d . Let Σ be a finite subset of Γ . There exists c < such that, for any symmetricprobability measure µ supported in Σ , h ( µ ) cℓ ( µ ) v, where v is the exponential growth rate of balls in (Γ , d ) . This theorem gives a positive answer to a conjecture of S. Lalley [Lal14, slide 16]. In thelanguage of Vershik [Ver00], this theorem says that no finite subset of Γ is extremal. Onthe other hand, if one lets Σ grow, h/ℓ can converge to v : Theorem 1.4.
Let Γ be a hyperbolic group, endowed with a left invariant distance d whichis hyperbolic and quasi-isometric to a word distance. Let ρ i be the uniform measure on theball of radius i . Then h ( ρ i ) /ℓ ( ρ i ) → v , where v is the exponential growth rate of balls in (Γ , d ) . More precisely, we prove that ℓ ( ρ i ) ∼ i and h ( ρ i ) ∼ iv . The only difficulty is to prove thelower bound on h ( ρ i ) : since h is defined in (1.2) using a subadditive sequence, upper bounds NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 5 are automatic, but to get lower bounds one should show that additional cancellations donot happen later on. This difficulty already appears in [EK13], where the authors provethat the entropy depends continuously on the measure. Our proof of Theorem 1.4, givenin Paragraph 2.5, also applies to this situation and gives a new proof of their result, underslightly weaker assumptions. There is nothing special about the uniform measure on balls,our proof also gives the same conclusion for the uniform measure on spheres, or for themeasures P e − s | g | δ g / P e − s | g | when s ց v .Our main result is Theorem 1.3. It is a consequence of the three following results. Sincetheir main aim is Theorem 1.3, they are designed to handle finitely supported symmetricmeasures. However, these theorems are all valid under weaker assumptions, which we specifyin the statements as they carry along implicit information on the techniques used in theproofs.The first result deals with admissible (or virtually admissible) measures. Theorem 1.5.
Let Γ be a hyperbolic group which is not virtually free, endowed with a worddistance. Let µ be a probability measure with a superexponential moment, such that Γ + µ is afinite index subgroup of Γ . Then h ( µ ) < ℓ ( µ ) v . The second result deals with non-admissible measures.
Theorem 1.6.
Let Γ be a hyperbolic group endowed with a word distance. Let µ be aprobability measure with a moment of order (i.e., L ( µ ) < ∞ ). Assume that ℓ ( µ ) > andthat Γ µ has infinite index in Γ . Then h ( µ ) < ℓ ( µ ) v . Finally, the third result is a kind of continuity statement, to get the uniformity.
Theorem 1.7.
Let Γ be a hyperbolic group, endowed with a left-invariant distance which ishyperbolic and quasi-isometric to a word distance. Let Σ be a subset of Γ which does notgenerate an elementary subgroup. There exists a probability measure µ Σ with finite supportsuch that ℓ ( µ Σ ) > and sup { h ( µ ) /ℓ ( µ ) : µ probability , Supp( µ ) ⊂ Σ , ℓ ( µ ) > } = h ( µ Σ ) /ℓ ( µ Σ ) . The same statement holds if the maximum is taken over symmetric probability measures,the resulting maximizing measure being symmetric.
Theorem 1.3 is a consequence of these three statements.
Proof of Theorem 1.3 using the three auxiliary theorems.
As in the statement of the theo-rem, consider a finite subset Σ of Γ . If Σ generates an elementary subgroup of Γ , allmeasures supported on Σ have zero entropy. Hence, one can take c = 0 in the statementof the theorem. Otherwise, by Theorem 1.7, there exists a symmetric measure µ Σ withfinite support that maximizes the quantity h ( µ ) /ℓ ( µ ) over µ symmetric supported by Σ . If Γ µ Σ = Γ + µ Σ has finite index, h ( µ Σ ) /ℓ ( µ Σ ) < v by Theorem 1.5. If it has infinite index, thesame conclusion follows from Theorem 1.6. (cid:3) The three auxiliary theorems are non-trivial. Their proofs are independent, and usecompletely different tools. Here are some comments about them.
NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 6 • At first sight, Theorem 1.5 seems to be the most delicate (this is the only one withthe assumption that Γ is not virtually free). However, this is also the setting thathas been mostly studied in the literature. Hence, we may use several known results,including most notably results of Ancona [Anc87], of Blachère, Haïssinsky and Math-ieu [BHM11] and Tanaka [Tan14] (Theorem 1.2 above) and of Izumi, Neshveyev andOkayasu [INO08] on rigidity results for cocycles. The proof relies mainly on thefact that the word distance is integer valued, contrary to the Green distance (moreprecisely, we use the fact that the stable translation length of hyperbolic elementsis rational with bounded denominator). • In Theorem 1.6, the difficulty comes from the lack of information on the subgroup Γ µ .If it has good geometric properties (for instance if it is quasi-convex), one may usethe same kind of techniques as for Theorem 1.5. Otherwise, the random walk doesnot really see the hyperbolicity of the ambient group. The fundamental inequalityalways gives h ℓv Γ µ , where v Γ µ is the growth rate of the subgroup Γ µ (for theinitial word distance on Γ ). If v Γ µ < v , the result follows. Unfortunately, thereexist non-quasi-convex subgroups of some hyperbolic groups with the same growthas the ambient group. However, a random walk does not typically visit all pointsof Γ µ , it concentrates on those points that are not distorted (i.e., their distances tothe identity in Γ and Γ µ are comparable). To prove Theorem 1.6, we will show thatin any infinite index subgroup of a hyperbolic group, the number of non-distortedpoints is exponentially smaller than e nv . • Theorem 1.7 is less simple than it may seem at first sight: it does not claim that µ Σ issupported by Σ , and indeed this is not the case in general (see Example 5.4). Hence,the proof is not a simple continuity argument: We need to understand precisely thebehavior of sequences of measures that degenerate towards a measure supported onan elementary subgroup. The proof will show that µ Σ is supported by K · (Σ ∪{ e } ) · K ,where K is a finite subgroup generated by some elements in Σ .A natural question is whether Theorem 1.3 holds for non-symmetric measures. For ad-missible measures, (i.e., Γ + µ = Γ ), Theorem 1.5 holds. For non-symmetric measures suchthat Γ µ has infinite index, Theorem 1.6 applies directly. However, since Γ µ = Γ + µ for generalnon-symmetric measures, there is another case to consider: the case of measures µ such that Γ µ = Γ (or Γ µ has finite index in Γ ), but Γ + µ is much smaller than Γ . In this case, it seemsthat our arguments do not suffice. We give in Section 6 two examples illustrating the newdifficulties:(1) One can not rely on growth arguments, as for Theorem 1.6. Indeed, there aresubsemigroups Λ + with bad asymptotic behavior, for instance such that lim inf | B n ∩ Λ + | / | B n | = 0 and lim sup | B n ∩ Λ + | / | B n | > .(2) The arguments of Theorem 1.5 work for finitely supported measures, or for measureswith a superexponential moment, but also more generally for measures with a nicegeometric behavior (they should satisfy so-called Ancona inequalities). In the non-symmetric situation, we give in Proposition 6.2 explicit examples of (non-admissible)measures with an exponential moment and a very nice geometric behavior, and suchthat nevertheless h = ℓv . So, arguments similar to those of Theorem 1.5 can not NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 7 suffice, one needs a new argument that distinguishes in a finer way between measureswith finite support and measures with infinite support.This article is organized as follows. In Section 2, we give more details on the notions ofhyperbolic group, drift and entropy. We also prove Theorem 1.4 on the asymptotic entropyand drift of the uniform measure on large balls. The following three sections are thendevoted to the proofs of the three auxiliary theorems. Finally, we describe in Section 6what can happen in the non-symmetric setting. In particular, we show that in any torsion-free group with infinitely many ends, there exist (non-admissible, non-symmetric) measureswith an exponential moment satisfying h = ℓv .2. General properties of entropy and drift in hyperbolic groups
Hyperbolic spaces.
In this paragraph, we recall classical properties of hyperbolicspaces. See for instance [GdlH90] or [BH99].Consider a metric space ( X, d ) . The Gromov product of two points y, y ′ ∈ X , based at x ∈ X , is by definition(2.1) ( y | y ′ ) x = (1 / d ( x , y ) + d ( x , y ′ ) − d ( y, y ′ )] . The space ( X, d ) is hyperbolic if there exists δ > such that, for any x , y , y , y , thefollowing inequality holds: ( y | y ) x > min(( y | y ) x , ( y | y ) x ) − δ. The main intuition to have is that, in hyperbolic spaces, configurations of finitely manypoints look like configurations in trees: for any k , for any subset F of X with cardinality atmost k , there exists a map Φ from F to a tree such that, for all x, y ∈ F , d ( x, y ) − kδ d (Φ( x ) , Φ( y )) d ( x, y ) . Hence, a lot of distance computations can be reduced to equivalent computations in trees(which are essentially combinatorial), up to a bounded error. Up to δ , the Gromov product ( y | y ′ ) x is, in the approximating tree, the length of the part that is common to the geodesicsfrom x to y and from x to y ′ .A space ( X, d ) is geodesic if there exists a geodesic between any pair of points. For suchspaces, there is a convenient characterization of hyperbolicity. A geodesic space ( X, d ) ishyperbolic if and only if there exists δ > such that its geodesic triangles are δ -thin, i.e.,each side is included in the δ -neighborhood of the union of the two other sides.Assume that ( X, d X ) and ( Y, d Y ) are two geodesic metric spaces, and that they are quasi-isometric. If ( X, d X ) is hyperbolic, then so is ( Y, d Y ) . Note however that this equivalenceonly holds for geodesic spaces.Let ( X, d ) be a geodesic hyperbolic metric space. A subset Y of X is quasi-convex ifthere exists a constant C such that, for any y, y ′ ∈ Y , the geodesics from y to y ′ stay in the C -neighborhood of Y .We will sometimes encounter hyperbolic spaces which are not geodesic, but only quasi-geodesic: there exist constants C > and λ such that any two points can be joined by a ( λ, C ) -quasi-geodesic, i.e., a map f from a real interval to X such that λ − | t ′ − t | − C d ( f ( t ) , f ( t ′ )) λ | t ′ − t | + C . When the space is geodesic, a quasi-geodesic stays a boundeddistance away from a true geodesic. Most properties that hold or can be defined using NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 8 geodesics (for instance the notion of quasi-convexity) can be extended to this setting, simplyreplacing geodesics with quasi-geodesics in the statements.Let ( X, d ) be a proper geodesic hyperbolic space. Its boundary at infinity ∂X is bydefinition the set of geodesics originating from a base point x , where two such geodesicsare identified if they remain a bounded distance away. It is a compact space, which doesnot depend on x . The space X ∪ ∂X is also compact. If X is only quasi-geodesic, all thesedefinitions extend using quasi-geodesics instead of geodesics.Any isometry (or, more generally, quasi-isometry) of a hyperbolic space extends continu-ously to its boundary, giving a homeomorphism of ∂X .The Gromov product may be extended to X ∪ ∂X : we define ( ξ | η ) x as the infimum limitof ( x n | y n ) x for x n and y n converging respectively to ξ and η . The choice to take the infimumis arbitrary, one could also take the supremum or any accumulation point, those quantitiesdiffer by at most a constant only depending on δ . Hence, one should think of the Gromovproduct at infinity to be canonically defined only up to an additive constant. Heuristically, ( ξ | η ) x is the time after which two geodesics from x to ξ and to η start diverging.Let ( X, d ) be a proper geodesic (or quasi-geodesic) hyperbolic space. For any smallenough ε > , one may define a visual distance d ∂X,ε on ∂X such that d ∂X,ε ( ξ, η ) ≍ e − ε ( ξ | η ) x (meaning that the ratio between these quantities is uniformly bounded from above and frombelow).Let ( X, d ) be a proper hyperbolic metric space. One can define another boundary of X , the Busemann boundary (or horoboundary), as follows. Let x be a fixed basepointin X . To x ∈ X , one associates its horofunction h x ( y ) = d ( y, x ) − d ( x , x ) , normalizedso that h x ( x ) = 0 . The map Φ : x h x is an embedding of X into the space of -Lipschitz functions on X , with the topology of uniform convergence on compact sets. Thehoroboundary is obtained by taking the closure of Φ( X ) . In other words, a sequence x n ∈ X converges to a boundary point if h x n ( y ) converges, uniformly on compact sets. Its limit isthe horofunction h ξ associated to the corresponding boundary point ξ (it is also called theBusemann function associated to ξ ). We denote by ∂ B X the Busemann boundary of X .There is a continuous projection π B : ∂ B X → ∂X , which is onto but not injective in general.The boundary ∂ B X is rather sensitive to fine scale details of the distance d , while ∂X onlydepends on its quasi-isometry class.Any isometry ϕ of X acts on horofunctions, by the formula h ϕ ( x ) ( y ) = h x ( ϕ − y ) − h x ( ϕ − x ) . This implies that ϕ extends to a homeomorphism on ∂ B X , given by the sameformula h ϕ ( ξ ) ( y ) = h ξ ( ϕ − y ) − h ξ ( ϕ − x ) . Note that, contrary to the action on the geometricboundary, this only works for isometries of X , not quasi-isometries.2.2. Hyperbolic groups.
Let Γ be a finitely generated group, with a finite symmetricgenerating set S . Denote by d = d S the corresponding word distance. The group Γ ishyperbolic if the metric space (Γ , d S ) is hyperbolic. Since hyperbolicity is invariant underquasi-isometry for geodesic spaces, this notion does not depend on the choice of the generat-ing set S . However, if one considers another left-invariant distance on Γ which is equivalentto d S but not geodesic, its hyperbolicity is not automatic. Hence, one should postulate itshyperbolicity if it is needed, as in the statement of Theorem 1.2. We say that the pair (Γ , d ) is a metric hyperbolic group if the group Γ is hyperbolic for one (or, equivalently, for any) NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 9 word distance, and if the distance d is left-invariant, hyperbolic, and quasi-isometric to one(or equivalently, any) word distance. Such a distance d does not have to be geodesic, but itis quasi-geodesic since geodesics for a given word distance form a system of quasi-geodesicsfor d , going from any point to any point.Let (Γ , d ) be a metric hyperbolic group. The left-multiplication by elements of Γ isisometric. Hence, Γ acts by homeomorphisms on its compactifications Γ ∪ ∂ Γ and Γ ∪ ∂ B Γ .Moreover, any infinite order element g ∈ Γ acts hyperbolically on Γ ∪ ∂ Γ : it has two fixedpoints at infinity g − and g + , the points in Γ ∪ ∂ Γ \ { g − } are attracted to g + by forwarditeration of g , and the points in Γ ∪ ∂ Γ \ { g + } are attracted to g − by backward iteration of g . Definition 2.1.
Consider an action of a group Γ on a space Z . A function c : Γ × Z → R is a cocycle if, for any g, h ∈ Γ and any ξ ∈ Z , (2.2) c ( gh, ξ ) = c ( g, hξ ) + c ( h, ξ ) . The cocycle is Hölder-continuous if Z is a metric space and each function ξ c ( g, ξ ) isHölder-continuous. There is a choice to be made in the definition of cocycles, since one may compose with g or g − . Our definition is the most customary. With this definition, the map c B : Γ × ∂ B Γ → R given by c B ( g, ξ ) = h ξ ( g − ) is a cocycle, called the Busemann cocycle.A subgroup H of Γ is nonelementary if its action on ∂ Γ does not fix a finite set. Equiva-lently, H is not virtually the trivial group or Z . We say that a probability measure µ on Γ is nonelementary if the subgroup Γ µ generated by its support is itself nonelementary.Let µ be a probability measure on Γ . Since Γ acts by homeomorphisms on the compactspace ∂ Γ , it admits a stationary measure: there exists a probability measure ν on ∂ Γ suchthat µ ∗ ν = ν , i.e., P g ∈ Γ µ ( g ) g ∗ ν = ν . If µ is nonelementary, this measure is unique,and has no atom (see [Kai00]). It is also the exit measure of the corresponding randomwalk X n = g · · · g n : almost every trajectory X n ( ω ) converges to a point X ∞ ( ω ) ∈ ∂ Γ , andmoreover the distribution of X ∞ is precisely ν .In the same way, since Γ acts on ∂ B Γ , it admits a stationary measure ν B there. Thismeasure is not unique in general, even if µ is nonelementary. However, all such measuresproject under π B to the unique stationary measure on ∂ Γ .2.3. The drift.
Let (Γ , d ) be a metric hyperbolic group. Consider a probability measure µ on Γ , with finite first moment L ( µ ) (defined in (1.1)). The drift of the random walk hasbeen defined in (1.2) as ℓ ( µ ) = lim L ( µ ∗ n ) /n . Let X n = g · · · g n be the position at time n of the random walk generated by µ (where the g i are independent and distributed accordingto µ ). Then, almost surely, ℓ ( µ ) = lim | X n | /n .The drift also admits a description in terms of the Busemann boundary. The followingresult is well-known (compare with [KL11, Theorem 18]). Proposition 2.2.
Let (Γ , d ) be a metric hyperbolic group. Let µ be a nonelementary prob-ability measure on Γ with finite first moment. Let ν B be a µ -stationary measure on ∂ B Γ .Then (2.3) ℓ ( µ ) = Z Γ × ∂ B Γ c B ( g, ξ ) d µ ( g ) d ν B ( ξ ) . NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 10
Proof.
Let X n be the position of the random walk at time n . Using the cocycle property ofthe Busemann cocycle, we have Z c B ( X n ( ω ) , ξ ) d P ( ω ) d ν B ( ξ ) = Z c B ( g · · · g n , ξ ) d µ ( g ) · · · d µ ( g n ) d ν B ( ξ )= n X k =1 Z c B ( g k , g k +1 · · · g n ξ ) d µ ( g k ) · · · d µ ( g n ) d ν B ( ξ ) . Since the measure ν B is stationary, the point g k +1 · · · g n ξ is distributed according to ν B .Hence, the terms in the above sum do not depend on k . We get(2.4) Z Γ × ∂ B Γ c B ( g, ξ ) d µ ( g ) d ν B ( ξ ) = 1 n Z c B ( X n ( ω ) , ξ ) d P ( ω ) d ν B ( ξ ) . We have | c B ( X n , ξ ) | /n | X n | /n , which converges in L and almost surely to ℓ . Hence,the sequence of functions c B ( X n ( ω ) , ξ ) /n is uniformly integrable on Ω × ∂ B Γ . Moreover, X n converges almost surely to a point on the boundary ∂ Γ , distributed according to theexit measure, which has no atom. It follows that, for all ξ , the trajectory X n ( ω ) convergesalmost surely to a point different from π B ( ξ ) . This implies that, almost surely, one has c B ( X n , ξ ) = | X n | + O (1) , giving in particular c B ( X n , ξ ) /n → ℓ . The result follows by takingthe limit in n in the equality (2.4). (cid:3) This formula easily implies that the drift depends continuously on the measure, as ex-plained in [EK13].
Proposition 2.3.
Let (Γ , d ) be a metric hyperbolic group. Consider a sequence of probabil-ity measures µ i with finite first moment, converging simply to a nonelementary probabilitymeasure µ (i.e., µ i ( g ) → µ ( g ) for all g ∈ Γ ). Assume moreover that L ( µ i ) → L ( µ ) . Then ℓ ( µ i ) → ℓ ( µ ) .Proof. Let ν i be stationary measures for µ i on ∂ B Γ . Taking a subsequence if necessary, wemay assume that ν i converges to a limiting measure ν . By continuity of the action on theboundary, it is stationary for µ .For each g ∈ Γ , the quantity R ∂ B Γ c B ( g, ξ ) d ν i ( ξ ) converges to R ∂ B Γ c B ( g, ξ ) d ν ( ξ ) since ξ c B ( g, ξ ) is continuous. Averaging over g (and using the assumption L ( µ i ) → L ( µ ) toget a uniform domination), we deduce that X g ∈ Γ µ i ( g ) Z ∂ B Γ c B ( g, ξ ) d ν i ( ξ ) → X g ∈ Γ µ ( g ) Z ∂ B Γ c B ( g, ξ ) d ν ( ξ ) . Together with the formula (2.3) for the drift, this completes the proof. (cid:3)
In this proposition, it is important that µ is nonelementary: the result is wrong otherwise.For instance, in the infinite dihedral group Z ⋊ Z / , the measures µ i = (1 − − i ) δ (1 , +2 − i δ (0 , have zero drift since the Z / element symmetrizes everything in Z , while the limitingmeasure µ = δ (1 , has drift . The reason is the non-uniqueness of the stationary measurefor µ on the boundary. NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 11
The entropy.
Let Γ be a countable group. Consider a probability measure µ on Γ ,with finite time one entropy H ( µ ) (defined in (1.1)). The entropy of the random walk hasbeen defined in (1.2) as h ( µ ) = lim H ( µ ∗ n ) /n . Let X n = g · · · g n be the position at time n of the random walk generated by µ (where the g i are independent and distributed accordingto µ ). Then, almost surely, h ( µ ) = lim( − log µ ∗ n ( X n )) /n . The fundamental inequality (1.3)shows that if h > then ℓ > .The entropy has several equivalent characterizations. The first one is in terms of the sizeof the typical support of the random walk: This support has size roughly e hn . The followinglemma follows from [Haï13, Proposition 1.13]. Lemma 2.4.
Consider a probability measure µ with H ( µ ) < ∞ on a countable group. Let h = h ( µ ) be its asymptotic entropy. Let η > and ε > . (1) For large enough n , there exists a subset K n of Γ with µ ∗ n ( K n ) > − η and | K n | e ( h + ε ) n . (2) For large enough n , there exists no subset K n of Γ with µ ∗ n ( K n ) > η and | K n | e ( h − ε ) n . Another description is in terms of the Poisson boundary of the walk. To avoid general def-initions, let us only state this description for measures on hyperbolic groups. The followingproposition is a consequence of [Kai00].
Proposition 2.5.
Let Γ be a hyperbolic group. Let µ be a nonelementary probability measureon Γ with H ( µ ) < ∞ . Let ν be its unique stationary measure on ∂ Γ . Define the Martincocycle on Γ × ∂ Γ by c M ( g, ξ ) = − log(d g − ∗ ν/ d ν )( ξ ) . Then (2.5) h ( µ ) > Z Γ × ∂ Γ c M ( g, ξ ) d µ ( g ) d ν ( ξ ) , with equality if µ has a logarithmic moment. When µ has a logarithmic moment, this proposition has a very similar flavor to Proposi-tion 2.2 expressing the drift of a random walk. Indeed, for symmetric measures, [BHM11]interprets Proposition 2.5 as a special case of Proposition 2.2, for a distance d = d µ relatedto the random walk, the Green distance, which we defined in Theorem 1.2. This distanceis hyperbolic if µ is admissible and has a superexponential moment, by [Anc87, Gou13]. Itis not geodesic in general, but this is not an issue since we were careful enough to stateProposition 2.2 without this assumption. The Busemann cocycle for the Green distance isprecisely the Martin cocycle.An important difference between the formulas (2.3) for the drift and (2.5) for the entropyis that, in the latter situation, the cocycle c M depends on the measure ν (and, therefore, on µ ). This makes it more complicated to prove continuity statements such as Proposition 2.3for the entropy. Nevertheless, Erschler and Kaimanovich proved in [EK13] that, in hyper-bolic groups, the entropy also depends continuously on the measure. As h ( µ ) = inf H ( µ ∗ n ) /n by subadditivity, it is easy to prove that when µ i → µ one has lim sup h ( µ i ) h ( µ ) . Themain difficulty to prove the continuity is to get lower bounds. We will need a slightlystronger (and more pedestrian) version of the results of [EK13] to prove Theorem 1.4. Al-though our argument may seem very different at first sight from the arguments in [EK13],the techniques are in fact closely related (an illustration is that we can recover with our NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 12 techniques the result of Kaimanovich that, for measures with finite logarithmic moment,equality holds in (2.5), i.e., the Poisson boundary coincides with the geometric boundary,see Remark 2.11). Our main criterion to get lower bounds on the entropy is the following.We write S k = { g ∈ Γ : | g | ∈ ( k − , k ] } for the thickened sphere, so that the union of thesespheres covers the whole group. Theorem 2.6.
Let (Γ , d ) be a metric hyperbolic group. Let µ i be a sequence of nonelemen-tary probability measures on Γ with H ( µ i ) < ∞ . Let ν i be the unique stationary measurefor µ i on ∂ Γ . Assume that: (1) The limit points of ν i have no atom. (2) The sequence (2.6) h i = X k X g ∈ S k µ i ( g )( − log( µ i ( g ) /µ i ( S k ))) tends to infinity.Then lim inf h ( µ i ) /h i > . The quantity h i can be written h i = X g ∈ Γ µ i ( g )( − log µ i ( g )) − X k µ i ( S k )( − log µ i ( S k )) . The first term is the time one entropy H ( µ i ) of the measure µ i . In most reasonable cases,the second term is negligible. The theorem then states that the asymptotic entropy h ( µ i ) iscomparable to the time one entropy H ( µ i ) . In other words, if the measure is supported closeto infinity, and sufficiently spread out in the group (this is the meaning of the assumptionthat the limit points of ν i have no atom), then there are few coincidences and the entropydoes not decrease significantly with time.To prove this theorem, we will use the following technical lemma. Lemma 2.7.
On a probability space ( X, µ ) , consider a nonnegative function f with average . For any subset A of X , Z X ( − log f ) > µ ( A ) (cid:18) − log Z A f (cid:19) − e − . Proof.
As the function x
7→ − log x is convex, Jensen’s inequality gives R ( − log f ) > − log( R f ) . The last quantity vanishes when R f = 1 .Let B ⊂ X . Write a = R B f d µ/µ ( B ) . The measure d µ/µ ( B ) is a probability measureon B , and the function f /a has integral for this measure. The previous inequality gives R B ( − log( f /a )) d µ/µ ( B ) > , that is, Z B ( − log f ) d µ > − µ ( B ) log a = − µ ( B ) log (cid:18)Z B f (cid:19) + µ ( B ) log µ ( B ) . The quantity µ ( B ) log µ ( B ) is bounded from below by inf [0 , x log x = − e − . Therefore, Z B ( − log f ) d µ > − µ ( B ) log (cid:18)Z B f (cid:19) − e − . NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 13
We apply this inequality to the complement A c of A . As − log (cid:0)R A c f (cid:1) > , we get a lowerbound − e − . Let us also apply this inequality to A , and add the results. We obtain Z X ( − log f ) d µ > − µ ( A ) log (cid:18)Z A f (cid:19) − e − . (cid:3) We will use the notion of shadow, due to Sullivan and considered in this context byCoornaert [Coo93]. Let
C > be large enough. The shadow O ( g, C ) of g ∈ Γ is { ξ ∈ ∂ Γ :( g | ξ ) e > | g | − C } . In geometric terms (and assuming the space is geodesic), this is essentiallythe trace at infinity of geodesics originating from e and going through the ball B ( g, C ) . Wewill use the following properties of shadows [Coo93]:(1) Their covering number is finite. More precisely, there exists D > (depending on C ) such that, for any integer k , for any ξ ∈ ∂ Γ , |{ g ∈ S k : ξ ∈ O ( g, C ) }| D. (2) The preimages of shadows are large. More precisely, for any η > , there exists C > such that, for all g ∈ Γ , the complement of g − O ( g, C ) has diameter at most η (for a fixed visual distance on the boundary). Proof of Theorem 2.6.
Fix ε > . As the limit points of ν i have no atom, there exists η > such that any ball of radius η in ∂ Γ has measure at most ε for ν i , for i large enough. We canthen choose a shadow size C so that g − O ( g, C ) has for all g a complement with diameterat most η . This yields ν i ( g − O ( g, C )) > − ε .By (2.5), the entropy of µ i satisfies h ( µ i ) > X g ∈ Γ µ i ( g ) Z ∂ Γ (cid:18) − log d g − ∗ ν i d ν i ( ξ ) (cid:19) d ν i ( ξ ) . The function f i,g = d g − ∗ ν i d ν i ( ξ ) is nonnegative and has integral . For any A ⊂ ∂ Γ ,Lemma 2.7 gives Z ∂ Γ (cid:18) − log d g − ∗ ν i d ν i ( ξ ) (cid:19) d ν i ( ξ ) > − ν i ( A ) log (cid:18)Z A d g − ∗ ν i d ν i ( ξ ) d ν i ( ξ ) (cid:19) − e − = − ν i ( A ) log( g − ∗ ν i ( A )) − e − = − ν i ( A ) log( ν i ( gA )) − e − . Let us take A = g − O ( g, C ) , so that ν i ( A ) > − ε . Summing over g , we get(2.7) h ( µ i ) > (1 − ε ) X g ∈ Γ µ i ( g )( − log ν i ( O ( g, C ))) − e − . NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 14
We split the sum according to the spheres S k . Let Σ k = P g ∈ S k ν i ( O ( g, C )) , it is at most D since the shadows have a covering number bounded by D . We have X g ∈ S k µ i ( g )( − log ν i ( O ( g, C )))= − µ i ( S k ) X g ∈ S k µ i ( g ) µ i ( S k ) (cid:20) log (cid:18) ν i ( O ( g, C ))Σ k µ i ( g ) /µ i ( S k ) (cid:19) + log Σ k + log( µ i ( g ) /µ i ( S k )) (cid:21) . The point of this decomposition is that the function on S k given by ϕ : g ν i ( O ( g,C ))Σ k µ i ( g ) /µ i ( S k ) has integral for the probability measure µ i ( g ) /µ i ( S k ) . By Jensen’s inequality, the integralof − log ϕ is nonnegative. This yields X g ∈ S k µ i ( g )( − log ν i ( O ( g, C ))) > − µ i ( S k ) log D + X g ∈ S k µ i ( g )( − log( µ i ( g ) /µ i ( S k )) . Summing over k , we deduce from (2.7) the inequality h ( µ i ) > (1 − ε ) h i − e − − log D. As h i tends to infinity, this gives h ( µ i ) > (1 − ε ) h i for large enough i , completing theproof. (cid:3) To apply the previous theorem, we need to estimate h i . In this respect, the followinglemma is often useful. Lemma 2.8.
Let R i > . The quantity h i defined in (2.6) satisfies h i > X | g | R i µ i ( g )( − log µ i ( g )) − log(2 + R i ) . Proof.
In the definition of h i , all the terms are nonnegative. Restricting the sum to those g with | g | R i , we get h i > X k R i X g ∈ S k µ i ( g )( − log( µ i ( g ) /µ i ( S k )))= X | g | R i µ i ( g )( − log µ i ( g )) − X k R i µ i ( S k )( − log µ i ( S k )) . A probability measure supported on a set with N elements has entropy at most log N . Thenumber µ i ( S k ) for k R i are not a probability measure in general, let us add a lastatom with mass m = µ i ( S k>R i S k ) . We are considering a space of cardinality R n + 2 , hence m ( − log m ) + X k R i µ i ( S k )( − log µ i ( S k )) log(2 + R i ) , completing the proof. (cid:3) Let us see how Theorem 2.6 implies a slightly stronger version of the continuity result forthe entropy of Erschler and Kaimanovich [EK13].
NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 15
Theorem 2.9.
Let Γ be a hyperbolic group. Consider a probability measure µ with finitetime one entropy and finite logarithmic moment. Let µ i be a sequence of probability measuresconverging simply to µ with H ( µ i ) → H ( µ ) . Then h ( µ i ) → h ( µ ) . The assumption H ( µ i ) → H ( µ ) ensures that there is no additional entropy in µ i comingfrom neighborhoods of infinity that would disappear in the limit. It is automatic if thesupport of µ i is uniformly bounded or if µ i satisfies a uniform L domination, but it is muchweaker. For instance, it is allowed that the µ i have no finite logarithmic moment.The main lemma for the proof is a lower bound on the entropy, following from Theo-rem 2.6. Lemma 2.10.
Let Γ be a hyperbolic group. Consider a probability measure µ with finite timeone entropy and finite logarithmic moment. Let µ i be a sequence of measures convergingsimply to µ . Then lim inf h ( µ i ) > h ( µ ) .Proof. Since the result is trivial if h ( µ ) = 0 , we can assume that h ( µ ) > .Let ε > . For large n , most atoms for µ ∗ n have a probability at most e − (1 − ε ) nh ( µ ) .Moreover, since µ has a finite logarithmic moment, log | X n | /n tends almost surely to by [Aar97, Proposition 2.3.1]. Therefore, the set K n = { g : µ ∗ n ( g ) e − (1 − ε ) nh ( µ ) , | g | e εn } has measure tending to . In particular µ ∗ n ( K n ) > − ε for large n . We get X | g | e εn µ ∗ n ( g )( − log µ ∗ n ( g )) > X g ∈ K n µ ∗ n ( g )( − log µ ∗ n ( g )) > X g ∈ K n µ ∗ n ( g )(1 − ε ) nh ( µ )= µ ∗ n ( K n )(1 − ε ) nh ( µ ) > (1 − ε ) nh ( µ ) . For each fixed n , the measures µ ∗ ni converge to µ ∗ n when i tends to infinity. Hence, we getfor large enough i the inequality X | g | e εn µ ∗ ni ( g )( − log µ ∗ ni ( g )) > (1 − ε ) nh ( µ ) . Letting ε tend to (and, therefore, n to infinity), we deduce the existence of sequences n i → ∞ and ε i → such that, for any i , X | g | e εini µ ∗ n i i ( g )( − log µ ∗ n i i ( g )) > (1 − ε i ) n i h ( µ ) . Let ˜ µ i = µ ∗ n i i . Its stationary measure ν i is also the stationary measure of µ i , by uniqueness.Any limit point of ν i is stationary for µ , and is therefore atomless since µ is nonelementaryas h ( µ ) > . The assumptions of Theorem 2.6 are satisfied by the sequence ˜ µ i . Moreover,Lemma 2.8 yields h i > (1 − ε i ) n i h ( µ ) − ε i n i > (1 − Cε i ) n i h ( µ ) . Theorem 2.6 ensures that lim inf h (˜ µ i ) /h i > . As h (˜ µ i ) = n i h ( µ i ) , this gives lim inf h ( µ i ) > h ( µ ) as desired. (cid:3) NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 16
Proof of Theorem 2.9.
For fixed n , the sequence µ ∗ ni converges simply to µ ∗ n . Moreover, H ( µ ∗ ni ) → H ( µ ∗ n ) since there is no loss of entropy at infinity by assumption. Choose n such that H ( µ ∗ n ) n (1 + ε ) h ( µ ) . We get H ( µ ∗ ni ) /n (1 + 2 ε ) h ( µ ) for large enough i . As h ( µ i ) H ( µ ∗ ni ) /n , this shows that lim sup h ( µ i ) h ( µ ) (this is the classical semi-continuityproperty of entropy, valid in any group).For the reverse inequality lim inf h ( µ i ) > h ( µ ) , we apply Lemma 2.10. (cid:3) Remark 2.11.
Let h ( µ, ∂ Γ) = R Γ × ∂ Γ ( − log d g − ∗ ν/ d ν )( ξ ) d µ ( g ) d ν ( ξ ) where ν is the sta-tionary measure for µ on ∂ Γ . In general, h ( µ ) > h ( µ, ∂ Γ) with equality if and only if ( ∂ Γ , ν ) is the Poisson boundary of (Γ , µ ) . A theorem of Kaimanovich [Kai00] asserts that, when µ has finite entropy and finite logarithmic moment, h ( µ, ∂ Γ) = h ( µ ) . We can recover thistheorem using the previous arguments. Indeed, what the proof of Theorem 2.6 really showsis that lim inf h ( µ i , ∂ Γ) /h i > . Hence, Lemma 2.10 proves that lim inf h ( µ i , ∂ Γ) > h ( µ ) if µ i converges simply to a measure µ with a logarithmic moment. Taking µ i = µ for all i , weobtain in particular h ( µ, ∂ Γ) > h ( µ ) , as desired.2.5. A criterion to bound the entropy from below.
In order to prove Theorem 1.4on the entropy of the uniform measure on balls, we want to apply Theorem 2.6. Thus, weneed a criterion to check that limit points of stationary measures have no atom.
Lemma 2.12.
Let Γ be a hyperbolic group. Let µ i be a sequence of probability measureson Γ . Assume that, on the space Γ ∪ ∂ Γ , the sequence µ i converges to a limit ν which issupported on ∂ Γ . Assume moreover that the limit points of ˇ µ i (defined by ˇ µ i ( g ) = µ i ( g − ) )have no atom. Then the stationary measures ν i associated to µ i also converge to ν .Proof. We fix a word distance d on Γ . Let f be a continuous function on Γ ∪ ∂ Γ . Let usshow that, uniformly in ξ ∈ ∂ Γ , the integral R f ( gξ ) d µ i ( g ) is close to R f ( g ) d µ i ( g ) . Weestimate the difference as (cid:12)(cid:12)(cid:12)(cid:12)Z ( f ( gξ ) − f ( g )) d µ i ( g ) (cid:12)(cid:12)(cid:12)(cid:12) Z | f ( gξ ) − f ( g ) | gξ | g ) e > C ) d µ i ( g )+ 2 k f k ∞ Z gξ | g ) e C ) d µ i ( g ) , where C is a fixed constant. If C is large enough, | f ( x ) − f ( y ) | ε when ( x | y ) e > C , byuniform continuity of f . Hence, the first integral is bounded by ε . For the second integral,we use the formula ( gx | g ) e = | g | − ( x | g − ) e , valid for any x ∈ Γ (it follows readily from thedefinition (2.1) of the Gromov product). This equality does not extend to the boundary sincethe Gromov product there is only well defined up to an additive constant D . Nevertheless,we get ( gξ | g ) e > | g | − ( ξ | g − ) e − D . Hence, the second integral is bounded by(2.8) µ i { g : | g | − C − D ( ξ | g − ) e } . If | g | is large, the points g with ( ξ | g − ) e > | g | − C − D are such that g − belongs to a smallneighborhood of ξ in Γ ∪ ∂ Γ . As the limit points of ˇ µ i are supported on ∂ Γ and have noatom, it follows that (2.8) converges to when i tends to infinity, uniformly in ξ .We have proved that sup ξ ∈ ∂ Γ (cid:12)(cid:12)(cid:12)(cid:12)Z f ( gξ ) d µ i ( g ) − Z f ( g ) d µ i ( g ) (cid:12)(cid:12)(cid:12)(cid:12) → . NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 17
By stationarity, Z ξ ∈ ∂ Γ f ( ξ ) d ν i ( ξ ) = Z ξ ∈ ∂ Γ (cid:18)Z f ( gξ ) d µ i ( g ) (cid:19) d ν i ( ξ ) . Combining these equations, we get R f ( ξ ) d ν i ( ξ ) − R f ( g ) d µ i ( g ) → . This shows that thelimit points of ν i and µ i are the same. (cid:3) Let us now consider the uniform measure µ i on the ball of radius i , as in Theorem 1.4.The next lemma follows from the techniques of [Coo93]. Lemma 2.13.
Let (Γ , d ) be a metric hyperbolic group. Let ρ i be the uniform measure onthe ball of radius i . Let ρ ∞ be the Patterson-Sullivan of (Γ , d ) constructed in [Coo93] (itis supported on ∂ Γ and atomless). Then the limit points of ρ i are equivalent to ρ ∞ , with adensity bounded from above and from below.Proof. Let C be large enough. We will use the shadows O ( g, C ) as defined before the proofof Theorem 2.6. The main property of ρ ∞ is that it satisfies(2.9) K − e − v | g | ρ ∞ ( O ( g, C )) K e − v | g | , where K is a constant only depending on C and v is the growth of (Γ , d ) (Proposition 6.1in [Coo93]).Let µ i be the uniform measure on thickened spheres S i = { g : i | g | i + L } , where L is large enough so that the cardinality of S i grows like e iv , see the proof of Theorem7.2 in [Coo93]. Let us push µ i to a measure ˜ µ i on ∂ Γ , by choosing for each g ∈ S i acorresponding point in its shadow. It is clear that µ i and ˜ µ i have the same limit points,since the diameter of the shadows tends uniformly to when i → ∞ . We will prove thatthe limit points of ˜ µ i are equivalent to ρ ∞ . The same result follows for µ i and then ρ i .The shadows of g ∈ S i have a covering number which is bounded from above by a constant D , and from below by if C is large enough. Hence, the measures ˜ µ i satisfy K − e − iv ˜ µ i ( O ( g, C )) K e − iv , for any g ∈ S i . This is comparable to ρ ∞ ( O ( g, C )) by (2.9), up to a multiplicative constant K . Consider a limit ˜ µ of a sequence ˜ µ i n , let us prove that it is uniformly equivalent to ρ ∞ .We will only prove that ˜ µ DK ρ ∞ , the other inequality is proved in the same way. Byregularity of the measures, it suffices to check this inequality on compact sets.Let A be a compact subset of ∂ Γ , and ε > . By regularity of the measure ρ ∞ , there is anopen neighborhood U of A with ρ ∞ ( U ) ρ ∞ ( A ) + ε . Consider B a compact neighborhoodof A , included in U , with ˜ µ ( ∂B ) = 0 (such a set exists, since among the sets B r = { ξ : d ( ξ, A ) r } , at most countably of them many have a boundary with nonzero measure).For large enough i , the shadows O ( g, C ) with g ∈ S i which intersect B are contained in U .Therefore, ˜ µ i ( B ) X g ∈ S i , O ( g,C ) ∩ B = ∅ ˜ µ i ( O ( g, C )) K X g ∈ S i , O ( g,C ) ∩ B = ∅ ρ ∞ ( O ( g, C )) DK ρ ∞ ( U ) . As ˜ µ ( ∂B ) = 0 , the sequence ˜ µ i n ( B ) tends to ˜ µ ( B ) . We obtain ˜ µ ( B ) DK ρ ∞ ( U ) . As A is included in B , we get ˜ µ ( A ) DK ( ρ ∞ ( A ) + ε ) . Letting ε tend to , this gives ˜ µ ( A ) DK ρ ∞ ( A ) , as desired. (cid:3) NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 18
Proof of Theorem 1.4.
Let ρ i be the uniform measure on the ball of radius i (which hascardinality in [ C − e iv , Ce iv ] ). We wish to apply Theorem 2.6 to this sequence of measures.First, by Lemmas 2.12 and 2.13, the limit points of the stationary measures ν i are equivalentto the Patterson-Sullivan measure. Therefore, they have no atom. Second, Lemma 2.8 showsthat the quantity h i in (2.6) satisfies h i > iv − log C − log(2 + i ) . This tends to infinity.Hence, Theorem 2.6 applies, and gives h ( ρ i ) > (1 − ε ) iv for large i .Using the fundamental inequality h ℓv and the trivial bound ℓ ( ρ i ) L ( ρ i ) i , we get (1 − ε ) iv h ( ρ i ) ℓ ( ρ i ) v iv. It follows that h ( ρ i ) ∼ iv and ℓ ( ρ i ) ∼ i . (cid:3) Remark 2.14.
Our technique also applies to estimate the entropy of other measures, forinstance the measure µ s = P e − s | g | δ g / P e − s | g | classically used in the construction of thePatterson-Sullivan measure. Indeed, µ s converges when s ց v to ρ ∞ , which has no atom.Moreover, writing Z s = P e − s | g | , we have H ( µ s ) = sL ( µ s ) + log Z s . One checks that log Z s is negligible with respect to H ( µ s ) , and that the quantity h s from (2.6) is also equivalentto H ( µ s ) . Hence, Theorem 2.6 gives H ( µ s )(1 + o (1)) h s (1 + o (1)) h ( µ s ) ℓ ( µ s ) v L ( µ s ) v H ( µ s )(1 + o (1)) . These inequalities show that h ( µ s ) /ℓ ( µ s ) → v . Remark 2.15.
One could imagine another strategy to find finitely supported measures µ i for which h ( µ i ) /ℓ ( µ i ) → v . First, find a nice measure µ for which the stationary measure ν at infinity is precisely the Patterson-Sullivan measure (which implies that h ( µ ) = ℓ ( µ ) v since the Martin cocycle and the Busemann cocycle coincide). Let µ i be a truncation of µ . Since it converges to µ , the continuity results for the drift and the entropy imply that h ( µ i ) /ℓ ( µ i ) → h ( µ ) /ℓ ( µ ) = v .We were not able to implement successfully this strategy. Given a measure ν , there isa general technique due to Connell and Muchnik [CM07] to get a measure µ on Γ with µ ∗ ν = ν . This technique requires a continuity assumption on ξ (d g ∗ ν/ d ν )( ξ ) , which isnot satisfied in our setting for ν = ρ ∞ . However, in nice groups such as surface groups, thisfunction is, for every g , continuous at all but finitely many points. The technique of [CM07]can be adapted to such a situation (in the proof of their Theorem 6.2, one should just takesets Y n that avoid the discontinuities of the spikes we have already used). Unfortunately,the resulting measure µ (which satisfies µ ∗ ν = ν ) has infinite moment and infinite entropy,and is therefore useless for our purposes.3. Rigidity for admissible measures
In this section, we prove Theorem 1.5. Assume that (Γ , d ) is a hyperbolic group endowedwith a word distance, which is not virtually free. Let µ be a probability measure on Γ , witha superexponential moment, such that Γ + µ is a finite index subgroup of Γ . We want to provethat h ( µ ) < ℓ ( µ ) v . We argue by contradiction, assuming that h ( µ ) = ℓ ( µ ) v . Assume firstthat Γ + µ = Γ .Since we are assuming the equality h ( µ ) = ℓ ( µ ) v , Theorem 1.2 implies that | d µ ( e, g ) − vd ( e, g ) | C. NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 19
As a warm-up, let us first deal with the baby case C = 0 . Then the distances d µ and d areproportional, hence they define the same Busemann boundary. The Busemann boundary ∂ B Γ corresponding to d is totally discontinuous since the distance d takes integer values(it is a word distance). On the other hand, the Busemann boundary associated to theGreen metric d µ is known as the Martin boundary of the random walk (Γ , µ ) . By [Anc87]and [Gou13], it is homeomorphic to the boundary ∂ Γ of Γ . Since the group Γ is notvirtually free, its boundary ∂ Γ is not totally discontinuous (see [KB02, Theorem 8.1]), hencea contradiction.Let us now go back to the general situation, when C is nonzero (but still assuming Γ + µ = Γ ). The argument is more complicated, but it still relies on the same facts: theboundary is not totally disconnected, while the word distance is integer valued (we willnot use directly this fact, rather the fact that stable translation lengths are rational, seeLemma 3.4). These two opposite features will give rise to a contradiction.In order to get rid of the constant C , we will need an homogenized version of the inequality | d µ ( e, g ) − vd ( e, g ) | C . This is Lemma 3.1 below. The homogenized quantity associatedto the distance d is called the stable translation length. For an element g of Γ , it is definedby l ( g ) = lim | g n | /n (it exists by subadditivity).Recall that we write c M ( g, ξ ) for the Martin cocycle associated to the random walk,defined in Proposition 2.5. It satisfies the cocycle relation of Definition 2.1. We will notuse its probabilistic definition, but rather the fact that the Martin cocycle is the Busemanncocycle associated to the Green distance d µ of Theorem 1.2. In other words, c M ( g, ξ ) =lim x → ξ d µ ( g − , x ) − d µ ( e, x ) (and this limit exists). Lemma 3.1.
For g ∈ Γ with infinite order, c M ( g, g + ) = vl ( g ) .Proof. Recall that we are assuming that the equality h ( µ ) = ℓ ( µ ) v holds, therefore we have | d µ ( e, g ) − vd ( e, g ) | C . It follows that the cocycle c M corresponding to d µ and the cocycle c B corresponding to the distance d satisfy | c M − vc B | C . Note that c B is not defined onthe geometric boundary, but on the horoboundary, so the proper way to write this inequalityis | c M ( g, π B ( ξ )) − vc B ( g, ξ ) | C for any g ∈ Γ and any ξ ∈ ∂ B Γ .Let ξ ∈ ∂ B Γ with π B ( ξ ) = g − . Then lim c B ( g n , ξ ) /n = lim h ξ ( g − n ) /n = l ( g ) . We choose ξ with π B ( ξ ) = g + , to get lim c M ( g n , g + ) /n = lim vc B ( g n , ξ ) /n ± C/n = vl ( g ) . As g + is g -invariant, the cocycle equation for c M on ∂ Γ gives c M ( g, g + ) = c M ( g n , g + ) /n .This converges to vl ( g ) when n → ∞ by the previous equation. (cid:3) The proof of Theorem 1.5 uses the following general result on cocycles.
Proposition 3.2.
Let Γ be a hyperbolic group which is not virtually free. Let c : Γ × ∂ Γ → R be a Hölder cocycle, such that any hyperbolic element g satisfies c ( g, g + ) ∈ Z . Then thereexists a hyperbolic element g ∈ Γ with c ( g, g − ) = c ( g, g + ) . Applied to the Busemann cocycle, this proposition implies that if a convex cocompactnegatively curved manifold has a fundamental group which is not virtually free, then itslength spectrum is not arithmetic, i.e., the lengths of its closed geodesics generate a densesubgroup of R . This result is already known, see [Dal99, Page 205]. It is proved in this article NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 20 using crossratios. This argument based on crossratios can be used to prove Proposition 3.2in full generality. However, we will give a different, more direct, proof.We will use the following topological lemma.
Lemma 3.3.
Let g be a hyperbolic element in a hyperbolic group Λ with connected boundary.There exists an arc I (i.e., a subset of ∂ Λ homeomorphic to [0 , ) joining g − and g + ,invariant under an iterate g i of g .Proof. We will use nontrivial results on the topology of ∂ Λ . When it is connected, thenit is also locally connected by [Swa96]. Hence, it is also path connected and locally pathconnected, see [HY61, Theorem 3-16]. Moreover, for any ξ ∈ ∂ Λ , the space ∂ Λ \ { ξ } hasfinitely many ends by [Bow98b].Consider g as in the statement of the lemma. Its action permutes the ends of ∂ Λ \ { g − } .Taking an iterate of g , we can assume it stabilizes the ends. If ξ is close to g − , it is also thecase of gξ . As they belong to the same end, one can join them by a small arc J that avoids g − (and g + ). Then S n ∈ Z g n J joins g − to g + , and it is invariant under g . However, it is notnecessarily an arc if g i J intersects J in a nontrivial way for i = 0 . To get a real arc, we willshorten J as follows.As g n J converges to g ± when n tends to ±∞ , the arc J can only intersect finitely many g i J . Let us fix a parametrization u : [0 , → J . The quantity inf {| t − s | : s, t ∈ [0 , and ∃ i = 0 , u ( t ) = g i u ( s ) } is realized by compactness (since i remains bounded), for some parameters s, t, i . Replacing s, t, i with t, s, − i if necessary, we may assume i > . As g − and g + are the only fixedpoints of g i , we have s = t . Let K = u ([ s, t ]) , this is an arc between η = u ( s ) and g i η = u ( t ) . Moreover, g j K does not intersect K , except maybe at its endpoints for j = ± i :otherwise, there exists x in the interior of K such that g j x also belongs to K , contradictingthe minimality of | s − t | .It follows that S n ∈ Z g ni K is an arc from g − to g + , invariant under g i . (cid:3) Proof of Proposition 3.2.
Let us consider the cocycle ¯ c = c mod Z . The assumption of theproposition ensures that ¯ c ( g, g + ) = 0 for all hyperbolic elements g . In geometric terms, thiswould correspond to an assumption that the cocycle has vanishing average on all closedorbits. Hence, we may apply a version of Livsic’s theorem, due in this context to [INO08](Theorem 5.1). It ensures that the cocycle ¯ c is a coboundary: there exists a Hölder contin-uous function ¯ b : ∂ Γ → R / Z such that, for all ξ ∈ ∂ Γ , for all g ∈ Γ ,(3.1) ¯ c ( g, ξ ) = ¯ b ( gξ ) − ¯ b ( ξ ) . Recall that, since the group Γ is not virtually free, its boundary is not totally discon-tinuous (see [KB02, Theorem 8.1]). The stabilizer of a nontrivial component L of ∂ Γ is asubgroup Λ of Γ , quasi-convex hence hyperbolic, whose boundary is L (see the discussionon top of Page 55 in [Bow98a]).Let us consider an infinite order element g ∈ Λ . Lemma 3.3 constructs an arc I from g − to g + in ∂ Λ ⊂ ∂ Γ , invariant under an iterate g i of g . Replacing g with g i , we may assume i = 1 .The restriction of the function ¯ b to the arc I admits a continuous lift b : I → R , as I is simply connected. The function F : ξ c ( g, ξ ) − b ( gξ ) + b ( ξ ) is well defined on NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 21 I , continuous, and it vanishes modulo Z by (3.1). Hence, it is constant. In particular, c ( g, g − ) = F ( g − ) = F ( g + ) = c ( g, g + ) . (cid:3) In order to apply Proposition 3.2, we will need the following result on stable translationlengths in hyperbolic groups ([BH99, Theorem III. Γ .3.17]). Lemma 3.4.
Let (Γ , d ) be a hyperbolic group with a word distance. Then there exists aninteger N such that, for any g ∈ Γ , one has N l ( g ) ∈ Z . The combination of Lemma 3.1 and Lemma 3.4 shows that the cocycle c ′ = N c M /v satis-fies c ′ ( g, g + ) ∈ Z for any hyperbolic element g . Moreover, this cocycle is Hölder-continuoussince the Martin cocycle c M is itself Hölder-continuous. This follows from [INO08] if µ hasfinite support, and from [Gou13] if it has a superexponential moment. Now, Proposition 3.2implies the existence of a hyperbolic element g such that c M ( g, g + ) = c M ( g, g − ) . This is acontradiction since c ( g, g + ) = vl ( g ) > and c ( g, g − ) = − c ( g − , g − ) = − vl ( g ) < again byLemma 3.1. This concludes the proof of Theorem 1.5 when Γ + µ = Γ .If Γ + µ is a finite index subgroup of Γ , the same proof almost works in Γ + µ to conclude that Γ + µ is virtually free if h = ℓv , implying that Γ is also virtually free. The only difficulty isthat the distance we are considering on Γ + µ is not a word distance for a system of generatorsof Γ + µ . However, the only properties of the distance we have really used are:(1) It is hyperbolic and quasi-isometric to a word distance (to apply Theorem 1.2).(2) The stable translation lengths are rational numbers with bounded denominators.These two properties are clearly satisfied for the restriction of the distance d to Γ + µ . Hence,the above proof also works in this case. This completes the proof of Theorem 1.5. (cid:3) Remark 3.5. If Λ is a quasi-convex subgroup of a hyperbolic group Γ , then the restrictionto Λ of a word distance on Γ also satisfies the above two properties. Hence, Theorem 1.5also holds in Λ for such a distance.4. Growth of non-distorted points in subgroups
Our goal in this section is to prove Theorem 1.6 on the entropy of a random walk onan infinite index subgroup Λ of a hyperbolic group Γ . Since the geometry of such randomwalks is complicated to describe in general, our argument is indirect: we will show that, inany infinite index subgroup, the number of points that the random walk effectively visitsis exponentially small compared to the growth of Γ . This is trivial if the growth v Λ =lim inf n →∞ log | B n ∩ Λ | n is strictly smaller than v = v Γ . When v Λ = v , on the other hand, wewill argue that the random walk does not typically visit all of Λ , but only a subset madeof non-distorted points. To prove Theorem 1.6, the main step is to show that, even when v Λ = v , the number of such non-distorted points is exponentially smaller than e nv . Weintroduce the notion of non-distorted points in Paragraph 4.1, prove this main geometricestimate in Paragraph 4.2, and apply this to random walks in Paragraph 4.4. Paragraph 4.3is devoted to the case v Λ < v , where unexpected phenomena happen even in distortedsubgroups. NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 22
Non-distorted points.
There are at least two different ways to define a notion ofnon-distorted point.
Definition 4.1.
Let Γ be a finitely generated group endowed with a word distance d = d Γ ,and let Λ be a subgroup of Γ . • For ε > and M > , we say that g ∈ Λ is ( ε, M ) -quasi-convex if any geodesic γ from e to g spends at least a proportion ε of its time in the M -neighborhood of Λ ,i.e., |{ i ∈ [1 , | g | ] : d ( γ ( i ) , Λ) M }| > ε | g | . We write Λ QC ( ε,M ) for the set of points in Λ which are ( ε, M ) -quasi-convex. • Assume additionally that Λ is finitely generated, and endowed with a word distance d Λ . For D > , we say that g ∈ Λ is D -undistorted if d Λ ( e, g ) Dd Γ ( e, g ) . Wewrite Λ UD ( D ) for the set of D -undistorted points. Up to a change in the constants, these notions do not depend on the choice of the distance d . The first definition has the advantage to work for infinitely generated subgroups, but itmay seem less natural than the second one. If Λ is a quasi-convex subgroup of a hyperbolicgroup Γ , then all its points are (1 , M ) -quasi-convex if M is large enough, and all its pointsare also D -undistorted for large enough D . In the general case, a quasi-convex point doesnot have to be undistorted: it may happen that the times i such that d ( γ ( i ) , Λ) M are allincluded in [1 , | g | / , while between | g | / and | g | one needs to make a huge detour to follow Λ , making d Λ ( e, g ) much larger than d Γ ( e, g ) . On the other hand, an undistorted point isautomatically quasi-convex, at least in hyperbolic groups: Proposition 4.2.
Let Γ be a hyperbolic group, let Λ be a finitely generated subgroup of Γ ,and let D > . There exist ε > and M > such that any D -undistorted point is also ( ε, M ) -quasi-convex, i.e., Λ UD ( D ) ⊂ Λ QC ( ε,M ) .Proof. Consider g ∈ Λ which is not ( ε, M ) -quasi-convex, we have to show that d Λ ( e, g ) ismuch bigger than n = d Γ ( e, g ) . The intuition is that, away from a Γ -geodesic from e to g ,the progress towards g is much slower by hyperbolicity.Let us consider a geodesic from e to g in Λ , with length d Λ ( e, g ) . Replacing each generatorof Λ by the product of a uniformly bounded number of generators of Γ , we obtain a path γ Λ in the Cayley graph of Γ , remaining in the C -neighborhood of Λ (for some C > ) andwith length | γ Λ | C d Λ ( e, g ) .Let us consider a geodesic γ Γ from e to g for the distance d Γ . For each x ∈ Γ , we canconsider its projection π ( x ) on γ Γ , i.e., the point on γ Γ that is closest to x (if several pointscorrespond, we take the closest one to e ). This projection is -Lipschitz. In particular, theprojection of γ Λ covers the whole geodesic γ Γ . For each x i ∈ γ Γ , let us consider the firstpoint y i ∈ γ Λ projecting to x i .Let us fix an integer L , large enough with respect to the hyperbolicity constant of Γ .Along γ Γ , let us consider the points at distance kL from e , i.e., x = e, x L , x L , . . . , x mL with m = ⌊ n/L ⌋ . In particular, | γ Λ | > P i d Γ ( y iL , y ( i +1) L ) . Moreover, a tree approximationshows that d Γ ( y iL , y ( i +1) L ) > d Γ ( y iL , x iL ) + L + d Γ ( x ( i +1) L , y ( i +1) L ) − C (where C only NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 23 depends on the hyperbolicity constant of Γ ). Choosing L > C , we get | γ Λ | > m X i =0 d Γ ( x iL , y iL ) > m X i =0 ( d Γ ( x iL , Λ) − C ) . Since we assume that g is not ( ε, M ) -quasi-convex, the set of indices i with d ( x i , Λ) M has cardinality at most εn . Taking M > C , the previous equation is bounded from belowby ( m + 1 − εn ) M − ( m + 1) C > ( n/L − εn ) M − nC /L. Finally, we get d Λ ( e, g ) > | γ Λ | /C > n (1 /L − ε ) M/C − n/L. If ε is small enough and M is large enough so that (1 /L − ε ) M/C − /L > D , we obtain d Λ ( e, g ) > Dn , i.e., g / ∈ Λ UD ( D ) , as desired. (cid:3) From this point on, we will mainly work with the notion of quasi-convex points, sincecounting results on such points imply results on undistorted points by the previous propo-sition.4.2.
Non-distorted points in subgroups with v Λ = v . In this section, we show thatthere are exponentially few quasi-convex points in infinite-index subgroups of hyperbolicgroups.
Theorem 4.3.
Let Γ be a nonelementary hyperbolic group endowed with a word distance.Let Λ be an infinite index subgroup of Γ . Then (4.1) | B n ∩ Λ | = o ( | B n | ) . Moreover, for all ε > and M > , there exists η > such that, for all large enough n , (4.2) | B n ∩ Λ QC ( ε,M ) | e − ηn | B n | . One may wonder why we put the estimate (4.1) in the statement of the theorem, whilethe main emphasis is on counting quasi-convex points. It turns out that this estimateis not trivial, and that its proof uses the same techniques as for the proof of (4.2). Toillustrate that it is not trivial, let us remark that this estimate is not true without thehyperbolicity assumption. For instance, in
Γ = F × Z (with its canonical generatingsystem, and the corresponding word distance), the infinite index subgroup Λ = F satisfies | Λ ∩ B n | / | B n | > c > .Theorem 4.3 is trivial if the growth rate v Λ of Λ is strictly smaller than the growth rate v of Γ , since in this case | B n ∩ Λ | itself is exponentially smaller than | B n | . However, this isnot always the case, even for finitely generated subgroups.Consider for instance a compact hyperbolic -manifold which fibers over the circle, ob-tained as a suspension of a hyperbolic surface with a pseudo-Anosov. Its fundamental group Γ surjects into Z = π ( S ) . The kernel Λ of this morphism ϕ is the fundamental group ofthe fiber. It is finitely generated, with infinite index, and | B n ∩ Λ | ∼ c | B n | / √ n , see [Sha98].Heuristically, one can understand in this case why there are exponentially few quasi-convex points in Λ . Let us consider a geodesic of length n in Γ . It projects under ϕ to apath in Z , which behaves roughly like a random walk. In particular, e − nv | S n ∩ Λ | behaveslike the probability that a random walk on Z comes back to the identity at time n . This NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 24 is of order / √ n , in accordance with the rigorous results of [Sha98]. Such an element isquasi-convex if the random walk in Z spends a big proportion of its time close to the origin.A large deviation estimate shows that this is exponentially unlikely.The proof of the theorem consists in making this heuristic precise, in the general casewhere the subgroup Λ is not normal (so that there is no morphism ϕ at hand). An importantpoint in the proof is that a hyperbolic group is automatic, i.e., there exists a finite stateautomaton that recognizes a system of geodesics parameterizing bijectively the points inthe group. Counting points in the group then amounts to a random walk on the graph ofthis automaton, while counting points in Λ amounts to a fibred random walk, on this graphtimes Λ \ Γ . As this space is infinite, the random walk spends most of its time outside offinite sets, i.e., far away from Λ .To formalize this argument, we will reduce the question to Markov chains on graphs,where we will use the following probabilistic lemma. Lemma 4.4.
Consider a Markov chain ( X n ) on a countable set V , with a stationary mea-sure m (i.e., m ( x ) = P y m ( y ) p ( y, x ) for all x ). Let ˜ V be the set of points x ∈ V such that P x → y m ( y ) = + ∞ , where we write x → y if there exists a positive probability path from x to y . Then, for all x ∈ V and x ′ ∈ ˜ V , (4.3) P x ( X n = x ′ ) → when n → ∞ . Take x ∈ ˜ V and ε > . There exists η > such that, for all large enough n , (4.4) P x ( X n = x and X i visits x at least εn times in between ) e − ηn . Proof.
In countable state Markov chains, a point x can be either transient, or null recurrent,or positive recurrent. Let us first show that points in ˜ V are not positive recurrent, bycontradiction. Otherwise, the points that can be reached from x form an irreducible class C ,which admits a stationary probability measure p . The restriction of m to C is an excessivemeasure. By uniqueness (see [Rev84, Theorem 3.1.9]), the measure m is proportional on C to p . In particular, it has finite mass there. This contradicts the assumption P x → y m ( y ) =+ ∞ .Let us now show that, for all x ∈ V and x ′ ∈ ˜ V , the probability P x ( X n = x ′ ) tends to .Otherwise, conditioning on the first visit to x ′ , we deduce that P x ′ ( X n = x ′ ) does not tendto . This implies that x ′ is positive recurrent, a contradiction.Let us now prove (4.4). Consider x ∈ ˜ V , it is either transient or null recurrent. If it istransient, the probability p to come back to x is < . Hence, the probability to come back εn times is bounded by p εn , and is therefore exponentially small as desired.Assume now that x is null recurrent: almost surely, the Markov chain comes back to x , but the waiting time τ has infinite expectation. Let τ , τ , . . . be the length of thesuccessive excursions based at x . They are independent and distributed like τ , by the Markovproperty. The probability in (4.4) is bounded by P ( P εni =1 τ i n ) , which is bounded for any M by P ( P εni =1 τ i τ i M n ) . The random variables τ i τ i M are bounded, independent andidentically distributed. If M is large enough, they have expectation > /ε . A standard largedeviation result then shows that P ( P εni =1 τ i τ i M n ) is exponentially small, as desired. (cid:3) We will also need the following technical lemma, which was explained to us by B. Bekka.
NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 25
Lemma 4.5.
Let Λ be a subgroup of a group Γ . Assume that there exists a finite subset B of Γ such that B Λ B = Γ . Then Λ has finite index in Γ .Proof. We have by assumption
Γ = S i,j b i Λ b j = S i,j Λ i b i b j , where Λ i = b i Λ b − i is a conju-gate of Λ (and has therefore the same index). A theorem of Neumann [Neu54] ensures thata group is never a finite union of right cosets of infinite index subgroups. Hence, one of the Λ i has finite index in Γ , and so has Λ . (cid:3) Let Γ be a hyperbolic group, with a finite generating set S . Consider a finite directedgraph A = ( V, E, x ∗ ) with vertex set V , edges E , a distinguished vertex x ∗ , and a labeling α : E → S . We associate to any path γ in the graph (i.e., a sequence of edges σ , σ , . . . , σ m − where the endpoint of σ i is the beginning of σ i +1 ) a path in the Cayley graph starting fromthe identity and following the edges labeled α ( σ ) , then α ( σ ) , and so on. The endpoint ofthis path is α ∗ ( γ ) := α ( σ ) · · · α ( σ m − ) . We always assume that any point can be reachedby a path starting at x ∗ .A hyperbolic group is automatic (see, for instance, [Cal13]): there exists such a graphwith the following properties.(1) For any path γ in the graph, the corresponding path α ( γ ) is geodesic in the Cayleygraph.(2) The map α ∗ induces a bijection between the set of paths in the graph starting from x ∗ and the group Γ .In particular, the paths of length n in the graph originating from x ∗ parameterize thesphere S n of radius n in the group. The existence of such a structure makes it for instancepossible to prove that the growth series of a hyperbolic group is rational. We will use suchan automaton to count the points in the subgroup Λ , and in particular the quasi-convexpoints.We define a transition matrix A , indexed by V . By definition, A xy is the number of edgesfrom x to y . Hence, ( A n ) xy is the number of paths of length n from x to y . In particular,the number of paths of length n starting from x ∗ is P y ( A n ) x ∗ y . Write u for the line vectorwith at position x ∗ and elsewhere, and ˜ u for the column vector with everywhere. Thisnumber of paths reads uA n ˜ u . Therefore, | S n | = uA n ˜ u , proving the rationality of the growthfunction of the group. Let v be the growth rate of balls in Γ . It satisfies | B n | Ce nv ,by [Coo93]. Hence, the spectral radius of A is e v , and A has no Jordan block for thismaximal eigenvalue.To understand the points of the infinite index subgroup Λ of Γ , we consider an extension A Λ of A , with fibers Λ \ Γ . Its vertex set V Λ is made of the pairs ( x, Λ g ) ∈ V × Λ \ Γ . For anyedge σ in A , going from x to y and with label α ( σ ) , we put for any g ∈ Γ an edge in A Λ from ( x, Λ g ) to ( y, Λ gα ( σ )) . A path γ in A , from x to y , lifts to a path ˜ γ in A Λ originatingfrom ( x, Λ e ) . By construction, its endpoint is ( y, Λ α ∗ ( γ )) . This shows that the paths in thegraph A Λ remember the current right coset of Λ .The next lemma proves that the relevant components of this fibred graph are infinite. Lemma 4.6.
Let ˜ x = ( x , Λ g ) belong to A Λ . Let C be the component of x in A (i.e., theset of points that can be reached from x and from which one can go back to x ). Let A C bethe restriction of the matrix A to the points in C . Assume that its spectral radius ρ ( A C ) is NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 26 equal to e v . Then, starting from ˜ x in the graph C Λ (the restriction of A Λ to C × Λ \ Γ ), onecan reach infinitely many different points of C Λ .Proof. It suffices to show that one can reach infinitely many points whose component in C is x . Assume by contradiction that one can only reach a finite number of classes ( x , Λ g i ) .Given w ∈ Γ and C > , let Y w,C be the set of points in Γ that have a geodesic expressionin which, for any subword ˜ w of this expression and for any a, b with length at most C , onehas w = a ˜ wb . In other words, the points in Y w,C are those that never see w (nor even athickening of w of size C ) in their geodesic expressions. Theorem 3 in [AL02] proves theexistence of C such that, for any w , the quantity | B n ∩ Y w,C | / | B n | tends to (the importantpoint is that C does not depend on w ).The number of paths in C originating from x grows at least like c | B n | since the spectralradius of A C is e v . These paths give rise to distinct points in Γ . Hence, there exists such apath γ such that α ∗ ( γ ) / ∈ Y w,C . In particular, there exists a subpath γ such that α ∗ ( γ ) can be written as a wb with | a | C and | b | C . We can choose a path from x to thestarting point of γ , with fixed length (since C is finite), and another path from the endpointof γ to x . Concatenating them, we get a path γ from x to itself with α ∗ ( γ ) = a wb and | a | , | b | C = C + 2 diam( C ) . By assumption, Λ g α ∗ ( γ ) is one of the finitely many Λ g i since we are returning to x . Hence, there exists λ ∈ Λ such that g a wb = λg i . Thisshows that w ∈ B Λ B , where B is the ball of radius C + max i d ( e, g i ) . As this holds for any w , we have proved that B Λ B = Γ . By Lemma 4.5, this shows that Λ has finite index in Γ ,a contradiction. (cid:3) Lemma 4.7.
Let K ( n, ˜ x , ε ) denote the set of paths in A Λ starting at a point ˜ x , of length n , coming back to ˜ x at time n , and spending a proportion at least ε of the time at ˜ x .Consider ˜ x ∈ A Λ and ε > . Then there exist η > and C > such that, for all n ∈ N , | K ( n, ˜ x , ε ) | Ce n ( v − η ) . Proof.
Write ˜ x = ( x , Λ g ) , let C be the component of x in A . If the spectral radius ofthe restricted transition matrix A C is < e v , we simply bound | K ( n, ˜ x , ε ) | by the numberof paths in C from x to itself. This is at most k A n C k , which is exponentially smaller than e nv as desired.Assume now that ρ ( A C ) = e v . We will understand the number of paths in C (and in its lift C Λ ) in terms of a Markov chain. The matrix A C has a unique eigenvector q correspondingto the eigenvalue e v , it is positive by Perron-Frobenius’s theorem. By definition, p ( x, y ) = e − v A xy q ( y ) /q ( x ) satisfies, for any x ∈ C , X y ∈C p ( x, y ) = e − v q ( x ) X A xy q ( y ) = 1 . This means that p ( x, y ) is a transition kernel on C . Denote by ( X n ) n ∈ N the correspondingMarkov chain. By construction, P x ( X n = y ) = e − nv ( A n ) xy q ( y ) /q ( x ) . Moreover, ( A n ) xy is the number of paths of length n in A from x to y . Hence, up to abounded multiplicative factor q ( y ) /q ( x ) , the transition probabilities of the Markov chain X n NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 27 count the number of paths in the graph C . Let m denote the unique stationary probabilityfor the Markov chain on C .We lift everything to C Λ , assigning to an edge the transition probability of its projection in C . The stationary measure m lifts to a stationary measure m Λ , which is simply the productof m and of the counting measure in the direction Λ \ Γ . Denoting by X Λ n the Markov chainin C Λ , we have e − nv | K ( n, ˜ x , ε ) | = P ˜ x ( X Λ n = ˜ x and X Λ i visits ˜ x at least ε n times in between ) . By Lemma 4.6, the Markov chain starting from ˜ x can reach infinitely many points. Equiva-lently, since m is bounded from below, it can reach a set of infinite m Λ -measure. Therefore,Lemma 4.4 applies, and shows that the above quantity is exponentially small. (cid:3) Proof of Theorem 4.3.
Let us first prove (4.2). Counting the points in S n ∩ Λ QC ( ε,M ) amountsto counting the paths of length n in A Λ , starting from ( x ∗ , Λ e ) and spending a proportionat least ε of their time in the finite subset F = V × Λ B M ⊂ V Λ . Such a path spends aproportion at least ε = ε/ | F | of its time at a given point ˜ x ∈ F . Let k and k + m denotethe first and last visits to ˜ x (with m > ε n since there are at least ε n visits). Such a pathis the concatenation of a path from ( x ∗ , Λ e ) to ˜ x of length k (their number is bounded bythe corresponding number of paths in A , at most k A k k Ce kv ), of a path in K ( m, ˜ x, ε ) ,and of a path starting from ˜ x of length n − k − m (their number is again bounded by thenumber of corresponding paths in A , at most Ce ( n − k − m ) v ). Hence, their number is at most Ce ( n − m ) v | K ( m, ˜ x, ε ) | . Summing over the points ˜ x ∈ F , over the at most n possible valuesof k , and the values of m , we get the inequality | S n ∩ Λ QC ( ε,M ) | Cne nv X ˜ x ∈ F n X m = ε n e − mv | K ( m, ˜ x, ε ) | . Lemma 4.7 shows that this is exponentially smaller than e nv .Let us now prove (4.1), using similar arguments. A point in S n ∩ Λ corresponds to apath of length n in A Λ , starting from ( x ∗ , Λ e ) and ending at a point ( x, Λ e ) . We say that acomponent C in the graph A is maximal if the spectral radius of the corresponding restrictedmatrix A C is e v . Since the matrix A has no Jordan block corresponding to the eigenvalue e v ,a path in the graph encounters at most one maximal component. The paths in A Λ whoseprojection in A spends a time k in non-maximal components give an overall contributionto | S n ∩ Λ | bounded by Ce ( n − k ) v + k ( v − η ) Ce − ηk | B n | . Given ε > , their contribution for k > k ( ε ) is bounded by ε | B n | . Hence, it suffices to control the paths for fixed k . Let us fixthe beginning of such a path, from ( x ∗ , Λ e ) to a point ( x , Λ g ) where x is in a maximalcomponent C , and its end from ( x , Λ g ) with x ∈ C to a point ( x, Λ e ) . To conclude, oneshould show that the number of paths of length n from ( x , Λ g ) to ( x , Λ g ) is o ( e nv ) . Thisfollows from the probabilistic interpretation in the proof of Lemma 4.7 and from (4.3). (cid:3) Non-distorted points in subgroups with v Λ < v . Let Λ be a subgroup of a hy-perbolic group Γ . Let v Λ and v Γ be their respective growths, for a word distance on Γ . If v Λ = v Γ , Theorem 4.3 proves that there is a dichotomy:(1) Either Λ is quasi-convex (equivalently, Λ has finite index in Γ ). Then | B n ∩ Λ | > ce nv Λ , and all points in Λ are quasi-convex. NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 28 (2) Or Λ is not quasi-convex (equivalently, it has infinite index in Γ ). Then | B n ∩ Λ | = o ( e nv Λ ) , and there are exponentially few quasi-convex points in Λ .Consider now a general subgroup Λ with v Λ < v Γ . If it quasi-convex, then (1) aboveis still satisfied: | B n ∩ Λ | > ce nv Λ by [Coo93], and all points in Λ are quasi-convex. Onemay ask if these properties are equivalent, and if they characterize quasi-convex subgroups.This question is reminiscent of a question of Sullivan in hyperbolic geometry: Are convexcocompact groups the only ones to have finite Patterson-Sullivan measure? Peigné showedin [Pei03] that the answer to this question is negative. His counterexamples adapt to oursituation, giving also a negative answer to our question. Proposition 4.8.
There exists a finitely generated subgroup Λ of a hyperbolic group Γ endowed with a word distance, which is not quasi-convex, but for which C − e nv Λ | B n ∩ Λ | Ce nv Λ . Moreover, most points of Λ are quasi-convex: there exist ε and η such that (4.5) | B n ∩ Λ \ Λ QC ( ε, | Ce n ( v Λ − η ) . Proof.
The example is the same as in [Pei03], but his geometric proofs are replaced bycombinatorial arguments based on generating series.Let G be a finitely generated non-quasi-convex subgroup of a hyperbolic group ˜ G (takefor instance for ˜ G the fundamental group of a hyperbolic -manifold which fibers over thecircle, and for G the fundamental group of the fiber of this fibration). Let H = F k , with k large enough so that v H > v G . We take Λ = G ∗ H ⊂ Γ = ˜ G ∗ H . It is not quasi-convex,because of the factor G . Writing v Λ for its growth, we claim that, for some c > ,(4.6) | S n ∩ Λ | ∼ ce nv Λ . We compute with generating series. Let F G ( z ) be the growth series for G , given by F G ( z ) = P n > | S n ∩ G | z n . Likewise, we define F H and F Λ . Since any word in Λ has a canonicaldecomposition in terms of words in G and H , a classical computation (see [dlH00, Prop.VI.A.4]) gives(4.7) F Λ = F G F H − ( F G − F H − . Let z G = e − v G > z H = e − v H be the convergence radii of F G and F H . At z H , we have F H ( z H ) = + ∞ , since the cardinality of spheres in the free group is exactly of the order of e nv H . When z increases to z H , the function ( F G ( z ) − F H ( z ) − takes the value , at anumber z = z Λ . Since this is the first singularity of F Λ , we have z Λ = e − v Λ . Moreover, thefunction F Λ is meromorphic at z Λ , with a pole of order (since the function ( F G − F H − has positive derivative, being a power series with nonnegative coefficients). It follows froma simple tauberian theorem (see, for instance, [FS09, Theorem IV.10]) that the coefficientsof F Λ behave like cz − n Λ , proving (4.6).Let us estimate the number of non-quasi-convex points in Λ . Consider a word w ∈ Λ oflength n , for instance starting with a factor in G and ending with a factor in H . It can bewritten as g h g h · · · h s . Along a geodesic from e to w , all the words g h (with h prefixof h ) belong to Λ . So do all the words g h g h with h prefix of h , and so on. Therefore,the proportion of time that the geodesic spends outside of Λ is at most P | g i | /n . Such apoint in Λ \ Λ QC ( ε, satisfies P | g i | > (1 − ε ) n and P | h i | εn . Assuming ε / , this NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 29 gives P | h i | ( ε/ P | g i | . In particular, for any α > , we have e α ( P | g i |− ε − P | h i | ) > . Let u n = | S n ∩ Λ \ Λ QC ( ε, | , its generating series satisfies the following equation (where we onlywrite in details the words starting with G and ending in H , the other ones being completelyanalogous): X u n z n X ℓ > X a + b + a + ··· + b ℓ = n e α ( P a i − ε − P b i ) | S a ∩ G || S b ∩ H | · · · | S b ℓ ∩ H | z n + . . . = X ℓ > (cid:2) ( F G ( e α z ) − F H ( e − αε − z ) − (cid:3) ℓ + . . . = F G ( e α z ) F H ( e − αε − z )1 − ( F G ( e α z ) − F H ( e − αε − z ) − . This is the same formula as in (4.7), but the factor z has been shifted in F G and F H . Choose α > such that e α z Λ < z G , and then ε small enough so that ( F G ( e α z Λ ) − F H ( e − αε − z Λ ) − < . We deduce that the series P u n z n converges for z = z Λ , and even slightly to itsright. It follows that u n is exponentially small compared to z − n Λ . This proves (4.5). (cid:3) Application to random walks in infinite index subgroups.
In this paragraph,we use Theorem 4.3 to prove Theorem 1.6 on random walks given by a measure µ on ahyperbolic group Γ , assuming that Γ µ has infinite index in Γ .Before proving Theorem 1.6, we give another easier result, pertaining to the case where µ has a finite moment for a word distance on Γ µ (which should be finitely generated): Inthis case, the random walk typically visits undistorted points. This easy statement is notused later on, but it gives a heuristic explanation to Theorem 1.6. Lemma 4.9.
Let Λ be a finitely generated subgroup of a finitely generated group Γ . Let d Λ and d Γ be the two corresponding word distances. Consider a probability measure µ on Λ ,with a moment of order for d Λ (and therefore for d Γ ), with nonzero drift for d Γ . Let X n denote the corresponding random walk. There exists D > such that P ( X n ∈ Λ UD ( D ) ) → .Proof. Almost surely, d Γ ( e, X n ) ∼ ℓ Γ n , for some nonzero drift ℓ Γ . In the same way, d Λ ( e, X n ) ∼ ℓ Λ n . For any D > ℓ Λ /ℓ Γ , we get almost surely d Λ ( e, X n ) Dd Γ ( e, X n ) forlarge enough n , i.e., X n ∈ Λ UD ( D ) . (cid:3) This lemma readily implies Theorem 1.6 under the additional assumption that Λ is finitelygenerated and that µ has a moment of order for d Λ . Indeed, for large n , with probabilityat least / , the point X n belongs to B ( ℓ + ε ) n ∩ Λ UD ( D ) , whose cardinality is bounded by Ce ( ℓ + ε ) n ( v − η ) according to Theorem 4.3. Lemma 2.4 yields h ( ℓ + ε )( v − η ) , hence h ℓ ( v − η ) < ℓv , completing the proof.However, the assumptions of Theorem 1.6 are much weaker: even when Λ is finitelygenerated, it is much more restrictive to require a moment of order on Λ than on Γ ,precisely because the Γ -distance is smaller than the Λ -distance on distorted points, whichmake up most of Λ . The general proof will not use undistorted points (which are not evendefined when Λ is not finitely generated), but rather quasi-convex points: we will showthat, typically, the random walk concentrates on quasi-convex points. With the previousargument, Theorem 1.6 readily follows from the next lemma. NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 30
Lemma 4.10.
Let Λ be a subgroup of a hyperbolic group Γ endowed with a word distance d = d Γ . Let us consider a probability measure µ on Λ , with a moment of order for d Γ .There exist ε > and M > such that P ( X n ∈ Λ QC ( ε,M ) ) > / for large enough n .Proof. The lemma is trivial if µ is elementary, since all the elements of Γ µ ⊂ Λ are thenquasi-convex. We may therefore assume that µ is non-elementary.The random walk at time n is given by X n = g · · · g n , where g i are independent anddistributed like µ . We will show that most products g · · · g i (which belong to Λ ) are withindistance M of a geodesic from e to X n (this amounts to the classical fact that trajectoriesof the random walk follow geodesics in the group), and moreover that they approximate aproportion at least ε of the points on this geodesic. This will give X n ∈ Λ QC ( ε,M ) as desired.The second point is more delicate: we should for instance exclude the situation where, givena geodesic γ , one has X n = γ ( a ( n )) where a ( n ) is the smallest square larger than n . In thiscase, X n follows the geodesic γ at linear speed, but nevertheless the proportion of γ it visitstends to . This behavior will be excluded thanks to the fact that, with high probability,the jumps of the random walk are bounded.The argument is probabilistic and formulated in terms of the bilateral version of therandom walk. On Ω = Γ Z with the product measure P = µ ⊗ Z , let g n be the n -th coordinate.The g n are independent, identically distributed, and correspond to the increments of arandom walk ( X n ) n ∈ Z with X = e and X − n X n +1 = g n +1 . Almost surely, X n convergeswhen n → ±∞ towards two random variables ξ ± ∈ ∂ Γ , with ξ + = ξ − almost surely sincethese random variables are independent and atomless. Following Kaimanovich [Kai00],denote by S ( ξ − , ξ + ) the union of all the geodesics from ξ − to ξ + . Let π be the projectionon S ( ξ − , ξ + ) , i.e., π ( g ) is the closest point to g on S ( ξ − , ξ + ) . It is not uniquely defined, buttwo possible choices are within distance C , for some C only depending on Γ .Let us choose L > large enough (how large will only depend on the hyperbolicityconstant of the space). Any measurable function is bounded on sets with arbitrarily largemeasure. Hence, there exists K > such that, with probability at least / ,(1) For every | k | > K , the projections π ( X k ) are distant from π ( X ) by at least L (andthey are closer to ξ + if k > , and to ξ − if k < ).(2) We have d ( e, S ( ξ − , ξ + )) K .As everything is equivariant, we deduce that, for all i ∈ Z , the point X i satisfies the sameproperties with probability at least / , i.e.,(4.8) d ( X i , S ( ξ − , ξ + )) K and, for all | k | > K , d ( π ( X i ) , π ( X i + k )) > L. Let n be a large integer. Write m = ⌊ n/K ⌋ . Among the integers K, K, . . . , mK n , weconsider the set I n ( ω ) of those i such that X i satisfies (4.8). We have E ( | I n | ) > m · / .As | I n | m , we get m E ( | I n | ) m P ( | I n | < m/
10) + m P ( | I n | > m/
10) = m
10 + 9 m P ( | I n | > m/ . This gives P ( | I n | > m/ > / . Let η = 1 / (20 K ) . Let Ω n be the set of ω such that | I n ( ω ) | > ηn + 1 , and X and X n satisfy (4.8), and d ( X n , e ) ℓn (where ℓ is the drift of µ ). It satisfies P (Ω n ) > / if n is large enough. This is the set of good trajectories forwhich we can control the position of many of the X i . NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 31 ξ − π ( X ) π ( X i ) π ( X n ) ξ + X X n X i Y i γ Figure 1.
The projections on γ and S Let ω ∈ Ω n . We write Y i for a projection of X i on a geodesic γ from e to X n . Let ˜ I n = I n \ { mK } , so that the elements of ˜ I n are at distance at least K of and n . As X and X n satisfy (4.8), the projections π ( X i ) for i ∈ ˜ I n are located between π ( X ) and π ( X n ) ,and are at a distance at least L of these points (see Figure 1). If L is large enough, weobtain d ( π ( X i ) , Y i ) C by hyperbolicity, where C only depends on Γ . This gives d ( Y i , Λ) d ( Y i , π ( X i )) + d ( π ( X i ) , X i ) C + K, thanks to (4.8) for X i . When i = j belong to ˜ I n , we have d ( π ( X i ) , π ( X j )) > L again thanksto (4.8), hence d ( Y i , Y j ) > L − C . If L was chosen larger than C + 1 , this shows that Y i = Y j . We have found along γ at least | I n | − distinct points, within distance C + K of Λ . Moreover, for large enough n , | I n | − > ηn > ℓn · ( η/ ℓ ) > d ( e, X n ) · ( η/ ℓ ) . Let ε = η/ ℓ and M = C + K . We have shown that, for ω ∈ Ω n (whose probability is atleast / ), the point X n ( ω ) belongs to Λ QC ( ε,M ) . (cid:3) Construction of maximizing measures
In this section, we prove Theorem 1.7: Given any finite subset Σ in a hyperbolic group Γ ,there exists a measure µ Σ maximizing the quantity h ( µ ) /ℓ ( µ ) over all measures µ supportedon Σ with ℓ ( µ ) > . To prove this result, we start with a sequence of measures µ i supportedon Σ such that h ( µ i ) /ℓ ( µ i ) converges to the maximum M of these quantities. We are lookingfor µ Σ with h ( µ Σ ) /ℓ ( µ Σ ) = M . Replacing µ i with ( µ i + δ e ) / (this multiplies entropy anddrift by / , and does not change their ratio) and adding e to Σ , we can always assume µ i ( e ) > / , to avoid periodicity problems.Extracting a subsequence, we can ensure that µ i converges to a limit probability measure µ . We treat separately the two following cases:(1) Γ µ is non-elementary.(2) Γ µ is elementary.Let us handle first the easy case, where Γ µ is non-elementary. In this case, the entropyand the drift are continuous at µ , by Proposition 2.3 and Theorem 2.9, both due to Erschler NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 32 and Kaimanovich in [EK13]. Therefore, h ( µ i ) /ℓ ( µ i ) tend to h ( µ ) /ℓ ( µ ) , since in this case ℓ ( µ ) > . One can thus take µ Σ = µ .The case where Γ µ is elementary is much more interesting. Let us describe heuristicallywhat should happen, in a simple case. We assume that µ i = (1 − ε ) µ + εν where ν is afixed measure, and ε tends to . The random walk for µ i can be described as follows. Ateach jump, one picks µ (with probability − ε ) or ν (with probability ε ), then one jumpsaccording to the chosen measure. After time N , the measure ν is chosen roughly εN times,with intervals of length /ε in between, where µ is chosen. Thus, µ ∗ Ni behaves roughly like ( µ ∗ /ε ∗ ν ) εN .When Γ µ is finite, the measure µ ∗ /ε is close, when ε is small, to the uniform measure π on Γ µ . Therefore, µ ∗ Ni is close to ( π ∗ ν ) εN . We deduce h ( µ i ) ∼ εh ( π ∗ ν ) and ℓ ( µ i ) ∼ εℓ ( π ∗ ν ) .In particular, h ( µ i ) /ℓ ( µ i ) → h ( π ∗ ν ) /ℓ ( π ∗ ν ) . One can take µ Σ = π ∗ ν .When Γ µ is infinite, it is virtually cyclic. Assuming that µ is centered for simplicity,the walk given by µ ∗ /ε arrives essentially at distance / √ ε of the origin, by the centrallimit theorem. Then, one jumps according to ν , in a direction transverse to Γ µ , preventingfurther cancellations. Hence, the walk given by ( µ ∗ /ε ∗ ν ) εN is at distance roughly εN/ √ ε from the origin, yielding ℓ ( µ i ) ∼ √ ε . On the other hand, each step µ ∗ /ε only visits /ε points, hence the measure ( µ ∗ /ε ∗ ν ) εN is supported by roughly (1 /ε ) εN points, yielding h ( µ i ) ∼ ε | log ε | . In particular, h ( µ i ) = o ( ℓ ( µ i )) . This implies that h ( µ i ) /ℓ ( µ i ) , which tendsto , can not tend to the maximum M . Therefore, this case can not happen.The rigorous argument is considerably more delicate. One difficulty is that µ i does notdecompose in general as (1 − ε ) µ + εν : there can be in µ i points with a very small probability(which are not seen by µ ), but much larger than ε , the probability to visit a nonelementarysubset of Γ . These points will play an important role on the relevant time scale, i.e., /ε .Hence, we have to describe the different time scales that happen in µ i .For each a ∈ Σ , we have a weight µ i ( a ) , which tends to if a is not in the support of µ .Reordering the a k and extracting a subsequence, we can assume that Σ = { a , . . . , a p } with µ i ( a ) > · · · > µ i ( a p ) (and a = e ). Extracting a further subsequence, we may also assumethat µ i ( a k ) /µ i ( a k − ) converges for all k , towards a limit in [0 , .Let Γ k be the subgroup generated by a , . . . , a k . We consider the smallest r such that Γ r is non-elementary. Then, we consider the biggest s < r such that µ i ( r ) = o ( µ i ( s )) .Roughly speaking, the random walk has enough time to spread on the elementary subgroup Γ s , before seeing a r . It turns out that the asymptotic behavior will depend on the natureof Γ s (finite or virtually cyclic infinite).We will decompose the measure µ i as the sum of two components (1 − ε i ) α i + ε i β i , where ε i tends to , the measure α i mainly lives on Γ s , and the measure β i corresponds to theremaining part of µ i , on { a s +1 , . . . , a p } . The precise construction depends on the nature of Γ s : • If Γ s is finite. Let β (0) i be the normalized restriction of µ i to { a s +1 , . . . , a p } . Toavoid periodicity problems, we rather consider β i = ( δ e + β (0) i ) / . We decompose µ i = (1 − ε i ) α i + ε i β i , where α i is supported on a , . . . , a s . By construction, theprobability of any element in the support of α i is much bigger than ε i . NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 33 • If Γ s is virtually cyclic infinite. The group Γ s contains a hyperbolic element g , withrepelling and attracting points at infinity denoted by g − and g +0 . The elements of Γ s all fix the set { g − , g +0 } . We take for α i the normalized restriction of µ i to thoseelements in Σ that fix { g − , g +0 } , and for β i the normalized restriction of µ i to theother elements. Once again, we can write µ i = (1 − ε i ) α i + ε i β i .In both cases, ε i is comparable to the probability µ i ( a r ) , and is therefore negligible withrespect to µ i ( a s ) . We will write µ i = µ ε (and, in the same way, we will replace all indices i with ε , since the main parameter is ε = ε i ). The measure µ ε converges to µ when ε tends to , while β ε tends to a probability measure β , supported on e, a s +1 , . . . , a p . If the measures µ ε are symmetric to begin with, the measures α ε and β ε are also symmetric by construction.To generate the random walk given by µ ε , one can first independently choose randommeasures ρ n : one takes ρ n = α ε with probability − ε , and ρ n = β ε with probability ε .Then, one chooses elements g n randomly according to ρ n , and one multiplies them: theproduct g · · · g n is distributed like the random walk given by µ ε at time n .We will group together successive g k , into blocks where the equidistribution on Γ s canbe seen. More precisely, denote by t , t , . . . the successive times where ρ n = β ε (and t = 0 ). They are stopping times, the successive differences are independent and identicallydistributed, with a geometric distribution of parameter ε (i.e., P ( t = n ) = (1 − ε ) n − ε ),with mean /ε . Write L N = g t N − +1 · · · g t N . By construction, the L i are independent,identically distributed, and the random walk they define, i.e., L · · · L N , is a subsequenceof the original random walk g · · · g n . Let λ ε be the distribution of L i on Γ , i.e., λ ε = ∞ X n =0 (1 − ε ) n εα ∗ nε ∗ β ε . Lemma 5.1.
The measure λ ε has finite first moment and finite time one entropy. Moreover, ℓ ( µ ε ) = εℓ ( λ ε ) and h ( µ ε ) = εh ( λ ε ) .Proof. As the mean of t is /ε , the random walk generated by λ ε is essentially the randomwalk generated by µ ε , but on a time scale /ε . This justifies heuristically the statement.For the rigorous proof, let us first check that λ ε has finite first moment (and hence finitetime one entropy). Since all the measures have finite support, we have | L | Ct . Since ageometric distribution has moments of all order, the same is true for | L | .The strong law of large numbers ensures that, almost surely, t N ∼ N/ε . Therefore, almostsurely, ℓ ( λ ε ) = lim | L · · · L N | N = lim | g · · · g t N | N = lim | g · · · g t N | t N · t N N = ℓ ( µ ε ) · /ε. This proves the statement of the lemma for the drift.For the entropy, we use the characterization of Lemma 2.4. We will show that h ( µ ε ) εh ( λ ε ) and h ( µ ε ) > εh ( λ ε ) . Let K n be a set of cardinality at most e ( h ( µ ε )+ η ) n which contains g · · · g n with probability at least / . Let N = εn . With large probability, t N is close to n , up to η ′ n (where η ′ is arbitrarily small). Hence, with probability at least / , the point L · · · L N belongs to the Cη ′ n -neighborhood of K n , whose cardinality is at most | K n | · e C ′ η ′ n e ( h ( µ ε )+ η + C ′ η ′ ) n = e ( h ( µ ε )+ η + C ′ η ′ ) N/ε . NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 34 As η and η ′ are arbitrary, this shows that h ( λ ε ) h ( µ ε ) /ε . The converse inequality isproved in the same way. (cid:3) The previous lemma shows that we should understand λ ε . We define an auxiliary proba-bility measure ˜ α ε so that λ ε = ˜ α ε ∗ β ε , by(5.1) ˜ α ε = ∞ X n =0 (1 − ε ) n εα ∗ nε . In this formula, most weight is concentrated around those n of the order of /ε . Hence, wehave to understand the iterates of α ε in time /ε . When Γ s is finite, we will see that ithas enough time to equidistribute on Γ s (even though α ε may give a very small weight tosome elements, this weight is by construction much larger than ε , so that /ε iterates areenough to equidistribute). When Γ s is virtually cyclic, we will see that the random walkhas enough time to drift away significantly from the identity.In both cases, we will need quantitative results on basic groups, but in weakly elliptic cases(i.e., the transition probabilities are not bounded from below). There are techniques to getquantitative estimates in such settings, especially comparison techniques (due for instanceto Varopoulos, Diaconis, Saloff-Coste): one can compare weakly elliptic walks to ellipticones (which we understand well) thanks to Dirichlet forms arguments: these argumentsmake it possible to transfer results from the latter to the former (modulo some loss in theconstants, due to the lack of ellipticity). We will rely on such results when Γ s is infinite.When it is finite, such techniques can also be used, but we will rather give a more elementaryargument.We start with the case where Γ s is finite. We need to quantify the speed of convergenceto the stationary measure in finite groups, with the following lemma. Lemma 5.2.
Let Λ be a finite group. Let Σ Λ ⊂ Λ be a generating subset (it does not haveto be symmetric). Let π Λ be the uniform measure on Λ , and let d ( µ, π Λ ) be the euclideandistance between a measure µ and π Λ (i.e., (cid:0)P ( µ ( g ) − π Λ ( g )) (cid:1) / ). For any δ > , thereexists K > with the following property. Let η > . Consider a probability measure µ on Λ with µ ( σ ) > η for any σ ∈ Σ Λ ∪ { e } . Then, for all n > K/η , d ( µ ∗ n , π Λ ) δ. In other words, the time to see the equidistribution towards the stationary measure isbounded by /η , where η is the minimum of the transition probabilities on Σ Λ . Proof.
Endow the space M (Λ) of signed measures on Λ with the scalar product correspond-ing to the quadratic form | ν | = P ν ( g ) . Denote by H = { ν : P ν ( g ) = 0 } the hyperplane π ⊥ Λ of zero mass measures. For any probability ρ , denote by M ρ the left-convolution operatoron M (Λ) , that is M ρ ( ν ) = ρ ∗ ν . Since convolution preserves mass, H is M ρ -invariant. Let usprove that the operator norm of M ρ is bounded by . Indeed, put u ρ ( g ) = P h ∈ Λ ρ ( h ) ρ ( hg ) , NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 35 this is a probability on Λ . We have | M ρ ν | = X g ∈ Λ ( M ρ ν ( g )) = X ( g,h ,h ) ∈ Λ ρ ( gh − ) ρ ( gh − ) ν ( h ) ν ( h )= X ( h ,h ) ∈ Λ ν ( h ) ν ( h ) u ρ ( h h − ) = X ( g,h ) ∈ Λ ν ( h ) ν ( g − h ) u ρ ( g ) X g ∈ Λ | ν | u ρ ( g ) = | ν | . This proves that k M ρ k . Now fix ρ o to be the uniform probability on the set Σ Λ ∪ { e } .Notice that u ρ o ( g ) > for any g ∈ Σ Λ ∪ { e } , since ρ o ( e ) > . We claim that M ρ o restrictedto H has an operator norm c < . Would it be not the case, there would exist ν ∈ H − { } such that the previous inequalities would be equalities. Thanks to the equality case in theCauchy-Schwarz inequality, this implies that, for any g ∈ Σ Λ , the two measures h ν ( h ) and h ν ( g − h ) are positively proportional. Since their norm are equal, they must beequal. Since Σ Λ generates Λ , ν is Λ -invariant and belongs to H , so it must be zero.By assumption, the probability µ can be decomposed as µ = ηρ o + (1 − η ) ν, where ν is some probability. This implies that M µ restricted to H has operator norm atmost ηc + (1 − η ) . Therefore, d ( µ ∗ n , π Λ ) = | µ ∗ n − π Λ | = | M nµ ( δ e − π Λ ) | − (1 − c ) η ) n . This inequality implies the result. (cid:3)
We can now describe the asymptotic behavior of µ ε when the group Γ s is finite. Lemma 5.3.
Assume that Γ s is finite. Define a new probability measure λ = π Γ s ∗ β (itgenerates a non-elementary subgroup). When ε tends to , we have h ( µ ε ) ∼ εh ( λ ) and ℓ ( µ ε ) ∼ εℓ ( λ ) .Proof. The random variable t , being geometric of parameter ε , is of the order of /ε withhigh probability (i.e., for any δ > , there exists u > such that P ( t > u/ε ) > − δ ).Writing Σ s = { a , . . . , a s } for the support of α ε , we have min σ ∈ Σ s α ε ( σ ) = (1 − ε ) − µ ε ( a s ) ,which is much bigger than ε by definition of s . Lemma 5.2 shows that the measures α ∗ nε are close to π Γ s for n > u/ε . This implies that ˜ α ε (defined in (5.1)) converges to π Γ s when ε → . As β ε converges to β , this shows that λ ε converges to λ .The support of the measure λ contains Γ s and a s +1 , . . . , a r (as the support of β contains { e, a s +1 , . . . , a r } by construction). Hence, Γ λ contains the non-elementary subgroup Γ r . Itfollows that the entropy and the drift are continuous at λ , by Proposition 2.3 and Theo-rem 2.9. We get h ( λ ε ) → h ( λ ) and ℓ ( λ ε ) → ℓ ( λ ) . With Lemma 5.1, this completes theproof. (cid:3) We deduce from the lemma that h ( µ ε ) /ℓ ( µ ε ) tends to h ( λ ) /ℓ ( λ ) . Hence, the measure µ Σ = λ satisfies the conclusion of the theorem, at least in the non-symmetric case. In thesymmetric case, where we are looking for a symmetric measure µ Σ , the measure λ = π Γ s ∗ β is not an answer to the problem. However, λ ′ = π Γ s ∗ β ∗ π Γ s is symmetric, and it clearly NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 36 has the same entropy and drift as λ (since π Γ s ∗ π Γ s = π Γ s ). Hence, we can take µ Σ = λ ′ .This completes the proof of Theorem 1.7 when the group Γ s is finite. Example 5.4.
Let
Γ = Z / ∗ Z / , with Σ = { a, b, b − } (where a is the generator of Z / and b the generator of Z / ), with the word distance coming from Σ . [MM07, Section 5.1]shows that the supremum over measures supported on Σ of h ( µ ) /ℓ ( µ ) is the growth v of thegroup (note that Γ is virtually free), and that it is not realized by a measure supported on Σ . This shows that, in Theorem 1.7, the fact that µ Σ may need a support larger than Σ isnot an artefact of the proof.In this example, any symmetric measure on Σ is of the form µ ε = (1 − ε ) δ a + εβ where β isuniform on { b, b − } . The above proof shows that, when ε tends to , h ( µ ε ) /ℓ ( µ ε ) convergesto h ( λ ) /ℓ ( λ ) where λ = π Γ s ∗ β = ( δ e + δ a ) ∗ ( δ b + δ b − ) is the uniform measure on { b, b − , ab, ab − } .It remains to treat the case where Γ s is virtually cyclic infinite. Such a group surjectsonto Z or Z ⋊ Z / (the infinite dihedral group), with finite kernel. From the point of view ofthe random walk, most things happen in the quotient. Hence, it would suffice to understandthese two groups (separating in the case of Z the centered and non-centered cases). We willrather give direct arguments which do not use this reduction and which avoid separatingcases. Let t s be the smallest index such that { a , · · · , a t } generates an infinite group.Let η = η ( ε ) = µ ε ( a t ) , this parameter governs the equidistribution speed on Γ s (or, at least,on Γ t , which has finite index in Γ s since these two groups are virtually cyclic infinite). Wewill find the asymptotics of the entropy and the drift in terms of η/ε (which tends to infinityby definition of s ). We start with the entropy (for which an upper bound suffices). Notethat the random walk directed by α ε does not live on Γ s , but on a possibly bigger groupsince we have put in α ε all the points that fix the set { g − , g +0 } (this will be important inthe control of the drift below). Let ˜Γ s be the group they generate, it is still virtually cyclic(see, for instance, [GdlH90, Théorème 37 page 157]), and it contains Γ s as a finite indexsubgroup. Lemma 5.5.
There exists a constant C such that h ( λ ε ) C log( η/ε ) .Proof. Let K be the group generated by { a , . . . , a t − } . It is finite by definition of t . Let Σ ′ be the set of points among a t , . . . , a p which stabilize { g − , g +0 } . The group ˜Γ s is generatedby K and Σ ′ . Let us consider the associated word pseudo-distance d ′ , where we decide thatelements in K have length. This pseudo-distance is quasi-isometric to the usual distance,and it satisfies d ′ ( e, xk ) = d ′ ( e, x ) for all x ∈ ˜Γ s and all k ∈ K .Let us first estimate the average distance to the origin for an element given by ˜ α ε . Wedecompose α ε as the average of a measure supported on { a , . . . , a t − } ⊂ K , and of ameasure supported on Σ ′ (the contribution of the latter has a mass m ( ε ) bounded by ( p − t + 1) η Cη ). The measure α ∗ nε can be obtained by picking at each step one of these twomeasures (according to their respective weight), and then jumping according to a randomelement for this measure. When we use the first measure, the d ′ -distance to the origin doesnot change by definition. Hence, the distance to the origin is bounded by the number of NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 37 choices of the second measure. We obtain E ˜ α ε ( d ′ ( e, g )) ∞ X n =0 (1 − ε ) n ε n X i =0 (cid:18) ni (cid:19) m ( ε ) i (1 − m ( ε )) n − i · Ci = Cm ( ε ) ∞ X n =0 (1 − ε ) n ε n X i =1 n (cid:18) n − i − (cid:19) m ( ε ) i − (1 − m ( ε )) n − i = Cm ( ε ) ∞ X n =0 (1 − ε ) n εn = Cm ( ε )(1 − ε ) /ε Cη/ε.
A measure supported on the integers with first moment A has entropy bounded by C log A + C (see, for instance, [EK10, Lemma 2]). The proof also applies to virtually cyclicsituations (the finite thickening does not change anything). Therefore, we get H ( ˜ α ε ) C log( η/ε ) + C .Finally, H ( λ ε ) = H ( ˜ α ε ∗ β ε ) H ( ˜ α ε ) + H ( β ε ) C log( η/ε ) + C, since the support of β ε is uniformly bounded. As η/ε → ∞ , this gives H ( λ ε ) C log( η/ε ) .Finally, we estimate h ( λ ε ) = inf n> H ( λ ∗ nε ) /n H ( λ ε ) to get the conclusion of the lemma. (cid:3) For the drift, we need to be more precise since we need a lower bound to conclude. Wewill use a lemma giving lower bounds on the equidistribution speed in virtually cyclic infinitegroups, using comparison techniques.
Lemma 5.6.
Let Λ be a virtually cyclic infinite group. Let Σ Λ ⊂ Λ be a finite subsetgenerating an infinite subgroup of Λ . There exists a constant C with the following property.Let η > . Let µ be a probability measure on Λ with µ ( e ) > / and µ ( σ ) > η for any σ ∈ Σ Λ . Then, for all n > , sup g ∈ Λ µ ∗ n ( g ) C ( ηn ) − / . The interest of the lemma is that C does not depend on the measure µ , and that we obtainan explicit control on µ ∗ n just in terms of a lower bound on the transition probabilities of µ . Proof.
We use the comparison method. Let ρ be the uniform measure on e , Σ Λ and Σ − .The random walk it generates does not have to be transitive (since Σ Λ does not necessarilygenerate the whole group Λ ), but Λ is partitioned into finitely many classes where it istransitive (and isomorphic to the random walk on the group generated by Σ Λ ). Moreover,it is symmetric, and therefore reversible for the counting measure m on Λ . The Dirichletform associated to ρ is by definition E ρ ( f, f ) = 12 X x,y | f ( x ) − f ( y ) | ρ ( x − y ) , for any f : Λ → C . As Λ has linear growth, the following Nash inequality holds (see, forinstance, [Woe00, Proposition 14.1]). k f k L C k f k L E ρ ( f, f ) , NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 38 where all norms are defined with respect to the measure m on Λ . Let P µ be the Markovoperator associated to µ . It satisfies k f k L − k P µ f k L = h f, f i − h P µ f, P µ f i = h ( I − P ∗ µ P µ ) f, f i . The operator P ∗ µ P µ is the Markov operator associated to the symmetric probability measure ν = ˇ µ ∗ µ , which satisfies ν ( σ ) > η/ for σ ∈ Σ Λ ∪ Σ − and ν ( e ) > / (since µ ( e ) > / ).Therefore, ρ ( g ) Cη − ν ( g ) for all g . We deduce k f k L − k P µ f k L = X f ( x )( f ( x ) − f ( y )) ν ( x − y ) = 12 X | f ( x ) − f ( y ) | ν ( x − y ) > η C X | f ( x ) − f ( y ) | ρ ( x − y ) = ηC E ρ ( f, f ) . Combining this inequality with Nash inequality, we obtain k f k L Cη − k f k L ( k f k L − k P µ f k L ) . The operator P ∗ µ satisfies the same inequality, for the same reason. Composing these in-equalities, we obtain an estimate for the norm of P nµ from L to L ∞ (this is [VSCC92,Lemma VII.2.6]), of the form k P nµ k L → L ∞ ( C ′ η − /n ) / . Applying this inequality to the function δ e , we get the desired result. (cid:3) The previous lemma implies that, if C ′ is large enough, a neighborhood of size ( ηn ) / /C ′ of the identity has probability for µ ∗ n at most / . Hence, the average distance to the originis at least of the order of ( ηn ) / .Now, we study the stationary measure for β ε ∗ ˜ α ε on ∂ Γ . We recall that g is a hyperbolicelement in Γ s , fixed once and for all. Lemma 5.7.
There exists a neighborhood U of { g − , g +0 } in ∂ Γ such that the stationarymeasure ν ε of β ε ∗ ˜ α ε satisfies ν ε ( U ) → .Proof. Let us first show that, for any neighborhood U of { g − , g +0 } , then ( ˜ α ε ∗ δ z )( U c ) tendsto , uniformly in z ∈ ∂ Γ . This is not surprising since a typical element for ˜ α ε is large in thevirtually cyclic group ˜Γ s , and sends most points into U . To make this argument rigorous, wewill use Lemma 5.6. The definition (5.1) shows that it suffices to prove that ( α ∗ nε ∗ δ z )( U c ) is small for n > u/ε .The subgroup generated by g has finite index in ˜Γ s . Hence, any element in ˜Γ s can bewritten as g k γ i , for γ i in a finite set. Thus, the measure α ∗ nε can be written as P c n ( k, i ) δ g k γ i ,for some coefficients c n ( k, i ) . Lemma 5.6 (applied to Λ = ˜Γ s with Σ Λ = { a , . . . , a t } ) ensuresthat sup k,i c n ( k, i ) C/ ( ηn ) / . When n > u/ε , this quantity tends to since ε = o ( η ) . Wehave ( α ∗ nε ∗ δ z )( U c ) = X k,i c n ( k, i )1( g k γ i z / ∈ U ) . As the element g is hyperbolic, there exists C such that, for any w ∈ ∂ Γ , |{ k ∈ Z : g k w / ∈ U }| C. NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 39
The uniformity in w follows from the compactness of ( ∂ Γ \ { g − , g +0 } ) / h g i . We obtain ( α ∗ nε ∗ δ z )( U c ) (cid:0) sup k,i c n ( k, i ) (cid:1) X i |{ k ∈ Z : g k γ i z / ∈ U }| C sup k,i c n ( k, i ) C/ ( ηn ) / . This shows that ( α ∗ nε ∗ δ z )( U c ) is small, as desired.As ˜ α ε ∗ δ z ( U c ) tends to uniformly in z , we deduce that ( ˜ α ε ∗ ν ε )( U c ) also tends to ,and therefore that ( ˜ α ε ∗ ν ε )( U ) tends to .Let A = { g − , g +0 } . We claim that, for all g such that gA ∩ A = ∅ , then gA = A . Indeed, if g ( g − ) ∈ A for instance, then g − g g is a hyperbolic element stabilizing g − . It also stabilizes g +0 , by [GdlH90, Théorème 30 page 154], i.e., g g ( g +0 ) = g ( g +0 ) . Hence, g ( g +0 ) is a fixed pointof g , i.e., g ( g +0 ) ∈ A .By definition of β ε , the finitely many elements of its support do not fix A . They evensatisfy gA ∩ A = ∅ for all g in this support, by the previous argument. If U is small enough,we get gU ∩ U = ∅ , i.e., g ( U ) ⊂ U c .Finally, ν ε ( U c ) = ( β ε ∗ ˜ α ε ∗ ν ε )( U c ) > ( ˜ α ε ∗ ν ε )( U ) , which tends to when ε tends to . (cid:3) Lemma 5.8.
The drift ℓ ( λ ε ) satisfies ℓ ( λ ε ) > c · ( η/ε ) / .Proof. Let ρ ε be a stationary measure for λ ε , on the Busemann boundary ∂ B Γ . By Propo-sition 2.2, ℓ ( λ ε ) = Z c B ( g, ξ ) d ρ ε ( ξ ) d λ ε ( g ) , where c B ( g, ξ ) = h ξ ( g − ) is the Busemann cocycle. As λ ε = ˜ α ε ∗ β ε , this gives ℓ ( λ ε ) = Z c B ( Lb, ξ ) d ρ ε ( ξ ) d ˜ α ε ( L ) d β ε ( b ) . With the cocycle relation (2.2), this becomes ℓ ( λ ε ) = Z c B ( L, bξ ) d ρ ε ( ξ ) d ˜ α ε ( L ) d β ε ( b ) + Z c B ( b, ξ ) d ρ ε ( ξ ) d ˜ α ε ( L ) d β ε ( b ) . The second integral is bounded independently of ε since the support of β ε is finite. In thefirst integral, ξ ′ = bξ is distributed according to the measure ˜ ρ ε := β ε ∗ ρ ε , which is stationaryfor β ε ∗ ˜ α ε . Lemma 5.7 implies that its projection ( π B ) ∗ ˜ ρ ε on the geometric boundary, whichis again stationary for β ε ∗ ˜ α ε , gives a small measure to a neighborhood U of { g − , g +0 } .As the limit set of ˜Γ s is { g − , g +0 } , there exists a constant C such that, for all ξ / ∈ π − B U and g ∈ ˜Γ s , we have | h ξ ( g − ) − d ( e, g ) | C . For ξ ∈ π − B U , we only use the trivial bound h ξ ( g − ) > − d ( e, g ) , since horofunctions are -Lipschitz and vanish at the origin. We get ℓ ( λ ε ) > Z ( L,ξ ) ∈ Γ × π − B U c d ( e, L ) d ˜ α ε ( L ) d ˜ ρ ε ( ξ ) − Z ( L,ξ ) ∈ Γ × π − B U d ( e, L ) d ˜ α ε ( L ) d ˜ ρ ε ( ξ ) − C = (cid:18)Z d ( e, L ) d ˜ α ε ( L ) (cid:19) (˜ ρ ε ( π − B U c ) − ˜ ρ ε ( π − B U )) − C. NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 40
For small enough ε , we have ˜ ρ ε ( π − B U ) / (and therefore ˜ ρ ε ( π − B U c ) > / ). Moreover,Lemma 5.6 ensures that the average distance to the origin for the measure ˜ α ε is at least c · ( η/ε ) / . Hence, the previous formula completes the proof. (cid:3) Combining Lemmas 5.5 and 5.8, we get h ( λ ε ) /ℓ ( λ ε ) C log( η/ε ) / ( η/ε ) / . This tends to since η/ε tends to infinity. We deduce from Lemma 5.1 that h ( µ ε ) /ℓ ( µ ε ) tends to . This is a contradiction since we were assuming that it converges to the maximum M , which is positive.This concludes the proof of Theorem 1.7. (cid:3) The study of the case where Γ s is virtually cyclic infinite gives in particular the followingresult. Theorem 5.9.
Let (Γ , d ) be a metric hyperbolic group. Let Σ be a finite subset of Γ whichgenerates a non-elementary group. Let µ i be a sequence of measures on Σ , with h ( µ i ) > , converging to a probability measure µ such that Γ µ is infinite virtually cyclic. Then h ( µ i ) /ℓ ( µ i ) → . Note that the precise value of ℓ ( µ i ) depends on the choice of the distance, but if twodistances are equivalent then the associated drifts vary within the same constants. Hence,the convergence h ( µ i ) /ℓ ( µ i ) → does not depend on the distance.We recover results of Le Prince [LP07]: In any metric hyperbolic group, there existadmissible probability measures with h/ℓ < v . The construction of Le Prince is rathersimilar to the examples given by Theorem 5.9. Example 5.10.
We can use the above proof to also find an example where h ( µ ε ) /ℓ ( µ ε ) → although µ ε tends to a measure µ for which Γ µ is finite and nontrivial. Consider Γ = Z / × F = { , } × h a, b i , endowed with the probability measure µ ε given by µ ε (0 , e ) = µ ε (1 , e ) = 1 / − ε − ε , µ ε (0 , a ) = µ ε (0 , a − ) = ε, µ ε (0 , b ) = µ ε (0 , b − ) = ε . The measure µ ε converges to µ = ( δ (0 ,e ) + δ (1 ,e ) ) / . With the above notations, Γ µ = Z / ×{ e } but Γ s = Z / × h a i is virtually cyclic infinite (so that h ( µ ε ) /ℓ ( µ ε ) → ) and Γ r = Γ .6. Examples for non-symmetric measures
In this section, we describe the additional difficulties that arise if one tries to proveTheorem 1.3 for non-symmetric measures. The main problem is that the random walk liveson the subsemigroup Γ + µ , which is not a subgroup any more. While many cases can behandled with the tools we have described in this article, one case can not be treated in thisway: when the subsemigroup Γ + µ has no nice geometric properties (it is not quasi-convex, itis not a subgroup), but Γ µ = Γ .Let us first show that the growth properties of such a subsemigroup can be more com-plicated than what happens for subgroups. If Λ is a subgroup of Γ , either | B n ∩ Λ | ≍ e nv ,or | B n ∩ Λ | = o ( e nv ) (the first case happens if and only if Λ has finite index in Γ , see thediscussion at the beginning of Paragraph 4.3). Unfortunately, the behavior of semigroupscan be more complicated. NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 41
Proposition 6.1. In F , there exists a subsemigroup Λ + such that lim inf | B n ∩ Λ + | / | B n | = 0 and lim sup | B n ∩ Λ + | / | B n | > .Proof. Let S na,a denote the geodesic words in F = h a, b i of length n which start and endwith a . Let n j be a sequence tending very quickly to infinity. Let Λ + be the subsemigroupgenerated by S S n j a,a . Then | B n j ∩ Λ + | > c | B n j | . We claim that | B n j − ∩ Λ + | / | B n j − | → . Indeed, the subsemigroup Λ + j − generated by S k Let Γ and Γ be two nontrivial groups, generated respectively by finitesymmetric sets S and S . Let Γ = Γ ∗ Γ with the generating set S = S ∪ S and thecorresponding word distance. There exists on Γ a (nonsymmetric, nonadmissible) probabilitymeasure µ , with an exponential moment and nonzero entropy, satisfying h ( µ ) = ℓ ( µ ) v .Proof. For i = 1 , , let Γ ∗ i = Γ i \ { e } . We claim that(6.1) X g ∈ Γ ∗ ,g ∈ Γ ∗ e − v | g g | = 1 , where v is the growth rate of Γ .Let F i ( z ) be the growth series of Γ i , i.e., F i ( z ) = P g ∈ Γ i z | g | . The spheres S ni ∈ Γ i satisfy S n + mi ⊂ S ni · S mi . Hence, the sequence log | S ni | is subadditive. This implies that log | S ni | /n converges to its infimum v i , and moreover that | S ni | > e nv i . We deduce that the radius ofconvergence of F i is e − v i , and moreover F i ( e − v i ) = + ∞ .Let F ( z ) be the growth series of Γ . As in the proof of Proposition 4.8, it is given by F ( z ) = F ( z ) F ( z )1 − ( F ( z ) − F ( z ) − . NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 42 Assume for instance v > v . As F ( e − v ) = + ∞ , the function ( F ( z ) − F ( z ) − takesthe value when z increases to e − v , at a point which is precisely the radius of convergence e − v of F . This shows that ( F ( e − v ) − F ( e − v ) − 1) = 1 . This is precisely the equality (6.1).We define a probability measure µ on Γ as follows: for ( g , g ) ∈ Γ ∗ × Γ ∗ , let µ ( g g ) = e − v | g g | . Since there is only one way to generate the word g g · · · g n g n using µ , we have µ ∗ n ( g g · · · g n g n ) = e − v P i | g i g i | . Denoting by X n the position of the random walk at time n , it follows that − log µ ∗ n ( X n ) = v | X n | . Dividing by n and letting n tend to infinity, this gives h ( µ ) = ℓ ( µ ) v . (cid:3) If one is interested in measures with finite support, one can only get the following approx-imation result. It has the same flavor as Theorem 1.4, but it is both stronger since it alsoapplies to some non-hyperbolic groups, and weaker since the measures it produces are notadmissible nor symmetric. Proposition 6.3. Let Γ and Γ be two nontrivial groups, generated respectively by finitesymmetric sets S and S . Let Γ = Γ ∗ Γ with the generating set S = S ∪ S and thecorresponding word distance. Then sup (cid:8) h ( µ ) /ℓ ( µ ) : µ finitely supported probability measure in Γ , ℓ ( µ ) > (cid:9) = v. Proof. Any element in Γ can be canonically decomposed as a word in elements of Γ and Γ . Let S pi,j be the set of elements of length p that start with an element in Γ i and end withan element in Γ j . We have the decomposition S p = S p , ∪ S p , ∪ S p , ∪ S p , . One term in this decomposition has cardinality at least | S p | / . Hence, there exist i, j suchthat lim sup log | S pi,j | /p = v . Multiplying by fixed elements at the beginning and at the endto go from Γ to Γ i , and from Γ j to Γ , we get(6.2) lim sup log | S p , | /p = v. Let µ p be the uniform probability measure on S p , . By construction, there are no simpli-fications when one iterates µ p . Hence, µ ∗ np is the uniform probability measure on ( S p , ) ∗ n ,whose cardinality is | S p , | n . We get H ( µ ∗ np ) = n log | S p , | and L ( µ ∗ np ) = np . Therefore, h ( µ p ) = log | S p , | and ℓ ( µ p ) = p , giving h ( µ p ) /ℓ ( µ p ) = log | S p , | /p. Together with (6.2), this proves the proposition. (cid:3) References [Aar97] Jon Aaronson, An introduction to infinite ergodic theory , Mathematical Surveys and Monographs,vol. 50, American Mathematical Society, Providence, RI, 1997. MR1450400. Cited page 15.[AL02] Goulnara N. Arzhantseva and Igor G. Lysenok, Growth tightness for word hyperbolic groups , Math.Z. (2002), 597–611. MR1938706. Cited page 26.[Anc87] Alano Ancona, Negatively curved manifolds, elliptic operators, and the Martin boundary , Ann. ofMath. (2) (1987), 495–536. MR890161. Cited pages 6, 11, and 19. NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 43 [BH99] Martin R. Bridson and André Haefliger, Metric spaces of non-positive curvature , Grundlehrender Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 319,Springer-Verlag, Berlin, 1999. MR1744486. Cited pages 7 and 21.[BHM11] Sébastien Blachère, Peter Haïssinsky, and Pierre Mathieu, Harmonic measures versus quasi-conformal measures for hyperbolic groups , Ann. Sci. Éc. Norm. Supér. (4) (2011), 683–721.MR2919980. Cited pages 2, 3, 6, and 11.[Bou12] Jean Bourgain, Finitely supported measures on SL ( R ) which are absolutely continuous at infinity ,Geometric aspects of functional analysis, Lecture Notes in Math., vol. 2050, Springer, Heidelberg,2012, pp. 133–141. MR2985129. Cited page 2.[Bow98a] Brian H. Bowditch, Boundaries of strongly accessible hyperbolic groups , The Epstein birthdayschrift, Geom. Topol. Monogr., vol. 1, Geom. Topol. Publ., Coventry, 1998, pp. 51–97. MR1668331.Cited page 20.[Bow98b] , Cut points and canonical splittings of hyperbolic groups , Acta Math. (1998), 145–186.MR1638764. Cited page 20.[Cal13] Danny Calegari, The ergodic theory of hyperbolic groups , Geometry and topology down under,Contemp. Math., vol. 597, Amer. Math. Soc., Providence, RI, 2013, pp. 15–52. MR3186668.Cited page 25.[CM07] Chris Connell and Roman Muchnik, Harmonicity of quasiconformal measures and Poisson bound-aries of hyperbolic spaces , Geom. Funct. Anal. (2007), 707–769. MR2346273. Cited page 18.[Coo93] Michel Coornaert, Mesures de Patterson-Sullivan sur le bord d’un espace hyperbolique au sens deGromov , Pacific J. Math. (1993), 241–270. MR1214072. Cited pages 2, 3, 13, 17, 25, and 28.[Dal99] Françoise Dal’bo, Remarques sur le spectre des longueurs d’une surface et comptages , Bol. Soc.Brasil. Mat. (N.S.) (1999), 199–221. MR1703039. Cited page 19.[dlH00] Pierre de la Harpe, Topics in geometric group theory , Chicago Lectures in Mathematics, Universityof Chicago Press, Chicago, IL, 2000. MR1786869. Cited page 28.[EK10] Anna Erschler and Anders Karlsson, Homomorphisms to R constructed from random walks , Ann.Inst. Fourier (Grenoble) (2010), 2095–2113. MR2791651. Cited page 37.[EK13] Anna Erschler and Vadim Kaimanovich, Continuity of asymptotic characteristics for randomwalks on hyperbolic groups , Funktsional. Anal. i Prilozhen. (2013), 84–89. MR3113872. Citedpages 5, 10, 11, 14, and 32.[FS09] Philippe Flajolet and Robert Sedgewick, Analytic combinatorics , Cambridge University Press,Cambridge, 2009. MR2483235. Cited page 28.[GdlH90] Étienne Ghys and Pierre de la Harpe (eds.), Sur les groupes hyperboliques d’après Mikhael Gromov ,Progress in Mathematics, vol. 83, Birkhäuser Boston Inc., Boston, MA, 1990, Papers from theSwiss Seminar on Hyperbolic Groups held in Bern, 1988. MR1086648. Cited pages 7, 36, and 39.[GLJ93] Yves Guivarc’h and Yves Le Jan, Asymptotic winding of the geodesic flow on modular surfacesand continued fractions , Ann. Sci. École Norm. Sup. (4) (1993), 23–50. MR1209912. Citedpage 2.[Gou13] Sébastien Gouëzel, Martin boundary of measures with infinite support in hyperbolic groups ,Preprint, 2013. Cited pages 4, 11, 19, and 21.[Gui80] Yves Guivarc’h, Sur la loi des grands nombres et le rayon spectral d’une marche aléatoire , Confer-ence on Random Walks (Kleebach, 1979) (French), Astérisque, vol. 74, Soc. Math. France, Paris,1980, pp. 47–98, 3. MR588157. Cited page 3.[Haï13] Peter Haïssinsky, Marches aléatoires sur les groupes hyperboliques , Géométrie ergodique, Monogr.Enseign. Math., vol. 43, Enseignement Math., Geneva, 2013, pp. 199–265. MR3220556. Citedpages 3 and 11.[HY61] John G. Hocking and Gail S. Young, Topology , Addison-Wesley Publishing Co., 1961. MR0125557.Cited page 20.[INO08] Masaki Izumi, Sergey Neshveyev, and Rui Okayasu, The ratio set of the harmonic measure ofa random walk on a hyperbolic group , Israel J. Math. (2008), 285–316. MR2391133. Citedpages 6, 20, and 21.[Kai00] Vadim A. Kaimanovich, The Poisson formula for groups with hyperbolic properties , Ann. of Math.(2) (2000), 659–692. MR1815698. Cited pages 9, 11, 16, and 30. NTROPY AND DRIFT IN WORD HYPERBOLIC GROUPS 44 [KB02] Ilya Kapovich and Nadia Benakli, Boundaries of hyperbolic groups , Combinatorial and geometricgroup theory (New York, 2000/Hoboken, NJ, 2001), Contemp. Math., vol. 296, Amer. Math. Soc.,Providence, RI, 2002, pp. 39–93. MR1921706. Cited pages 19 and 20.[KL11] Anders Karlsson and François Ledrappier, Noncommutative ergodic theorems , Geometry, rigidity,and group actions, Chicago Lectures in Math., Univ. Chicago Press, Chicago, IL, 2011, pp. 396–418. MR2807838. Cited page 9.[Lal14] Steve Lalley, Random walks on hyperbolic groups , http://galton.uchicago.edu/ ∼ lalley/Talks/paris-talkC.pdf, 2014. Cited page 4.[Led95] François Ledrappier, Applications of dynamics to compact manifolds of negative curvature , Pro-ceedings of the International Congress of Mathematicians, Vol. 1, 2 (Zürich, 1994), Birkhäuser,Basel, 1995, pp. 1195–1202. MR1404020. Cited page 2.[LP07] Vincent Le Prince, Dimensional properties of the harmonic measure for a random walk on ahyperbolic group , Trans. Amer. Math. Soc. (2007), 2881–2898 (electronic). MR2286061. Citedpages 2 and 40.[MM07] Jean Mairesse and Frédéric Mathéus, Random walks on free products of cyclic groups , J. Lond.Math. Soc. (2) (2007), 47–66. MR2302729. Cited page 36.[Neu54] Bernhard H. Neumann, Groups covered by finitely many cosets , Publ. Math. Debrecen (1954),227–242 (1955). MR0072138. Cited page 25.[Pei03] Marc Peigné, On the Patterson-Sullivan measure of some discrete group of isometries , Israel J.Math. (2003), 77–88. MR1968423. Cited page 28.[Rev84] Daniel Revuz, Markov chains , second ed., North-Holland Mathematical Library, vol. 11, North-Holland Publishing Co., Amsterdam, 1984. MR758799. Cited page 24.[Sha98] Richard Sharp, Relative growth series in some hyperbolic groups , Math. Ann. (1998), 125–132.MR1645953. Cited pages 23 and 24.[Swa96] Gadde A. Swarup, On the cut point conjecture , Electron. Res. Announc. Amer. Math. Soc. (1996), 98–100 (electronic). MR1412948. Cited page 20.[Tan14] Ryokichi Tanaka, Hausdorff spectrum of harmonic measure , arXiv:1411.2312 [math.PR], 2014.Cited pages 3 and 6.[Ver00] Anatoly Vershik, Dynamic theory of growth in groups: entropy, boundaries, examples , UspekhiMat. Nauk (2000), 59–128. MR1786730. Cited pages 1 and 4.[VSCC92] Nicholas Th. Varopoulos, Laurent Saloff-Coste, and Thierry Coulhon, Analysis and geometry ongroups , Cambridge Tracts in Mathematics, vol. 100, Cambridge University Press, Cambridge,1992. MR1218884. Cited page 38.[Woe00] Wolfgang Woess, Random walks on infinite graphs and groups , Cambridge Tracts in Mathematics,vol. 138, Cambridge University Press, Cambridge, 2000. MR1743100. Cited page 37. Sébastien Gouëzel, IRMAR, Univ. Rennes 1, 35042 Rennes Cedex, France E-mail address : [email protected] Frédéric Mathéus, Univ. Bret. Sud, L.M.B.A., UMR 6205, BP 573, 56017 Vannes, France E-mail address : [email protected] François Maucourant, IRMAR, Univ. Rennes 1, 35042 Rennes Cedex, France E-mail address ::