[PDF] The ergodic theory of hyperbolic groups

Abstract

These notes are a self-contained introduction to the use of dynamical and probabilistic methods in the study of hyperbolic groups. Most of this material is standard; however some of the proofs given are new, and some results are proved in greater generality than have appeared in the literature. These notes originated in a minicourse given at a workshop in Melbourne, July 11-15 2011.

Full PDF

aa r X i v : . [ m a t h . G R ] M a y THE ERGODIC THEORY OF HYPERBOLIC GROUPS

DANNY CALEGARI

Abstract.

These notes are a self-contained introduction to the use of dy-namical and probabilistic methods in the study of hyperbolic groups. Most ofthis material is standard; however some of the proofs given are new, and someresults are proved in greater generality than have appeared in the literature.

Contents

1. Introduction 12. Hyperbolic groups 22.1. Coarse geometry 22.2. Hyperbolic spaces 52.3. Hyperbolic groups 82.4. The Gromov boundary 102.5. Patterson–Sullivan measure 143. Combings 183.1. Regular languages 183.2. Cannon’s theorem 203.3. Combings and combable functions 213.4. Markov chains 233.5. Shift space 253.6. Limit theorems 263.7. Thermodynamic formalism 284. Random walks 304.1. Random walk 304.2. Poisson boundary 334.3. Harmonic functions 344.4. Green metric 344.5. Harnack inequality 354.6. Monotonicity 355. Acknowledgments 36References 361.

Introduction

These are notes from a minicourse given at a workshop in Melbourne July 11–152011. There is little pretension to originality; the main novelty is ﬁrstly that wegive a new (and much shorter) proof of Coornaert’s theorem on Patterson–Sullivan

Date : October 24, 2018. measures for hyperbolic groups (Theorem 2.5.4), and secondly that we explain howto combine the results of Calegari–Fujiwara in [8] with that of Pollicott–Sharp [35]to prove central limit theorems for quite general classes of functions on hyperbolicgroups (Corollary 3.7.5 and Theorem 3.7.6), crucially without the hypothesis thatthe Markov graph encoding an automatic structure is ergodic.A ﬁnal section on random walks is much more cursory.2.

Hyperbolic groups

Coarse geometry.

The fundamental idea in geometric group theory is tostudy groups as automorphisms of geometric spaces, and as a special case, to studythe group itself (with its canonical self-action) as a geometric space. This is accom-plished most directly by means of the

Cayley graph construction.

Deﬁnition 2.1.1 (Cayley graph) . Let G be a group and S a (usually ﬁnite) gen-erating set. Associated to G and S we can form the Cayley graph C S ( G ). This isa graph with vertex set G , and with an edge from g to gs for all g ∈ G and s ∈ S .The action of G on itself by (left) multiplication induces a properly discontinuousaction of G on C S ( G ) by simplicial automorphisms.If G has no 2-torsion, the action is free and properly discontinuous, and thequotient is a wedge of | S | circles X S . In this case, if G has a presentation G = h S | R i we can think of C S ( G ) as the covering space of X S corresponding to thesubgroup of the free group F S normally generated by R , and the action of G on C S ( G ) is the deck group of the covering. Figure 1.

The Cayley graph of F = h a, b | i with generating set S = { a, b } We assume the reader is familiar with the notion of a metric space, i.e. a space X together with a symmetric non-negative real-valued function d X on X × X which vanishes precisely on the diagonal, and which satisﬁes the triangle inequality HE ERGODIC THEORY OF HYPERBOLIC GROUPS 3 d X ( x, y ) + d X ( y, z ) ≥ d X ( x, z ) for each triple x, y, z ∈ X . A metric space is a pathmetric space if for each x, y ∈ X , the distance d X ( x, y ) is equal to the inﬁmum ofthe set of numbers L for which there is a 1-Lipschitz map γ : [0 , L ] → X sending0 to x and L to y . It is a geodesic metric space if it is a path metric space and ifthe inﬁmum is achieved on some γ for each pair x, y ; such a γ is called a geodesic.Finally, a metric space is proper if closed metric balls of bounded radius are compact (equivalently, for each point x the function d ( x, · ) : X → R is proper).The graph C S ( G ) can be canonically equipped with the structure of a geodesicmetric space . This is accomplished by making each edge isometric to the Euclideanunit interval. If S is ﬁnite, C S ( G ) is proper . Note that G itself inherits a subspacemetric from C S ( G ), called the word metric . We denote the word metric by d S , anddeﬁne | g | S (or just | g | if S is understood) to be d S (id , g ). Observe that d S ( g, h ) = | g − h | S = | h − g | S and that | g | S is the length of the shortest word in elements of S and their inverses representing the element g .The most serious shortcoming of this construction is its dependence on the choiceof a generating set S . Diﬀerent choices of generating set S give rise to diﬀerentspaces C S ( G ) which are typically not even homeomorphic. The standard way toresolve this issue is to coarsen the geometric category in which one works. Deﬁnition 2.1.2.

Let

X, d X and Y, d Y be metric spaces. A map f : X → Y ( not assumed to be continuous) is a quasi-isometric map if there are constants K ≥ , ǫ ≥ K − d X ( x , x ) − ǫ ≤ d Y ( f ( x ) , f ( x )) ≤ Kd X ( x , x ) + ǫ for all x , x ∈ X . It is said to be a quasi-isometry if further f ( X ) is a net in Y ;that is, if there is some R so that Y is equal to the R -neighborhood of f ( X ).One also uses the terminology K, ǫ quasi-isometric map or K, ǫ quasi-isometry ifthe constants are speciﬁed. Note that a K, K bilipschitz map. The best constant K is called the multiplicative constant ,and the best ǫ the additive constant of the map.We denote the R -neighborhood of a set Σ by N R (Σ). Hence a quasi-isometry isa quasi-isometric map for which Y = N R ( f ( X )) for some R . Remark . It is much more common to use the terminology quasi-isometricembedding instead of quasi-isometric map as above; we consider this terminologymisleading, and therefore avoid it.

Lemma 2.1.4.

Quasi-isometry is an equivalence relation.Proof.

Reﬂexivity and transitivity are obvious, so we must show symmetry. Foreach y ∈ Y choose x ∈ X with d Y ( y, f ( x )) ≤ R (such an x exists by deﬁnition) anddeﬁne g ( y ) = x . Observe d Y ( y, f g ( y )) ≤ R by deﬁnition. Then d X ( g ( y ) , g ( y )) ≤ Kd Y ( f g ( y ) , f g ( y )) + Kǫ ≤ Kd Y ( y , y ) + K ( ǫ + 2 R )Similarly, d X ( g ( y ) , g ( y )) ≥ K − d Y ( f g ( y ) , f g ( y )) − K − ǫ ≥ K − d Y ( y , y ) − K − ( ǫ + 2 R )proving symmetry. (cid:3) Note that the compositions f g and gf as above move points a bounded distance.One can deﬁne a category in which objects are equivalence classes of metric spaces DANNY CALEGARI under the equivalence relation generated by thickening (i.e. isometric inclusion asa net in a bigger space), and morphisms are equivalence classes of quasi-isometricmaps, where two maps are equivalent if their values on each point are a uniformlybounded distance apart. In this category, quasi-isometries are isomorphisms. Inparticular, the set of quasi-isometries of a metric space X , modulo maps that movepoints a bounded distance, is a group , denoted QI( X ), which only depends on thequasi-isometry type of X . Determining QI( X ), even for very simple spaces, istypically extraordinarily diﬃcult. Example . A metric space

X, d X is quasi-isometric to a point if and only if ithas bounded diameter. A Cayley graph C S ( G ) (for S ﬁnite) is quasi-isometric to apoint if and only if G is ﬁnite. Example . If S and T are two ﬁnite generating sets for a group G then theidentity map from G to itself is a quasi-isometry (in fact, a bilipschitz map) of G, d S to G, d T . For, there are constants C and C so that d T ( s ) ≤ C for all s ∈ S , and d S ( t ) ≤ C for all t ∈ T , and therefore C − d T ( g, h ) ≤ d S ( g, h ) ≤ C d T ( g, h ).Because of this, the quasi-isometry class of G, d S is independent of the choice ofﬁnite generating set, and we can speak unambiguously of the quasi-isometry classof G .The Schwarz Lemma connects the geometry of groups to the geometry of spacesthey act on. Lemma 2.1.7 (Schwarz Lemma) . Let G act properly discontinuously and cocom-pactly by isometries on a proper geodesic metric space X . Then G is ﬁnitely gen-erated by some set S , and the orbit map G → X sending g to gx (for any x ∈ X )is a quasi-isometry from G, d S to X .Proof. Since X is proper and G acts cocompactly there is an R so that GN R ( x ) = X . Note that Gx is a net, since every point of X is contained in some translate gB and is therefore within distance R of gx .Let B = N R +1 ( x ). Since G acts properly discontinuously, there are only ﬁnitelymany g in G for which gB ∩ B is nonempty; let S be the nontrivial elements of thisset.Now, if g, h ∈ G are arbitrary, let γ be a geodesic in X from gx to hx . Pa-rameterize γ by arclength, and for each integer i ∈ (0 , | γ | ) let g i be such that d X ( g i x, γ ( i )) ≤ R . Then g − i g i +1 ∈ S and therefore d S ( g, h ) = | g − h | ≤ | γ | + 1 = d ( gx, hx ) + 1which shows incidentally that S generates G .Conversely, if L := d S ( g, h ) and g i is a sequence of elements with g = g and g L = h and each g − i g i +1 ∈ S , then there is a path γ i from g i x to g i +1 x of lengthat most 4 R + 2, and the concatenation of these paths certiﬁes that d ( gx, hx ) ≤ (4 R + 2) | g − h | = (4 R + 2) d S ( g, h )This completes the proof of the lemma. (cid:3) Example . If G is a group and H is a subgroup of ﬁnite index, then G and H arequasi-isometric (for, both act properly discontinuously and cocompactly on C S ( G )).Two groups are said to be commensurable if they have isomorphic subgroups of ﬁniteindex; the same argument shows that commensurable groups are quasi-isometric. HE ERGODIC THEORY OF HYPERBOLIC GROUPS 5

Example . Any two regular trees of (ﬁnite) valence ≥ Example . The set of ends of a geodesic metric space is a quasi-isometryinvariant. A famous theorem of Stallings [39] says that a ﬁnitely generated groupwith more than one end splits over a ﬁnite subgroup; it follows that the propertyof splitting over a ﬁnite subgroup is a quasi-isometry invariant.Finiteness of the edge groups (in a splitting) is detected quasi-isometrically bythe existence of separating compact subsets. Quasi-isometry can further detectthe ﬁniteness of the vertex groups, and in particular one observes that a group isquasi-isometric to a free group if and only if it is virtually free.

Example . Any two groups that act cocompactly and properly discontinu-ously on the same space X are quasi-isometric. For example, if M , M are closedRiemannian manifolds with isometric universal covers, then π ( M ) and π ( M ) arequasi-isometric. It is easy to produce examples for which the groups in questionare not commensurable; for instance, a pair of closed hyperbolic 3-manifolds M , M with diﬀerent invariant trace ﬁelds (see [27]). Remark . In the geometric group theory literature, Lemma 2.1.7 is oftencalled the “Milnor–ˇSvarc (or ˇSvarc-Milnor) Lemma”; “ˇSvarc” here is in fact thewell-known mathematical physicist Albert Schwarz; it is our view that the orthog-raphy “ˇSvarc” tends to obscure this. Actually, the content of this Lemma was ﬁrstobserved by Schwarz in the early 50’s and only rediscovered 15 years later by Milnorat a time when the work of Soviet mathematicians was not widely disseminated inthe west.2.2.

Hyperbolic spaces.

In a geodesic metric space a geodesic triangle is just aunion of three geodesics joining three points in pairs. If the three points are x, y, z we typically denote the (oriented) geodesics by xy , yz and zx respectively; thisnotation obscures the possibility that the geodesics in question are not uniquelydetermined by their endpoints. Deﬁnition 2.2.1.

A geodesic metric space

X, d X is δ -hyperbolic if for any geodesictriangle, each side of the triangle is contained in the δ -neighborhood of the unionof the other two sides. A metric space is hyperbolic if it is δ -hyperbolic for some δ .One sometimes says that geodesic triangles are δ -thin . Example . A tree is 0-hyperbolic.

Example . Hyperbolic space (of any dimension) is δ -hyperbolic for a uniform δ . Example . If X is a simply-connected complete Riemannian manifold with cur-vature bounded above by some K < X is δ -hyperbolic for some δ dependingon K . Deﬁnition 2.2.5.

A geodesic metric space X is CAT( K ) for some K if trianglesare thinner than comparison triangles in a space of constant curvature K . Thismeans that if xyz is a geodesic triangle in X , and x ′ y ′ z ′ is a geodesic triangle in acomplete simply connected Riemannian manifold Y of constant curvature K withedges of the same lengths, and φ : xyz → x ′ y ′ z ′ is an isometry on each edge, thenfor any w ∈ yz we have d X ( x, w ) ≤ d Y ( x ′ , φ ( w )). DANNY CALEGARI

Figure 2. A δ -thin triangle; the gray tubes have thickness δ .The initials CAT stand for Cartan–Alexandrov–Toponogov, who made substan-tial contributions to the theory of comparison geometry. Example . From the deﬁnition, a CAT( K ) space is δ -hyperbolic whenever thecomplete simply connected Riemannian 2-manifold of constant curvature K is δ -hyperbolic. Hence a CAT( K ) space is hyperbolic if K < Example . Nearest point projection to a convex subset of a CAT( K ) spacewith K ≤ K ) space agree, and such a subspace is itselfCAT( K ).Thinness of triangles implies thinness of arbitrary polygons. Example . Let X be δ -hyperbolic and let abcd be a geodesic quadrilateral.Then either there are points on ab and cd at distance ≤ δ or there are points on ad and bc at distance ≤ δ , or possibly both. Figure 3.

Two ways that a quadrilateral can be thin

HE ERGODIC THEORY OF HYPERBOLIC GROUPS 7

The number of essentially distinct ways in which an n -gon can be thin is equalto the n th Catalan number. By cutting up a polygon into triangles and examiningthe implications of δ -thinness for each triangle, one can reason about the geometryof complicated conﬁgurations in δ -hyperbolic space. Lemma 2.2.9.

Let X be δ -hyperbolic, let γ be a geodesic segment/ray/line in X ,and let p ∈ X . Then there is a point q on γ realizing the inﬁmum of distance from p to points on γ , and moreover for any two such points q, q ′ we have d X ( q, q ′ ) ≤ δ .Proof. The existence of some point realizing the inﬁmum follows from the proper-ness of d ( p, · ) : γ → R , valid for any geodesic in any metric space.Let q, q ′ be two such points, and if d ( q, q ′ ) > δ let q ′′ be the midpoint of thesegment qq ′ , so d ( q, q ′′ ) = d ( q ′′ , q ′ ) > δ . Without loss of generality there is r on pq with d ( r, q ′′ ) ≤ δ hence d ( r, q ) > δ . But then d ( p, q ′′ ) ≤ d ( p, r ) + d ( r, q ′′ ) ≤ d ( p, r ) + δ < d ( p, r ) + d ( r, q ) = d ( p, q )contrary to the fact that q minimizes the distance from p to points on γ . (cid:3) Lemma 2.2.9 says that there is an approximate “nearest point projection” map π from X to any geodesic γ (compare with Example 2.2.7). This map is not continu-ous, but nearby points must map to nearby points, in the sense that d ( π ( x ) , π ( y )) ≤ d ( x, y ) + 8 δ .We would now like to show that the property of being hyperbolic is preserved un-der quasi-isometry. The problem is that the property of δ -hyperbolicity is expressedin terms of geodesics, and quasi-isometries do not take geodesics to geodesics.A quasigeodesic segment/ray/line is the image of a segment/ray/line in R undera quasi-isometric map. For inﬁnite or semi-inﬁnite intervals this deﬁnition hascontent; for ﬁnite intervals this deﬁnition has no content without specifying theconstants involved. Hence we can talk about a K, ǫ quasigeodesic segment/ray/line.

Lemma 2.2.10 (Morse lemma) . Let

X, d X be a proper δ -hyperbolic space. Thenfor any K, ǫ there is a constant C (depending in an explicit way on K, ǫ, δ ) so thatany

K, ǫ quasigeodesic γ is within Hausdorﬀ distance C of a genuine geodesic γ g .If γ has one or two endpoints, γ g can be chosen to have the same endpoints.Proof. If γ is noncompact, it can be approximated on compact subsets by ﬁnitesegments γ i . If we prove the lemma for ﬁnite segments, then a subsequence of the γ gi , converging on compact subsets, will limit to γ g with the desired properties (hereis where we use properness of X ). So it suﬃces to prove the lemma for γ a segment.In this case choose any γ g with the same endpoints as γ . We need to estimatethe Hausdorﬀ distance from γ to γ g . Fix some constant C and suppose there arepoints p, p ′ on γ that are both distance C from γ g , but d ( r, γ g ) ≥ C for all r on γ between p and p ′ . Choose p i a sequence of points on γ and q i a sequence of pointson γ g closest to the p i so that d ( q i , q i +1 ) = 11 δ .Consider the quadrilateral p i p i +1 q i +1 q i . By Example 2.2.8 either there are closepoints on p i p i +1 and q i q i +1 , or close points on p i q i and p i +1 q i +1 (or possibly both).Suppose there are points r i on p i q i and r i +1 on p i +1 q i +1 with d ( r i , r i +1 ) ≤ δ . Thenany nearest point projections of r i and r i +1 to γ g must be at most distance 10 δ apart. But q i and q i +1 are such nearest point projections, by deﬁnition, and satisfy d ( q i , q i +1 ) = 11 δ . So it must be instead that there are points r i on p i p i +1 and s i on q i q i +1 which are at most 2 δ apart. But this means that d ( p i , p i +1 ) ≥ C − δ , DANNY CALEGARI so the length of γ between p and p ′ is at least (2 C − d ( q, q ′ ) / δ where q, q ′ arepoints on γ closest to p, p ′ . On the other hand, d ( p, p ′ ) ≤ C + d ( q, q ′ ). Since γ is a K, ǫ quasigeodesic, if d ( q, q ′ ) is big enough, we get a uniform bound on C in termsof K, ǫ, δ . The remaining case where d ( q, q ′ ) is itself uniformly bounded but C isunbounded quickly leads to a contradiction. (cid:3) Corollary 2.2.11.

Let Y be δ -hyperbolic and let f : X → Y be a K, ǫ quasi-isometry. Then X is δ ′ -hyperbolic for some δ . Hence the property of being hyperbolicis a quasi-isometry invariant.Proof. Let Γ be a geodesic triangle in X with vertices a, b, c . Then the edges of f (Γ) are K, ǫ quasigeodesics in Y , and are therefore within Hausdorﬀ distance C ofgeodesics with the same endpoints. It follows that every point on f ( ab ) is withindistance 2 C + δ of f ( ac ) ∪ f ( bc ) and therefore every point on ab is within distance K (2 C + δ ) + ǫ of ac ∪ bc . (cid:3) The Morse Lemma lets us promote quasigeodesics to (nearby) geodesics. Thenext lemma says that quasigeodesity is a local condition.

Deﬁnition 2.2.12.

A path γ in X is a k -local geodesic if the subsegments of length ≤ k are geodesics. Similarly, γ is a k -local K, ǫ quasigeodesic if the subsegments oflength ≤ k are K, ǫ quasigeodesics.

Lemma 2.2.13 ( k -local geodesics) . Let X be a δ -hyperbolic geodesic space, and let k > δ . Then any k -local geodesic is K, ǫ quasigeodesic for

K, ǫ depending explicitlyon δ .More generally, for any K, ǫ there is a k and constants K ′ , ǫ ′ so that any k -local K, ǫ quasigeodesic is a K ′ , ǫ ′ quasigeodesic.Proof. Let γ be a k -local geodesic segment from p to q , and let γ g be any geodesicfrom p to q . Let r be a point on γ furthest from γ g , and let r be the midpoint ofan arc r ′ r ′′ of γ of length 8 δ . By hypothesis, r ′ r ′′ is actually a geodesic. Let s ′ and s ′′ be points on γ g closest to r ′ and r ′′ . The point r is within distance 2 δ eitherof γ g or of one of the sides r ′ s ′ or r ′′ s ′′ . If the latter, we would get a path from r to s ′ or s ′′ shorter than the distance from r ′ or r ′′ , contrary to the deﬁnition of r .Hence the distance from r to γ g is at most 2 δ , and therefore γ is contained in the2 δ neighborhood of γ g .Now let π : γ → γ g take points on γ to closest points on γ g . Since π moves pointsat most 2 δ , it is approximately continuous. Since γ is a k -local geodesic, the map π is approximately monotone; i.e. if p i are points on γ with d ( p i , p i +1 ) = k movingmonotonely from one end of γ to the other, then d ( π ( p i ) , π ( p i +1 )) ≥ k − δ and theprojections also move monotonely along γ . In particular, d ( p i , p j ) ≥ ( k − δ ) | i − j | and π is a quasi-isometry. The constants involved evidently depend only on δ and k , and the multiplicative constant evidently goes to 1 as k gets large.The more general fact is proved similarly, by using Lemma 2.2.10 to promotelocal quasigeodesics to local geodesics, and then back to global quasigeodesics. (cid:3) Hyperbolic groups.

Corollary 2.2.11 justiﬁes the following deﬁnition:

Deﬁnition 2.3.1.

A group G is hyperbolic if C S ( G ) is δ -hyperbolic for some δ forsome (and hence for any) ﬁnite generating set S . HE ERGODIC THEORY OF HYPERBOLIC GROUPS 9

Example . Free groups are hyperbolic, since their Cayley graphs (with respectto a free generating set) are trees which are 0-hyperbolic.

Example . Virtually free groups, being precisely the groups quasi-isometric totrees, are hyperbolic. A group quasi-isometric to a point or to R is ﬁnite or virtually Z respectively; such groups are called elementary hyperbolic groups ; all others are nonelementary . Example . Fundamental groups of closed surfaces with negative Euler charac-teristic are hyperbolic. By the uniformization theorem, each such surface can begiven a hyperbolic metric, exhibiting π as a cocompact group of isometries of thehyperbolic plane. Example . A Kleinian group is a ﬁnitely generated discrete subgroup of thegroup of isometries of hyperbolic 3-space. A Kleinian group G is is convex cocompact if it acts cocompactly on the convex hull of its limit set (in the sphere at inﬁnity).Such a convex hull is CAT( − Lemma 2.3.6 (invariant quasiaxis) . Let G be hyperbolic. Then there are ﬁnitelymany conjugacy classes of torsion elements (and therefore a bound on the order ofthe torsion) and there are constants K, ǫ so that for any nontorsion element g thereis a K, ǫ quasigeodesic γ invariant under g on which g acts as translation.Proof. Let g ∈ G be given. Consider the action of g on the Cayley graph C S ( G ).The action is simplicial, so p → d ( p, gp ) has no strict local minima in the interiorof edges, and takes integer values at the vertices (which correspond to elements of G ). It follows that there is some h for which d ( h, gh ) is minimal, and we can take h to be an element of G (i.e. a vertex). If d ( h, gh ) = k > δ then we can join h to gh by a geodesic σ and let γ = ∪ i g i σ . Note that g acts on γ by translation throughdistance k ; since this is the minimum distance that g moves points of G , it followsthat γ is a k -local geodesic (and therefore a K, ǫ quasigeodesic by Lemma 2.2.13).Note in this case that g has inﬁnite order.Otherwise there is h moved a least distance by g so that d ( h, gh ) ≤ δ . Since G acts cocompactly on itself, there are only ﬁnitely many conjugacy classes ofelements that move some point any uniformly bounded distance, so if g is torsionwe are done. If g is not torsion, its orbits are proper, so for any T there is an N so that d ( h, g N h ) > T ; choose T (and N ) much bigger than some ﬁxed (butbig) n . Let γ be a geodesic from h to g N h . Then for any 0 ≤ i ≤ n the geodesic g i γ has endpoints within distance 8 δn of the endpoints of γ . On the other hand, | γ | = T ≫ δn so γ contains a segment σ of length at least T − δn − O ( δ ) suchthat g i σ is contained in the 2 δ neighborhood of γ for 0 ≤ i ≤ n . To see this,consider the quadrilateral with successive vertices h , g N h , g i + N h and g i h . Twononadjacent sides must contain points which are at most 2 δ apart. Since N ≫ i ,the sides must be γ and g i γ . We ﬁnd σ and g i σ in the region where these twogeodesics are close.Consequently, for any p ∈ σ the sequence p, gp, · · · , g n p is a K, ǫ quasigeodesicfor some uniform

K, ǫ independent of n . In particular there is a constant C (in-dependent of n ) so that d ( p, g i p ) ≥ iC for 0 ≤ i ≤ n , and therefore the inﬁnitesequence g i p for i ∈ Z is an ( nC )-local K, ǫ quasigeodesic. Since

K, ǫ is ﬁxed, if n isbig enough, this inﬁnite sequence is an honest K ′ , ǫ ′ quasigeodesic invariant under g , by Lemma 2.2.13. Here K ′ , ǫ ′ depends only on δ and G , and not on g . (cid:3) Lemma 2.3.6 can be weakened considerably, and it is frequently important tostudy actions which are not necessarily cocompact on δ -hyperbolic spaces which arenot necessarily proper. The quasigeodesic γ invariant under g is called a quasiaxis .Quasiaxes in δ -hyperbolic spaces are (approximately) unique : Lemma 2.3.7.

Let G be hyperbolic, and let g have inﬁnite order. Let γ and γ ′ be g -invariant K, ǫ quasigeodesics (i.e. quasiaxes for g ). Then γ and γ ′ are aﬁnite Hausdorﬀ distance apart, and this ﬁnite distance depends only on K, ǫ and δ .Consequently the centralizer C ( g ) is virtually Z .Proof. Let p ∈ γ and p ′ ∈ γ ′ a closest point to p . Since g acts on both γ and γ ′ cocompactly, there is a constant C so that every point in γ or γ ′ is within C fromsome point in the orbit of p or p ′ . This implies that the Hausdorﬀ distance from γ to γ ′ is at most 2 C + d ( p, p ′ ); in particular, this distance is ﬁnite.Pick two points on γ very far away from each other; each is distance at most2 C + d ( p, p ′ ) from γ ′ , and therefore most of the geodesic between them is withindistance 2 δ of the geodesic between corresponding points on γ ′ . But γ and γ ′ are themselves K, ǫ quasigeodesic, and therefore uniformly close to these geodesics.Hence some points on γ are within a uniformly bounded distance of γ ′ , and thereforeall points on γ are.If h commutes with g , then h must permute the quasiaxes of g . Therefore h takes points on any quasiaxis γ for g to within a bounded distance of γ . Hence C ( g ), thought of as a subset of G , is quasiisometric to a quasiaxis (that is to say,to R ), and is therefore virtually Z . (cid:3) This shows that a hyperbolic group cannot contain a copy of Z ⊕ Z (or, for thatmatter, the fundamental group of a Klein bottle). This is more subtle than it mightseem; Z ⊕ Z can act freely and properly discontinuously by isometries on a proper δ -hyperbolic space — for example, as a parabolic subgroup of the isometries of H . Example . If M is a closed 3-manifold, then π ( M ) is hyperbolic if and only ifit does not contain any Z ⊕ Z subgroup. Note that this includes the possibility that π ( M ) is elementary hyperbolic (for instance, ﬁnite). This follows from Perelman’sGeometrization Theorem [31, 32].If g is an isometry of any metric space X , the translation length of g is the limit τ ( g ) := lim n →∞ d X ( p, g n p ) /n for some p ∈ X . The triangle inequality implies thatthe limit exists and is independent of the choice of p . Moreover, from the deﬁnition, τ ( g n ) = | n | τ ( g ) and τ ( g ) is a conjugacy invariant.Lemma 2.3.6 implies that for G acting on itself, τ ( g ) = 0 if and only if g hasﬁnite (and therefore bounded) order. Consequently a hyperbolic group cannotcontain a copy of a Baumslag–Solitar group; i.e. a group of the form BS ( p, q ) := h a, b | ba p b − = a q i . For, we have already shown hyperbolic groups do not contain Z ⊕ Z , and this rules out the case | p | = | q | , and if | p | 6 = | q | then for any isometricaction of BS ( p, q ) on a metric space, τ ( a ) = 0.By properness of C S ( G ) and the Morse Lemma, there is a constant N so thatfor any g ∈ G the power g N has an invariant geodesic axis on which it acts bytranslation. It follows that τ ( g ) ∈ Q , and in fact ∈ N Z ; this cute observation isdue to Gromov [20].2.4. The Gromov boundary.

Two geodesic rays γ, γ ′ in a metric space X are asymptotic if they are a ﬁnite Hausdorﬀ distance apart. The property of being HE ERGODIC THEORY OF HYPERBOLIC GROUPS 11 asymptotic is an equivalence relation, and the set of equivalence classes is the

Gromov boundary , and denoted ∂ ∞ X . If X is proper and δ -hyperbolic, and x isany basepoint, then every equivalence class contains a ray starting at x . For, if γ is a geodesic ray, and g i ∈ γ goes to inﬁnity, then by properness, any collection ofgeodesics xg i contains a subsequence which converges on compact subsets to a ray γ ′ . By δ -thinness each of the triangles xg g i is contained in a uniformly boundedneighborhood of γ , so the same is true of γ ′ ; in particular, γ ′ is asymptotic to γ . We give ∂ ∞ X the topology of convergence on compact subsets of equivalenceclasses. That is, γ i → γ if and only if every subsequence of the γ i contains afurther subsequence whose equivalence classes have representatives that convergeon compact subsets to some representative of the equivalence class of γ . Lemma 2.4.1.

Let X be a δ -hyperbolic proper geodesic metric space. Then ∂ ∞ X is compact.Proof. If γ i is any sequence of rays, and γ ′ i is an equivalent sequence starting at abasepoint x , then by properness γ ′ i has a subsequence which converges on compactsubsets. (cid:3) In fact, we can deﬁne a (compact) topology on X := X ∪ ∂ ∞ X by saying that x i → γ if and only if every subsequence of a sequence of geodesics xx i contains afurther subsequence which converges on compact subsets to a representative of γ .With this topology, X is compact, ∂ ∞ X is closed in X , and the inclusion of X into X is a homeomorphism onto its image.A bi-inﬁnite geodesic γ determines two (distinct) points in ∂ ∞ X ; we call thesethe endpoints of γ . Two geodesics with the same (ﬁnite or inﬁnite) endpoints areHausdorﬀ distance at most 2 δ apart. Conversely, any two distinct points in ∂ ∞ X are spanned by an inﬁnite geodesic γ . For, if γ , γ are two inﬁnite rays (starting ata basepoint x for concreteness), and g i , h i are points on γ , γ respectively going toinﬁnity, some point p i on any geodesic g i h i is within δ of both xg i and xh i , and if p i → ∞ then γ and γ would be a ﬁnite Hausdorﬀ distance apart. Otherwise somesubsequence of the p i converges to p , and the geodesics g i h i converge on compactsubsets to a (nonempty!) bi-inﬁnite geodesic γ through p asymptotic to both γ and γ . Evidently, geodesic triangles with some or all endpoints at inﬁnity are δ ′ -thinfor some δ ′ depending only on δ (one can take δ ′ = 20 δ ). By abuse of notation, inthe sequel we will call a metric space δ -hyperbolic if all geodesic triangles — eventhose with some endpoints at inﬁnity — are δ -thin.Let X, Y be hyperbolic geodesic metric spaces. Then any quasi-isometric map φ : X → Y extends uniquely to a continuous map ∂ ∞ X → ∂ ∞ Y . In particular,the Gromov boundary ∂ ∞ X depends (up to homeomorphism) only on the quasi-isometry type of X , and QI( X ) acts on ∂ ∞ X by homeomorphisms.If G is a hyperbolic group, we deﬁne ∂ ∞ G to be the Gromov boundary of some(any) C S ( G ). Example . If G is free, ∂ ∞ G is a Cantor set. If G is a π of a closed surfacewith negative Euler characteristic, ∂ ∞ G is a circle. If G is a convex cocompactKleinian group, ∂ ∞ G is homeomorphic to the limit set. For example, if G is thefundamental group of a hyperbolic 3-manifold with totally geodesic boundary, ∂ ∞ G is a Sierpinski carpet. Figure 4.

The Sierpinski carpet and the Menger sponge.In fact, a theorem of Kapovich–Kleiner [25] says that if G is a hyperbolic groupwhich does not split over a ﬁnite or virtually cyclic subgroup, and if ∂ ∞ G is 1-dimensional (in the topological sense of dimension), then ∂ ∞ G is homeomorphic tothe circle, the Sierpinski carpet, or the Menger sponge.Evidently, ∂ ∞ G is empty if and only if G is ﬁnite, and if ∂ ∞ G is nonempty, ithas at least two points, and has exactly two points if and only if G is itself quasi-isometric to the geodesic joining these two points, which holds if and only if G isvirtually Z .If g ∈ G has inﬁnite order, a quasiaxis γ is asymptotic to two points p ± ∈ ∂ ∞ G .Under (positive) powers of g , points stay a constant distance from γ , and movetowards one of the endpoints, say p + . As homeomorphisms from X to itself, theelements g n with n → ∞ converge uniformly (in the compact-open topology) on X − p − to the constant map to p + . We call p + the attracting endpoint and p − the repelling endpoint of g ; the actions of g on ∂ ∞ G is sometimes expressed by sayingthat it has source-sink dynamics . Example . Let g, h ∈ G be of inﬁnite order, with quasiaxes γ and γ ′ . If γ and γ ′ share an endpoint (without loss of generality the attracting endpoint of each)and p is close to both γ and γ ′ , then there are n i , m i → ∞ for which d ( h − m i g n i p, p )is bounded. Since the action of G on its Cayley graph is properly discontinuously,it follows that there are distinct i, j with h − m i g n i = h − m j g n j so that h m = g n forsome positive n, m . In particular, in this case g and h together generate a virtual Z subgroup, and their quasiaxes have the same endpoints. Otherwise the endpointsare disjoint, and because of the source-sink dynamics, Klein’s pingpong argumentimplies that suﬃciently large powers g n , h m generate a (nonabelian) free subgroupof G . Lemma 2.4.4.

Suppose G is nonelementary. Then the action of G on ∂ ∞ G isminimal; i.e. every orbit is dense. Consequently ∂ ∞ G is inﬁnite and perfect. HE ERGODIC THEORY OF HYPERBOLIC GROUPS 13

Proof. If G is nonelementary, there are g, h whose quasiaxes have distinct endpoints p ± and q ± respectively. If r ∈ ∂ ∞ G is arbitrary, then either g n r → p + or g n hr → p + ; it follows that every attracting/repelling point is in the closure of every orbit.Now let γ be a geodesic from p − to p + and let γ ′ be a geodesic ray asymptoticto r . Pick s on γ and let g i be a sequence of elements with g i ( s ) ∈ γ ′ convergingto r . At most one component of γ − s can come close to the basepoint x . Hencethere is some subsequence so that either g i p + → r or g i p − → r , and therefore everypoint is in the closure of the orbit of some attracting/repelling point. This provesthe lemma. (cid:3) Another way to see the compactiﬁcation ∂ ∞ X is in terms of (equivalence classesof) horofunctions . Deﬁnition 2.4.5 (horofunction) . Let γ be a geodesic ray parameterized by length.The horofunction (also called the Busemann function ) associated to γ is the limit b γ ( x ) := lim t →∞ d X ( x, γ ( t )) − t The level sets of horofunctions are called horospheres .This limit exists and is ﬁnite, by the triangle inequality. Moreover, it is 1-Lipschitz. If γ and γ ′ are asymptotic, then there is some constant C ( γ, γ ′ ) so that | ( b γ − b γ ′ ) − C | ≤ δ . If x is the endpoint of γ , we let b x denote any horofunctionof the form b γ , and say that b x is centered at x .Here is another way to deﬁne b γ without reference to γ . On any proper metricspace, the set of 1-Lipschitz functions mod constants is compact (in the topologyof convergence on compact subsets). For any x ∈ X the function d X ( x, · ) : X → R is 1-Lipschitz, and x → d X ( x, · ) embeds X in the space of 1-Lipschitz functionson X mod constants. The closure of this image deﬁnes a natural compactiﬁcationof X ; quotienting further by bounded functions gives X . For each x ∈ ∂ ∞ X thepreimage is the set of equivalence classes of functions b x . In this way we think of b x as a normalization of the function which measures “distance to x ”.The space ∂ ∞ X can be metrized following Gromov (see [20]). Deﬁnition 2.4.6.

Fix some basepoint x and some constant a >

1. The a -length of a rectiﬁable path γ in X is the integral along γ of a − d X ( x, · ) , and the a -distance from y to z , denoted d aX ( y, z ), is the inﬁmum of the a -lengths of paths between y and z .A straightforward calculation shows that there is an a > a < a ,the a -length deﬁnes a metric on X . In fact, any a with δ log( a ) ≪ a is too big, a -length still extends to a pseudo-metric on X , but now distinct pointsof ∂ ∞ X might be joined by a sequence of paths with a -length going to 0. Increasing a will decrease the Hausdorﬀ dimension of ∂ ∞ X ; of course, the Hausdorﬀ dimensionmust always be at least as big as the topological dimension. In any case, it followsthat ∂ ∞ X is metrizable.The following lemma is useful to compare length and a -length. Lemma 2.4.7.

For a < a there is a constant λ so that for all points y, z ∈ ∂ ∞ X there is an inequality λ − a − d X ( x,yz ) ≤ d aX ( y, z ) ≤ λa − d X ( x,yz ) where d X ( x, yz ) isthe ordinary distance from the basepoint x to the geodesic yz . For a proof, see [13].The quantity d X ( x, yz ) is sometimes abbreviated by ( y | z ) (the basepoint x issuppressed in this notation), and called the Gromov product . So we can also write λ − a − ( y | z ) ≤ d aX ( y, z ) ≤ λa − ( y | z ) . Because of this inequality, diﬀerent choices of a give rise to H¨older equivalent metrics on ∂ ∞ X . If X is a group G , we take id asthe basepoint, by convention. Remark . With our notation, ( y | z ) := d X ( x, yz ) is ambiguous, since it dependson a choice of geodesic from y to z . Since we only care about ( y | z ) up to a uniformadditive constant, we ignore this issue. One common normalization, adopted byGromov, is to use the formula ( y | z ) := ( d X ( x, y ) + d X ( x, z ) − d X ( y, z )). Thesedeﬁnitions are interchangeable for our purposes, as the ambiguity can always beabsorbed into some unspeciﬁed constant.A group G acting by homeomorphisms on a compact metrizable space M is saidto be a convergence action if the induced action on the space M − ∆ of distinctordered triples is properly discontinuous. Lemma 2.4.9.

The action of G on ∂ ∞ G is a convergence action. Moreover, theaction on the space of distinct triples is cocompact.Proof. If x, y, z is a distinct triple of points in ∂ ∞ G , there is a point p withindistance δ of all three geodesics xy, yz, zx ; moreover, the set of such points hasuniformly bounded diameter in G . This deﬁnes an approximate map from distincttriples to points in G . Since the action of G on itself is cocompact, the same is truefor the action on the space of distinct triples. Similarly, if the action of G on thespace of distinct triples were not properly discontinuous, we could ﬁnd two boundedregions in G and inﬁnitely many g i in G taking some point in one bounded regionto some point in the other, which is absurd. (cid:3) The converse is a famous theorem of Bowditch:

Theorem 2.4.10 (Bowditch’s convergence theorem [3] Thm. 0.1) . Let G act faith-fully, properly discontinuously and cocompactly on the space of distinct triples ofsome perfect compact metrizable space M . Then G is hyperbolic and M is G -equivariantly homeomorphic to ∂ ∞ G . Patterson–Sullivan measure.

The results in this section are due to Coor-naert [12], although because of our more narrow focus we are able to give somewhatdiﬀerent and shorter proofs. However by and large our proofs, like Coornaert’s, areobtained by directly generalizing ideas of Sullivan [43] in the context of Kleiniangroups.Let G be a hyperbolic group, and let G ≤ n denote the set of elements of (word)length ≤ n , with respect to some ﬁxed generating set. The critical exponent h ( G )(also called the volume entropy of G ) is the quantity h ( G ) := lim sup n →∞ n log | G ≤ n | in other words, the exponential growth rate of G . Since every nonelementary hy-perbolic group contains many free groups (Example 2.4.3), h ( G ) = 0 if and only if G is elementary. HE ERGODIC THEORY OF HYPERBOLIC GROUPS 15

Deﬁne the (Poincar´e) zeta function by the formula ζ G ( s ) := X g ∈ G e − s | g | Then ζ G ( s ) diverges if s < h ( G ) and converges if s > h ( G ). Lemma 2.5.1.

The zeta function diverges at s = h ( G ) .Proof. We will show in § G and anygenerating set S there is a regular language L ⊂ S ∗ consisting of geodesics, whichevaluates bijectively to G . In particular, | G ≤ n | = | L ≤ n | for any n . In any regularlanguage L the generating function P | L ≤ n | t n is rational (Theorem 3.1.3); i.e. itis the power series expansion of p ( t ) /q ( t ) for some integral polynomials p, q , andconsequently C − ( e hn n k ) ≤ | L ≤ n | ≤ C ( e hn n k ) for some real h and non-negativeinteger k , and constant C . Evidently, for L as above, h = h ( G ) and the zetafunction diverges at h . (cid:3) For s > h ( G ) construct a probability measure ν s on G (i.e. on G ∪ ∂ ∞ G ) sup-ported in G , by putting a Dirac mass of size e − s | g | /ζ G ( s ) at each g ∈ G . As s converges to h from above, this sequence of probability measures contains a sub-sequence which converges to a limit ν . Since the zeta function diverges at h , thelimit ν is supported on ∂G . This measure is called a Patterson–Sullivan measure,by analogy with the work of Patterson and Sullivan [30, 43] on Kleinian groups.For any g , the pushforward of measure g ∗ ν s is deﬁned by g ∗ ν s ( A ) = ν s ( g − A ),and similarly for g ∗ ν . For any g, g ′ there is an inequality | g ′ |−| g | ≤ | gg ′ | ≤ | g ′ | + | g | .From the deﬁnition on ν s , this implies that g ∗ ν s is absolutely continuous withrespect to ν s , and its Radon–Nikodym derivative satisﬁes e − s | g | ≤ d ( g ∗ ν s ) /dν s ≤ e s | g | . Passing to a limit we deduce that e − h | g | ≤ d ( g ∗ ν ) /dν ≤ e h | g | .The most important property of the measure ν is a reﬁnement of this inequality,which can be expressed by saying that it is a so-called quasiconformal measure ofdimension h . The “conformal” structure on ∂ ∞ X is deﬁned using the a -distancefor some ﬁxed a > Deﬁnition 2.5.2 (Coornaert) . For g ∈ G deﬁne j g : ∂ ∞ X → R by j g ( y ) = a b y (id) − b y ( g ) for some horofunction b y centered at y . A probability measure ν on ∂ ∞ X is a quasiconformal measure of dimension D if g ∗ ν is absolutely continuous with respectto ν for every g ∈ G , and there is some constant C independent of g so that C − j g ( y ) D ≤ d ( g ∗ ν ) /dν ≤ Cj g ( y ) D Notice that the ambiguity in the choice of horofunction b y is absorbed into thedeﬁnition of j g (which only depends on b y mod constant functions) and the constant C . The support of any quasiconformal measure is evidently closed and G -invariant,so by Lemma 2.4.4, it is all of ∂ ∞ G .From the deﬁnition of the Radon–Nikodym derivative, ν is a quasiconformalmeasure of dimension D if there is a constant C so that for all y we can ﬁnd aneighborhood V of y in X for which C − j g ( y ) D ν ( A ) ≤ ν ( g − A ) ≤ Cj g ( y ) D ν ( A )for all A ⊂ V . Remark . For some reason, Coornaert chooses to work with pullbacks of mea-sure g ∗ ν := g − ∗ ν instead of pushforward. Therefore the roles of g and g − aregenerally interchanged between our discussion and Coornaert’s. Theorem 2.5.4 (Coornaert [12] Thm. 5.4) . The measure ν is a quasiconformalmeasure of dimension D where D = h/ log a .Proof. Evidently the support of ν is G -invariant, and is therefore equal to all of ∂ ∞ G . Let y ∈ ∂ ∞ X , let b y be a horofunction centered at y , and let g ∈ G .By δ -thinness and the deﬁnition of a horofunction, d ( g, z ) − d (id , z ) is close to b y ( g ) − b y (id) for z suﬃciently close to y . In particular, there is a neighborhood V of y in X so that | g − z | − | z | − C ≤ b y ( g ) − b y (id) ≤ | g − z | − | z | + C for some C , and for all z in V .For each s > h we have g ∗ ν s ( z ) /ν s ( z ) = ν s ( g − z ) /ν s ( z ) = e − s ( | g − z |−| z | ) . Takingthe limit as s → h and deﬁning D by a D = e h proves the theorem. (cid:3) To make use of this observation, we introduce the idea of a shadow , followingSullivan.

Deﬁnition 2.5.5.

For g ∈ G and R a positive real number, the shadow S ( g, R ) isthe set of y ∈ ∂ ∞ G such that every geodesic ray from id to y comes within distance R of g .Said another way, y is in S ( g, R ) if g comes within distance R of any geodesicfrom id to y . Given R > δ , for any ﬁxed n the shadows S ( g, R ) with | g | = n cover ∂ ∞ G eﬃciently: Lemma 2.5.6.

Fix R . Then there is a constant N so that for any y ∈ ∂ ∞ G andany n there is at least and there are at most N elements g with | g | = n and y ∈ S ( g, R ) .Proof. If R > δ , if γ is any geodesic from id to y , and if g is any point on γ ,then y ∈ S ( g, R ). Conversely, if g and h are any two elements with | g | = | h | and y ∈ S ( g, R ) ∩ S ( h, R ) then d ( g, h ) ≤ R . (cid:3) Sullivan’s fundamental observation is that the action of g − on S ( g, R ) is uni-formly close to being linear , in the sense that the derivative d ( g ∗ ν ) /dν varies by abounded multiplicative constant on S ( g, R ): Lemma 2.5.7.

Fix R . Then there is a constant C so that for any y ∈ S ( g, R ) there is an inequality C − a | g | ≤ j g ( y ) ≤ Ca | g | Proof.

Recall j g ( y ) = a b y (id) − b y ( g ) for some horofunction b y . But by δ -thinness andthe deﬁnition of a shadow, there is a constant C ′ so that | g | − C ′ ≤ b y (id) − b y ( g ) ≤ | g | + C ′ for any y in S ( g, R ). (cid:3) From this one readily obtains a uniform estimate on the measure of a shadow:

HE ERGODIC THEORY OF HYPERBOLIC GROUPS 17

Lemma 2.5.8.

Fix R . Then there is a constant C so that for any g ∈ G there isan inequality C − a −| g | D ≤ ν ( S ( g, R )) ≤ Ca −| g | D Proof.

Let m < ν , and ﬁx m < m < ∂ ∞ G there is some ǫ so that every ball in ∂ ∞ G of diameter ≤ ǫ has mass at most m . Now, g − S ( g, R ) is the set of y ∈ ∂ ∞ G for whichevery geodesic ray from g − to y comes within distance R of id. As R → ∞ , thediameter of ∂ ∞ G − g − S ( g, R ) goes to zero uniformly in g (this follows from thequasi-equivalence of d aX ( y, z ) and a − ( y | z ) ; see Lemma 2.4.7). Consequently there issome R so that for all R ≥ R the measure ν ( g − S ( g, R )) is between 1 − m and1, independent of g .Now, by Lemma 2.5.7 and the deﬁnition of a quasiconformal measure, there is aconstant C so that C − a | g | D ≤ ν ( g − S ( g, R )) /ν ( S ( g, R )) ≤ C a | g | D Taking reciprocals, and using the fact that 1 − m ≤ ν ( g − S ( g, R )) ≤ (cid:3) Note that the argument shows that ν has no atoms, since any y ∈ ∂ ∞ G iscontained in some shadow of measure ≤ Ca − Dn for any n . We deduce the followingcorollary. Corollary 2.5.9 (Coornaert [12] Thm. 7.2) . Let G be a hyperbolic group. Thenthere is a constant C so that C − e hn ≤ | G ≤ n | ≤ Ce hn for all n .Proof. The lower bound is proved in §

3, so we just need to prove the upper bound.For each g with | g | = n Lemma 2.5.8 says e − hn = a − Dn ≤ C ν ( S ( g, R )). Onthe other hand, Lemma 2.5.6 says that every point y ∈ ∂ ∞ G is in at most N sets S ( g, R ) with | g | = n , so | G n | e − hn C − ≤ X | g | = n ν ( S ( g, R )) ≤ N ν ( ∪ | g | = n S ( g, R )) = N (cid:3) Corollary 2.5.9 has important consequences that we will explore in § ∂ ∞ G .An action of a group G on a space X is said to be ergodic for some measure ν on X if for any two subsets A , B of X with ν ( A ) , ν ( B ) > g ∈ G with ν ( g ( A ) ∩ B ) > Corollary 2.5.10 (Coornaert [12] Cor. 7.5 and Thm. 7.7) . Let ν be a quasiconfor-mal measure on ∂ ∞ G of dimension D . Then ν is quasi-equivalent to D -dimensionalHausdorﬀ measure; i.e. there is a constant C so that C − H D ( A ) ≤ ν ( A ) ≤ CH D ( A ) for any A . In particular, the space ∂ ∞ G has Hausdorﬀ dimension D ,and its D -dimensional Hausdorﬀ measure is ﬁnite and positive. Moreover, the ac-tion of G on ∂G is ergodic for ν . Proof.

Evidently, the second and third claims follow from the ﬁrst (if A is a G -invariant subset of ∂ ∞ G of positive ν -measure, the restriction of ν to A is a qua-siconformal measure of dimension D , and is therefore quasi-equivalent to H D andthence to ν . In particular, A has full measure). So it suﬃces to show that ν and H D are quasi-equivalent.Since C − a − ( y | z ) ≤ d aX ( y, z ) ≤ C a − ( y | z ) it follows that every metric ball B ( y, r )in ∂ ∞ G can be sandwiched between two shadows S ( g , R ) ⊂ B ( y, r ) ⊂ S ( g , R )with a −| g | ≥ r/C and a −| g | ≤ rC . From Lemma 2.5.8 we obtain C − r D ≤ ν ( B ( y, r )) ≤ C r D . From this and the deﬁnition of Hausdorﬀ measure, we willobtain the theorem.If A is any measurable set, cover A by balls U i of radius ǫ i ≤ ǫ . Then ν ( A ) ≤ ν ( ∪ i U i ) ≤ X i ν ( U i ) ≤ C X ǫ Di so letting ǫ → ν ( A ) ≤ C H D ( A ).The following proof of the reverse inequality was suggested to us by Curt Mc-Mullen. For any δ let K be compact and U open so that K ⊂ A ⊂ U and both ν ( U − K ) and H D ( U − K ) are less than δ . By compactness, there is an ǫ so thatevery ball of radius ≤ ǫ centered at a point in K is contained in U . Now inductivelycover K by balls U , U , · · · of non-increasing radius ǫ i ≤ ǫ in such a way that thecenter of each U i is not in U j for and j < i . Then the balls with the same centersand half the radii are disjoint, so X ǫ Di = 2 D X ( ǫ i / D ≤ C ν ( U )and therefore H D ( K ) ≤ C ν ( U ). Taking δ → H D ( A ) ≤ C ν ( A ) and we aredone. (cid:3) Remark . Coornaert only gives the proof of the inequality ν ( A ) ≤ CH D ( A ) inhis paper, referring the reader to Sullivan [43] for the proof of C − H D ( A ) ≤ ν ( A ).However, there is a gap in Sullivan’s proof of the reverse inequality, of which thereader should be warned. Remark . The approximate linearity of g − on S ( g, R ) has many other ap-plications. For example, see the proof of Theorem 1 in [42].3. Combings

On a Riemannian manifold, a “geodesic” is just a smooth path that locally min-imizes length (really, energy). A suﬃciently long geodesic is typically not globallylength minimizing, and the entire subject of Morse theory is devoted to the dif-ference. By contrast, one of the most important qualitative features of negativecurvature is that (quasi)-geodesity is a local property (i.e. Lemma 2.2.13).This localness translates into an important combinatorial property, known tech-nically as ﬁniteness of cone types . This is the basis of Cannon’s theory of hyperbolicgroups, and for the more general theory of automatic groups and structures (see[15] for more details).3.1.

Regular languages.

Let S be a ﬁnite set, and let S ∗ denote the set of ﬁnitewords in the alphabet S . An automaton is a ﬁnite directed graph Γ with a distin-guished initial vertex, and edges labeled by elements of S in such a way that eachvertex has at most one outgoing edge with a given label. Some subset of the vertices HE ERGODIC THEORY OF HYPERBOLIC GROUPS 19 of Γ are called accept states . A word w is S ∗ determines a simplicial path in Γ, bystarting at the initial vertex, by reading the letters of w (from left to right) one byone, and by moving along the corresponding edge of Γ if it exists, or halting if not.Associated to Γ there is a subset L ⊂ S ∗ consisting of precisely those words thatcan be read in their entirety without halting, and for which the terminal vertex ofthe associated path ends at an accept state. One says that L is parameterized by(paths in) Γ. Deﬁnition 3.1.1.

A subset L ⊂ S ∗ is a regular language if there is a ﬁnite directedgraph Γ as above that parameterizes L .Note that Γ is not part of the data of a regular language, and for any givenregular language there will be many graphs that parameterize it. A language is preﬁx-closed if, whenever w ∈ L , every preﬁx of w is also in L (the empty word isa preﬁx of every word). Lemma 3.1.2. If L is preﬁx-closed and regular, there is a Γ parameterizing L forwhich every vertex is an accept state.Proof. If Γ is any graph that parameterizes L , remove all non-accept vertices andthe edges into and out of them. (cid:3) Theorem 3.1.3 (Generating function) . Let L be a regular language, and for each n , let L n denote the set of elements of length n , and L ≤ n the set of elements oflength ≤ n . Let s ( t ) := P | L n | t n and b ( t ) := P | L ≤ n | t n be (formal) generatingfunctions for | L n | and | L ≤ n | respectively. Then s ( t ) and b ( t ) are rational ; i.e. theyagree as power series expansions with some ratio of integral polynomials in t .Proof. Note that b ( t ) = s ( t ) / (1 − t ) so it suﬃces to prove the theorem for s ( t ). LetΓ parameterize L , and let M be the adjacency matrix of Γ; i.e. M ij is equal tothe number of edges from vertex i to vertex j . Let v be the vector with a 1 inthe initial state, and 0 elsewhere, and let v a be the vector with a 1 in every acceptstate, and 0 elsewhere. Then | L n | = v T M n v a .A formal power series A ( t ) := P a n t n is rational if and only if its coeﬃcientssatisfy a linear recurrence; i.e. if there are constants c , · · · , c d (not all zero) so that c a n + c a n − + · · · + c d a n − d = 0 for all n ≥ d . For, A ( t )( c + c t + · · · + c d t d ) vanishesin degree ≥ d , and is therefore a polynomial (reversing this argument proves theconverse).If p ( t ) = P p i t d − i is the characteristic polynomial of M , then p ( M ) = 0, and0 = v T M n − d p ( M ) v a = p | L n | + p | L n − | + · · · + p d | L n − d | proving the theorem. (cid:3) Another way of expressing s ( t ), more useful in some ways, is as follows. Proposition 3.1.4.

Let L be a regular language. Then there is an integer D sothat for each value of n mod D either | L n | is eventually zero, or there are ﬁnitelymany constants λ i and polynomials p i so that | L n | = p ( n ) λ n + · · · + p k ( n ) λ nk forall suﬃciently large n . For a proof see e.g. [18] Thm. V.3. In particular, either | L ≤ n | has polynomialgrowth, or C − ( n k λ n ) ≤ | L ≤ n | ≤ C ( n k λ n ) for some real λ and integer k , andconstant C . Cannon’s theorem.

Let S be a set. A total order ≺ on S extends to a uniquelexicographic (or dictionary) order on S ∗ as follows:(1) the empty word precedes everything;(2) if u and v are both nonempty and start with diﬀerent letters s, t ∈ S then u ≺ v if and only if s ≺ t ; and(3) if u ≺ v and w is arbitrary, then wu ≺ wv .If G is a group and S is a generating set for G , there are ﬁnitely many geodesic wordsrepresenting any given element; the lexicographically ﬁrst geodesic is therefore acanonical representative for each element of g , and determines a language L ⊂ S ∗ that bijects with G under evaluation. We denote evaluation by overline, so if u ∈ S ∗ ,we denote the corresponding element of G by u . We similarly denote length of anelement of S ∗ by | · | . So we always have | u | ≤ | u | with equality if and only if u isgeodesic.Given g ∈ G the cone type of g , denoted cone( g ), is the set of h ∈ G for whichsome geodesic from id to gh passes through g . For any n , the n -level of g is theset of h in the ball B n (id) such that | gh | < | g | . Cannon showed that the n level(for n suﬃciently large) determines the cone type, and therefore that there are only ﬁnitely many cone types . Lemma 3.2.1 (Cannon [10] Lem. 7.1 p. 139) . The δ + 1 level of an elementdetermines its cone type.Proof. Let g and h have the same 2 δ + 1 level, and let u, v be geodesics with u = g and v = h . Only id has an empty 2 δ + 1 level, so we may assume u, v both havelength ≥

1. We prove the lemma by induction. Suppose uw , vw and uws aregeodesics, where s ∈ S . We must show that vws is a geodesic. Suppose to thecontrary that there is some w w = vws where | w | = | v | − | w | ≤ | w | + 1.Then h − w is in the 2 δ + 1 level of h , which agrees with the 2 δ + 1 level of g ,and therefore | gh − w | < | g | . But then concatenating a geodesic representative of gh − w with w gives a shorter path to uws , certifying that uws is not geodesic,contrary to assumption. (cid:3) g hw w vwsuws Figure 5.

A shortcut w from the 2 δ + 1 level of h to vws gives ashortcut from the 2 δ + 1 level of g to uws . This ﬁgure is adaptedfrom [15]. HE ERGODIC THEORY OF HYPERBOLIC GROUPS 21

The following theorem is implicit in [10], though expressed there in somewhatdiﬀerent language.

Theorem 3.2.2 (Cannon [10]) . Let G be a hyperbolic group, and S a symmetricgenerating set. Fix a total order ≺ on S . Then the language of lexicographicallyﬁrst geodesics is preﬁx-closed and regular.Proof. That this language is preﬁx-closed is obvious. We show it is regular bydescribing an explicit parameterizing graph.As a warm up, we show ﬁrst that the language of all geodesics is regular. Aparameterizing graph can be taken as follows. The vertices (all accept states) areprecisely the set of cone types, and there is an edge labeled s from a cone type of theform cone( g ) to one of the form cone( gs ) whenever | gs | = | g | + 1. By the deﬁnitionof cone types, this is well-deﬁned. By Lemma 3.2.1, the number of cone types isﬁnite, so this is a ﬁnite graph. By construction, this graph exactly parameterizesthe language of all geodesics.Now ﬁx a total order ≺ on S . For each g ∈ G , let u g be the lexicographicallyﬁrst geodesic from id to g . For each g ∈ G a competitor of g is some h with | h | = | g | ,with u h ≺ u g , and for which d ( u h | ≤ i , u g | ≤ i ) ≤ δ for all i , where u g | ≤ i denotes thepreﬁx of u g of length i , and similarly for u h | ≤ i (this is described by saying that u h synchronously fellow-travels u g ).If there is some g ′ with | g ′ | = | g | + d ( g, g ′ ) and | g ′ | = | h | + d ( h, g ′ ) then by δ -thinness and the deﬁnition of geodesics, u h synchronously fellow-travels u g . Itfollows that for all g ∈ G and s ∈ S we have u gs = u g s if and only if u g s is ageodesic, and there is no competitor h of g and s ′ ∈ S so that hs ′ = gs .Given g ∈ G deﬁne C ( g ) ⊂ B δ (id) to be the set of h for which gh is a competitorof g . Associated to g is the list L ( g ) of pairs ( h ∈ C ( g ) , cone( gh )) together with thecone type of g itself. Note that the set of possible lists L ( g ) is ﬁnite . We can nowdeﬁne a parameterizing graph by taking the vertices (all accept states) to be thepossible lists L ( g ), and there is an edge labeled s from a list of the form L ( g ) to alist of the form L ( gs ) if and only if | gs | = | g | +1, and there is no h ∈ C ( g ) and s ′ ∈ S with ghs ′ = gs . This is evidently a ﬁnite directed graph, which parameterizes the u g ; we must show it is well-deﬁned.First of all, h ∈ C ( gs ) if and only if one of the two following possibilities occurs:(1) there is some h ′ ∈ C ( g ) and s ′ ∈ S ∩ cone( gh ′ ) with gh ′ s ′ = gsh ; or(2) there is some s ′ ≺ s in S ∩ cone( g ) with gs ′ = gsh .Both of these possibilities depend only on C ( g ), cone( g ) or cone( gh ′ ) for some h ′ ∈ C ( g ), and not on g itself. Second of all, if h ∈ C ( gs ), then cone( gsh ) dependsonly on cone( gh ′ ) and s ′ in the ﬁrst case, and on cone( g ) and s ′ in the second case.This shows the graph is well-deﬁned, and completes the proof of the theorem. (cid:3) Remark . This completes the proof of Lemma 2.5.1, and the subsequent resultsin § Combings and combable functions.Deﬁnition 3.3.1.

Let G be a group, and S a generating set. A combing for G (with respect to S ) is a preﬁx-closed regular language L ⊂ S ∗ which bijects with G under evaluation, and satisﬁes | u | = | u | for all u ∈ L (i.e. L is a language ofgeodesics). Theorem 3.2.2 says that every hyperbolic group admits a combing. If L is acombing with respect to S , the L -cone type of g , denoted cone L ( g ), is the set of h ∈ G for which the L -geodesic evaluating to gh contains a preﬁx (which is also an L -geodesic) evaluating to g . There is a graph Γ parameterizing L with one vertexfor each L -cone type, and an edge from cone L ( g ) to cone L ( gs ) labeled s whenever s ∈ cone L ( g ). Remark . The reader should be warned that many competing deﬁnitions ofcombing exist in the literature.Suppose L is a combing of G , and Γ is a graph parameterizing L , so that thereis a (length-preserving) bijection between directed paths in Γ starting at the initialvertex, and words of L , by reading the edge labels of the path. If u ∈ L , we let γ ( u )denote the corresponding path in Γ, and γ ( u ) i the successive vertices in Γ visitedby γ ( u ). Deﬁnition 3.3.3.

A function φ : G → Z is weakly combable with respect to acombing L if there is a graph Γ parameterizing L and a function dφ from thevertices of Γ to Z so that φ ( u ) = P i dφ ( γ ( u ) i ) for all u ∈ L .A function φ is combable if it is weakly combable with respect to some combing L , and if there is a constant C so that | φ ( gs ) − φ ( g ) | ≤ C for all g ∈ G and s ∈ S ;and it is bicombable if it is combable, and further satisﬁes | φ ( sg ) − φ ( g ) | ≤ C . Remark . It might be more natural to deﬁne a function dφ on the edges of Γinstead of its vertices; however, associated to any directed graph Γ there is anothergraph — the line graph of Γ — whose vertices are the edges of Γ, and whose edgesare the composable pairs of edges of Γ, and the line graph of Γ parameterizes L ifΓ does. Lemma 3.3.5 (Calegari–Fujiwara [8] Lem. 3.8) . The property of being combableor bicombable does not depend on the choice of a generating set or a combing.

The proof proceeds along the same lines as Theorem 3.2.2. The key point isthat words in L are (uniformly) quasigeodesic with respect to S ′ , and thereforestay within a bounded distance of words in L ′ with the same evaluation. Thereforean automaton reading the letters of an L ′ -word can keep track of the states of acollection of automata simultaneously reading nearby L -words, and keeping trackof how φ changes as one goes along. See [8] for details. Example . Word length in any generating set is bicombable. In fact, if S isa (possibly unsymmetric) set which generates G as a semigroup, word length in S is bicombable. One can generalize word length by giving diﬀerent generators (andcorresponding edges in the Cayley graph) diﬀerent lengths; providing the lengthsare all integral and positive, the resulting (geodesic) word length is bicombable. Example . The sum or diﬀerence of two (bi)combable functions is (bi)combable.

Example . The following deﬁnition is due to Epstein–Fujiwara [16] (also see[5]). Let σ be a path in C S ( G ). A copy of σ is a translate gσ for some g ∈ G . Givena path γ in c S ( G ), deﬁne c σ ( γ ) to be the maximal number of disjoint copies of σ in γ , and for g ∈ G deﬁne the small counting function c σ : G → Z by the formula c σ ( g ) = | g | − inf γ ( | γ | − c σ ( γ )) HE ERGODIC THEORY OF HYPERBOLIC GROUPS 23

Counting functions are bicombable. In fact, we can add σ to S as a (semigroup)generator, but insist that the (directed) edges labeled σ have length | σ |− | · | σ which is bicombable (by Example 3.3.6),and therefore so is the diﬀerence | · | − | · | σ = c σ (by Example 3.3.7).Many variations on this idea are possible; for instance, the “big” counting func-tions C σ which count all copies of σ in γ , not just the maximal number of disjointcopies.3.4. Markov chains.

A directed graph Γ is sometimes called a topological Markovchain . A topological Markov chain can be promoted to a genuine (stationary)Markov chain by assigning probabilities to each edge in such a way that the proba-bilities on the edges leaving each vertex sum to 1. Recall that we write the adjacencymatrix as M ; we think of this as an endomorphism of the vector space V spannedby the states of Γ. Let denote the vector with all components equal to 1, and let ι denote the vector corresponding to the initial state.Two states in a topological Markov chain are said to be communicating if thereis a directed path from each to the other. The property of being communicatingis an equivalence relation. We write C → C for equivalence classes C if there isa directed path from some (any) vertex of C to some (any) vertex of C ; observethat → is a partial order. We call each equivalence class a component .The induced (directed) subgraph associated to a component C is itself a topo-logical Markov chain. Its adjacency matrix M C has the property that for any i and j there is an n (in fact, inﬁnitely many n ) so that ( M nC ) ij is positive; one says sucha Markov chain is irreducible . If there is a ﬁxed n so that ( M nC ) ij is positive for all i , j we say the Markov chain is aperiodic ; this holds exactly when the gcd of thelengths of all loops in C is 1. A Markov chain (on a ﬁnite state space) which isboth irreducible and aperiodic is ergodic . Lemma 3.4.1 (Perron–Frobenius) . Let M be a real matrix with positive entries.Then there is a unique eigenvalue λ of biggest absolute value, and this eigenvalueis real and positive. Moreover, λ is a simple root of the characteristic polynomial,and it has a right (left) eigenvector with all components positive, unique up toscale. Finally, any other non-negative right (left) eigenvector is a multiple of the λ eigenvector.Proof. Since the entries of M are positive, M takes the positive orthant strictlyinside itself. The projectivization of the positive orthant is a simplex, and therefore M takes this simplex strictly into its interior. It follows that M has a uniqueattracting ﬁxed point in the interior this simplex; this ﬁxed point corresponds tothe unique eigenvector v (up to scale) with non-negative entries, and its entries areevidently all positive, and its associated eigenvalue λ is real and positive.If π is any plane containing this unique positive eigenvector, the projectivizationof π is an RP ; since the eigenvector becomes an attracting ﬁxed point in this RP ,it is not the only ﬁxed point. This shows that λ is a simple eigenvalue; a similarargument shows that − λ is not an eigenvalue.Let µ be any other eigenvalue. If µ is real, then | µ | < λ . Suppose µ is complex,acting as composition of a dilation with a rotation on some plane π . If | µ | = λ thenthe restriction of M to π ⊕ h v i acts projectively like a rotation; but this contradictsthe fact that v is a projective attracting ﬁxed point. This proves the theorem. (cid:3) If M is non-negative, there is still a non-negative real eigenvector v with areal positive eigenvalue λ , and every other eigenvalue µ satisﬁes | µ | ≤ λ . In thisgenerality, λ might have multiplicity >

1, and the Jordan block associated to λ might not be diagonal. However if M is irreducible , then λ has multiplicity 1, theeigenvector v is strictly positive, and every other eigenvalue with absolute value λ is simple and of the form e πi/k λ . These facts can be proved similarly to the proofof Lemma 3.4.1Now let G be a hyperbolic group, L a combing with respect to some generat-ing set, and Γ a graph parameterizing L . Let Γ C be the quotient directed graphwhose vertices are the components of Γ. Note that Γ C contains no directed loops.Associated to each vertex of Γ C is an adjacency matrix M C which has a uniquemaximal real eigenvalue λ ( C ) of multiplicity 1. We let λ = max C λ ( C ), and we calla component maximal if λ ( C ) = λ .The next lemma is crucial to what follows, and depends on Coornaert’s estimateof the growth function (i.e. Corollary 2.5.9). Lemma 3.4.2.

The maximal components do not occur in parallel; that is, there isno directed path from any maximal component to a distinct maximal component.Proof.

Since there is a directed path from the initial vertex to every other vertex,the number of paths of length n is of the form p ( n ) λ n + O ( q ( n ) ξ n ) for polynomials p , q and ξ < λ , where λ is as above. Moreover, the degree of p is one less than thelength of the biggest sequence of maximal components C → C → · · · → C deg( p ) .The number of paths of length n is equal to the number of elements of G of length n , so Corollary 2.5.9 implies that the degree of p is zero. (cid:3) It follows that all but exponentially few paths γ of length n in γ are entirelycontained in one of the maximal components of Γ, except for a preﬁx and a suﬃxof length O (log( n )). Consequently, the properties of a “typical” path in Γ canbe inferred from the properties of a “typical” path conditioned to lie in a singlecomponent.For any vector v , the limits ρ ( v ) := lim n →∞ n − n − X i =0 λ − i M i v, ℓ ( v ) := lim n →∞ n − n − X i =0 λ − i ( M T ) i v exist, and are the projections onto the left and right λ -eigenspaces respectively.Heuristically, ℓ ( v ) is the distribution of endpoints of long paths that start withdistribution v , and ρ ( v ) is the distribution of starting points of long paths that endwith distribution v .Recall that ι denotes the vector with a 1 in the coordinate corresponding to theinitial vertex and 0s elsewhere, and denotes the vector with all coordinates equalto 1. Deﬁne a measure µ ′ on the vertices of Γ by µ ′ i = ℓ ( ι ) i ρ ( ) i , and scale µ ′ to aprobability measure µ . Deﬁne a matrix N by N ij = M ij ρ ( ) j /λρ ( ) i if ρ ( ) i = 0,and deﬁne N ij = δ ij otherwise. Lemma 3.4.3.

The matrix N is a stochastic matrix (i.e. it is non-negative, andthe rows sum to ) and preserves the measure µ . HE ERGODIC THEORY OF HYPERBOLIC GROUPS 25

Proof. If ρ ( ) i = 0 then P j N ij = 1 by ﬁat. Otherwise X j N ij = X j M ij ρ ( ) j λρ ( ) i = ( M ρ ( )) i λρ ( ) i = 1. To see that N preserves µ ′ (and therefore µ ), we calculate X i µ ′ i N ij = X i ℓ ( ι ) i ρ ( ) i M ij ρ ( ) j λρ ( ) i = X i ℓ ( ι ) i M ij λ ρ ( ) j = ℓ ( ι ) j ρ ( ) j = µ ′ j (cid:3) In words, µ i is the probability that a point on a path will be in state i , conditionedon having originated at the initial vertex in the distant past, and conditioned onhaving a distant future.3.5. Shift space.

For each n let Y n denote the set of paths in Γ of length n startingat the initial vertex, and let X n denote the set of all paths in Γ of length n . Wecan naturally identify X with the vertices of Γ.Restricting to an initial subpath deﬁnes an inverse system · · · → X n → · · · → X → X , and the inverse limit X ∞ is the space of (right) inﬁnite paths. Similarlydeﬁne Y ∞ ⊂ X ∞ . If we give each X n and Y n the discrete topology, then X ∞ and Y ∞ are Cantor sets.If x := x , x , · · · and x ′ := x ′ , x ′ · · · are two elements of X ∞ , we deﬁne ( x | x ′ )to be the ﬁrst index at which x and x ′ diﬀer, and deﬁne a metric on X ∞ by setting d ( x, x ′ ) = a − ( x | x ′ ) for some a > ·|· ) is deliberately intended tosuggest a resemblance to the Gromov product). If we like, we can deﬁne X = ∪ i X i ∪ X ∞ and metrize it (as a compact space, in which each X n sits as a discretesubset) in the same way. Similarly, give Y ∞ the induced metric, and deﬁne Y = ∪ i Y i ∪ Y ∞ likewise.The shift operator T : X ∞ → X ∞ is deﬁned by ( T x ) i = x i +1 . We deﬁne aprobability measure µ on each X n by µ ( x · · · x n ) = µ x N x x N x x · · · N x n − x n where µ and N are the measure and stochastic matrix whose properties are givenin Lemma 3.4.3. By the deﬁnition of an inverse limit, there is a map X ∞ → X n foreach n which takes an inﬁnite path to its initial subpath of length n ; the preimagesof subsets of the X n under such maps are a basis for the topology on X ∞ , called cylinder sets . The measures µ as above let us deﬁne a Borel probability measure µ on X ∞ by ﬁrst deﬁning it on cylinder sets (note that the deﬁnitions of µ ondiﬀerent X n are compatible) and extending it to all Borel sets in the standardway; Lemma 3.4.3 implies that µ is T -invariant (i.e. µ ( A ) = µ ( T − ( A )) for allmeasurable A ⊂ X ∞ ).There is a bijection between Y n and L n , and by evaluation with G n . This mapextends continuously to a map E : Y → G , by sending Y ∞ → ∂ ∞ G . Lemma 3.5.1.

The map E : Y → G is surjective, Lipschitz in the a -metric, andbounded-to-one.Proof. That the map is Lipschitz follows immediately from the deﬁnition, and theobservation that ( E ( y ) | E ( y ′ )) ≤ ( y | y ′ ) − δ for y, y ′ ∈ Y . The restrictions E : Y n → G n are all bijections, so we just need to check that Y ∞ → ∂ ∞ G is surjective andbounded-to-one. Since E is continuous, Y is compact and G is Hausdorﬀ, the image is compact.Since the image is dense (because it contains G n for all n ), it is surjective.Finally, observe that if y and y ′ are any two points in Y ∞ , and γ, γ ′ are theassociated inﬁnite geodesics in G , then γ ∩ γ ′ is a compact initial segment, sinceafter they diverge they never meet again (by the deﬁnition of a combing). Fix x ∈ ∂ ∞ G , let y i be a ﬁnite subset of E − ( x ), and let γ i be the geodesic rays in G corresponding to the y i . For all but ﬁnitely many points p on any γ i , each γ j intersects the ball B δ ( p ) disjointly from the others. In particular, the number ofpoints in the preimage of any point in ∂ ∞ G is bounded by the cardinality of a ball(in G ) of radius δ . (cid:3) Recall that in § ν s on G for each s > h ( G ).Note that e h ( G ) = λ where λ is as above. For each n we deﬁne a probability measureon Y n (which, by abuse of notation, we call ν s ) by ν s ( y ) = ν s ( E ( y )cone L ( E ( y ))) for y ∈ Y n , and observe that the limit as s → h ( G ) from above (which we denote ν ( y ))exists and depends only on the cone type cone L ( E ( y )). Since the Patterson–Sullivanmeasure ν is supported in ∂ ∞ G , the measures ν on each Y n are compatible, thinkingof each Y n as a collection of cylinder sets in Y ∞ , and deﬁne a unique probabilitymeasure ν on Y ∞ which pushes forward under E to ν on ∂ ∞ G . Lemma 3.5.2.

The measure µ on X ∞ is the limit µ = lim n →∞ n P n − i =0 T i ∗ ν .Proof. We give the sketch of a proof. For any y , let L y be the (regular) languageof suﬃxes of words in L with y as a preﬁx, and let L yn be the subset of L y of length n . Then ν s ( y ) = ζ − G ( s ) P n e − s ( | y | + n ) | L yn | .If there is no path from the ﬁnal state y n to a maximal component, the growthrate of L y is strictly less than that of L , and ν s ( y ) →

0. Otherwise both growthfunctions are eventually of the form Cλ n plus something exponentially small com-pared to λ n . Deﬁne measures ν m on Y n by ν m ( y ) = m P mi =1 λ − ( | y | + i ) | L yi | . Thenby considering the form of the growth functions of L and L y , we see that there is aconstant C (not depending on y or n ) so that lim m →∞ ν m ( y ) = Cν ( y ). Scaling ν m to be a probability measure, we can set C = 1.The proof now follows from the deﬁnition of µ, N ; see [8] Lem. 4.19 for details. (cid:3) Limit theorems.

Let ξ , ξ , · · · be a (stationary) irreducible Markov chain ona ﬁnite state space, with stationary measure µ , and let f be a real-valued functionon the state space (since this space is ﬁnite, there are no additional assumptionson f ; in general we require f to be integrable, and have ﬁnite variance). Deﬁne F n := P ni =1 f ( ξ i ), and A = R f dµ . Theorem 3.6.1 (Markov’s central limit theorem) . With notation as above, thereis some σ ≥ so that for any r ≤ s , lim n →∞ P (cid:18) r ≤ F n − nAσ √ n ≤ s (cid:19) = 1 √ π Z sr e − x / dx Equivalently, there is convergence in probability n − / ( F n − nA ) → N (0 , σ )where N (0 , σ ) denote the normal distribution with mean 0 and standard deviation σ (in case σ = 0, we let N (0 , σ ) denote a Dirac mass centered at 0).Now, each maximal component C as above is a stationary irreducible Markovchain, with stationary measure the conditional measure µ | C . The measure µ on X ∞ HE ERGODIC THEORY OF HYPERBOLIC GROUPS 27 decomposes measurably into the union of (shift-invariant) subspaces X ∞ ( C ), thesubspace of (right) inﬁnite sequences contained in the component C . Consequently,if φ is a combable function on G , then for each maximal component C , there areconstants A C = R C dφ/µ ( C ) dµ and σ C , so that for µ -a.e. x ∈ X ∞ ( C ), the randomvariable n − / ( P n − i =0 dφ ( x i ) − nA C ) converges in probability to N (0 , σ C ).By Lemma 3.5.2, for ν -a.e. y ∈ Y ∞ there is a unique C so that T n y ∈ X ∞ ( C ) forsuﬃciently big n ; we say that y is associated to the component C . Let Y ∞ ( C ) bethe set of y associated to a ﬁxed C . For ν -a.e. y ∈ Y ∞ ( C ) we have convergence inprobability n / ( P n − i =0 dφ ( y i ) − nA C ) → N (0 , σ C ) (one way to see this is to observethat this is a shift-invariant tail property of y , and use Lemma 3.5.2).For combable functions, this is the end of the story. It is certainly possible forthe constants A C , σ C to vary from component to component. But for bicombable φ we have the following key lemma: Lemma 3.6.2.

Let φ be bicombable. Then there are constants A, σ so that A C = A and σ C = σ for all maximal components C .Proof. Call y ∈ Y ∞ typical if there are constants A y and σ y (necessarily unique)so that n / ( P n − i =0 dφ ( y i ) − nA y ) → N (0 , σ y ). For each C we have seen that ν -a.e. y ∈ Y ∞ ( C ) is typical with A y = A C and σ y = σ C .The map E : Y ∞ → ∂ ∞ G is ﬁnite-to-one, and takes the measure ν on Y ∞ to thePatterson–Sullivan measure ν on ∂ ∞ G . Hence E ( Y ∞ ( C )) has positive measure foreach C . Let y ∈ ∂ ∞ G be typical, and let id , g , g , · · · be the associated geodesicsequence of elements in G converging to E ( y ). Now let g be arbitrary, let y ′ be anyelement of Y ∞ with E ( y ′ ) = gE ( y ), and let id , g ′ , g ′ , · · · be the geodesic sequence ofelements in G associated to y ′ . By δ -thinness, d ( g ′ i , gg i ) is eventually approximatelyconstant, and therefore bounded. Since φ is bicombable, y ′ is typical, with A y ′ = A y and σ y ′ = σ y . But the action of G on ν is ergodic for ν , by Corollary 2.5.10, andtherefore for any C, C ′ there are typical y ∈ Y ∞ ( C ), y ′ ∈ Y ∞ ( C ′ ) with A y = A C , σ y = σ C and A y ′ = A C ′ , σ y ′ = σ C ′ , and with y ′ = gy for some g . Thiscompletes the proof. (cid:3) Corollary 3.6.3 (Calegari–Fujiwara [8]) . Let G be hyperbolic, and let φ be bicom-bable. Then there are constants A, σ so that if g n denotes a random element of G n (in the ν measure), there is convergence in probability n − / ( φ ( g n ) − nA ) → N (0 , σ ) . Note that A and σ as above are algebraic , and one can estimate from above thedegree of the ﬁeld extension in which they lie from the complexity of Γ.The uniform measure and the measure ν on G n are uniformly quasi-equivalenton a large scale, in the sense that there are constants R and C so that for any g ∈ G n , there is an inequality C − | B R ( g ) ∩ G n | / | G n | ≤ ν ( B R ( g ) ∩ G n ) ≤ C | B R ( g ) ∩ G n | / | G n | It follows that if g n denotes a random element of G n (in the uniform measure), thedistribution n − / ( φ ( g n ) − nA ) has a tail that decays like C e − C t .Since length with respect to one generating set is bicombable with respect toanother, we obtain the following corollary: Corollary 3.6.4.

Let G be hyperbolic, and let S and S ′ be two ﬁnite generatingsets for G . There is an algebraic number λ S,S ′ so that if g n is a random element of S of word length n , then the distribution n − / ( | g n | S ′ − nλ S,S ′ ) has a tail thatdecays like C e − C t when n is suﬃciently large. It is a slightly subtle point that λ S ′ ,S ≥ λ − S,S ′ , and the inequality is strict exceptfor essentially trivial cases.3.7. Thermodynamic formalism.

To push these techniques further, we muststudy classes of functions more general than combable functions, and invoke moresophisticated limit theorems. There is a well-known framework to carry out suchanalysis, pioneered by Ruelle, Sinai, Bowen, Ratner, Parry etc.; [36] is a standardreference.The setup is as follows. For simplicity, let M be a k × k matrix with 0–1 entries forwhich there is a constant n so that all the entries of M n are positive (i.e. M is theadjacency matrix of a topological Markov chain with k states which is irreducibleand aperiodic). Let X ∞ be the space of (right) inﬁnite sequences x := x , x , · · · satisfying M ( x n , x n +1 ) = 1 for all n , and let T be the shift operator on X ∞ . Asbefore, we can metrize X ∞ by d ( x, x ′ ) = a − ( x | x ′ ) for some ﬁxed a >

1, and observethat the action of T on X ∞ is mixing . This means that for all nonempty open sets U, V ⊂ X ∞ there is N so that T − n ( U ) ∩ V is nonempty for all n ≥ N . Note thatif M is irreducible but not aperiodic, there is nevertheless a decomposition of X ∞ into D disjoint components which are cycled by T , and such that T D is mixing oneach component, where D is the gcd of the periods of T -invariant sequences.Let M T be the space of T -invariant probability measures on X ∞ . This is aconvex, compact subset of the space of all measures in the weak- ∗ topology. It isnot hard to show that the topological entropy h of T is equal to the supremum ofthe measure theoretic entropies sup µ ∈ M T h ( µ ), and that h = log λ where λ is thePerron–Frobenius eigenvalue of M ; see e.g. [36].The shift T uniformly expands X ∞ by a factor of a , and therefore if a functionon X ∞ is suﬃciently regular, it tends to be smoothed out by T . Deﬁne T ∗ f by T ∗ f ( x ) = f ◦ T x . We would like the iterates ( T n ) ∗ f to have a uniform modulusof continuity; this is achieved precisely by insisting that f be H¨older continuous,that is, that there is some α so that | f ( x ) − f ( x ′ ) | ≤ Cd ( x, x ′ ) α = Ca − α ( x | x ′ ) .The set of functions f on X ∞ , H¨older continuous of exponent α , is a Banach spacewith respect to the norm k f k ∞ + k f k α where k f k α is the least such C so that | f ( x ) − f ( x ′ ) | ≤ Cd ( x, x ′ ) α . We denote this Banach space C α ( X ∞ ). Deﬁnition 3.7.1.

Let f be H¨older continuous on X ∞ . The pressure of f , denoted P ( f ), is P ( f ) = sup µ ∈ M T ( h ( µ ) + R f dµ ).It turns out that the supremum is realized on some invariant measure µ f of fullsupport, known as the equilibrium state (or Gibbs state ) of f . That is, P ( f ) = h ( µ f ) + R f dµ f . See e.g. [4] Ch. 1 for a proof of this theorem, and of Theorem 3.7.3below. Deﬁnition 3.7.2.

The

Ruelle transfer operator L f associated to f is deﬁned bythe formula L f g ( x ) = P T x ′ = x e f ( x ′ ) g ( x ′ ). Note that L f acts as a bounded linearoperator on C α ( X ∞ ). Theorem 3.7.3 (Ruelle–Perron–Frobenius [36]) . The operator L f has a simplepositive eigenvalue e P ( f ) which is strictly maximal in modulus. The essential spec-trum is contained in a ball whose radius is strictly less than e P ( f ) , and the rest ofthe spectrum outside this ball is discrete and consists of genuine eigenvalues. HE ERGODIC THEORY OF HYPERBOLIC GROUPS 29

There is a strictly positive eigenfunction ψ f satisfying L f ψ f = e P ( f ) ψ f , and an“eigen probability measure” ν f satisfying L ∗ f ν f = e P ( f ) ν f , and if we scale ψ f sothat R ψ f dν f = 1 , then the equilibrium state µ f is equal to ν f ψ f .Remark . ν f can be thought of as a left eigenvector for L f , and ψ f as a righteigenvector. When f is identically zero, L f is basically just the matrix M , and µ f is basically just µ as constructed in § P ( f ) and ψ f are analytic on an open subset of the complex Banach space C α ( X ∞ , C ) which contains a neighborhood of C α ( X ∞ , R ) (i.e. of C α ( X ∞ )).Because of the simplicity and analyticity of the maximal eigenvector/value, onecan study the derivatives of pressure. For simplicity, let P ( t ) := P ( tf + g ). Thenwe can compute P ′ (0) = Z f dµ g and a further diﬀerentiation gives P ′′ (0) = Z f + 2 f w ′ (0) dµ g where w ( t ) = ψ tf + g (suitably normalized).Now, let F n ( x ) = P n − i =0 f ( T i x ). Then from the deﬁnition of the transfer opera-tor, L ntf + g ( · ) = L ng ( e tF n · ), and therefore one obtains nP ′′ (0) = Z F n + 2 F n w ′ (0) dµ g If we set g to be identically zero, then µ g is just the equilibrium measure µ frombefore. If we change f by a constant f − R f dµ to have mean 0, then the ergodictheorem shows (1 /n ) F n → µ -a.e. and therefore P ′′ (0) = lim n →∞ Z n F n dµ It is usual to denote this limiting quantity by σ .The analyticity of P lets us control the higher moments of F n in a uniformmanner, and therefore by applying Fourier transform, one obtains a central limittheorem n − / F n → N (0 , σ ). Better estimates of the rate of convergence can beobtained by studying P ′′′ (0); see [11].This theorem can be combined with Lemma 3.6.2 to obtain a central limit the-orem for certain functions on hyperbolic groups whose (discrete) derivatives alonga combing satisfy a suitable H¨older continuity property. Such functions arise nat-urally for groups acting cocompactly on CAT( K ) spaces with K <

0, where onewants to compare the intrinsic geometry of the space with the “coarse” geometryof the group.Let Z be a complete CAT( K ) geodesic metric space with K <

0, and let G actcocompactly on Z by isometries. Pick a basepoint z ∈ Z , and deﬁne a function F on G by F ( g ) = d ( z, gz ). Since G is hyperbolic, if we ﬁx a ﬁnite generatingset S we can choose a geodesic combing L with respect to S as above. Now, forany s ∈ S deﬁne D s F ( g ) = F ( g ) − F ( sg ). It is straightforward to see from theCAT( K ) property that there are constants C and α (depending on K and G ) sothat | D s F ( g ) − D s F ( h ) | ≤ Ca − α ( g | h ) for all s and all g, h ∈ G . An element of ∪ X n corresponds to a path in Γ. Reading the edge labels deter-mines a word in the generators (a suﬃx of some word in L ), and by evaluation, anelement of G . Let E : ∪ X n → G denote this evaluation map (note that this is notinjective). We can deﬁne a function DF on ∪ X n by DF ( x ) = D s F ( E ( x )) where s − is the label associated to the transition from x to x (we could suggestivelywrite s = x − x ). Evidently, DF extends to a H¨older continuous function on X .Furthermore, for each y ∈ Y n , we have P n − i =0 DF ( T i y ) = F ( E ( y )).For each big component C , it follows that ν -a.e. y ∈ Y ∞ ( C ) are A C , σ C typical(for the function DF ) for some A C , σ C depending only on C . Since F is Lipschitzon G in the left and right invariant metrics, the argument of Lemma 3.6.2 impliesthat A C , σ C are equal to some common values A, σ , and therefore we obtain thefollowing corollary:

Corollary 3.7.5.

Let Z be a complete CAT( K ) geodesic metric space with K < ,and let G act cocompactly on Z by isometries. Pick a basepoint z ∈ Z , and aﬁnite generating set S for G . Then there are constants A and σ so that if g n isa random element of G n (in the ν measure), there is convergence in probability n − / ( d ( z, g n z ) − An ) → N (0 , σ ) . Evidently, the only properties of the function F we use are that it is Lipschitz inboth the left- and right-invariant metrics, and satisﬁes a H¨older estimate | D s F ( g ) − D s F ( h ) | ≤ Ca − α ( g | h ) for all s and all g, h ∈ G . Any such function on a hyperbolicgroup satisﬁes a central limit theorem analogous to Corollary 3.7.5. For the sakeof completeness, therefore, we state this as a theorem: Theorem 3.7.6 (H¨older central limit theorem) . Let G be a hyperbolic group, and S a ﬁnite generating set for G . Let F be a real-valued function which is Lipschitzin both the left- and right-invariant word metrics on G , and satisﬁes | D s F ( g ) − D s F ( h ) | ≤ Ca − α ( g | h ) for all s in S and all g, h ∈ G . Then there are constants A and σ so that if g n is a random element of G n (in the ν measure), there isconvergence in probability n − / ( F ( g n ) − An ) → N (0 , σ ) .Remark . The idea of using the thermodynamic formalism to study the rela-tionship between distance and word length in cocompact groups of isometries ofhyperbolic space is due to Pollicott–Sharp [35]; Corollary 3.7.5 and Theorem 3.7.6above are simply the result of combining their work with [12] and [8]. Nevertheless,we believe they are new. 4.

Random walks

The main references for this section are Kaimanovich [22] and Kaimanovich–Vershik [23]. The theory of random walks is a vast and deep subject, with connec-tions to many diﬀerent parts of mathematics. Therefore it is necessary at a fewpoints to appeal to some standard (but deep) results in probability theory, whoseproof lies outside the scope of this survey. A basic reference for probability theoryis [41]. We give more specialized references in the text where relevant.This section is brief compared to the earlier sections, and is not meant to becomprehensive.4.1.

Random walk.

Let G be a group and let µ be a probability measure on G .We further assume that µ is nondegenerate ; i.e. that the support of µ generates G as a semigroup. An important example is the case where µ is the uniform HE ERGODIC THEORY OF HYPERBOLIC GROUPS 31 measure on a symmetric ﬁnite generating set S . There are two ways to describerandom walk on G determined by µ : as a sequence of elements visited in thewalk, or as a sequence of increments. In the ﬁrst description, a random walk y := id , y , y , · · · is a Markov chain with state space G , with initial state id, andwith transition probability p gh = µ ( g − h ). In the second description, a randomwalk z := z , z , · · · is a sequence of random elements of G (the increments of thewalk), independently distributed according to µ . The two descriptions are relatedby taking y n = z z · · · z n . We write this suggestively as z = Dy and y = Σ z .We use the notation ( G N , µ N ) for the product probability space, and ( G N , P )for the probability space of inﬁnite sequences with the measure P on cylinder setsdeﬁned by P ( { y : y begins id , y , · · · , y n } ) = p id y p y y · · · p y n − y n With this notation, z is a random element of ( G N , µ N ) and y is a random elementof ( G N , P ).The shift operator T acts on G N by ( T z ) n = z n +1 or ( T y ) n = y n +1 . It ismeasure preserving for µ N but not for P ; in fact, from the deﬁnition, the supportof P is contained in the set of sequences starting at id. The action of the shift T on( G N , µ N ) is ergodic. For, if A is a subset satisfying A = T − ( A ), then a sequence z is in A if and only if T n ( z ) is in A for suﬃciently big n . This is a tail event for thesequence of independent random variables z i , so by Kolmogorov’s 0–1 law (see [41]Thm. 1.1.2) A has measure 0 or 1. Deﬁnition 4.1.1.

Let G be a group and S a ﬁnite generating set. Let µ be aprobability measure on G . The ﬁrst moment of µ is P | g | µ ( g ); if this is ﬁnite, wesay µ has ﬁnite ﬁrst moment . Lemma 4.1.2.

Let µ be a probability measure on G with ﬁnite ﬁrst moment. Let id , y , y , · · · be a random walk determined by µ . Then L := lim n →∞ | y n | /n existsalmost surely, and is independent of y . In fact, if µ ∗ n denotes n -fold convolution(i.e. the distribution of the random variable y n ), then L = lim n →∞ P | g | µ ∗ n ( g ) .Proof. We set z = Dy . Deﬁne h n ( z ) := | y n | . Then h n satisﬁes h n + m ( z ) ≤ h n ( T m z ) + h m ( z )i.e. h n form a subadditive cocycle . Kingman’s subadditive ergodic theorem (see e.g.[40]) says that for any subadditive L cocycle h n on a space with a T -invariantmeasure, the limit lim n →∞ h n ( z ) /n exists a.s. and is T -invariant. In our circum-stance, ﬁnite ﬁrst moment implies that h (and all the h n ) are in L , so the theoremapplies. Since the action of T on ( G N , µ N ) is ergodic, the limit is independent of z .The lemma follows. (cid:3) L as above is called the drift of the random walk associated to µ . Since each | y n | ≥ L ≥ Example . If G = Z n and µ is symmetric (i.e. µ ( g ) = µ ( g − ) for all g ) withﬁnite support, then L = 0.We now focus our attention on the case of hyperbolic groups and simple randomwalk (i.e. when µ is the uniform measure on a ﬁnite symmetric generating set). Lemma 4.1.4. let G be a nonelementary hyperbolic group, and let µ be a nonde-generate probability measure on G with ﬁnite ﬁrst moment. Then the drift L ofrandom walk with respect to µ is positive.Proof. We give the idea of a proof. Let µ ∗ n denote the n -fold convolution of µ asbefore. The probability measures µ ∗ n have a subsequence converging to a weaklimit µ ∗∞ in G . Clearly the support of µ ∗∞ is contained in ∂ ∞ G (a group for whichlim sup n →∞ µ ∗ n ( g ) > g and for µ nondegenerate is a ﬁnite group).To prove the lemma it suﬃces to show that for any C , for suﬃciently large enough n there is an inequality P g,h ( | hg | − | h | ) µ ∗ n ( g ) µ ∗ N ( h ) ≥ C > N ≥ n , sincethen L ≥ C/n . Now, for each h , if g satisﬁes | hg | − | h | < C , then the closest pointon the geodesic from h − to g is within δ of some geodesic from id to h − . So as | g | goes to inﬁnity, the a -distance from h − to g goes to 0. Hence for this inequalityto fail to hold, almost half of the mass of µ ∗ n × µ ∗ N must be concentrated near the antidiagonal ; i.e. the set of ( g, g − ) ⊂ G × G .From this we can deduce that either the desired inequality is satisﬁed, or elsemost of the mass of µ ∗ n must be concentrated near a single geodesic through id.Taking n → ∞ , the support of µ ∗∞ must consist of exactly two points, and G isseen to be elementary, contrary to hypothesis. (cid:3) Remark . It is a theorem of Guivarc’h (see [46], Thm. 8.14) that if G is anygroup with a nondegenerate measure µ (always with ﬁnite ﬁrst moment) for whichthe drift of random walk is zero, then G is amenable. Some care is required toparse this statement: on an amenable group some nondegenerate measures may have positive drift, but on a nonamenable group, every nondegenerate measure haspositive drift.A nonelementary hyperbolic group always contains many nonabelian free groups,and is therefore nonamenable; this gives a more highbrow proof of Lemma 4.1.4. Lemma 4.1.6 (Kaimanovich [22] 7.2) . Let X be a δ -hyperbolic space. The followingtwo conditions are equivalent for a sequence x n in X and a number L > : (1) d ( x n , x n +1 ) ≤ o ( n ) and d ( x , x n ) = nL + o ( n ) ; (2) there is a geodesic ray γ so that d ( x n , γ ( Ln )) = o ( n ) . A sequence x n satisfying either condition is said to be regular . Proof.

That (2) implies (1) is obvious, so we show that (1) implies (2). For sim-plicity, we use the notation | y | := d ( x , y ). The path obtained by concatenatinggeodesics from x n to x n +1 has ﬁnite a -length, and therefore converges to someunique x ∞ ∈ ∂ ∞ X .Let γ n (resp. γ ∞ ) be geodesic rays from the origin to x n (resp. x ∞ ) andparameterize them by distance from the origin. Fix some positive ǫ , and let N = N ( ǫ ) be such that for any two n, m > N the geodesics γ n and γ m are within δ onthe interval of length ( L − ǫ ) n , and let p n = γ n (( L − ǫ ) n ) so that d ( p n , γ m ) ≤ δ for n > N . Now, d ( p n − , p n ) ≤ L − ǫ + 4 δ and therefore d ( p n , p m ) ≤ | n − m | ( L − ǫ + 4 δ ).On the other hand, d ( p n , p m ) ≥ || p m | − | p n || = | n − m | ( L − ǫ ). Consequently thesequence p i is a quasigeodesic, and therefore there is a constant H = H ( δ, L ) so that d ( p n , p N x ∞ ) ≤ H for any n ≥ N . Since p N x ∞ and γ ∞ are asymptotic, d ( p n , γ ∞ ) ≤ H + δ for suﬃciently large n , and therefore d ( x n , γ ∞ ) ≤ H + δ + ( | x n | − n ( L − ǫ ))for suﬃciently large n . Taking ǫ → γ = γ ∞ . (cid:3) HE ERGODIC THEORY OF HYPERBOLIC GROUPS 33

Together with Lemma 4.1.4 this gives the following Corollary:

Corollary 4.1.7 (Kaimanovich [22] 7.3) . Let G be a nonelementary hyperbolicgroup, and let µ be a nondegenerate probability measure on G with ﬁnite ﬁrst mo-ment. Then there is L > so that for a.e. random walk y there is a unique geodesicray γ y with d ( y n , γ y ( Ln )) = o ( n ) .Proof. It suﬃces to show that if µ has ﬁnite ﬁrst moment, then d ( y n , y n +1 ) = o ( n )almost surely. Let z = Dy , and for any ǫ > E n be the event that | z n | ≥ ǫn .Then the probability of E n is P | g |≥ ǫn µ ( g ), and therefore X n P ( E n ) = X n X | g |≥ ǫn µ ( g ) ≤ ǫ X ( | g | + 1) µ ( g ) < ∞ Therefore by the easy direction of the Borel–Cantelli lemma (see e.g. [41] 1.1.4)the probability that E n occurs inﬁnitely often is zero. Since this is true for every ǫ , we have d ( y n , y n +1 ) = | z n | = o ( n ) almost surely. (cid:3) Poisson boundary.

Deﬁne an equivalence relation ∼ on G N by y ∼ y ′ if andonly if there are integers k, k ′ so that T k y = T k ′ y ′ . Deﬁnition 4.2.1.

The measurable envelope of ∼ is the smallest measurable equiv-alence relation generated by ∼ . The quotient measure space (Γ , ν ) of ( G N , P ) bythe measurable envelope is called the Poisson boundary of G with respect to µ .In other words, ν -measurable functions on Γ correspond precisely to T -invariant P -measurable functions on G N . We let bnd : G N → Γ be the quotient map, so thatbnd P = ν .Now, G acts on G N on the left coordinatewise. This action commutes with T ,and descends to an action on Γ. Since ∼ is T -invariant, bnd P = bnd T P , so ν = P g µ ( g ) gν ; i.e. the measure ν is µ -stationary . Deﬁnition 4.2.2. A µ -boundary is a G -space with a µ -stationary measure λ whichis obtained as a T -equivariant (measurable) quotient of ( G N , P ).Any µ -boundary factors through (Γ , ν ). A µ -boundary is µ -maximal if the mapfrom (Γ , ν ) is a measurable isomorphism. Kaimanovich [22] gave two very usefulcriteria for a µ -boundary to be maximal. Theorem 4.2.3 (Kaimanovich ray criterion [22] Thm. 5.5) . Let B be a µ -boundary,and for y ∈ G N let Π( y ) ∈ B be the image of y under the ( G -equivariant) quotientmap Π : G N → B . If there is a family of measurable maps π n : B → G such that P -a.e. d ( y n , π n (Π( y ))) = o ( n ) then B is maximal. Together with Corollary 4.1.7, this gives the following important result:

Corollary 4.2.4 (Kaimanovich [22] Thm. 7.6) . Let G be a nonelementary hyper-bolic group, and let µ be a nondegenerate probability measure of ﬁnite ﬁrst moment.Let Π : G N → ∂ ∞ G take a random walk to its endpoint (which exists P -a.e.), andlet λ = Π P . Then ( ∂ ∞ G, λ ) is the Poisson boundary of G, µ .Proof.

Simply deﬁne π n to be the maps that take a point y ∈ ∂ ∞ G to γ y ( nL ) where γ y is a parameterized geodesic ray from id to y , and L is the drift. (cid:3) Harmonic functions.Deﬁnition 4.3.1. If f is a function on G , the operator P µ (convolution with µ )is deﬁned by P µ f ( g ) := P h f ( gh ) µ ( h ). A function f on G is µ -harmonic (orjust harmonic if µ is understood) if it is ﬁxed by P µ ; i.e. if it satisﬁes f ( g ) = P h f ( gh ) µ ( h ) for all g in G .In general we need to impose some condition on f for P h f ( gh ) µ ( h ) to be deﬁned.If the support of µ is ﬁnite, then f can be arbitrary, but if the support of µ is inﬁnite,we usually (but not always!) require f to be in L ∞ . We let H ∞ ( G, µ ) denote theBanach space of bounded µ -harmonic functions on G .In probabilistic terms, if f is harmonic and y ∈ G N is a random walk, therandom variables f n := f ( y n ) are a martingale ; i.e. the expected value of f n given y n − is f n − (see e.g. [41] § Lemma 4.3.2.

The Banach spaces H ∞ ( G, µ ) and L ∞ (Γ , ν ) are isometric.Proof. Given f ∈ H ∞ ( G, µ ) and y ∈ G N , the random variables f ( y n ) are a boundedmartingale, and therefore by the martingale convergence theorem ([41] Thm. 5.2.22),converge a.s. to a well-deﬁned limit. Evidently this limit is measurable and T -invariant, and therefore descends to a function on Γ which we denote b f . Explicitly, b f (bnd y ) := lim n →∞ f ( y n ).Conversely, given b f ∈ L ∞ (Γ , ν ) we deﬁne f ( g ) = R Γ b f d ( g ∗ ν ) (this expression isknown as the Poisson formula ). Since ν is stationary, f is harmonic.The mean value property of harmonic functions implies that these maps areisometries, since a harmonic function achieves its maximum on the boundary. (cid:3) Note that the Poisson formula is available for any µ -boundary. That is, if B is a G -space with a µ -stationary probability measure λ , and b f is any element of L ∞ ( B, λ ), then f ( g ) := R B b f d ( gλ ) is a bounded harmonic function on G . If λ isnot invariant, f is typically nonconstant.The remainder of this section is devoted to some miscellaneous applications ofrandom walks to hyperbolic and other groups.4.4. Green metric.

There is a close resemblance between the measure ν and thePatterson–Sullivan measures constructed in § G adapted to the random walk, namelythe so-called Green metric . Deﬁnition 4.4.1.

Let G be a group and µ a probability measure on G with ﬁniteﬁrst moment. The Green metric on G is the metric for which the distance between g and h is − log of the probability that random walk starting at g ever hits h .If µ is symmetric, so is the Green metric, since random walks are time-reversible.Note that the Green metric is degenerate if random walk is recurrent. For simplerandom walk, this occurs only if G is ﬁnite, or is virtually Z or Z , by a classicalresult of Varopoulos (see [45]). For nondegenerate measures with ﬁnite ﬁrst momenton non-elementary hyperbolic groups, Blach`ere and Broﬀerio [1] show that theGreen metric and the word metric are quasi-isometric (one needs to be somewhatcareful: the Green metric is not in general a geodesic metric). HE ERGODIC THEORY OF HYPERBOLIC GROUPS 35

Theorem 4.4.2 (Blach`ere–Ha¨ıssinsky–Mathieu [2], Thm. 1.3) . Let G be a non-elementary hyperbolic group, and for y ∈ ∂ ∞ G , let B ( y, R ) denote the ball of radius R in the a -metric (see Deﬁnition 2.4.6). Let µ be a symmetric probability measurewith ﬁnite ﬁrst moment, and let ν be the associated harmonic measure on ∂ ∞ G .Then for ν -almost every y ∈ ∂ ∞ G , there is convergence lim R → log ν ( B ( y, R )) / log R = ℓ G /aL where L is the drift in the word metric, and ℓ G is the drift in the Green metric. Note that Kingman’s subadditive ergodic theorem implies that the drift ℓ G withrespect to the Green metric is well-deﬁned, essentially by the same argument as theproof of Lemma 4.1.2.4.5. Harnack inequality.

The classical

Harnack inequality relates the values of apositive harmonic function at two points. In its inﬁnitesimal version, it asserts anupper bound on the logarithmic derivative of a positive harmonic function.Let f be a non-negative bounded harmonic function on H n , for simplicity. ThePoisson formula says that f ( p ) = R S n − ∞ b f dν p where ν p is the visual measure asseen from p . If ν is visual measure as seen from the origin, and g is any isometrytaking the origin to p , then ν p = g ∗ ν . To understand how f varies as a functionof p therefore, it suﬃces to understand how ν p varies as a function of p . If B is aninﬁnitesimal ball centered at some point y in S n − ∞ , then the visual size of B growslike e t ( n − as one moves distance t in the direction of y . Hence: Proposition 4.5.1 (Harnack inequality) . Let f be a non-negative bounded har-monic function on H n . Then the logarithmic derivative of f satisﬁes the inequality | d log f | ≤ ( n − . If f is a non-negative harmonic function on a group G , the analog of this in-equality is f ( gs ) /f ( g ) ≤ e D for any g ∈ G and s ∈ S where D is the dimension of ν , which can be determined from Theorem 4.4.2.If S is a closed surface of genus ≥ ρ : π ( S ) → G is injective, then π ( S )acts on G by left translation, and there is an associated foliated bundle with ﬁberthe ideal circle ∂ ∞ π ( S ) with its natural π ( S ) action. We can build a harmonicconnection for this circle bundle; i.e. a choice of measure m g on the circle S ( g )over each g ∈ G so that for any subset A ⊂ S we have m g ( A ) = P µ ( s ) m gs ( A ).Since the circle is 1-dimensional, these measures integrate to metrics on the circles S ( g ) for which the curvature is harmonic. The Harnack inequality then gives apriori bounds on this curvature, and one can deduce local compactness results forfamilies of injective surface maps of variable genus. For stable minimal surfacesin hyperbolic 3-manifolds, such a priori bounds were obtained by Schoen [38] andare an important tool in low-dimensional topology. The idea of using Harnack-type inequalities to obtain curvature bounds is due to Thurston [44] (also see [6],Example 4.6).4.6. Monotonicity. A norm on a group is a non-negative function τ : G → R so that τ ( gh ) ≤ τ ( g ) + τ ( h ) for all g, h ∈ G . A functor from groups to norms is monotone if τ H ( φ ( g )) ≤ τ G ( g ) for any g ∈ G and φ : G → H .If τ is a norm on G , and µ is a probability measure with ﬁnite ﬁrst moment,it makes sense to study the growth rate of τ under µ -random walk on G . If G is ﬁnitely generated, one can study the growth rate of τ under all simple random walks; if they all have the same growth rate, this rate is an invariant of G . Since µ random walk on G pushes forward to φ ∗ µ random walk on φ ( G ) = H , the growthrate of a monotone family of norms cannot increase under a homomorphism; thusif the growth rate of τ G on G is strictly smaller than the growth rate of τ H on H ,there are strong constraints on the homomorphisms from G to H .As an example, consider the commutator length cl. For any group G and any g in the commutator subgroup [ G, G ], the commutator length cl( g ) is just the leastnumber of commutators in G whose product is g (for technical reasons, one usuallystudies a closely related quantity, namely the stable commutator length ; see e.g [7]for an introduction).One of the main theorems of [9] is as follows: Theorem 4.6.1 (Calegari–Maher [9]) . Let G be hyperbolic, and let µ be a non-degenerate symmetric probability measure with ﬁnite ﬁrst moment whose supportgenerates a nonelementary subgroup. There is a constant C so that if g n is ob-tained by random walk of length n , conditioned to lie in [ G, G ] , then C − n/ log( n ) ≤ cl( g n ) ≤ Cn/ log( n ) with probability − O ( C − n c ) . Said another way, commutator length grows like n/ log( n ) under random walkin a hyperbolic group. Similar estimates on commutator length can be obtainedfor groups acting in a suitable way on (not necessarily proper) hyperbolic spaces;the most important examples are mapping class groups and relatively hyperbolicgroups.As a corollary, if H is any ﬁnitely generated group, and commutator length in H grows like o ( n/ log( n )) for simple random walk (with respect to some generatingset), then there are no interesting homomorphisms from H to any hyperbolic group G , and no interesting actions of H on certain hyperbolic complexes.5. Acknowledgments

I would like to thank Vadim Kaimanovich, Anders Karlsson, Joseph Maher,Curt McMullen, Richard Sharp, and Alden Walker. I would also like to thank theanonymous referee for some useful comments. Danny Calegari was supported byNSF grant DMS 1005246.

References [1] S. Blach`ere and S. Broﬀerio,

Internal diﬀusion limited aggregation on discrete groups havingexponential growth , Probab. Theory Relat. Fields (2007), 323-343[2] S. Blach`ere, P. Ha¨ıssinsky and P. Mathieu,

Harmonic measures versus quasiconformal mea-sures for hyperbolic groups , preprint, arXiv:0806.3915[3] B. Bowditch,

A topological characterization of hyperbolic groups , Jour. AMS (1998), no.3, 643–667[4] R. Bowen, Equilibrium states and the ergodic theory of Anosov diﬀeomorphisms , LectureNotes in Mathematics , Springer-Verlag 1975[5] R. Brooks,

Some remarks on bounded cohomology , Riemann surfaces and related topics:Proceedings of the 1978 Stony Brook Conference (SUNY Stony Brook NY 1978) Ann. Math.Studies , Princeton University Press, 1981, 53–63[6] D. Calegari, Foliations and the geometry of -manifolds , Oxford Mathematical Monographs.Oxford University Press, Oxford, 2007.[7] D. Calegari, scl , MSJ Memoirs, . Mathematical Society of Japan, Tokyo, 2009. HE ERGODIC THEORY OF HYPERBOLIC GROUPS 37 [8] D. Calegari and K. Fujiwara,

Combable functions, quasimorphisms and the central limittheorem , Ergodic Theory Dynam. Systems (2010), no. 5, 1343–1369[9] D. Calegari and J. Maher, Statistics and compression of scl , preprint, arXiv:1008.4952[10] J. Cannon,

The combinatorial structure of cocompact discrete hyperbolic groups , Geom. Ded. (1984), no. 2, 123–148[11] Z. Coelho and W. Parry, Central limit asymptotics for shifts of ﬁnite type , Israel J. Math. (1990), no. 2, 235–249[12] M. Coornaert, Mesures de Patterson–Sullivan sure le bord d’un espace hyperbolique au sensde Gromov , Pac. J. Math. (1993), no. 2, 241–270[13] M. Coornaert, T. Delzant and A. Papadopoulos,

G´eom´etrie et th´eorie des groupes , Lesgroupes hyperboliques de Gromov, Springer-Verlag, Berlin, 1990.[14] M. Coornaert and A. Papadopoulos,

Symbolic dynamics and hyperbolic groups , Lecture Notesin Mathematics , Springer-Verlag 1993[15] D. Epstein, J. Cannon, D. Holt, S. Levy, M. Paterson and W. Thurston,

Word processing ingroups , Jones and Bartlett, Boston, MA, 1992[16] D. Epstein and K. Fujiwara,

The second bounded cohomology of word-hyperbolic groups ,Topology (1997), 1275–1289[17] H. Federer, Geometric measure theory , Grund. der math. Wiss.

Springer-Verlag NewYork, 1969[18] P. Flajolet and R. Sedgewick,

Analytic combinatorics , Cambridge University Press, Cam-bridge, 2009[19] A. Furman,

Random walks on groups and random transformations , Handbook of dynamicalsystems, Vol. 1a, 931–1014, North-Holland, Amsterdam, 2002[20] M. Gromov,

Hyperbolic groups , Essays in group theory, MSRI Publ. , 75–263, Springer, NewYork, 1987[21] P. Hall and C. Heyde, Martingale limit theory and its application , Academic Press, NewYork, 1980[22] V. Kaimanovich,

The Poisson formula for groups with hyperbolic properties , Annals of Math-ematics, (2000), 659–692[23] V. Kaimanovich and A. Vershik,

Random walks on discrete groups: boundary and entropy ,Ann. Probab. (1983), no. 3, 457–490[24] S. Kakutani, Random ergodic theorems and Markoﬀ processes with a stable distribution ,Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability,1950, University of California Press, Berkeley (1951), 247–261[25] M. Kapovich and B. Kleiner,

Hyperbolic groups with low dimensional boundary , Ann. Sci.´Ecole Norm. Sup. (4) (2000), no. 5, 647–669[26] H. Kesten, Full Banach means on countable groups , Math. Scand. (1959), 146–156[27] C. Maclachlan and A. Reid, The arithmetic of hyperbolic -manifolds , Springer-Verlag GTM . Springer-Verlag, New York, 2003.[28] B. Maskit, Kleinian groups , Grund. der Math. Wiss. . Springer-Verlag, Berlin, 1988.[29] P. Nicholls,

The ergodic theory of discrete groups , LMS , Cambridge University Press,Cambridge, 1989[30] S. Patterson,

The limit set of a Fuchsian group , Acta. Math. (1976), 241–273[31] G. Perelman,

The entropy formula for the Ricci ﬂow and its geometric applications , preprintarXiv:math/0211159[32] G. Perelman,

Ricci ﬂow with surgery on three-manifolds , preprint arXiv:math/0303109[33] J.-C. Picaud,

Cohomologie born´ee des surfaces et courants g´eod´esiques , Bull. Soc. Math.France (1997), no. 1, 115–142[34] M. Pollicott,

A complex Ruelle-Perron-Frobenius theorem and two counterexamples , ErgodicTheory Dynam. Systems (1984), no. 1, 135–146[35] M. Pollicott and R. Sharp, Comparison theorems and orbit counting in hyperbolic geometry ,Trans. AMS (1998), no. 2, 473–499[36] D. Ruelle,

Thermodynamic formalism , Addison-Wesley Publishing Co., Reading, Mass., 1978[37] R. Sharp,

Local limit theorems for free groups , Math. Ann. (2001), 889–904[38] R. Schoen,

Estimates for stable minimal surfaces in three-dimensional manifolds , Seminaron minimal submanifolds, pp. 111–126, Ann. of Math. Stud. , Princeton Univ. Press,Princeton, N.J., 1983. [39] J. Stallings,

On torsion-free groups with inﬁnitely many ends , Ann. Math. (2) (1968),312–334[40] M. Steele, Kingman’s subadditive ergodic theorem , Ann. Inst. Henri Poincar´e (1989), no.1, 93–98[41] D. Stroock, Probability Theory, an analytic view , Cambridge University Press, Cambridge,1993[42] D. Sullivan,

On the ergodic theory at inﬁnity of an arbitrary discrete group of hyperbolic mo-tions , Riemann surfaces and related topics: Proceedings of the 1978 Stony Brook Conference(State Univ. New York, Stony Brook, N.Y., 1978), pp. 465-496, Ann. of Math. Stud. ,Princeton Univ. Press, Princeton, N.J., 1981.[43] D. Sullivan, The density at inﬁnity of a discrete group of hyperbolic motions , Publ. Math.IHES (1979), 171–202[44] W. Thurston, 3 -manifolds, foliations and circles II , preprint[45] N. Varopoulos, L. Saloﬀ-Coste, and T. Coulhon, Analysis and geometry on groups , CambridgeTracts in Math., , Cambridge University Press, Cambridge, 1992.[46] W. Woess

Random walks on inﬁnite graphs and groups , Cambridge University Press, Cam-bridge, 2000

Department of Mathematics, Caltech, Pasadena CA, 91125

E-mail address ::