aa r X i v : . [ m a t h . D S ] A ug A quantitative multi-parameter mean ergodic theorem
Andrei Sipo¸s a,b a Department of Mathematics, Technische Universit¨at Darmstadt,Schlossgartenstrasse 7, 64289 Darmstadt, Germany b Simion Stoilow Institute of Mathematics of the Romanian Academy,Calea Grivit¸ei 21, 010702 Bucharest, RomaniaE-mail: [email protected]
Abstract
We use techniques of proof mining to obtain a computable and uniform rate of metastability (inthe sense of Tao) for the mean ergodic theorem for a finite number of commuting linear contractiveoperators on a uniformly convex Banach space.
Mathematics Subject Classification 2010 : 37A30, 47A35, 03F10.
Keywords:
Proof mining, mean ergodic theorem, uniformly convex Banach spaces, rate of metastability.
In the mid-1930s, Riesz proved the following formulation of the classical mean ergodic theorem of vonNeumann [27]: if X is a Hilbert space, T : X → X is a linear operator such that for all x ∈ X , k T x k ≤ k x k , then for any x ∈ X , we have that the corresponding sequence of ergodic averages ( x n ),where for each n , x n := 1 n + 1 n X k =0 T k x is convergent.It is then natural to ask for a rate of convergence – this ties into the area of proof mining , anapplied subfield of mathematical logic. This program in its current form has been developed in thelast twenty years primarily by Ulrich Kohlenbach and his collaborators (see [9] for a comprehensivemonograph; a recent survey which serves as a short and accessible introduction is [12]), and seeks toapply proof-theoretic tools to results in ordinary mathematics in order to extract information which maynot be immediately apparent. Unfortunately, it is known that ergodic convergence can be arbitrarily slow[19], i.e. there cannot exist a uniform rate of convergence. Kohlenbach’s work then suggests that oneshould look instead at the following (classically but not constructively) equivalent form of the Cauchyproperty: ∀ ε > ∀ g : N → N ∃ N ∈ N ∀ i, j ∈ [ N, N + g ( N )] ( k x i − x j k ≤ ε ) , which has been arrived at independently by Terence Tao in his own work on ergodic theory [26] andpopularized in [25] – as a result of the latter, the property got its name of metastability (at the suggestionof Jennifer Chayes). Kohlenbach’s metatheorems then guarantee the existence of a computable anduniform rate of metastability – a bound Θ( ε, g ) on the N in the sentence above – extractable from anyproof that shows the convergence of a given class of sequences and that may be formalized in one of thelogical systems for which such metatheorems have so far been developed, and in the late 2000s, Avigad,Gerhardy and Towsner [1] carried out the actual extraction for Riesz’s proof mentioned above.In 1939, Birkhoff generalized [3] the mean ergodic theorem to uniformly convex Banach spaces; ina paper published in 2009 [13], Kohlenbach and Leu¸stean analyzed that proof in order to obtain aquantitative version. The main technical obstacle to overcome was the use in the proof of the principle The proof was suggested by an idea of Carleman [4] and it first appeared in Hopf’s 1937 book on ergodic theory [8].The argument was published in an individual paper only in 1943 [22]. a n ) has an infimum, which they managed to replaceby an arithmetical greatest lower bound principle – stating that for any ε > n ∈ N suchthat for any m ∈ N , a n ≤ a m + ε – and gave a quantitative version thereof. Thus, they extracteda rate of metastability which depended, in addition, on the modulus of uniform convexity. In theintervening years, proof mining has continued this line of research, yielding rates of metastability fornonlinear generalizations of ergodic averages [10, 11, 14, 15, 24, 18, 20, 7, 17] or bounds on the numberof fluctuations [2] (see [16] for a detailed proof-theoretic study of the concept).Inspired by Birkhoff’s proof, Riesz produced a new one in 1941 [21] that separates more clearly therole played by uniform convexity (i.e. the fact that in such spaces minimizing sequences of convex setsare convergent). One of the advantages of this argument, pointed out in [23, p. 412], is that it readilygeneralizes to the case where one deals with more than one contractive operator (a result attributedthere to Dunford [6]), for example if d ≥ T , . . . , T d : X → X are commuting linear operators suchthat for each l and for each x ∈ X , k T l x k ≤ k x k , then for any x ∈ X , the sequence ( x n ), defined, for any n , by x n := 1( n + 1) d n X k =0 . . . n X k d =0 T k . . . T k d d x is convergent. The goal of this paper is to extract out of that argument a rate of metastability for the abovemulti-parameter mean ergodic theorem.
Towards that end, we noticed that the infimum of all the convex combinations of the iterates of x may be effectively replaced by that of just the arithmetic means of pairs of two given ergodic averages.Therefore, we only need to extend the abovementioned principle of Kohlenbach and Leu¸stean to doublesequences, which we do in Lemma 2.2 by means of the Cantor pairing function. After stating the factsthat we shall need about uniformly convex Banach spaces, we express our quantitative metastabilityresult in the form of Theorem 2.4. For all f : N → N , we define e f : N → N , for all n , by e f ( n ) := n + f ( n ) and f M : N → N , for all n , by f M ( n ) := max i ≤ n f ( i ); in addition, for all n ∈ N , we denote by f ( n ) the n -fold composition of f withitself.We may now state the quantitative arithmetical greatest lower bound principle of Kohlenbach andLeu¸stean [13]. Lemma 2.1 (cf. [13, Lemma 3.1]) . Let ( a n ) ⊆ [0 , . Then for all ε > and all g : N → N there is an N ≤ (cid:0) g M (cid:1) ( ⌈ ε ⌉ ) (0) such that for all s ≤ g ( N ) , a N ≤ a s + ε . In order to extend the above principle to double sequences, we make use of the bijection c : N → N ,known as the Cantor pairing function , defined for all m , n ∈ N by c ( m, n ) := ( m + n )( m + n + 1)2 + n. Note that for all m and n , we have firstly that m , n ≤ c ( m, n ) and then that for all s with m , n ≤ s , c ( m, n ) ≤ s + 2 s . Lemma 2.2.
Let ( a m,n ) ⊆ [0 , . Define, for any suitable g , ε , s : f ( s ) := 2 s + 2 sG ( ε, g ) := (cid:16) ( f ◦ g ) M (cid:17) ( ⌈ ε ⌉ ) (0) . Then for all ε > and all g : N → N there is an N ≤ G ( ε, g ) and p , q ≤ N such that for all i , j ≤ g ( N ) , a p,q ≤ a i,j + ε . roof. Put, for all s , if m and n are such that c ( m, n ) = s , b s := a m,n (using that c is bijective). ByLemma 2.1, there is an N ≤ G ( ε, g ) such that for all s ≤ f ( g ( N )), b N ≤ b s + ε . Let p and q be suchthat c ( p, q ) = N , so p , q ≤ N . Let i , j ≤ g ( N ). Since then c ( i, j ) ≤ f ( g ( N )), b N ≤ b c ( i,j ) + ε , so a p,q ≤ a i,j + ε .Uniform convexity in Banach spaces was first introduced by Clarkson [5]. We use the followingformulation: a Banach space X is uniformly convex if there is an η : (0 , ∞ ) → (0 , modulus ofuniform convexity , such that for all ε > x, y ∈ X with k x k ≤ k y k ≤ k x − y k ≥ ε onehas that (cid:13)(cid:13)(cid:13)(cid:13) x + y (cid:13)(cid:13)(cid:13)(cid:13) ≤ − η ( ε ) . One typically defines the fixed modulus of uniform convexity δ X by putting, for all ε ∈ (0 , δ X ( ε ) := inf (cid:26) − (cid:13)(cid:13)(cid:13)(cid:13) x + y (cid:13)(cid:13)(cid:13)(cid:13) (cid:12)(cid:12) k x k ≤ , k y k ≤ , k x − y k ≥ ε (cid:27) and says that X is uniformly convex if and only if for all ε ∈ (0 , δ X ( ε ) >
0. It is immediate that thetwo definitions coincide: in this case, δ X may serve as a modulus η in our sense (it is, in fact, the largestsuch modulus) by putting for all ε > η ( ε ) := δ X (min( ε, Proposition 2.3 ([13, Lemma 3.2]) . Let X be a Banach space with modulus of uniform convexity η .Define, for any ε > , u ( ε ) := ε · η ( ε ) . Then, for any ε > and any x , y ∈ X with k x k ≤ k y k ≤ and k x − y k ≥ ε we have that (cid:13)(cid:13)(cid:13)(cid:13) x + y (cid:13)(cid:13)(cid:13)(cid:13) ≤ k y k − u ( ε ) . In addition, if there is a nondecreasing function η ′ such that for all ε , η ( ε ) = εη ′ ( ε ) (e.g. in the case ofHilbert spaces where ε ε is a modulus of uniform convexity), then one can take u to be simply η . With that in mind, we may now state our main theorem.
Theorem 2.4.
Let X be a Banach space. Let G be defined as in Lemma 2.2. Define, for any suitable ε , g , u , d , δ , Q , γ , n : Φ( d, δ, Q ) := max (cid:18) Q, (cid:24) d Qδ (cid:25)(cid:19) h γ,g,d ( n ) := e g (cid:16) Φ (cid:16) d, γ , n (cid:17)(cid:17) Ψ( d, γ, g ) := Φ (cid:16) d, γ , G (cid:16) γ , h γ,g,d (cid:17)(cid:17) Θ u,d ( ε, g ) := Ψ (cid:18) d, u ( ε )2 , g (cid:19) . Let u be such that the property described by Proposition 2.3 holds. Let d ≥ and T , . . . , T d : X → X becommuting linear operators such that for all l , T l is contractive, i.e. for all x ∈ X , k T l x k ≤ k x k . Wedenote, for any x ∈ X and i ∈ N d , T i x := T i . . . T i d d x . Let x ∈ X with k x k ≤ . Put, for all n , x n := 1( n + 1) d X k ∈ [0 ,n ] d T k x. Let ε > and g : N → N . Then there is an N ≤ Θ u,d ( ε, g ) such that for all i , j ∈ [ N, N + g ( N )] , k x i − x j k ≤ ε .Proof. We first prove the following two claims.
Claim 1.
For all δ > Q and ( c j ) j ∈ [0 ,Q ] d ⊆ [0 ,
1] with P j ∈ [0 ,Q ] d c j = 1, if we set z := X j ∈ [0 ,Q ] d c j T j x, n ≥ Φ( d, δ, Q ), k x n k ≤ k z k + δ . Proof of claim 1:
Put, for all n , z n := 1( n + 1) d X i ∈ [0 ,n ] d T i z. Then it is enough to show that for all n ≥ Q we have that k x n − z n k ≤ d Qn +1 , since then for all n ≥ Φ( d, δ, Q ), given that n ≥ d Qδ , one has d Qn +1 ≤ δ , so k x n − z n k ≤ δ and (using here the contractivityof the T l ’s) k x n k ≤ k z n k + k x n − z n k ≤ k z k + δ. Let, then, n ≥ Q . We have that (using here that the T l ’s are linear and commuting) z n = 1( n + 1) d X i ∈ [0 ,n ] d T i X j ∈ [0 ,Q ] d c j T j x = 1( n + 1) d X i ∈ [0 ,n ] d X j ∈ [0 ,Q ] d c j T i + j x. If for any k ∈ [0 , n + Q ] d we denote by R k the set of all j ∈ [0 , Q ] d with the property that there is an i ∈ [0 , n ] d with i + j = k , we have that z n = 1( n + 1) d X k ∈ [0 ,n + Q ] d X j ∈ R k c j T k x. We show that for all k ∈ [ Q, n ] d , R k = [0 , Q ] d . Let j ∈ [0 , Q ] d and put i := k − j . We have to show that i ∈ [0 , n ] d . Let s ∈ [1 , d ]. Since Q ≤ k s ≤ n and − Q ≤ − j s ≤
0, we have 0 ≤ i s ≤ n , which is what weneeded. Thus, we have z n = 1( n + 1) d X k ∈ [0 ,n + Q ] d \ [ Q,n ] d X j ∈ R k c j T k x + 1( n + 1) d X k ∈ [ Q,n ] d X j ∈ [0 ,Q ] d c j T k x. Since on the other hand, x n = 1( n + 1) d X k ∈ [0 ,n ] d T k x = 1( n + 1) d X k ∈ [0 ,n ] d X j ∈ [0 ,Q ] d c j T k x, we have, since [ Q, n ] d ⊆ [0 , n ] d , that x n − z n = 1( n + 1) d X k ∈ [0 ,n + Q ] d \ [ Q,n ] d k ∈ [0 ,n ] d X j ∈ [0 ,Q ] d j / ∈ R k c j T k x − n + 1) d X k ∈ [0 ,n + Q ] d \ [ Q,n ] d k / ∈ [0 ,n ] d X j ∈ R k c j T k x, so, using that k x k ≤ T l ’s, k x n − z n k ≤ n + 1) d X k ∈ [0 ,n + Q ] d \ [ Q,n ] d k ∈ [0 ,n ] d n + 1) d X k ∈ [0 ,n + Q ] d \ [ Q,n ] d k / ∈ [0 ,n ] d (cid:12)(cid:12) [0 , n + Q ] d \ [ Q, n ] d (cid:12)(cid:12) ( n + 1) d , so, by putting m := n + 1, k x n − z n k ≤ ( n + Q + 1) d − ( n − Q + 1) d ( n + 1) d = ( m + Q ) d − ( m − Q ) d m d . Since ( m + Q ) d = m d + (cid:0) d (cid:1) m d − Q + (cid:0) d (cid:1) m d − Q + (cid:0) d (cid:1) m d − Q + . . . and ( m − Q ) d = m d − (cid:0) d (cid:1) m d − Q + (cid:0) d (cid:1) m d − Q − (cid:0) d (cid:1) m d − Q + . . . , we have that( m + Q ) d − ( m − Q ) d = 2 (cid:18)(cid:18) d (cid:19) m d − Q + (cid:18) d (cid:19) m d − Q + . . . (cid:19) . m ≥ Q , for each k , m d − k Q k ≤ m d − Q , so( m + Q ) d − ( m − Q ) d ≤ m d − Q (cid:18)(cid:18) d (cid:19) + (cid:18) d (cid:19) + . . . (cid:19) = 2 d m d − Q, from which we get k x n − z n k ≤ d m d − Qm d = 2 d Qm = 2 d Qn + 1 , which is what we wanted. (cid:4) Claim 2.
For all γ > N ≤ Ψ( d, γ, g ) such that for all i , j ∈ [ N, N + g ( N )], k x i k ≤ (cid:13)(cid:13)(cid:13) x i + x j (cid:13)(cid:13)(cid:13) + γ – note that by symmetry we also have k x j k ≤ (cid:13)(cid:13)(cid:13) x i + x j (cid:13)(cid:13)(cid:13) + γ . Proof of claim 2:
By Lemma 2.2, there is a Q ≤ G (cid:0) γ , h γ,g,d (cid:1) and p , q ≤ Q such that for all i , j ≤ h γ,g,d ( Q ), (cid:13)(cid:13)(cid:13)(cid:13) x p + x q (cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13)(cid:13) x i + x j (cid:13)(cid:13)(cid:13)(cid:13) + γ . Put N := Φ (cid:0) d, γ , Q (cid:1) . Then, using that Φ is nondecreasing in its last argument, N ≤ Φ (cid:16) d, γ , G (cid:16) γ , h γ,g,d (cid:17)(cid:17) = Ψ( d, γ, g ) . Let i , j ∈ [ N, N + g ( N )]. Note that N + g ( N ) = e g (cid:16) Φ (cid:16) d, γ , Q (cid:17)(cid:17) = h γ,g,d ( Q ) , so (cid:13)(cid:13)(cid:13)(cid:13) x p + x q (cid:13)(cid:13)(cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:13)(cid:13) x i + x j (cid:13)(cid:13)(cid:13)(cid:13) + γ . In addition, since i ≥ Φ (cid:0) d, γ , Q (cid:1) , we have by the previous claim that k x i k ≤ (cid:13)(cid:13)(cid:13)(cid:13) x p + x q (cid:13)(cid:13)(cid:13)(cid:13) + γ . Putting together the last two inequalities, we obtain our conclusion. (cid:4)
Apply the above claim for γ := u ( ε )2 to get an N ≤ Θ u,d ( ε, g ). Let i , j ∈ [ N, N + g ( N )] and assumew.l.o.g. k x j k ≤ k x i k . Assume that k x i − x j k > ε . Then, using Proposition 2.3 and the conclusion of theabove claim, we get (cid:13)(cid:13)(cid:13)(cid:13) x i + x j (cid:13)(cid:13)(cid:13)(cid:13) ≤ k x i k − u ( ε ) ≤ (cid:13)(cid:13)(cid:13)(cid:13) x i + x j (cid:13)(cid:13)(cid:13)(cid:13) − u ( ε )2 , a contradiction.Let us extend the above result to an arbitrary x ∈ X . For any b >
0, set Θ u,d,b ( ε, g ) := Θ u,d ( ε/b, g ).Then if b > k x k ≤ b , if we put e x := x/b we see that k e x k ≤ n , thecorresponding e x n is equal to x n /b . By applying the above theorem for ε/b , g and e x we get that there isan N ≤ Θ u,d,b ( ε, g ) such that for all i , j ∈ [ N, N + g ( N )], k e x i − e x j k ≤ ε/b – that is, k x i − x j k ≤ ε . Thus,Θ u,d,b is a rate of metastability applicable for this general situation. This work has been supported by the German Science Foundation (DFG Project KO 1737/6-1).5 eferences [1] J. Avigad, P. Gerhardy, H. Towsner, Local stability of ergodic averages.
Trans. Amer. Math. Soc.
Ergodic Theory Dynam. Systems
35, 1009–1027, 2015.[3] G. Birkhoff, The mean ergodic theorem.
Duke Math. J.
5, no. 1, 19–20, 1939.[4] T. Carleman, Application de la th´eorie des ´equations int´egrales lin´eaires aux syst`emes d’´equationsdiff´erentielles non lin´eaires.
Acta Math.
59, no. 1, 63–87, 1932.[5] J. A. Clarkson, Uniformly convex spaces.
Trans. Amer. Math. Soc.
40, no. 3, 415–420, 1936.[6] N. Dunford, An ergodic theorem for n -parameter groups, Proc. Natl. Acad. Sci. U.S.A.
25, 195–196,1939.[7] F. Ferreira, L. Leu¸stean, P. Pinto, On the removal of weak compactness arguments in proof mining.
Adv. Math.
Ergodentheorie . Ergebnisse d. Math. u. ihrer Grenzgeb. Band 5, Heft 2, Berlin: J. Springer,1937.[9] U. Kohlenbach,
Applied proof theory: Proof interpretations and their use in mathematics . SpringerMonographs in Mathematics, Springer, 2008.[10] U. Kohlenbach, On quantitative versions of theorems due to F. E. Browder and R. Wittmann.
Adv.Math.
Commun. Contemp. Math.
14, no. 1, 1250006, 2012.[12] U. Kohlenbach, Proof-theoretic methods in nonlinear analysis. In: B. Sirakov, P. Ney de Souza, M.Viana (eds.),
Proceedings of the International Congress of Mathematicians 2018 (ICM 2018) , Vol. 2(pp. 61–82). World Scientific, 2019.[13] U. Kohlenbach, L. Leu¸stean, A quantitative mean ergodic theorem for uniformly convex Banachspaces.
Ergodic Theory Dynam. Systems
29, 1907–1915, 2009.[14] U. Kohlenbach, L. Leu¸stean, Effective metastability of Halpern iterates in CAT(0) spaces.
Adv.Math.
Adv. Math.
Philosophical Transactions of the Royal Society A
Vol. 370, Issue 1971 (Theme Issue ‘The foundationsof computation, physics and mentality: the Turing legacy’), 3449–3463, 2012.[16] U. Kohlenbach, P. Safarik, Fluctuations, effective learnability and metastability in analysis.
Ann.Pure Appl. Log.
Commun.Contemp. Math. , https://doi.org/10.1142/S0219199719500937 , 2020.[18] D. K¨ornlein, Quantitative results for Halpern iterations of nonexpansive mappings. J. Math. Anal.Appl.
Monatshefte f¨ur Mathematik κ ) spaces. ErgodicTheory Dynam. Systems
36, 2580–2601, 2016. 621] F. Riesz, Another proof of the mean ergodic theorem.
Acta Univ. Szeged. Sect. Sci. Math.
10, 75–76,1941.[22] F. Riesz, B. Sz˝okefalvi-Nagy, ¨Uber Kontraktionen des Hilbertschen Raumes.
Acta Univ. Szeged. Sect.Sci. Math.
10, 202–205, 1943.[23] F. Riesz, B. Sz˝okefalvi-Nagy,
Functional analysis . Translated by Leo F. Boron. Frederick UngarPublishing Co., New York, 1955.[24] P. Safarik, A quantitative nonlinear strong ergodic theorem for Hilbert spaces.
J. Math. Anal. Appl.
Structure and Randomness: Pages from Year One of a Mathematical Blog .AMS, 298 pp., 2008.[26] T. Tao, Norm convergence of multiple ergodic averages for commuting transformations.
ErgodicTheory Dynam. Systems
28, 657–688, 2008.[27] J. von Neumann, Proof of the quasi-ergodic hypothesis.