[PDF] Computer Assisted Proof of Drift Orbits Along Normally Hyperbolic Manifolds

Abstract

Normally hyperbolic invariant manifolds theory provides an efficient tool for proving diffusion in dynamical systems. In this paper we develop a methodology for computer assisted proofs of diffusion in a-priori chaotic systems based on this approach. We devise a method, which allows us to validate the needed conditions in a finite number of steps, which can be performed by a computer by means of rigorous-interval-arithmetic computations. We apply our method to the generalized standard map, obtaining diffusion over an explicit range of actions.

Full PDF

CComputer Assisted Proof of Drift Orbits Along NormallyHyperbolic Manifolds

Maciej J. Capi´nski a,1, ∗ , Jorge Gonzalez b,2 , Jean-Pierre Marco c , J.D. Mireles James d,3 a Faculty of Applied Mathematics, AGH University of Science and Technology, al. Mickiewicza 30, 30-059 Krak´ow,Poland. b School of Mathematics, Georgia Institute of Technology, 686 Cherry Street, Atlanta, GA, 30332, USA. c Institut de Math´ematiques, Analyse algx´ebrique, Universit´e Pierre et Marie Curie, 175 rue du Chevaleret, 75013 Paris,France. d Department of Mathematical Sciences, Florida Atlantic University, 777 Glades Road, Boca Raton, FL 33431, USA.

Abstract

Normally hyperbolic invariant manifolds theory provides an e ﬃ cient tool for proving di ﬀ usionin dynamical systems. In this paper we develop a methodology for computer assisted proofs ofdi ﬀ usion in a-priori chaotic systems based on this approach. We devise a method, which allowsus to validate the needed conditions in a ﬁnite number of steps, which can be performed by acomputer by means of rigorous-interval-arithmetic computations. We apply our method to thegeneralized standard map, obtaining di ﬀ usion over an explicit range of actions. Keywords:

Normally hyperbolic manifold, Arnold di ﬀ usion, scattering map, topologicalshadowing, computer assisted proof

1. Introduction

The inﬂuence of Celestial Mechanics on the evolution of dynamical systems theory cannotbe overstated. The latter originated in Newton’s

Principia , where were formulated and solvedthe di ﬀ erential equations governing the motion of two planets. In modern terminology, Newtonshowed that the two body problem is “completely integrable”: that is, there are enough conservedquantities so that all solutions are obtained by taking intersections of their level sets. In the twobody case this results in the conic sections.The Newtonian n -body problem for n ≥ ﬃ cient for describing the behavior ofmore general gravitating systems. Interest in the problem led to the development of perturbationtheory in the works of prominent mathematicians like Euler, Lagrange, Laplace, Gauss, andHamilton during the next two centuries, and to the introduction of both the Lagrangian and ∗ Corresponding author. Partially supported by the NCN grant 2018 / / B / ST1 / Partially supported by NSF grant MSPRF DMS-2001758. Partially supported by NSF grant DMS 1813501.

Preprint submitted to Elsevier February 15, 2021 a r X i v : . [ m a t h . D S ] F e b amiltonian formulations of mechanics. However, the “only” problem studied during this periodwas to compute with extreme precision the trajectories of (some of) the planets in the solarsystem, taking into account mutual interactions between them.The work of Poincar´e at the end of the 19 th century revolutionized the theory, and the funda-mental questions changed in several dramatic ways. • Instead of looking at individual solutions of a given system, Poincar´e realized that consid-ering the evolution of all initial conditions enables one to use (or create new) geometrictools adapted to this setting. • Instead of looking at the evolution of a trajectory over a ﬁnite time interval, he realizedthat understanding the asymptotic behavior of certain special orbits as time goes to inﬁnityyields useful information.As an example, recall that Poincar´e used the geometry of certain inﬁnitely long homoclinic orbitsto establish that the (restricted) three body problem is not integrable, simultaneously shatteringthe notion that integrability was su ﬃ cient for the study of all dynamics and establishing theexistence of complex phenomena never before imagined. Another stunning example where bothideas are fully exploited is the celebrated “Poincar´e recurrence theorem”, which uses measure(probability) theory on the geometric side, and which would have been totally unreachable byany study only focusing on the evolution of single trajectories. The impressive ensemble ofideas introduced and developed by Poincar´e is nowadays considered to be the very cornerstoneof dynamical systems theory.Stability remained a major concern in the new theory, and a central problem was to un-derstand perturbations of completely integrable systems. Indeed the solar system itself can beviewed as a system of weakly coupled (completely integrable) two body problems, and the ques-tion of it’s stability has captivated mathematicians since the days of Newton. To formalize thediscussion , let A n = T ∗ T n = T n × R n denote the annulus with the angle-action coordinates ( θ, r ),endowed with the symplectic form Ω = (cid:80) ni = dr i ∧ d θ i . Consider a Hamiltonian of the form H ( θ, r ) = h ( r ) + f ( θ, r ) , (1)where f is small in some suitable function space (analytic, C ∞ , C κ , etcetera). The Hamiltoniandi ﬀ erential equations generated by H read˙ θ i = ∂ r i H ( θ, r ) = ∂ r i h ( r ) + ∂ r i f ( θ, r )˙ r i = − ∂ θ i H ( θ, r ) = − ∂ θ i f ( θ, r ) , and we note that when f ≡

0, all the orbits move with constant velocity on invariant tori.When f is small it is apparent that the evolution of the action variables r i are “slow”. Thefact that this evolution is “extremely slow” emerged from the averaging methods originally de-veloped by Lagrange and Laplace, furthered by Poincar´e and Birkho ﬀ , and which culminated inthe work of Littlewood [2] and in the major achievements of Nekhoroshev [3]. Thanks to thework of these and many subsequent authors it is now well-known that if f is strictly convex andanalytic, then the drift in the action variables cannot exceed a variation of O ( ε / n ) during an O (cid:0) exp(1 /ε ) / n − (cid:1) -long time, where ε measures the size of the perturbation function f . We refer to [1] for a comprehensive presentation of symplectic geometry T n × { r } persistand are only slightly deformed when the perturbation f is added to the system, provided that thefrequency vector ∇ h ( r ) is Diophantine, meaning that there are constants γ > τ > | k · ∇ h ( r ) | ≥ γ (cid:107) k (cid:107) τ , ∀ k ∈ Z n \ { } . Arnold and Moser then added their own contributions to this initial result, giving rise to what isnow known as the KAM theory [5, 6, 7, 8]. See also [9] for much more complete discussion ofthe KAM theory and and it’s development.Taken together, the KAM and averaging theories provide indispensable information aboutthe dynamics of perturbations of integrable Hamiltonian systems. The KAM theory tells us that some orbits remain close to the unperturbed dynamics for all time (the KAM tori), while theaveraging theory says that all orbits stay close to the unperturbed dynamics for exponentiallylong times. A natural question is to ask do there exist orbits which move “arbitrarily” far fromthe integrable dynamics on a long enough time scale?

Indeed, when n ≥

3, a KAM torus has aconnected complement in a constant energy level, and the existence of the full family of KAMtori (whose complement has an O ( √ ε ) relative measure in this level) does not prevent trajectoriesfrom drifting away from the integrable dynamics on very long timescales.The ﬁrst example exhibiting this phenomenon was given by Arnold in [10], and had thefollowing form: H ε ( θ, r ) = r + ( r + r ) + µ cos θ + ε g ( θ, r ) , θ ∈ T , r ∈ R , (2)where g is an explicit ﬁxed trigonometric polynomial, with µ and ε independent parameters.The example has several important special properties, which we state here for a general analyticfunction g . • When µ = ε =

0, the system reduces to h and is completely integrable in angle-actionform. • When µ > ε =

0, the system H is Liouville-integrable. In particular, it admits anormally hyperbolic (and symplectic) invariant annulus A = A ×{ O } , where O = (0 , ∈ A is the hyperbolic ﬁxed point of the pendulum r + µ cos θ . The stable and unstablemanifolds of A take the form W ± ( A ) = A × W ± ( O ). The Hamiltonian ﬂow in restrictionto A is completely integrable, in the sense that it admits a foliation by the Lagrangian (forthe induced structure) invariant tori (cid:0) T × { ( r , r ) } (cid:1) ( r , r ) ∈ R . • For ﬁxed µ and small enough ( ε has to be exponentially small w.r.t. µ in Arnold’s exam-ple), the annulus A is only slightly deformed and gives rise to a 4-dimensional normallyhyperbolic (symplectic) invariant annulus A ε close to A , with a rich homoclinic structure,while the Hamiltonian ﬂow on A ε is close to completely integrable.It is important to stress that the perturbation g is carefully chosen in Arnold’s example, sothat the annulus A is still invariant when ε >

0. By exploiting this fact Arnold was able to showthat for µ, ε > H ε admits a solution γ ε ( t ) = (cid:0) θ ( t ) , r ( t ) (cid:1) which drifts of order 1 inaction for suitable (very large) T µ,ε . That is r (0) < , r ( T ε ) > , ﬀ use”as far and as fast from the integrable dynamics as allowed by averaging theory.The use of two independent parameters (a method originally introduced by Poincar´e) inArnold’s example simpliﬁes a lot the study: (2) is to be compared with (1), where the size of f is the only available parameter. Nevertheless, Arnold’s example became a jumping o ﬀ pointfor a large body of work. By now this is a thriving industry and it is known that di ﬀ usion occursunder a wide variety of hypotheses.Another (deeper) question raised by Arnold is the case where the parturbed Hamiltonian iscompletely integrable and in action-angle form (the famous “fundamental problem of dynamics”of Poincar´e). Given a Hamiltonian system h which depends only on the actions, does there exista large (residual) set of perturbations g such that orbits di ﬀ use in the previous fashion - or evenvisit any prescribed collection of open subsets of an energy level? It turns out that this questionis extremely delicate, and there are still many important open problems in this active area ofresearch. The present discussion is by no means intended as a literature review of the ﬁeld, werefer to [15] for a very nice result in any dimension, together with relevant references.We consider another line of study, which comes from weakening the hypothesis that the un-perturbed system is completely integrable. Consider for example systems of the form (2), inwhich the parameter µ is ﬁxed but not small (say µ = a priori unstable, since they already admit hyperbolic invariant objects when ε =

0. The main di ﬃ culty instudying a priori unstable systems is their “singular character” or lack of transversality, comingfrom the fact that the manifolds W ± ( A ) coincide when ε =

0. Detecting homoclinic intersec-tions in such systems for generic g when ε (cid:44) W ± ( A ) transversely intersect even in the case ε =

0. This class of systems is knownas a priori chaotic (see [16] and [17] for examples in this category closely related to Arnold’s).Studying such systems is simpler, which leaves open the possibility of asking new and morequantitative questions, e.g. what is the threshold in ε under which di ﬀ usion phenomenons canappear, or, what is the maximal length of di ﬀ usive trajectories? These questions require newmethods, and this is the main concern of the present work. It turns out that in realistic physicalsystems the relevant quantities to estimate are di ﬃ cult to compute, and our aim is to providean explicit example illustrating the relevance of computer-assisted methods of proof in suchproblems.In order to simplify the construction we shift our focus to symplectic maps instead of Hamil-tonian vector ﬁelds. Such a reduction is natural, since taking a Poincar´e section in an energymanifold results in a symplectic di ﬀ eomorphism. The main example of the paper is the familyof symplectic di ﬀ eomorphisms f ε : R × T → R × T deﬁned by f ε ( x , y , θ, I ) =  x + y + α sin ( x ) y + α sin ( x ) θ + II  + ε  cos( x ) sin( θ )cos( x ) sin( θ )sin( x ) cos( θ )sin( x ) cos( θ )  (3) The fact that the speed of Arnold di ﬀ usion coincides with the prediction of averaging theory was indeed provedmuch later, see [11, 12, 13, 14] x , y ) ∈ R and ( θ, I ) ∈ T . Observe that the map f ε can be seen as a perturbation ofa standard map (variables ( x , y )) coupled to an I -parametrized rotation on T (variables ( θ, I )).Indeed, when ε = O in R . Inthe present work we do not treat α as a perturbation parameter, and will show that for some ﬁxed α the stable and unstable manifolds intersect transversely at some point P (so that the parameter α plays the role of µ in Arnold’s example). Consequently, f admits an invariant torus { O } × T , which is readily seen to be normally hyperbolic, and whose stable and unstable manifoldsintersect transversely along a homoclinic torus { P } × T . By the Birkho ﬀ -Smale theorem, a largeenough iterate of the standard map admits a horseshoe (homeomorphic to { , } Z endowed withthe product topology) near the origin. Consequently, for N large enough, the coupling f N admitsa ﬁbered horseshoe, close to { } × T and homeomorphic to { , } Z × T , on which it induces aﬁber-preserving dynamics. This problem was formalized in [18]. -8-6-4-2 0 2 4 6 0 1 2 3 4 5 6 7 Figure 1: Phase space structure of the Chirikov Standard Map when α = . Black dots indicate the dynamics of a numberof “typical” orbits. The stable and unstable manifolds of the ﬁxed point at the origin are depicted by the red and bluecurves respectively.

When ε >

0, the preservation of the ﬁbers is broken, and nothing prevents the orbits fromdrifting along the base T in the I direction. In this paper we use constructive computer assistedarguments to prove that such drift orbits do indeed exist for f ε , and that they have lengths in-dependent of the size ε of the perturbation. This makes the system a signiﬁcant example in the a-priori chaotic case. Moreover, the present work provides a self contained exposition of con-structive computer assisted methods for proving the existence of di ﬀ usion phenomena in explicitexamples.Our results are based on shadowing theorems for scattering maps worked out in [19]. Ascattering map is a function from a normally hyperbolic invariant manifold to itself, deﬁnedthrough appropriate intersections of ﬁbers of its stable and unstable manifolds. In [19] it isshown that pseudo orbits resulting from iterations of scattering maps are shadowed by true orbitsof the system. We use this method in our main results, which are contained in Theorems 11,17, 18 and 19. Theorems 11, 17, 18 establish orbits which di ﬀ use over an explicit interval of5ctions. Theorem 19 establishes orbits which shadow sequences of actions, chosen from theinterval. The aim of this paper is to provide tools which can be used to obtain computer assistedproofs. To check the hypotheses of our theorems one needs to compute the scattering maps ofthe unperturbed system, and to check certain explicit inequalities which measure the inﬂuenceof the perturbation on the action. This inﬂuence is computed by considering ﬁnite fragmentsof homoclinic orbits. We show two computer-assisted methods with which the scattering mapcan be computed: by using cones or the parameterization method. We apply our results to givea computer-assisted proof of di ﬀ usion for the system given by Equation (3). In a forthcomingpaper we plan an application to the Planar Restricted Three Body Problem, with mass parametersof the Jupiter-Sun system.An alternative approach for computer assisted proof of di ﬀ usion is given in [20]. This work isbased on the method of correctly aligned windows. The di ﬀ erence compared to this paper is that[20] requires an explicit construction of ‘connecting sequences’ of windows. These windows arethen used for shadowing arguments. Here we establish transversal intersections of stable / unstablemanifolds leading to scattering maps, and check our conditions along homoclinic orbits. Theshadowing is automatically ensured by [19].The remainder of the paper is organized as follows. In Section 2 we review some preliminaryinformation about normally hyperbolic invariant manifolds, scattering maps, and the intervalNewton method. In Section 3 we lay out our main theoretical results, namely the constructivehypothesis which are used to establish Arnold di ﬀ usion in explicit examples. Section 4 appliesthe method to the example system. Section 5 is a technical treatment of constructive methods forstudying the stable / unstable manifolds fo ﬁxed points of maps. Proofs of some of the theoremsand lemmas are relegated to the Appendices.

2. Preliminaries

Throughout the paper, for x ∈ R n , by (cid:107) x (cid:107) we shall mean the Euclidean norm. We use T = R / mod 2 π, to stand for a one dimensional torus and T k to stand for a k -dimensional torus. For aset A in a topological space we shall write A to denote its closure. In this section we recall the notion of a normally hyperbolic invariant manifold and statethe main result concerning its persistence under small perturbation. A classic reference for thismaterial is [21].

Deﬁnition 1.

Let Λ ⊂ R n be a compact manifold without boundary, invariant under f : R n → R n , i.e., f ( Λ ) = Λ , where f is a C r -di ﬀ eomorphism, r ≥ . We say that Λ is a normally hyperbolicinvariant manifold (with symmetric rates) if there exists a constant C > , rates < λ < µ − < and a T f invariant splitting for every x ∈ Λ R n = E ux ⊕ E sx ⊕ T x Λ such that v ∈ E ux ⇔ (cid:13)(cid:13)(cid:13) D f k ( x ) v (cid:13)(cid:13)(cid:13) ≤ C λ − k (cid:107) v (cid:107) , k ≤ , (4) v ∈ E sx ⇔ (cid:13)(cid:13)(cid:13) D f k ( x ) v (cid:13)(cid:13)(cid:13) ≤ C λ k (cid:107) v (cid:107) , k ≥ , (5) v ∈ T x Λ ⇒ (cid:13)(cid:13)(cid:13) D f k ( x ) v (cid:13)(cid:13)(cid:13) ≤ C µ | k | (cid:107) v (cid:107) , k ∈ Z . (6)6et d ( x , Λ ) stand for the distance between a point x and the manifold Λ , induced by theEuclidean norm. Given a normally hyperbolic invariant manifold and a suitable small tubularneighbourhood U ⊂ R n of Λ one deﬁnes its local unstable and local stable manifold [21] as W u Λ ( f , U ) = (cid:110) y ∈ R n | f k ( y ) ∈ U , d (cid:16) f k ( y ) , Λ (cid:17) ≤ C y λ | k | , k ≤ (cid:111) , W s Λ ( f , U ) = (cid:110) y ∈ R n | f k ( y ) ∈ U , d (cid:16) f k ( y ) , Λ (cid:17) ≤ C y λ k , k ≥ (cid:111) , where C y is a positive constant, which can depend on y . We deﬁne the (global) unstable andstable manifolds as W u Λ ( f ) = (cid:91) n ≥ f n (cid:16) W u Λ ( f , U ) (cid:17) , W s Λ ( f ) = (cid:91) n ≥ f − n (cid:16) W s Λ ( f , U ) (cid:17) . The manifolds W u Λ ( f , U ), W s Λ ( f , U ), W u Λ ( f ) and W s Λ ( f ) are foliated by W ux ( f , U ) = (cid:110) y ∈ R n | f k ( y ) ∈ U , d ( f k ( y ) , f k ( x )) ≤ C x , y λ | k | , k ≤ (cid:111) , W sx ( f , U ) = (cid:110) y ∈ R n | f k ( y ) ∈ U , d ( f k ( y ) , f k ( x )) ≤ C x , y λ k , k ≥ (cid:111) , where x ∈ Λ and C x , y is a positive constant, which can depend on x and y , W ux ( f ) = (cid:91) n ≥ f n (cid:16) W uf − n ( x ) ( f , U ) (cid:17) , W sx ( f ) = (cid:91) n ≥ f − n (cid:16) W sf n ( x ) ( f , U ) (cid:17) . Let l < min (cid:40) r , | log λ | log µ (cid:41) . (7)The manifold Λ is C l smooth, the manifolds W u Λ ( f ) , W s Λ ( f ) are C l − and W ux ( f ), W sx ( f ) are C r [22]. Normally hyperbolic manifolds, as well as their stable and unstable manifolds and theirﬁbres persist under small perturbations [21]. Lemma 2. [22] In the case that the map f preserves a symplectic form ω , the induced form ω | Λ is a symplectic and f | Λ preserves ω | Λ .2.2. Shadowing of scattering maps Our di ﬀ usion result is based on shadowing lemmas for scattering maps found in [19], whichwe now summarize.Let us assume that Λ is a normally hyperbolic invariant manifold for f , and deﬁne two maps, Ω + : W s Λ ( f ) → Λ , Ω − : W u Λ ( f ) → Λ , where Ω + ( x ) = x + i ﬀ x ∈ W sx + ( f ), and Ω − ( x ) = x − i ﬀ x ∈ W ux − ( f ) . These are referred to as thewave maps

Deﬁnition 3.

We say that a manifold Γ ⊂ W u Λ ( f ) ∩ W s Λ ( f ) is a homoclinic channel for Λ if thefollowing conditions hold: i) for every x ∈ Γ T x W s Λ ( f ) ⊕ T x W u Λ ( f ) = R n , (8) T x W s Λ ( f ) ∩ T x W u Λ ( f ) = T x Γ , (9) (ii) the ﬁbres of Λ intersect Γ transversally in the following senseT x Γ ⊕ T x W sx + ( f ) = T x W s Λ ( f ) , (10) T x Γ ⊕ T x W ux − ( f ) = T x W u Λ ( f ) , (11) for every x ∈ Γ ,(iii) the wave maps ( Ω ± ) | Γ : Γ → Λ are di ﬀ eomorphisms onto their image. Deﬁnition 4.

Assume that Γ is a homoclinic channel for Λ and let Ω Γ ± : = ( Ω ± ) | Γ . We deﬁne a scattering map σ Γ for the homoclinic channel Γ as σ Γ : = Ω Γ+ ◦ (cid:16) Ω Γ − (cid:17) − : Ω Γ − ( Γ ) → Ω Γ+ ( Γ ) . We have the following theorem.

Theorem 5. [19] Assume that f : R n → R n is a su ﬃ ciently smooth map, Λ ⊂ R n is a normallyhyperbolic invariant manifold with stable and unstable manifolds which intersect transversallyalong a homoclinic channel Γ ⊂ R n , and σ is the scattering map associated to Γ .Assume that f preserves measure absolutely continuous with respect to the Lebesgue measureon Λ , and that σ sends positive measure sets to positive measure sets.Let m , . . . , m n ∈ N be a ﬁxed sequence of integers. Let { x i } i = ,..., n be a ﬁnite pseudo-orbit in Λ , that is a sequence of points in Λ of the formx i + = f m i ◦ σ Γ ( x i ) , i = , . . . , n − , n ≥ , (12) that is contained in some open set U ⊂ Λ with almost every point of U recurrent for f | Λ . (Thepoints { x i } i = ,..., n do not have to be themselves recurrent.)Then for every δ > there exists an orbit { z i } i = ,..., n of f in R n , with z i + = f k i ( z i ) for somek i > , such that d ( z i , x i ) < δ for all i = , . . . , n. Remark 6.

In [19] the statement of the theorem is for pseudo-orbits of the form x i + = σ Γ ( x i ) .Here we shadow pseudo-orbits of the form (12), but this is the same result as that from [19] forthe following reason.The proof of the theorem in [19] is based on a general shadowing lemma [19, Lemma 3.1]which ensures that given a pseudo-orbits of the form y i + = f k i ◦ σ Γ ◦ f n i ( y i ) where the numbersof iterates k i , n i are big enough, we are able to ﬁnd an orbit of the form z i + = f k i + n i ( z i ) , δ -closeto the pseudo-orbit y i .The shadowing of a pseudo-orbit x i + = σ Γ ( x i ) is proven in [19] by combining [19, Lemma3.1] with recurrence. First, by using recurrence, a pseudo-orbit of the form y i + = f k i ◦ σ Γ ◦ f n i ( y i ) is constructed close to the pseudo-orbit x i + = σ Γ ( x i ) . The k i , n i are chosen to be big enough to pply from [19, Lemma 3.11]. The true orbit, which follows from [19, Lemma 3.11], shadowsthe pseudo orbit y i , but since this lies close to x i one obtains the shadowing of the pseudo orbitx i + = σ Γ ( x i ) .The proof of the shadowing of a pseudo-orbit of the form (12) follows from the same construc-tion: One can use recurrence to construct a pseudo-orbit of the form y i + = f k i ◦ σ Γ ◦ f n i ( y i ) , sothat y i are close to x i from (12). The lemma [19, Lemma 3.1] ensures that y i can be shadowed bya true orbit. Since y i is close to the pseudo-orbit x i form (12) we obtain the shadowing of (12) bya true orbit. Remark 7.

The result can be immediately extended to the case where we have a ﬁnite numberof scattering maps σ , . . . , σ L to shadowx i + = f m i ◦ σ α i ( x i ) , i = , . . . , n − , n ≥ , for two prescribed sequences m , . . . , m n ∈ N and α , . . . , α n ∈ { , . . . , L } ; see [19, Theorem 3.7]. Remark 8.

If f is symplectic for a symplectic form ω , Λ is compact and ω | Λ is not degenerateon Λ then f | Λ is measure-preserving. Hence, by the Poincar´e recurrence theorem almost everypoint of Λ is recurrent. In such setting in Theorem 5 we can take U = Λ .2.3. Interval Newton Method In our computer assisted proofs we use the following classical result, which allows one toconclude from the existence of a “good enough” approximate solution that there exists a truesolution to a nonlinear system of equations.Let F : R k → R k be a C function and U ⊂ R k . We shall denote by [ D F ( U )] the intervalenclosure of a Jacobian matrix on the set U . This means that [ D F ( U )] is an interval matrixdeﬁned as [ D F ( U )] = (cid:40) A ∈ R k × k | A i j ∈ (cid:34) inf x ∈ U d F i dx j ( x ) , sup x ∈ U d F i dx j ( x ) (cid:35) for i , j = , . . . , k (cid:41) . Let A ⊂ R k × k be an interval matrix. We shall write A − to denote an interval matrix, for which if A ∈ A then A − ∈ A − . Theorem 9. [23] (Interval Newton method) Let F : R k → R k be a C function and X =Π ki = [ a i , b i ] with a i < b i . If [ D F ( X )] is invertible and there exists an x in X such thatN ( x , X ) : = x − [ D F ( X )] − f ( x ) ⊂ X , then there exists a unique point x ∗ ∈ X such that F ( x ∗ ) = .

3. Main results

Let f , g : R d × T → R d × T and consider the following system f ε ( u , s , I , θ ) = f ( u , s , I , θ ) + ε g ( u , s , I , θ ) , where u , s ∈ R d , θ, I ∈ T . Assume that f ε are symplectic maps for a symplectic form ω , assumethat for ε = Λ = (cid:110) (0 , , I , θ ) : I , θ ∈ T (cid:111) (cid:39) T

9s a normally hyperbolic invariant manifold for which ω | Λ is non degenerate, and that I is aconstant of motion for the unperturbed system, i.e. π I f ( x ) = π I x , (13)for any x ∈ R d × T , where π I ( u , s , I , θ ) = I .Our objective is to provide conditions under which for any su ﬃ ciently small ε > x ε and a number of iterates n ε for which π I (cid:0) f n ε ε ( x ε ) − x ε (cid:1) > . (14)The coordinates have the following roles. The u , s are the coordinates on unstable and stablebundles, respectively, of Λ . The θ is an angle and I plays the role of an constant of motion for ε =

0. In the setting of action-angle coordinates, the I would be chosen as the action. We shallrefer to I as an ‘action’, slightly abusing the terminology. In this paper we restrict to the casewhere the angle and action are one dimensional. We do so to achieve simplicity . Remark 10.

The assumption that Λ is a torus to simpliﬁes the arguments, as Λ is compactwithout boundary and the normally hyperbolic manifold theorem ensures that Λ is perturbedto a nearby compact normally hyperblic invariant manifold Λ ε . Having compactess of Λ ε isconvenient, but does not appear to be necessary.A more typical setting is when Λ is a normally hyperbolic invariant cylinder (possibly witha boundary) with θ ∈ T and I ∈ R . We would then have f ε : R d × R × T → R d × R × T .In such case consider I ∈ [0 , and artiﬁcially ‘glue’ the system so that I is in T to apply ourresult. Details of how this can be done are found in section Appendix A. A typical setting where our result can be applied is that of a time dependent perturbation of aHamiltonian system of the form x (cid:48) = J ∇ ( H ( x ) + ε G ( x , t )) , (15)where H : R d + → R d + , G : R d + × T → R d + and J = (cid:32) Id − Id (cid:33) , for Id = (cid:32) (cid:33) . In such case we can take f ε ( x ) = Φ ε π ( x , t ), for some t ∈ [0 , π ), where Φ ε t ( x , t ) stands for thetime t ﬂow induced by (15) with the initial condition ( x , t ). If the unperturbed system admits anormally hyperbolic invariant cylinder, then we are in the setting from Remark 10.Another possibility is to consider the ﬂow induced by (15) in the extended phase space andconsider a section of the form Σ × T in R d + × T . Then f ε can then be chosen as the section-to-section map along the ﬂow in the extended phases space. The time coordinate plays the roleof the angle θ and we choose I as the Hamiltonian H (energy) of the unperturbed system.The next theorem is our ﬁrst main result. It provides conditions for the existence of orbitswhich di ﬀ use in I . We believe that our methods can be generalised to the higher dimensional case. We make comments how to do so inRemarks 14, 20 after the statements of our results. We also make a cautionary Remark 44 regarding potential problemswhile extending the higher dimensional setting to the case of normally hyperbolic cylinders in Appendix A. igure 2: The setting for Theorem 11. Theorem 11.

Assume that there is a neighborhood U of Λ and positive constants L g , C , λ ,where λ ∈ (0 , , such that for every x , x ∈ U , and every z ∈ Λ , x u ∈ W uz ( f , U ) andx s ∈ W sz ( f , U ) we have | π I ( g ( x ) − g ( x )) | ≤ L g (cid:107) x − x (cid:107) , (16) (cid:13)(cid:13)(cid:13) f n ( z ) − f n ( x u ) (cid:13)(cid:13)(cid:13) < C λ | n | for all n ≤ , (cid:13)(cid:13)(cid:13) f n ( z ) − f n ( x s ) (cid:13)(cid:13)(cid:13) < C λ n for all n ≥ . (17) Assume that for ε = we have a sequence Γ , . . . , Γ L ⊂ U of homoclinic channels for f ,with corresponding wave maps Ω α ± : Γ α → Λ and scattering maps σ α : dom ( σ α ) → Λ for α = , . . . , L.Assume that for every z ∈ Λ There exists an α ∈ { , . . . , L } such that z ∈ dom ( σ α ) . There exists an m ∈ N and a point x ∈ Γ α , x ∈ W uz ( f , U ) ∩ W s σ α ( z ) ( f ) such that f m ( x ) ∈ W sf m ( σ α ( z )) ( f , U ) (see Figure 2) and m − (cid:88) j = π I g (cid:16) f j ( x ) (cid:17) − + λ − λ L g C > . (18) (The above α, m and x can depend on the choice of z.)Then for su ﬃ ciently small ε > there exists an x ε and n ε > such that π I (cid:0) f n ε ε ( x ε ) − x ε (cid:1) > . Before giving the proof let us make a couple of comments about the assumptions.

Remark 12.

Assumption (16) will readily hold when g is C since Λ is compact, so we cantake U to be compact as well. Conditions (17) will hold due to the contraction and expansionproperties along the stable and unstable manifolds. What is important for us is to have explicitbounds L g , C and λ which enter into the key assumption (21). Condition (21) measures theinﬂuence of the perturbation term g on the coordinate I . This can be thought of as a discreteversion of a Melnikov integral. (Instead of an integral we have a sum, since we are working witha discrete system.) An important feature is that we are computing the sum along a ﬁnite fragmentof a homoclinic orbit, and not along the full orbit as is the case in Melnikov theory. The secondterm in (21) takes into account the truncated tail.11 emark 13. In Theorem 11 we assume that the homoclinic channels are in U , meaning thatthey are close to Λ . This is not a restrictive assumption, since a homoclinic channel which is faraway can be propagated close to Λ by using backward iterates of f . Remark 14.

Theorem 11 can be generalised to the setting of higher dimensional θ and I asfollows. If we have actions I , . . . , I k , we can single out one of them (say I = I ) for the conditions(16) and (21), to obtain di ﬀ usion towards the singled out action. Remark 15.

We have assumed that f ε ( x ) = f ( x ) + ε g ( x ). We can assume just as well that f ε ( x ) = f ( x ) + ε g ( ε, x ), with smooth g ( ε, x ). Then in conditions (16) and (18) we can write g (0 , · )instead of g ( · ), and the result will follow from the same arguments. Analogous modiﬁcations canbe made also in subsequent theorems. We consider g ( x ) instead of g ( ε, x ) since it simpliﬁes andshortens the notation. Proof of Theorem 11.

The manifold Λ is perturbed to a normally hyperbolic invariantmanifold Λ ε for f ε . Moreover, for su ﬃ ciently small ε if z ∈ Λ ε , x u ∈ W uz ( f ε , U ) and x s ∈ W sz ( f ε , U ) so that (cid:13)(cid:13)(cid:13) f n ε ( z ) − f n ε ( x u ) (cid:13)(cid:13)(cid:13) < C λ | n | ε for all n ≤ , (cid:13)(cid:13)(cid:13) f n ε ( z ) − f n ε ( x s ) (cid:13)(cid:13)(cid:13) < C λ n ε for all n ≥ , (19)with λ ε converging to λ as ε tends to zero.Since transversal intersections persist under perturbation, the homoclinic channels Γ , . . . , Γ L for f are perturbed to homoclinic channels Γ ε , . . . , Γ l ε for f ε , provided that ε > ﬃ cientlysmall. This leads [22] to a scattering map σ εα : Ω Γ αε − (cid:0) Γ αε (cid:1) → Ω Γ αε + (cid:0) Γ αε (cid:1) for f ε .Our ﬁrst objective is to show that for any z ε ∈ Λ ε there exists an m ∈ N and α ∈ { , . . . , L } (both m and α can depend on z ε ) such that π I (cid:0) f m ε ◦ σ εα ( z ε ) − z ε (cid:1) > ε c , (20)where c > m − (cid:88) j = π I g (cid:16) f j ( x ) (cid:17) − + λ − λ L g C > c (21)for any z ∈ Λ (with the same c ). We can ﬁnd such small c because of (18) and compactness of Λ . It turns out that (20) is the main step in our proof, since once it is established the resultfollows from the shadowing Theorem 5. Below we ﬁrst prove (20) and then discuss how to applythe shadowing method.Consider now a z ε ∈ Λ ε . By our assumptions, for every x ∈ Λ we have an α ∈ { , . . . , L } , m ∈ N and x ∈ W uz ( f , U ) ∩ W s σ α ( z ) ( f ) such that f m ( x ) ∈ W sf m ( σ α ( z )) ( f , U ) and (21) holds. Thismeans that for su ﬃ ciently small ε , for some α ∈ { , . . . , L } and some m ∈ N we shall have x ε ∈ W uz ε ( f ε , U ) and f m ε ( x ε ) ∈ W sf m ε ( σ εα ( z ε )) ( f ε , U ), hence by (19) (cid:13)(cid:13)(cid:13)(cid:13) f j ε ( z ε ) − f j ε ( x ε ) (cid:13)(cid:13)(cid:13)(cid:13) < C λ | j | ε for j ≤ (cid:13)(cid:13)(cid:13)(cid:13) f m + j ε (cid:0) σ εα ( z ε ) (cid:1) − f m + j ε ( x ε ) (cid:13)(cid:13)(cid:13)(cid:13) < C λ j ε for j ≥ . (23)12ue to (21) and the continuous dependence of x ε , λ ε , for su ﬃ ciently small ε we shall have m − (cid:88) j = π I g (cid:16) f j ε ( x ε ) (cid:17) − + λ ε − λ ε L g C > c . In order to show (20) we will split our estimates into three terms f m ε (cid:0) σ εα ( z ε ) (cid:1) − z ε = (cid:2) f m ε (cid:0) σ εα ( z ε ) (cid:1) − f m ε ( x ε ) (cid:3) + (cid:2) f m ε ( x ε ) − x ε (cid:3) + [ x ε − z ε ] , (24)and investigate bounds on the projection π I for each of them. We start by showing that (cid:12)(cid:12)(cid:12) π I (cid:2) f m ε (cid:0) σ εα ( z ε ) (cid:1) − f m ε ( x ε ) (cid:3)(cid:12)(cid:12)(cid:12) ≤ ε − λ ε L g C . (25)Indeed, since f ε ( x ) = f ( x ) + ε g ( x ) and π I f ( x ) = π I x , for any x , x we have π I f ε ( x ) − π I f ε ( x ) = π I f ( x ) + επ I g ( x ) − π I f ( x ) − επ I g ( x ) = π I ( x − x ) + επ I ( g ( x ) − g ( x )) . It follows by induction that π I (cid:16) f j ε ( x ) − f j ε ( x ) (cid:17) = π I [ x − x ] + ε j − (cid:88) i = π I (cid:16) g (cid:16) f i ε ( x ) (cid:17) − g (cid:16) f i ε ( x ) (cid:17)(cid:17) . (26)Taking x = f m ε (cid:0) σ εα ( z ε ) (cid:1) and x = f m ε ( x ε ) from (26) with π I [ x − x ] moved to the left handside, we have (cid:12)(cid:12)(cid:12) π I (cid:2) f m ε (cid:0) σ εα ( z ε ) (cid:1) − f m ε ( x ε ) (cid:3)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) π I (cid:16) f m + j ε (cid:0) σ εα ( z ε ) (cid:1) ) − f m + j ε ( x ε ) (cid:17) − ε j − (cid:88) i = π I (cid:16) g (cid:16) f m + i ε (cid:0) σ εα ( z ε ) (cid:1)(cid:17) − g (cid:16) f m + i ε ( x ε ) (cid:17)(cid:17)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) < C λ j ε + ε L g j − (cid:88) i = (cid:13)(cid:13)(cid:13) f m + i ε (cid:0) σ εα ( z ε ) (cid:1) − f m + i ε ( x ε ) (cid:13)(cid:13)(cid:13) < C λ j ε + ε L g j − (cid:88) i = C λ i ε , where the last two inequalities follow from (23). Letting j → ∞ , we obtain (25).Now consider the third term from (24). An analogous bound to (23) is obtained as follows.From (26) we have that π I ( x − x ) = π I (cid:104) f − j ε ( x ) − f − j ε ( x ) (cid:105) + ε j − (cid:88) i = π I (cid:16) g (cid:16) f i − j ε ( x ) (cid:17) − g (cid:16) f i − j ε ( x ) (cid:17)(cid:17) = π I (cid:104) f − j ε ( x ) − f − j ε ( x ) (cid:105) + ε − (cid:88) i = − j π I (cid:16) g (cid:16) f i ε ( x ) (cid:17) − g (cid:16) f i ε ( x ) (cid:17)(cid:17) . (27)13aking x = x ε and x = z ε , from (27) we obtain | π I ( x ε − z ε ) |≤ (cid:12)(cid:12)(cid:12)(cid:12) π I (cid:104) f − j ε ( x ε ) − f − j ε ( z ε ) (cid:105)(cid:12)(cid:12)(cid:12)(cid:12) + ε − (cid:88) i = − j (cid:12)(cid:12)(cid:12)(cid:12) π I (cid:16) g (cid:16) f i ε ( x ε ) (cid:17) − g (cid:16) f i ε ( z ε ) (cid:17)(cid:17)(cid:12)(cid:12)(cid:12)(cid:12) < C λ j ε + ε L g − (cid:88) i = − j (cid:13)(cid:13)(cid:13) f i ε ( x ) − f i ε ( x ) (cid:13)(cid:13)(cid:13) < C λ j ε + ε L g j (cid:88) i = C λ i ε , where the last two inequalities follow from (22). Taking j → ∞ gives | π I ( x ε − z ε ) | ≤ ε λ ε − λ ε CL g . (28)We now turn to the middle term from (24). Since f ε ( x ) = f ( x ) + ε g ( x ) and π I f ( x ) = x , itfollows that (below we consider x = f j ε ( x ε )) π I (cid:16) f ε (cid:16) f j ε ( x ε ) (cid:17) − f j ε ( x ε ) (cid:17) = π I f (cid:16) f j ε ( x ε ) (cid:17) + επ I g (cid:16) f j ε ( x ε ) (cid:17) − π I f j ε ( x ε ) = επ I g (cid:16) f j ε ( x ε ) (cid:17) so π I (cid:0) f m ε ( x ε ) − x ε (cid:1) = m − (cid:88) j = π I (cid:16) f j + ε ( x ε ) − f j ε ( x ε ) (cid:17) = ε m − (cid:88) j = π I g (cid:16) f j ε ( x ε ) (cid:17) . (29)Combining (24), (25), (28), (29) gives π I (cid:0) f m ε (cid:0) σ εα ( z ε ) (cid:1) − z ε (cid:1) > ε  m − (cid:88) j = π I g (cid:16) f j ε ( x ε ) (cid:17) − + λ ε − λ ε CL g  . Since the right hand side of the inequality above depends continuously on ε , from (21) we obtain(20) for su ﬃ ciently small ε .This establishes the key step (20). We now apply Theorem 5 to prove our result. Indeed, since ω | Λ is nondegenerate, the same is true for ω | Λ ε for su ﬃ ciently small ε. Since f ε is symplectic, byRemark 8 almost every point of Λ ε is recurrent for f ε | Λ ε . Choose x ∈ Λ ε having π I x = α , m , (which are allowed to depend on x ) such that for x : = f m ε ◦ σ εα ( x )we have π I (cid:16) f m ε ◦ σ εα ( x ) − x (cid:17) > c ε . This can be done due to (20). Repeating the procedure,choosing α i , m i for which (cid:16) π I f m i ε ◦ σ εα i ( x i ) − x i (cid:17) > c ε we obtain a pseudo-orbit x , . . . , x N , where x i + : = f m i ε ◦ σ εα i ( x i ), for which π I ( x N − x ) > Nc ε. igure 3: A typically shaped strip (left) and a ‘strip’ consisting of two connected components (right). Choosing N large enough, we obtain that π I ( x N − x ) >

1. By Theorem 5 the pseudo-orbit x , . . . , x N is δ -shadowed by a true orbit, so by choosing δ <

12 ( π I ( x N − x ) − Λ we can ﬁnd a pseudo-orbit such that wehave a gain in I . Note however that we do not need to have (21) for all z ∈ Λ . It is enough tohave (21) for z on some smaller subset of Λ , provided that we can ensure that the pseudo-orbitconstructed in the proof of Theorem 11 returns to that set. Below we formulate Theorem 17,which will make this statement precise. First we introduce one notion. Deﬁnition 16.

Consider the topology on Λ ∩ { I ∈ [0 , } induced by Λ . We say that an openset S ⊂ Λ ∩ { I ∈ [0 , } is a strip in Λ i ﬀ S ∩ { z ∈ Λ : π I z = ι } (cid:44) ∅ for any ι ∈ [0 , . (Recall that we consider T = R / mod 2 π ; the interval I ∈ [0 , is a strict subset of [0 , π ) . SinceS is open in the topology induced on Λ ∩ { I ∈ [0 , } we require that it contains points with I = and I = .) We refer to S as a ‘strip’ because usually we would choose it to be of the shape as in the lefthand side of Figure 3. In principle though a strip migh look di ﬀ erently, for instance as on theright hand side plot in ﬁgure 3. (Provided that conditions from below corollary are fulﬁlled, theresult holds regardless from the shape of the ‘strip’.)In subsequent two theorems we consider two strips S + and S − . The strip S + is used to validatedi ﬀ usion in I , which increases I by order one. The strip S − will be used to prove di ﬀ usion inwhich I decreases by order one. Theorem 17.

Assume that conditions (16) and (17) are satisﬁed, and that for ε = we havethe sequence scattering maps σ α : dom ( σ α ) → Λ for α = , . . . , L. Let S + ⊂ Λ be a strip .Assume that for every z ∈ S + there exists an α ∈ { , . . . , L } for which z ∈ dom ( σ α ) ,f m ◦ σ α ( z ) ∈ S + , (30) We add the plus in the superscript for S + since this strip is used to increase I . In subsequent theorem we will haveanother strip S − to obtain di ﬀ usion in the oposite direction. there exists a constant m ∈ N and a point x ∈ W uz ( f , U ) ∩ W s σ α ( z ) ( f ) such that f m ( x ) ∈ W sf m ( σ α ( z )) ( f , U ) and m − (cid:88) j = π I g (cid:16) f j ( x ) (cid:17) − + λ − λ L g C > . (31) Then for su ﬃ ciently small ε > there exists an x ε and n ε > such that π I (cid:0) f n ε ε ( x ε ) − x ε (cid:1) > . Proof.

The result follows by making minor adjustments to the arguments in the proof ofTheorem 11. So, let S + ε ⊂ Λ ε be the perturbation of the strip S + ⊂ Λ . As in the proof ofTheorem 11 we construct a pseudo orbit x i + = f m i ε ◦ σ εα i ( x i ), starting with a point x ∈ S + ε with π I x =

0. Note we assume that (30) holds for any z ∈ S + (with choices of m and α depending on z ). This means that for su ﬃ ciently small ε , and for any point z ε ∈ S + ε , there is an m = m ( z ε ) , α = α ( z ε ) such that f m ( z ε ) ε ◦ σ εα ( z ε ) ( z ε ) ∈ S + ε . In other words, z ε ‘returns’ to the stripfor su ﬃ ciently small ε . Due to the compactness of S + , a su ﬃ ciently small choice of ε guaranteesthat we have f m ( z ε ) ε ◦ σ εα ( z ε ) ( z ε ) ∈ S + ε for all z ε ∈ S + ε . In short, condition (30) ensures that thepseudo-orbit x i + = f m i ε ◦ σ εα i ( x i ) remains within the strip S + ε for su ﬃ ciently small ε . By (31)and identical arguments to those from Theorem 11 we therefore have π I ( x i + − x i ) > ε c , for some c >

0, and the result follows from the shadowing argument just as in the proof ofTheorem 11.A mirror result gives di ﬀ usion in the opposite direction. Theorem 18.

Assume that conditions (16) and (17) are satisﬁed, and that for ε = we have thesequence of scattering maps σ α : dom ( σ α ) → Λ for α = , . . . , L. Let S − ⊂ Λ be a strip.Assume that for every z ∈ S − there exists an α ∈ { , . . . , L } for which z ∈ dom ( σ α ) ,f m ◦ σ α ( z ) ∈ S − , there exists a constant m ∈ N and a point x ∈ W uz ( f , U ) ∩ W s σ α ( z ) ( f ) such that f m ( x ) ∈ W sf m ( σ α ( z )) ( f , U ) and m − (cid:88) j = π I g (cid:16) f j ( x ) (cid:17) + + λ − λ L g C < . Then for su ﬃ ciently small ε > there exists an x ε and n ε > such that π I (cid:0) x ε − f n ε ε ( x ε ) (cid:1) > . Proof.

The proof follows as in the proof of Theorem 17.Bu combining the two strips we obtain shadowing of any prescribed ﬁnite sequence of ac-tions.

Theorem 19.

Assume that two strips S + and S − satisfy assumptions of Theorems 17 and 18,respectively. If in addition for every z ∈ S + there exists an n (which can depend on z) such that f n ( z ) ∈ S − , and for every z ∈ S − there exists an n (which can depend on z) such that f n ( z ) ∈ S + ,then for any given ﬁnite sequence { I k } Nk = and any given δ > , for su ﬃ ciently small ε there existsan orbit of f ε which δ -shadows the actions I k ; i.e. there exists a point z ε and a sequence ofintegers n ε ≤ n ε ≤ . . . ≤ n ε N such that (cid:13)(cid:13)(cid:13)(cid:13) π I f n ε k ε ( z ε ) − I k (cid:13)(cid:13)(cid:13)(cid:13) < δ. Proof.

Suppose that I > I . (The opposite case will be analogous.) As in the proof ofTheorem 17, we construct a pseudo orbit x i + = f m i ε ◦ σ εα i ( x i ), x i ∈ S + ε , starting with a point x with π I x = I , such that π I ( x i + − x i ) > ε c , for some c >

0. We can therefore ﬁnd a pseudo orbit for which (cid:12)(cid:12)(cid:12) π I x i − I (cid:12)(cid:12)(cid:12) < δ/

2, for some i > I > I , and we carry on as in the proof of Theorem 17, continuing with our pseudo-orbit along S + ε , until we reach x i such that (cid:12)(cid:12)(cid:12) π I x i − I (cid:12)(cid:12)(cid:12) < δ/

2. If on the other hand I < I , then we take x i + = f n ε ( x m l ), where the n is the number from assumption 1. (for z = x i ). For su ﬃ cientlysmall ε we will obtain that x i + ∈ S − ε . We now construct the subsequent points x i along the strip S − ε , going down in I along each step, until we reach x i satisfying (cid:12)(cid:12)(cid:12) π I x i − I (cid:12)(cid:12)(cid:12) < δ/

2. Dependingon whether I k + > I k or I k + < I k we procede in an analogous manner: to move up in I weconstruct the given fragment of the pseudo-orbit along S + ε ; and to go down in I we construct thegiven fragment of the pseudo-orbit along S − ε . Assumptions 1., 2. ensure that our pseudo-orbitcan be chosen to jump between the strips S + ε and S − ε at any stage of the construction.This way we construct a pseudo orbit for which (cid:12)(cid:12)(cid:12) π I x i k − I k (cid:12)(cid:12)(cid:12) < δ/ k = , . . . , N . By Theorem 5 the pseudo-orbit x i can be δ/ Remark 20.

Theorems 17, 18, 19 can be generalised to the setting of higher dimensional I bysingling out one action, as in Remark 14. The deﬁnition of the strip is then with respect to thatparticular action.

4. Example of application

In this section we discuss our example, the generalized standard map, to which we applyour method. We give a computer assisted proof of the existence of di ﬀ using orbits by applyingTheorem 19. We validate the assumptions of the theorem using two independent implemen-tations, which use di ﬀ erent methods to obtain bounds on the stable / unstable manifolds of theNHIM. The ﬁrst is based on cone conditions [24, 25, 26], and the second on the parameterizationmethod [27, 28, 29]. Let V ( q ) be a Z n -periodic function. Consider a map f : R n → R n given by f ( q , p ) = ( q + p + ∇ V ( q ) , p + ∇ V ( q )) . emark 21. The map f is symplectic and has the generating functionS ( q , Q ) = (cid:107) Q − q (cid:107) + V ( q ) . Remark 22.

When V = n = V ( q ) = α cos( q )then we obtain the Chirikov Standard Map.For our example, taking q = ( x , θ ) , p = ( y , I ) and V ε ( x , θ ) = α cos ( x ) − ε sin( x ) sin( θ ) , we obtain a family of maps (3). To be in line with the setup from section 3 we interpret that f ε : R × T → R × T . (We could just as well interpret f ε to be on T . )In our example we take α = . For this parameter, when ε =

0, on the x , y coordinateswe have a hyperbolic ﬁxed point at the origin. The reader can get a sense of the dynamics byreferring to the simulation results illustrated in Figure 1.At ε = F : R → R and G : T → T f ( x , y , θ, I ) = ( F ( x , y ) , G ( θ, I )) . (32)The origin on the x , y plane is a hyperbolic ﬁxed point of F and DF (0) has eigenvalues λ, λ − for λ = − √ α = Λ = (cid:110) (0 , , θ, I ) : θ ∈ T , I ∈ T (cid:111) is a normally hyperbolic invariant manifold for f with the rates λ and µ = (cid:113) √ + . (The µ is the norm of the matrix acting on θ, I in (3) for ε =

0. In fact, µ is the famous golden ratio.)We consider the standard symplectic form ω = dx ∧ dy + d θ ∧ dI . The maps f ε are ω -symplectic and ω | Λ is non-degenerate. Remark 23.

Note that with the coupling considered in (3), for ε > Λ remainsinvariant, and the dynamics on it remains unchanged. We remark that our method does notdepend on this property. Rather, we have chosen such a coupling so that it is evident that it isimpossible to di ﬀ use in I by using the ‘inner dynamics’ on the perturbed manifold. Our di ﬀ usionis driven by the ‘outer dynamics’ along the homoclinic connections, which is clearly visible here.We prove the following result. Theorem 24 (Di ﬀ usion in the generalized standard map). Let δ be an arbitrary, ﬁxed, strictlypositive number. Then for every ﬁnite sequence (cid:110) I l (cid:111) Ll = ⊂ (cid:104) , π − (cid:105) , there exists an ε > , asequence of integers n ε , . . . , n ε L and an orbit z ε , . . . , z ε L , z ε l = f n ε l ε (cid:16) z ε l − (cid:17) , for l = , . . . , L, such that (cid:12)(cid:12)(cid:12) π I z ε l − I l (cid:12)(cid:12)(cid:12) < δ, for l = , . . . , L . Remark 25.

The proof of this theorem is based on computer assisted validation of the assump-tions of Theorem 19. The strips validated by our computer program are depicted in Figure 4.18 π /2 π π /22 π π /2 π π /2 2 πθ Figure 4: The strips from Theorem 19 for the map (3), validated by our computer program. The S + is in black and S − ingray. The angle θ is on the horizontal axis and I on the vertical axis. Remark 26.

From our validation of the strips (see Figure 4) it follows also that we can takethe interval (cid:104) π + , π − (cid:105) instead of (cid:104) , π − (cid:105) in Theorem 24. Between these two intervalsthough, at I = π, we have a gap, which our method is unable to overcome. In other words, weare not able to establish an orbit which would start with I ∈ (0 , π ) and ﬁnish with I ∈ ( π, π ) (andvice versa). Remark 27.

The di ﬀ usion is in fact be established for intervals reaching in I slightly closer to0 and π than stated in Theorem 24, where we have rounded down the intervals. Our computerassisted proof based on the parameterization method does a better job and produces higher (in I ) strips than the method based on cone conditions. This is because the parametrization methodleads to much higher accuracy of the bounds on the stable / unstable manifolds, which is thenreﬂected in better accuracy of the remaining computations. Both methods though can be used tovalidate the I -intervals stated in Theorem 24 and Remark 26. Remark 28.

If we take the parameter α in (3) closer to zero, then the unstable eigenvalues at theorigin becomes smaller and the example becomes more challenging numerically. This is becausewith weak hyperbolicity it is more di ﬃ cult to obtain good estimates on the manifolds; also thehomoclinic excursion takes more iterates. We have found that close to α = .

15 the methodbased on cone conditions fails, but the parametrization method can still be applied.

The proof of Theorem 24 exploits computer assisted validation methods for studying thelocal stable / unstable manifolds of ﬁxed points. We apply these for the map F from (32), i.e. theunperturbed map acting on x , y . We take the origin as our ﬁxed point of F . The methods allowus to obtain an open interval J ⊂ R and smooth functions P u : J → R and P s : J → R suchthat P u ( J ) is the local unstable manifold W u ( F , U ) of the origin for F , and P s ( J ) is the localstable manifold W s ( F , U ) of the origin for F , for some neighbourhood U of the origin. We givea description of both methods in section 5. For the purpose of this section it is enough that we19 able 1: Homoclinic orbit i x i y i C , λ ∈ R , C , λ > (cid:13)(cid:13)(cid:13) F i ( P s ( x )) (cid:13)(cid:13)(cid:13) ≤ C λ i (cid:13)(cid:13)(cid:13) F − i ( P u ( x )) (cid:13)(cid:13)(cid:13) ≤ C λ i for all i ∈ N and x ∈ J . (33)The functions P u and P s give only a local description of the unstable and stable manifolds.To establish their intersections we use the following parallel shooting approach. Deﬁne F : J × B × . . . × B M × J → R M + , where B i ⊂ R are cartesian products of two closed intervals, as F ( x , v , . . . , v M − , y ): = ( P u ( x ) − v , F ( v ) − v , . . . , F ( v M − ) − v M − , F ( x M − ) − P s ( y )) . If we establish the existence of a point p ∗ = (cid:16) x ∗ , v ∗ , . . . , v ∗ M − , y ∗ (cid:17) for which F ( p ∗ ) = , (34)then we have established a sequence of points v ∗ , . . . , v ∗ M , where v ∗ = P u ( x ∗ ) and v ∗ M = P s ( y ),along a homoclinic to zero. The bound on the solution of (34) can be established by using theinterval Newton theorem ; see section 2.3. This way, we obtain a homoclinic orbit within a setof the form v ∗ i ∈ (cid:104) x i − r , x i + r (cid:105) × (cid:104) y i − r , y i + r (cid:105) for i = , . . . , M (35)where x i , y i are written in Table 1. (Our M is equal to 10.)We use two methods to obtain bounds on P u and P s . In the case of the ﬁrst method, by usingcones, we obtain r = r cones = . · − , (36) An alternative could be to use the Newton-Krawczyk theorem or a version of the Newton-Kantorovich theorem. Weuse the interval Newton theorem because of its simplicity and the fact that it is su ﬃ cient for our needs in this particularexample. r = r param = . · − . (37)(The bounds on our computer program are in fact often tighter and vary from pint to point. Herewe have rounded them up to write a uniform enclosure r for all considered points.)Since we use the interval Newton method as the tool for our validation we also obtaintransversality of obtained intersection of our manifolds. (Such results are well known, see forinstance [30] for a similar approach. We add the proof in the appendix to keep the work self-contained.) Lemma 29.

The manifolds W u ( F ) and W s ( F ) intersect transversally. Proof.

The proof is given in Appendix B.Deﬁne the sequence (cid:0) x ∗ i , y ∗ i (cid:1) : = F i (cid:16) v ∗ (cid:17) for all i ∈ Z . Note that (cid:16) x ∗ i , y ∗ i (cid:17) = v ∗ i , for i = , . . . , M . We now show that for ε = Lemma 30.

The set

Γ = (cid:110)(cid:16) x ∗ , y ∗ , I , θ (cid:17) : I , θ ∈ T (cid:111) , is a homoclinic channel for f and the associated scattering map σ is globally deﬁned and is theidentity on Λ . Proof.

To show that Γ is a homoclinic channel for f we need to prove points (i), (ii) and (iii)from Deﬁnition 3.We start by observing that for p ∈ Γ T p Γ = { (0 , } × R . (38)Since W u ( F ) , W s ( F ) intersect transversally in R at v ∗ we also have T v ∗ W s ( F ) ⊕ T v ∗ W u ( F ) = R , (39) T v ∗ W s ( F ) ∩ T v ∗ W u ( F ) = { } . (40)Since W u Λ ( f ) = W u ( F ) × T and W s Λ ( f ) = W s ( F ) × T we see that for p ∈ Γ T p W u Λ ( f ) = T v ∗ W u ( F ) × R , (41) T p W s Λ ( f ) = T v ∗ W s ( F ) × R . (42)From (39), (41), (42) and (40), (41), (42), (38) we obtain, respectively, T p W s Λ ( f ) ⊕ T p W u Λ ( f ) = R , T p W s Λ ( f ) ∩ T p W u Λ ( f ) = { } × R = T p Γ , which proves (i) from Deﬁnition 3. 21ince any two points that converge to each other need to start with the same values on θ, I wesee that for any z ∈ Λ W uz ( f ) = W u ( F ) × (cid:8) π ( θ, I ) z (cid:9) , (43) W sz ( f ) = W s ( F ) × (cid:8) π ( θ, I ) z (cid:9) . (44)This means that the wave maps are of the form Ω ± (cid:16) x ∗ , y ∗ , θ, I (cid:17) = (0 , , θ, I ) . (45)Clearly Ω ± are di ﬀ eomorphisms as required in (iii) from Deﬁnition 3.From (43), (44) we see that for any p ∈ Γ and z ∈ Λ T p W uz ( f ) = T v ∗ W u ( F ) × { (0 , } , (46) T p W sz ( f ) = T v ∗ W s ( F ) × { (0 , } , (47)Combining (38) with (46), (47) and comparing with (41), (42) gives T p Γ ⊕ T p W uz ( f ) = T v ∗ W u ( F ) × R = T p W u Λ ( F ) , T p Γ ⊕ T p W sz ( f ) = T v ∗ W s ( F ) × R = T p W s Λ ( F ) , which means that we have (ii) from Deﬁnition 3. We have established that Γ is a homoclinicchanel. From (45) we see that the associated scattering map σ is globally deﬁned and is theidentity on Λ .We validate strips S + and S − with shapes as in Figure 4. These are composed of smalloverlapping rectangular fragments. Below we introduce a lemma which we then apply on eachsuch rectangular part. First we introduce a notation. For a , b ∈ [0 , π ) we deﬁne the interval[ a , b ] ⊂ T = R / mod 2 π as[ a , b ] = (cid:40) (cid:110) x ∈ T : a ≤ x ≤ b (cid:111) if a ≤ b , { x ∈ R : b ≤ x ≤ a + π } mod 2 π if b < a . (48)We deﬁne ( a , b ) ⊂ T as the interior of [ a , b ].Let I , I ∈ (0 , π ) satisfy I < I . Let s , s ∈ T , and consider strips on Λ of the form { (0 , } × [ s , s ] × [ I , I ] . (49)(In (49) the interval [ s , s ] is in the sense (48).) We now have the following lemma. Lemma 31. If M − (cid:88) i = sin( x ∗ i ) cos( θ + iI ) > + λ − λ C (50) and if for every ( θ, I ) ∈ [ s , s ] × [ I , I ] there exists an m ≥ M (the m can depend on the choiceof ( θ, I ) ) such that θ + mI ∈ ( s , s ) , (51) then assumptions of Theorem 17 hold true for our map (3) on the strip (49). roof. Condition (30) follows from (51). We need to validate (31). Since v ∗ M ∈ P s ( J ) , from(33) it follows that | x ∗ m | < C λ m − M , for m ≥ M .Consider an arbitrary ﬁxed ( θ, I ) ∈ [ s , s ] × [ I , I ] and let C m : = m − (cid:88) j = sin( x ∗ j ) cos( θ + jI ) . Since for j ≥ M we know that | x ∗ j | < C λ j − M , we see that for m ≥ M | C m − C M | ≤ m − (cid:88) j = M (cid:12)(cid:12)(cid:12) sin( x ∗ j ) (cid:12)(cid:12)(cid:12) | cos( θ + jI ) | ≤ C − λ m − M − λ < C + λ − λ . (52)Observe that the map ( x , y , θ, I ) → sin( x ) cos ( θ ) is Lipschitz with the constant L g = z = (0 , , θ, I ) ∈ { (0 , } × [ s , s ] × [ I , I ], consider x = (cid:16) x ∗ , y ∗ , θ, I (cid:17) ∈ W uz ( f , U ) ∩ W s σ α ( z ) ( f ). Since (cid:16) x ∗ , y ∗ (cid:17) = v ∗ ∈ P u ( J ) and v ∗ M ∈ P s ( J ), for every m ≥ N , f m ( x ) ∈ W sf m ( σ α ( z )) ( f , U ).Also, for every m ≥ M , by using (50) and (52), we obtain m − (cid:88) j = π I g (cid:16) f j ( x ) (cid:17) − + λ − λ L g C = m − (cid:88) j = sin( x ∗ j ) cos( θ + jI ) − + λ − λ C ≥ C M − | C m − C M | − + λ − λ C ≥ C M − + λ − λ C > , which ensures (31). This ﬁnishes our proof. Remark 32.

A mirror result lets us validate assumptions of Theorem 18. The only di ﬀ erence isthat instead of (50), we require N − (cid:88) i = sin( x ∗ i ) cos( θ + iI ) < − + λ − λ C . We are now ready to prove Theorem 24.

Proof of Theorem 24.

By Lemma 29 the stable and unstable manifolds of the origin for themap F intersect transversally. Moreover, we have explicit bounds for a homoclinic orbit alongthis intersection, written in Table 1 and (35–37). This means that, by Lemma 30, the scatteringmap for the unperturbed system is well deﬁned.Using the bounds from Table 1 and (35–37), which give an enclosure of a ﬁnite fragment ofthe homoclinic orbit, and together with the aid of Lemma 31, our computer program constructsthe strip S + from Figure 4. This strip is a union of overlapping rectangles, for which assumptionsof Theorem 17 are satisﬁed. We use a mirror result to Lemma 31 (see Remark 32), to constructthe strip S − from Figure 4, for which assumptions of Theorem 18 are satisﬁed. We also validatethat for these two strips conditions 1. and 2. of Theorem 19 are fullﬁled.After such validation the result follows from Theorem 19.23 igure 5: The cone at z intersected with B (in dark grey) is mapped into the cone at ˜ F ( z ) (in light grey). The computer assisted proof using cone conditions for the validation of intersections of themanifolds was performed with the CAPD library [31]. The parametrisation method approachwas implemented in Matlab. The source code is available on the web page of the correspondingauthor.

5. Invariant manifolds and their intersections

We now discuss computation of the local stable and unstable manifolds, with a focus onobtaining mathematically rigorous computer assisted error bounds on all approximations.

The ideas presented here are based on the more general method from [24]. We reformulatethe results for our particular setting, giving sketches of proofs, in order to keep the paper self-contained.Let F be the map (32), i.e. the unperturbed map f acting on x , y . Let P ∈ R × and˜ F : R → R be deﬁned as follows P : = (cid:32) + √ − √

22 2 (cid:33) , ˜ F ( z ) : = P − F ( P z ) , where we recall that α =

4. The matrix P is the coordinate change to Jordan form for DF (0)and ˜ F ( z ) is the map expressed in local coordinates, which diagonalizes the stable and unstabledirections at the origin, i.e. D ˜ F (0) = diag (cid:16) (3 − √ − , − √ (cid:17) . We refer to these as the localcoordinates, as ˜ F is expressed as z = ( u , s ). (The u stands for ‘unstable’ and s for ‘stable’.)Let L ∈ R be a ﬁxed constant satisfying L > C : R → R as C ( u , s ) = L | u | − | s | . For z ∈ R we deﬁne the cone at z as C + ( z ) : = { v : C ( z − v ) ≥ } (see Figure 5). Let r > J : = [ − r , r ] ⊂ R and let B ⊂ R be the rectangle B : = [ − r , r ] × [ −L r , L r ] . Deﬁnition 33.

We say that ˜ F satisﬁes cone conditions in B if for every z ∈ B we have (see Figure5) ˜ F (cid:0) C + ( z ) ∩ B (cid:1) ⊂ C + (cid:16) ˜ F ( z ) (cid:17) . Computer Assisted Proofs in Dynamics: http: // capd.ii.uj.edu.pl

24e have the following lemma, which gives bounds on the unstable manifold in the localcoordinates.

Lemma 34. If ˜ F satisﬁes cone conditions in B, and there exists a λ < such that for everyz ∈ C + (0) we have | π u ˜ F ( u , s ) | > λ − | u | , (53) then there exists a smooth function w : J → [ − r L , r L ] , such thatW u ( ˜ F , B ) = { ( u , w ( u )) : u ∈ J } . Moreover, (cid:12)(cid:12)(cid:12) ddu w ( u ) (cid:12)(cid:12)(cid:12) ≤ L and for every u ∈ J (cid:13)(cid:13)(cid:13) ˜ F − n ( u , w ( u )) (cid:13)(cid:13)(cid:13) < λ n (cid:112) + L | u | . (54) Proof.

This lemma in a slightly more general form was proven in [32]. We therefore limitourselves to a sketch of the proof, which is given in Appendix C.In practice we can validate cone conditions and (53) from the interval enclosure of the deriva-tive of ˜ F on B . We give proofs of below lemmas in the appendix. Lemma 35. If [ D ˜ F ( B )] ( C + (0)) ⊂ C + (0) then ˜ F satisﬁes cone conditions.

Above lemma is straighforward to apply in interval arithmetic by checking that[ D ˜ F ( B )] ( { } × [ −L , L ]) ⊂ C + (0) . Lemma 36.

Let a , a , a , a be real intervals such that [ D ˜ F ( B )] = ( a i j ) i , j ∈{ , } . If a −L | a | > λ − then (53) is fullﬁled. Using a computer program we compute an interval enclosure [ D ˜ F ( B )]. This enclosure isused to validate, via Lemmas 35 and 36, the assumptions of Lemma 34. This way we obtain w : J → [ − r L , r L ], and deﬁne P u : J → R by P u ( x ) : = P ( x , w ( x )) . Note that since w ( x ) is Lipschitz with constant L , w ( x ) ∈ [ −L x , L x ], our method allows us toobtain the explicit bound P u ( x ) ⊂ P ( { x } × [ −L x , L x ]) , for every x ∈ J . Moreover, by Lemma 34 we know that ddx ( x , w ( x )) ∈ { } × [ −L , L ], which gives the bound onthe derivative of P u as ddx P u ( x ) ⊂ P ( { } × [ −L , L ]) , for every x ∈ J . From (54) we also see that for every x ∈ J (cid:13)(cid:13)(cid:13) F − n ( P u ( x )) (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) P ˜ F − n ( x , w ( x )) (cid:13)(cid:13)(cid:13) ≤ (cid:107)P(cid:107) λ n (cid:112) + L | x | ≤ C λ n , for C : = (cid:107)P(cid:107) √ + L r ; recall that J = [ − r , r ]. We thus see that we have all the bounds for P u ,which are required by section 24.The function P s and associated bounds can be obtained the same way, by considering F − instead of F . 25 .2. Parameterization method for the (un)stable manifold with validated error bounds We now review a method for computing high order polynomial expansions of chart mapsfor the local stable / unstable manifolds, providing accurate approximations further from the ﬁxedpoint. Employing these expansions improves error bounds in computer assisted proofs for con-necting orbits, as it shortens the orbit segment between the local stable / unstable manifold seg-ments. This in turn improves the condition number of the matrix appearing in the interval Newtonmethod.The idea behind the parameterization method is to ﬁnd a chart map conjugating the givendynamics to the linear dynamics at the ﬁxed point. The conjugacy is then used to accurately trackorbits as they approach the ﬁxed point. Since the chart map expansions are used in computerassisted proofs, it is necessary to develop explicit and mathematically rigorous bounds for thetruncation errors. We also need to be able to compute rigorous enclosure on derivatives.Our approach is adapted from more general results of [30], a work which is itself based on theparameterization method of [33, 34, 29]. We refer to the book of [35] for much more completediscussion of the parameterization method. The key is that the desired conjugacy relation isviewed nonlinear functional equation, and error bounds are obtained via ﬁxed point argumentsin appropriate function spaces. Rather than proceeding in full generality, we focus for the sakeof simplicity only on the details needed in the present work.So, suppose that f : C → C is a function of two complex variables, analytic in an open setabout the ﬁxed point z . Let λ ∈ C denote an unstable eigenvalue of D f ( z ), so that | λ | >

1, andlet ξ ∈ C be an associated eigenvector of λ .With D ⊂ C denoteing the unit disk in the complex plane, we look for an analytic function P : D → C having that P (0) = z , P (cid:48) (0) = ξ , and that P solves the invariance equation f ( P ( σ )) = P ( λσ ) , σ ∈ D . (55)Such a P parameterizes a local unstable manifold attached to z . Indeed, the equation requiresthat applying the linear dynamics in D is the same as applying the full dynamics on the image of P : that is, P is a conjugacy as desired.Since λ (cid:44) σ (cid:55)→ λ − σ and obtain the equivalent ﬁxed point problem P ( σ ) = f ( P ( λ − σ )) , for σ ∈ D . (56)This problem has exactly the same solution, but is better suited for a-posteriori error analysis.Writing P ( σ ) = ∞ (cid:88) n = (cid:32) a n b n (cid:33) σ n = ∞ (cid:88) n = p n σ n , we impose that p = z and p = ξ , so that P satisﬁes the ﬁrst order constraints, i.e. P (0) = z and DP (0) = ξ . The coe ﬃ cients p , p , p , . . . are computed via a power matching argument,which depends strongly on the nonlinearity of f . See Appendix F for the derivation when f isthe standard map.Let P N ( σ ) = N (cid:88) n = (cid:32) a n b n (cid:33) σ n , denote the approximate parameterization obtained by truncating P to order N . That is, we sup-pose that the coe ﬃ cients p , . . . , p N are exactly the Taylor coe ﬃ cients of P . In practice these26ust be computed using validated numerical methods and are known only up to interval enclo-sures.Our goal is to understand the truncation error on D . Deﬁne the defect function E N ( σ ) = f [ P N ( λ − σ )] − P N ( σ ) . The quantity (cid:15) N = (cid:13)(cid:13)(cid:13) E N (cid:13)(cid:13)(cid:13) , is an a-posteriori error indicator on D associated with the approximation P N . We note that (cid:15) N ismade small either by taking N large or by taking (cid:107) ξ (cid:107) small. In practice this is a delicate balancingact, see Remark 41 below.Small defects do not necessarily imply small errors, and further hypotheses are needed tobound the truncation error associated with P N in terms of (cid:15) N . We now formulae an a-posterioritheorem, whose proof is given in Appendix H for the sake of completeness. The statementrequires a little notation and a few additional deﬁnitions having to do with the Taylor remainderat z .So, for σ ∈ C let | σ | denote the usual complex absolute value. We write z = ( z , z ) ∈ C andendow C with the norm (cid:107) z (cid:107) = max( | z | , | z | ) . This induces a norm on the set of all 2 × (cid:107) A (cid:107) = max( | a | + | a | , | a | + | a | ) . With this norm we have that (cid:107) A z (cid:107) ≤ (cid:107) A (cid:107)(cid:107) z (cid:107) . For a function P : D → C we write (cid:107) P (cid:107) = sup σ ∈ D (cid:107) P ( σ ) (cid:107) , to denote the usual supremum norm on C ( D , C ) norm. Recall that the set D = (cid:110) P : D → C : P is analytic on D and (cid:107) P (cid:107) < ∞ (cid:111) , is a Banach space.Fix z ∈ C and r ∗ , R ∈ R with 0 < r ∗ < R . For (cid:107) z − z (cid:107) < R and (cid:107) w (cid:107) < r ∗ write the ﬁrst orderTaylor expansion of f as f ( z + w ) = f ( z ) + D f ( z ) w + R z ( w ) . Here R z ( · ) – the ﬁrst order Taylor remainder at z – is analytic in both z and w .By the Taylor remainder theorem there are constants 0 < C , C withsup (cid:107) z − z (cid:107)≤ R (cid:107)R z ( w ) (cid:107) ≤ C (cid:107) w (cid:107) , for (cid:107) w (cid:107) ≤ r ∗ , (57)and sup (cid:107) z − z (cid:107)≤ R (cid:107) D R z ( w ) (cid:107) ≤ C (cid:107) w (cid:107) , for (cid:107) w (cid:107) ≤ r ∗ . (58)27here is also a C > (cid:107) z − z (cid:107)≤ R (cid:107) D f ( z ) (cid:107) ≤ C . (59)If f is entire then R z ( w ) is entire in both z and w , and explicit constants C , C , C for the standardmap are derived in Appendix G. In general what is needed is that f is analytic on a ball about z of radius R + r ∗ . Theorem 37 (A-posteriori error bounds for Equation (55) ). Suppose that f : C → C ﬁxesthe point z ∈ C . Let < r ∗ < R and suppose that and that f is analytic for all z ∈ C with (cid:107) z − z (cid:107) < R + r ∗ . Assume that D f ( z ) has a single unstable eigenvalue, denoted by λ , and let µ = λ − and ξ ∈ C be an eigenvector associated with λ . Let P denote the solution of Equation (55) on the unit diskD ⊂ C , and let p , . . . , p N ∈ C denote the zeroth through N-th order power series coe ﬃ cientsof P subject to the constraints p = z and p = ξ .Let C , C , and C be as deﬁned in Equations (57) , (58) , and (59) , and assume tha theysatisfy N (cid:88) n = µ n (cid:107) p n (cid:107) < R , (60) and C | µ | N + < (cid:16) − C | µ | N + (cid:17) . (61) If r > has r ≤ r ∗ , (62) and C | µ | N + r − (cid:16) − C | µ | N + (cid:17) r + (cid:15) N < , (63) then sup | σ |≤ (cid:13)(cid:13)(cid:13) P ( σ ) − P N ( σ ) (cid:13)(cid:13)(cid:13) ≤ r . Remark 38 (Existence of an r > satisfying the hypotheses of Equation (63) .). Observe that p ( r ) = C | µ | N + r − (cid:16) − C | µ | N + (cid:17) r + (cid:15) N , is a quadratic polynomial with p (0) = (cid:15) N > p (cid:48) (0) = − (cid:16) − C | µ | N + (cid:17) <

0. If the discrim-inant condition hypothesized in Equation (61) is met, then p ( r ) has two positive real roots r ± ,and moreover p ( r ) is negative for all r ∈ ( r − , r + ). Given the problem data C , C , | µ | , N , and (cid:15) N ,ﬁnding an appropriate value of r is a matter of solving a quadratic equation. Remark 39 (Stable manifold parameterization).

The standard map is symplectic, so that if | λ | < D f ( z ) then 1 /λ is an unstable eigenvalue. The standard map isentire with entire inverse, and the theorem above applies to the unstable manifold of f − . A moregeneral alternative, which does not require invertibility of f , is to study separately the equation P ( λσ ) = f ( P ( σ )) , and develop analogous a-posteriori analysis for this equation. See [30].28 emark 40 (Real invariant manifolds). Suppose that f is real valued for real inputs, as is thecase for the standard map. Then p , is real, and we are interested in the real image of P . Inthe case considered in the present work – that of a real hyperbolic saddle – the eigenvalues andeigenvectors are real also. It follows that solutions a n and b n of Equation (F.5) are real at allorders. Taking real values of σ provides the parameterization of the real stable manifold for f at p , and treating σ as a complex variable is a convenience which facilitates the use of analyticfunction theory in the error analysis. Remark 41 (Scaling the eigenvector).

The coe ﬃ cients p , p , p , . . . , and hence the solution P ( σ ), are only unique up to the choice of the scaling of the eigenvector. This is seen explic-itly in Appendix F, where the formal series solution of Equation (56) is derived. This lack ofuniqueness is exploited in numerical calculations, providing control over the growth rate of thecoe ﬃ cient sequence. Algorithms for determining optimal scalings are discussed in [36]. In thepresent work we determine good scalings through numerical experimentation. More precisely,we ﬁx at the start of the calculation the order of approximation N . Then we adjust the scaling ofthe eigenvector so that coe ﬃ cients of order N are roughly of size machine epsilon, as the mag-nitude of the N -th power series coe ﬃ cient of P serves as a good heuristic indicator of the size ofthe truncation error. Remark 42 (Bounds on derivatives).

Suppose that P ( σ ) = P N ( σ ) + H ( σ ) for | σ | <

1, with (cid:107) H (cid:107) ≤ r . Then P is di ﬀ erentiable on D with dd σ P ( σ ) = dd σ P N ( σ ) + dd σ H ( σ ) , for all σ ∈ D . Here P N ( σ ) is a polynomial whose derivative is given by the standard formula.The derivative of H is bounded on any smaller disk thanks to the Cauchy boundsup | σ |≤ e − ν (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) dd σ H ( σ ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ πν (cid:107) H (cid:107) ≤ πν r , (64)where ν >

0. A proof is in [30].In practice ﬁnding (cid:15) N requires a bound on the tail of f ( P N ( σ )), and this will of course dependon the explicit form of the map f . For example, if f is the Chirikov Standard Map f ( z , z ) = (cid:32) z + z + α sin( z ) z + α sin( z ) (cid:33) , (65)then we write f (cid:16) P N (cid:16) λ − σ (cid:17)(cid:17) = ∞ (cid:88) n = f n σ n , and observe that we have to study the term [sin( P N ( λ − σ ))] n – the Taylor coe ﬃ cients of thecomposition of P N with the sine function. Indeed, we have that E N ( σ ) = f ( P N ( λ − σ )) − P N ( σ ) = ∞ (cid:88) n = N + f n σ n = α ∞ (cid:88) n = N + [sin( P N ( λ − σ ))] n (cid:34) (cid:35) σ n ,

29s the lower order terms cancel exactly by hypothesis, and the linear operations on P N do notcontribute to the tail of the series. Then by taking (cid:15) N = ∞ (cid:88) n = N + (cid:107) f n (cid:107) = ∞ (cid:88) n = N + | α | (cid:12)(cid:12)(cid:12) [sin( P N ( λ − σ ))] n (cid:12)(cid:12)(cid:12) we obtain (cid:107) E N (cid:107) ≤ (cid:15) N , as needed.However the Taylor series expansion of sin( P N ( λ − σ )) is an inﬁnite series, even though P N is a polynomial, and bounding the tail is not a ﬁnite calculation. The following Lemma, whoseproof is found in Appendix I, exploits the fact that sin( P N ( λ − σ )) is the solution of a certain lin-ear di ﬀ erential equation involving only the known data P N . This analysis reduces the necessarybound to a ﬁnite sum. Lemma 43.

Suppose that g N : C → C is an N-th order polynomial denoted byg N ( σ ) = N (cid:88) n = β n σ n . We write c ( σ ) = cos (cid:16) g N ( σ ) (cid:17) = ∞ (cid:88) n = c n σ n , and s ( σ ) = sin (cid:16) g N ( σ ) (cid:17) = ∞ (cid:88) n = s n σ n , to denote the power series of the compositions with sine and cosine. Lets N ( σ ) = N (cid:88) n = s n σ n , and c N ( σ ) = N (cid:88) n = c n σ n , be the Taylor polynomials to N-th order, where recursion relations for the coe ﬃ cients s n and c n are worked out via power matching in Appendix F. Let ˆ K = N − (cid:88) n = ( n + | β n + | , and e N = max  N (cid:88) n = N + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − (cid:88) k = k + n s n − k − β k + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) , N (cid:88) n = N + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n − (cid:88) k = k + n c n − k − β k + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . igure A.6: For I ∈ [0 ,

1] the system is not modiﬁed (bottom grey area). In the white regions the system is modiﬁed bythe ‘bump’ function to allow for gluing at I = I = −

1. The system on I ∈ [2 ,

5] is a ‘ﬂipped copy’ of the system on I ∈ [ − , I ∈ [5 , π −

1] we ‘freeze’ I = − = π − Assume that ˆ KN + < . Then the truncation error on the unit disk D satisﬁes sup | σ |≤ (cid:107) sin( g N ( σ )) − s N ( σ ) (cid:107) ≤ e N − ˆ KN + , (66) and similarly sup | σ |≤ (cid:107) cos( g N ( σ )) − c N ( σ ) (cid:107) ≤ e N − ˆ KN + . (67) Appendix A. Modiﬁcation of a system with a normally hyperbolic invariant cilinder to onewith a normally hyperbolic invariant torus

Consider a family of maps f ε : R d × R × T → R d × R × T for which π I f ( x ) = π I x . Wenow modify the family to ˜ f ε : R d × T → R d × T so that f ε ( u , s , I , θ ) = ˜ f ε ( u , s , I , θ ) for I ∈ [0 , . Before writing out the slightly technical formulae we ﬁrst explain the idea, which is depictedin Figure A.6. For I ∈ [0 ,

1] we leave the system as it is. We then employ a ‘bump’ function sothat at the edges of the domain I ∈ [ − , I = I = − = π −

1, we have ˜ f ε = f .Then for I ∈ [2 ,

5] we ‘ﬂip’ the system and glue at I =

2. For the remaining I ∈ [5 , π −

1] we‘freeze’ the system taking f with I = − = π − f ε is chosen are of secondary importance. The important issueis that if we prove di ﬀ usion in I over the range I ∈ [0 ,

1] for ˜ f ε , then this implies di ﬀ usion for f ε , as for I ∈ [0 ,

1] the systems are the same. Since all the assumptions of Theorem 11 are for ε =

0, the discussion here is of an abstract nature. The assumptions of Theorems 11 need tobe validated for f over I ∈ [0 , b : R → [0 ,

1] for which b ( I ) = I ∈ R \ ( − , b ( I ) = I ∈ [0 , g ( u , s , I , θ ) : = ( u , s , − I , θ ), h ( u , s , I , θ ) : = ( u , s , − , θ ) and take˜ f ε ( x ) =  f ( x ) + b ( π I x ) ( f ε ( x ) − f ( x )) for π I x ∈ [ − , , f ( g ( x )) + b (4 − π I x ) ( f ε ( g ( x )) − f ( g ( x ))) for π I x ∈ [2 , , f ( h ( x )) for π I x ∈ [5 , π − . Remark 44.

A similar construction works in the case when I is higher dimensional; say I ∈ [0 , k . In this case we control one action as mentioned in Remarks 14, 20, but we need to makesure that the strips for Theorems 17, 18, 19 do not intersect the boundary of { I ∈ [0 , k } on anyaction coordinate, except the one action which we control. Otherwise the dynamics could escape { I ∈ [0 , k } through the remaining actions, and the results obtained for the artiﬁcial ‘glued’system would not need to be realized by the true system. Appendix B. Proof of Lemma 29

We will show that the tangent lines to W u ( F ) and W s ( F ) at the intersection point v ∗ M span R . Note that v ∗ M = F M ( P u ( x ∗ )). Taking w = DP u ( x ∗ ) ∈ R and w k = DF (cid:16) v ∗ k (cid:17) w k − ∈ R wesee that ddx F M ( P u ( x )) | x = x ∗ = w M − . If w M − was colinear with ddy P s ( y ) | y = y ∗ , then there would exist an α (cid:44) ddy P s ( y ∗ ) = α w M − . Taking the vector V = (1 , w , . . . , w M − , /α ) would lead to D F ( p ∗ ) V = . This is a contradiction, since if p ∗ is validated by the use of Theorem 9, so the matrix D F ( p ∗ )must be invertible. Appendix C. Proof of Lemma 34

Since 0 is a hyperbolic ﬁxed point of ˜ F , locally at the ﬁxed point the unstable manifold exists,is smooth, and tangent to the horizontal axis, hence it is contained in C + (0). Cone conditiontogether with (53) ensure that the unstable manifold is streched through B to become a graphabove J . Since locally, close to zero, the unstable manifold is tangent to the horizontal axis it isa graph of a function with the Lipschitz constant smaller than L . This property is preserved asthe manifold is stretched throughout B thanks to the cone condition.To show (54) note that for z ∈ C + (0), since | π s z | < L | π u z | , we obtain (cid:107) z (cid:107) ≤ √ + L | π u z | .Thus, from (53), (cid:107) z (cid:107) < (cid:112) + L | π u z | < (cid:112) + L λ (cid:12)(cid:12)(cid:12) π u ˜ F ( z ) (cid:12)(cid:12)(cid:12) . For instance b ( x ) = exp( − (cid:16) − x (cid:17) − ) for x ∈ [ − , b ( x ) = x ∈ [0 , b ( x ) = exp( − (cid:16) − (1 − x ) (cid:17) − ) for x ∈ [1 ,

2] and zero otherwise. z = ˜ F − n ( w ( u )) and using (53) we obtain (cid:13)(cid:13)(cid:13) ˜ F − n ( w ( u )) (cid:13)(cid:13)(cid:13) < (cid:112) + L λ (cid:12)(cid:12)(cid:12) π u ˜ F − n + ( w ( u )) (cid:12)(cid:12)(cid:12) < . . . < (cid:112) + L λ n | u | , as required. Appendix D. Proof of Lemma 35

Let z ∈ B and v ∈ C + ( z ) ∩ B . Since v − z ∈ C + (0), from our assumption it follows that˜ F ( v ) − ˜ F ( z ) = (cid:90) ddt ˜ F ( z + t ( v − z )) dt (D.1) = (cid:90) D ˜ F ( z + t ( v − z )) dt ( v − z ) ∈ [ DF ( B )] ( v − z ) ⊂ C + (0) , hence ˜ F ( C + ( z )) ⊂ C + ( ˜ F ( z )), as required. Appendix E. Proof of Lemma 36

Let ( u , s ) ∈ C + (0) ∩ B . From a mirror argument to (D.1) and since | s | ≤ L | u | , | π u F ( u , s ) | ∈ | π u [ DF ( B )] ( u , s ) | ≥ a | u | − L | a | | u | > λ − | u | , as required. Appendix F. Formal series calculations

Suppose that p = (cid:32) x y (cid:33) is a hyperbolic ﬁxed point of the Standard map f (deﬁned in Equation (65)), that λ ∈ R is aneigenvalue of D f ( p ), and that ξ ∈ R is an associated eigenvector.Let P ( σ ) = (cid:32) X ( σ ) Y ( σ ) (cid:33) = ∞ (cid:88) n = (cid:32) a n b n (cid:33) σ n . and note that (cid:32) a b (cid:33) = (cid:32) x y (cid:33) = p , and (cid:32) a b (cid:33) = ξ. While the equations f ( P ( σ )) = P ( λσ ) or f ( P ( λ − σ )) = P ( σ ) , P ( λσ ) = ∞ (cid:88) n = λ n (cid:32) a n b n (cid:33) σ n , computationg the power series coe ﬃ cients of f ( P ( σ )) is more delicate, due to the appearance ofthe composition term sin( X ( s )). To work out the power series of the composition let s ( σ ) = sin( X ( σ )) = ∞ (cid:88) n = s n σ n , and c ( σ ) = cos( X ( σ )) = ∞ (cid:88) n = c n σ n , where we note that s n and c n depend on the a n and b n . Indeed, to ﬁrst order we have that s = sin( X (0)) = sin( a ) , and c = cos( a ) . Di ﬀ erentiating s ( σ ) and c ( σ ) leads to s (cid:48) ( σ ) = cos( X ( σ )) X (cid:48) ( σ ) = c ( σ ) X (cid:48) ( σ ) , (F.1)and c (cid:48) ( σ ) = − sin( X ( σ )) X (cid:48) ( σ ) = − s ( σ ) X (cid:48) ( σ ) , (F.2)and evaluating at σ = s = c a , and c = − s a . To work out the higher order terms, we expand Equations (F.1) and (F.2) as power series andobtain that ∞ (cid:88) n = ( n + s n + σ n =  ∞ (cid:88) n = c n σ n   ∞ (cid:88) n = ( n + a n + σ n  = ∞ (cid:88) n =  n (cid:88) k = ( k + c n − k a k +  σ n , and ∞ (cid:88) n = ( n + c n + σ n = −  ∞ (cid:88) n = s n σ n   ∞ (cid:88) n = ( n + a n + σ n  = − ∞ (cid:88) n =  n (cid:88) k = ( k + s n − k a k +  σ n , Matching like powers (and reindexing) leads to s n = n n − (cid:88) k = ( k + c n − k − a k + = c a n + n n − (cid:88) k = ( k + c n − k − a k + , (F.3)34nd c n = − n n − (cid:88) k = ( k + s n − k − a k + = − s a n + − n n − (cid:88) k = ( k + s n − k − a k + , (F.4)for n ≥

2. Note that the sums on the right hand sides depend only on terms of order less than n .Then f ( P ( σ )) = f ( X ( σ ) , Y ( σ )) = (cid:32) X ( σ ) + Y ( σ ) + α sin( X ( σ )) Y ( σ ) + α sin( X ( σ )) (cid:33) = ∞ (cid:88) n = (cid:32) a n + b n + α s n b n + α s n (cid:33) σ n = ∞ (cid:88) n =  a n + b n + α (cid:16) c a n + n (cid:80) n − k = ( k + c n − k − a k + (cid:17) b n + α (cid:16) c a n + n (cid:80) n − k = ( k + c n − k − a k + (cid:17)  σ n Setting this last sum equal to P ( λσ ) and matching like powers leads to (cid:32) a n + b n + α c a n + α n (cid:80) n − k = ( k + c n − k − a k + b n + α c a n + α n (cid:80) n − k = ( k + c n − k − a k + (cid:33) = λ n (cid:32) a n b n (cid:33) , or, upon rearranging (cid:34) + α c − λ n α c − λ n (cid:35) (cid:32) a n b n (cid:33) = − α n n − (cid:88) k = ( k + c n − k − a k + (cid:32) (cid:33) , (F.5)for n ≥

2. Observe that the right hand side of the equation does not depend on a n . Indeed, theright hand side depends only on c , . . . , c n − , and a , . . . , a n − .Moreover, noting that (cid:34) + α c − λ n α c − λ n (cid:35) = (cid:34) + α cos( x ) − λ n α cos( x ) 1 − λ n (cid:35) = D f ( p ) − λ n Id , and observing that for n ≥ λ n is never an eigenvalue of D f ( p ) (as | λ | (cid:44) ﬃ cients of P are formally welldeﬁned to all orders. Observe also that once we solve Equation (F.5) for a n and b n we compute s n , and c n using Equations (F.3) and (F.4), as this information is needed to solve the homologicalequations at order n + Appendix G. Explicit constants for parameterization of the standard map

The next lemmas provide constants C , C , C when f is the standard map deﬁned in Equa-tion (65). Lemma 45 (Explicit constants for the Standard Map).

Consider f : C → C as given inEquation (65) . Recall that λ is an unstable eigenvalue and that µ = λ − . Let R and r ∗ be onstants satisfying < r ∗ < R. If we chooseC = | α | e R ( e r ∗ + , C = | α | e R ( e r ∗ + , C = + | α | e R , then (57–59) are satisﬁed. Proof.

To prove the lemma, start by expanding f ( z + w , z + w ) to ﬁnd that R ( z , z ) ( w , w ) = α [sin( z )(cos( w ) − + cos( z )(sin( w ) − w )] (cid:34) (cid:35) , is the ﬁrst order Taylor remainder for the standard map as a function of the base point ( z , z ).Recalling that z , z , w , w ∈ C , the bounds C and C follow immediately from the identitiessin( z ) = ( e z − e − z ) / i and cos( z ) = ( e z + e − z ) /

2. Indeed we have that | cos( z ) | ≤ e | z | + ≤ e | z | , | sin( z ) | ≤ e | z | + ≤ e | z | , | cos( z ) − | ≤ | z | e | z | + , and | sin( z ) − z | ≤ | z |

12 ( e | z | + . The form of C follows directly.Di ﬀ erentiating with respect to w = ( w , w ) gives that D R z , z ( w , w ) = α ( − sin( z ) sin( w ) + cos( z )(cos( w ) − (cid:32) (cid:33) , from which follows the form of C .Finally, for the standard map we have that D f ( z , z ) = (cid:34) + α cos( z ) 1 α cos( z ) 1 (cid:35) , and the bound on C follows from the formula for the matrix norm.Observing that f − ( z + w , z + w ) = f − ( z , z ) + D f − ( z , z ) (cid:34) w w (cid:35) + ˜ R ( z , z ) ( w , w ) , with˜ R ( z , z ) ( w , w ) = (cid:32) − α sin( z − z ) (cos( w − w ) − − α cos( z − z ) (sin( w − w ) − ( w − w )) (cid:33) , D f − ( z , z ) = (cid:32) − − α cos( z − z ) 1 + α cos( z − z ) (cid:33) , gives the following bounds. Lemma 46 (Explicit constants for the inverse Standard Map).

Let f − : C → C denote theinverse of the standard map. ChoosingC = | α | e M ( e r ∗ + , C = | α | e M ( e r ∗ + , C = max (cid:16) , + | α | e ˜ M (cid:17) , gives that sup (cid:107) z − z (cid:107)≤ R (cid:107) ˜ R z ( w ) (cid:107) ≤ C (cid:107) w (cid:107) , for (cid:107) w (cid:107) ≤ r ∗ , sup (cid:107) z − z (cid:107)≤ R (cid:107) D ˜ R z ( w ) (cid:107) ≤ C (cid:107) w (cid:107) , for (cid:107) w (cid:107) ≤ r ∗ , and sup (cid:107) z − z (cid:107)≤ R (cid:107) D f − ( z ) (cid:107) ≤ C . Appendix H. Proof of Theorem 37

The idea behind the proof is to write P ( σ ) = P N ( σ ) + H ( σ ) where H is analytic on D , and torewrite (56) as a ﬁxed point problem for H .Since the coe ﬃ cients of P N are exactly the Taylor coe ﬃ cients of P , we have that H (0) = dd σ H (0) = . . . = d N d σ N H (0) = . That is, the truncation error function H is zero to order N at σ =

0. We refer to H as an analyticN-tail , and let X = (cid:110) H : D → C : H is analytic, H (0) = . . . = H N (0) = , (cid:107) H (cid:107) < ∞ (cid:111) , (H.1)denote the Banach space of all bounded analytic N -tails endowed with the C norm. For a linearoperator M : X → X we write (cid:107)M(cid:107) B ( X ) = sup (cid:107) H (cid:107) = (cid:107)M ( H ) (cid:107) , to denote the operator norm. The collection of all bounded linear operators (bounded in the (cid:107) · (cid:107) B ( D ) ) is denoted B ( D ), and is a Banach algebra.Using the ﬁrst order Taylor expansion we rewrite the invariance equation (56) as P N ( σ ) + H ( σ ) = f ( P N ( µσ ) + H ( µσ )) = f ( P N ( µσ )) + D f ( P N ( µσ )) H ( µσ ) + R P N ( µσ ) ( H ( µσ )) . E N ( σ ) = f ( P N ( µσ )) − P N ( σ ) , and note that E N is an analytic N -tail. Moreover, E N does not depend on H . Rearranging leadsto the ﬁxed point problem H ( σ ) = E N ( σ ) + D f ( P N ( µσ )) H ( µσ ) + R P N ( µσ ) ( H ( µσ )) , (H.2)for the truncation error. Observe that if H ∈ X is a solution of Equation (H.2) with (cid:107) H (cid:107) ≤ r then P = P N + H solves Equation (56) and has (cid:13)(cid:13)(cid:13) P − P N (cid:13)(cid:13)(cid:13) = (cid:107) H (cid:107) ≤ r . Since H ∈ X and P N is entire, P is analytic on D .Writing P N ( µσ ) = z , we see that the condition given in Equation (60) gives (cid:107) z − z (cid:107) = (cid:107) P N ( µσ ) − p (cid:107) ≤ N (cid:88) n = µ n (cid:107) p n (cid:107) ≤ R , so that for all H ∈ X with (cid:107) H (cid:107) ≤ r ∗ the estimates of Equations (57), (58), and (59) give us (cid:107)R P N ( µσ ) ( H ) (cid:107) ≤ C (cid:107) H (cid:107) , and (cid:107) D R P N ( µσ ) ( H ) (cid:107) B ( X ) ≤ C (cid:107) H (cid:107) . Deﬁne the linear operator µ : X → X by µ ( H )( σ ) = H ( µσ ) . The main estimate is that (cid:107) µ (cid:107) B ( X ) ≤ | µ | N + . To see this note that for any H ∈ X we have thatsup σ ∈ D (cid:107) H ( µσ ) (cid:107) ≤ | µ | N + sup σ ∈ D (cid:107) H ( σ ) (cid:107) , (H.3)by the maximum modulus principle.Now deﬁne the ﬁxed point operator Ψ : X → X by Ψ [ H ]( σ ) = E N ( σ ) + D f ( P N ( µσ )) H ( µσ ) + R P N ( µσ ) ( H ( µσ )) . Let B r = { H ∈ D : (cid:107) H (cid:107) ≤ r } . We will show that Ψ has a unique ﬁxed point in B r ⊂ X using the contraction mapping theorem.Observe ﬁrst that Ψ is Fr´echet di ﬀ erentiable with D Ψ [ H ] H = D f ( P N ( µσ )) H ( µσ ) + D R P N ( µσ ) ( H ( µσ )) H ( µσ ) , for all H , H ∈ X . From the inequality hypothesized in Equation (63) we obtain that C | µ | N + r + C | µ | N + r + (cid:15) N < r , (H.4)38y adding r to both sides of Equation (63). Dividing the inequality in Equation (H.4) by r > C | µ | N + r + C | µ | N + + (cid:15) N r < , and since each term in the sum is positive we have that C | µ | N + r + C | µ | N + r < . (H.5)Now for H ∈ B r we have that (cid:107) D Ψ [ H ] (cid:107) B ( X ) ≤ (cid:107) D f ( P N ( µσ )) µ + D R P N ( µσ ) ( H ( µσ )) µ (cid:107) B ( X ) ≤ C (cid:107) µ (cid:107) B ( X ) + C | µ | N + (cid:107) H (cid:107) (cid:107) µ (cid:107) B ( X ) ≤ C | µ | N + + C | µ | N + r < , by the inequality given in Equation (H.5). It follows from the Mean Value Inequality that Ψ is acontraction on B r .Finally, to verify that Ψ maps B r into itself choose H ∈ B r and note that (cid:107) Ψ [ H ] (cid:107) ≤ (cid:107) Ψ [ H ] − Ψ [0] (cid:107) + (cid:107) Ψ [0] (cid:107) ≤ sup (cid:107) V (cid:107) ≤ r (cid:107) D Ψ [ V ] (cid:107)(cid:107) H (cid:107) + (cid:107) E N (cid:107) ≤ (cid:16) C | µ | N + + C | µ | N + r (cid:17) r + (cid:15) N ≤ C | µ | N + r + C | µ | N + r + (cid:15) N ≤ r , again by the mean value inequality and the bound given in Equation (H.5). Then Ψ is a contrac-tion mapping from B r into itself, and the proof is complete. Appendix I. Proof of Lemma 43

Deﬁne the function Q : D → C by Q ( σ ) = dd σ g N ( σ ) = N − (cid:88) n = ( n + β n + σ n . The main observation is that s ( σ ) , c ( σ ) solve the system of di ﬀ erential equations s (cid:48) = cQc (cid:48) = − sQ subject to the initial conditions s (0) = s = sin( g N (0)) = sin( β ) , and c (0) = c = cos( g N (0)) = cos( β ) . s ( σ ) = s + (cid:90) σ c ( z ) Q ( z ) dz , c ( σ ) = c − (cid:90) σ s ( z ) Q ( z ) dz . (I.1)We now write s = s N + s ∞ and c = c N + c ∞ with ( s ∞ , c ∞ ) ∈ X (the space of analytic N -tails – seeEquation (H.1)).Then s N ( σ ) + s ∞ ( σ ) = s + (cid:90) σ c N ( z ) Q ( z ) dz + (cid:90) σ c ∞ ( z ) Q ( z ) dz , c N ( σ ) + c ∞ ( σ ) = c − (cid:90) σ s N ( z ) Q ( z ) dz − (cid:90) σ s ∞ ( z ) Q ( z ) dz , or s ∞ ( σ ) − (cid:90) σ c ∞ ( z ) Q ( z ) dz = − s N ( z ) + s + (cid:90) σ c N ( z ) Q ( z ) dz , c ∞ ( σ ) + (cid:90) σ s ∞ ( z ) Q ( z ) dz , = − c N ( z ) + c − (cid:90) σ s N ( z ) Q ( z ) dz , (I.2)Since c N , s N solve the system of Equations (I.1) exactly to N -th order we have that s N ( σ ) = s + (cid:34)(cid:90) σ c N ( z ) Q ( z ) dz (cid:35) N , c N ( σ ) = c − (cid:34)(cid:90) σ s ( z ) Q ( z ) dz (cid:35) N , so that Equation (I.2)becomes s ∞ ( σ ) − (cid:90) σ c ∞ ( z ) Q ( z ) dz = − s − (cid:34)(cid:90) σ c N ( z ) Q ( z ) dz (cid:35) N + s + (cid:90) σ c N ( z ) Q ( z ) dz , c ∞ ( σ ) + (cid:90) σ s ∞ ( z ) Q ( z ) dz , = − c + (cid:34)(cid:90) σ s ( z ) Q ( z ) dz (cid:35) N + c − (cid:90) σ s N ( z ) Q ( z ) dz , which, after cancelation is s ∞ ( σ ) − (cid:90) σ c ∞ ( z ) Q ( z ) dz = (cid:34)(cid:90) σ c N ( z ) Q ( z ) dz (cid:35) ∞ , c ∞ ( σ ) + (cid:90) σ s ∞ ( z ) Q ( z ) dz , = − (cid:34)(cid:90) σ s N ( z ) Q ( z ) dz (cid:35) ∞ , (I.3)a linear system of equations for s ∞ , c ∞ ∈ X , amenable to a straight forward Neumann seriesanalysis.To this end deﬁne H ( σ ) = (cid:32) s ∞ ( σ ) c ∞ ( σ ) (cid:33) , E N ( σ ) =  (cid:104)(cid:82) σ c N ( z ) Q ( z ) dz (cid:105) ∞ − (cid:104)(cid:82) σ [ s N ( z ) Q ( z ) dz (cid:105) ∞  . g N ( σ )) − s N ( σ ) = s ∞ ( σ ) and cos( g N ( σ )) − c N ( σ ) = c ∞ ( σ ), so obtaining boundson H ( σ ) will lead to (66–67). Observe that, c N ( z ) Q ( z ) = N − (cid:88) n =  n (cid:88) k = ( k + c n − k β k +  z n as c N and Q are N -th and N − σ ∈ D we havethat (cid:90) σ c N ( z ) Q ( z ) dz = (cid:90) σ  N − (cid:88) n = n (cid:88) k = ( k + c n − k β k + z n  dz = N − (cid:88) n = n (cid:88) k = ( k + c n − k β k + (cid:32)(cid:90) σ z n dz (cid:33) = N − (cid:88) n = n (cid:88) k = ( k + c n − k β k + n + σ n + = N (cid:88) n =  n − (cid:88) k = k + n c n − k − β k +  σ n . Then (cid:34)(cid:90) σ c N ( z ) Q ( z ) dz (cid:35) ∞ = N (cid:88) n = N +  n − (cid:88) k = k + n c n − k − β k +  σ n . A similar calculation shows that (cid:34)(cid:90) σ s N ( z ) Q ( z ) dz (cid:35) ∞ = N (cid:88) n = N +  n − (cid:88) k = k + n s n − k − β k +  σ n . Combining these with the maximum modulus principle and the triangle inequality gives (cid:107)E N (cid:107) ≤ e N , where e N is as deﬁned in the hypothesis of the Lemma. We seek a solution H ∈ X of (I.3)Deﬁne the linear operator M : X → X by M ( H ) ( σ ) =  − (cid:82) σ c ∞ ( z ) Q ( z ) dz (cid:82) σ s ∞ ( z ) Q ( z ) dz  . The linear equation for H is now (Id + M ) H = E N . We will show that (cid:107) M (cid:107) B ( X ) ≤ ˆ KN + < . To see this note that for any analytic N -tail H with (cid:107) H (cid:107) < ∞ there exists an analytic functionˆ H : D → C so that H ( σ ) = ˆ H ( σ ) σ N + , (cid:107) H (cid:107) = (cid:107) ˆ H (cid:107) . Here equality of the C norms is a consequence of the maximum modulus principle and theobservation that | σ | N + = σ ∈ D we have that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) σ H ( z ) Q ( z ) dz (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) σ z N + ˆ H ( z ) Q ( z ) dz (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:90) ( t σ ) N + ˆ H ( t σ ) Q ( t σ ) σ dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:90) | σ | N + | ˆ H ( t σ ) || Q ( t σ ) | t N + dt ≤ (cid:107) ˆ H (cid:107) (cid:107) Q (cid:107) | σ | N + (cid:90) t N + dt ≤ (cid:107) H (cid:107) ˆ K t N + N + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ ˆ KN + (cid:107) H (cid:107) , as | σ | ≤

1. Taking the sup over all H with norm one yields the result.It now follows from the assumption that ˆ K / ( N + < + M is invertible with (cid:107) (Id + M ) − (cid:107) B ( X ) ≤ − ˆ KN + . Then H = (Id + M ) − E N , and (cid:107) H (cid:107) ≤ e N − ˆ KN + , as required. References [1] P. Libermann, C.-M. Marle, Symplectic geometry and analytical mechanics, Vol. 35 of Mathematics and its Appli-cations, D. Reidel Publishing Co., Dordrecht, 1987, translated from the French by Bertram Eugene Schwarzbach. doi:10.1007/978-94-009-3807-6 .URL https://doi.org/10.1007/978-94-009-3807-6 [2] J. E. Littlewood, The Lagrange Conﬁguration in Celestial Mechanics, Proc. London Math. Soc. (3) 9 (4) (1959)525–543. doi:10.1112/plms/s3-9.4.525 .URL https://doi.org/10.1112/plms/s3-9.4.525 [3] N. N. Nehoroˇsev, An exponential estimate of the time of stability of nearly integrable Hamiltonian systems, UspehiMat. Nauk 32 (6(198)) (1977) 5–66, 287.[4] A. N. Kolmogorov, On conservation of conditionally periodic motions for a small change in Hamilton’s function,Dokl. Akad. Nauk SSSR (N.S.) 98 (1954) 527–530.[5] V. I. Arnol´d, Proof of a theorem of A. N. Kolmogorov on the preservation of conditionally periodic motions undera small perturbation of the Hamiltonian, Uspehi Mat. Nauk 18 (5 (113)) (1963) 13–40.

6] J. Moser, Convergent series expansions for quasi-periodic motions, Math. Ann. 169 (1967) 136–176.[7] V. I. Arnol´d, Small denominators and problems of stability of motion in classical and celestial mechanics, UspehiMat. Nauk 18 (6 (114)) (1963) 91–192.[8] C. L. Siegel, J. K. Moser, Lectures on celestial mechanics, Classics in Mathematics, Springer-Verlag, Berlin, 1995,translated from the German by C. I. Kalme, Reprint of the 1971 translation.[9] R. de la Llave, A tutorial on KAM theory, in: Smooth ergodic theory and its applications (Seattle, WA, 1999),Vol. 69 of Proc. Sympos. Pure Math., Amer. Math. Soc., Providence, RI, 2001, pp. 175–292. doi:10.1090/pspum/069/1858536 .URL https://doi.org/10.1090/pspum/069/1858536 [10] V. I. Arnol´d, Instability of dynamical systems with many degrees of freedom, Dokl. Akad. Nauk SSSR 156 (1964)9–12.[11] J.-P. Marco, D. Sauzin, Stability and instability for Gevrey quasi-convex near-integrable Hamiltonian systems,Publ. Math. Inst. Hautes ´Etudes Sci. (96) (2002) 199–275 (2003). doi:10.1007/s10240-003-0011-5 .URL https://doi.org/10.1007/s10240-003-0011-5 [12] P. Lochak, J.-P. Marco, Di ﬀ usion times and stability exponents for nearly integrable analytic systems, Cent. Eur. J.Math. 3 (3) (2005) 342–397. doi:10.2478/BF02475913 .URL https://doi.org/10.2478/BF02475913 [13] A. Bounemoura, J.-P. Marco, Improved exponential stability for near-integrable quasi-convex Hamiltonians, Non-linearity 24 (1) (2011) 97–112. doi:10.1088/0951-7715/24/1/005 .URL https://doi.org/10.1088/0951-7715/24/1/005 [14] J. Zhang, K. Zhang, Improved stability for analytic quasi-convex nearly integrable systems and optimal speed ofArnold di ﬀ usion, Nonlinearity 30 (7) (2017) 2918–2929. doi:10.1088/1361-6544/aa72b7 .URL https://doi.org/10.1088/1361-6544/aa72b7 [15] P. Bernard, V. Kaloshin, K. Zhang, Arnold di ﬀ usion in arbitrary degrees of freedom and normally hyperbolicinvariant cylinders, Acta Math. 217 (1) (2016) 1–79. doi:10.1007/s11511-016-0141-5 .URL https://doi.org/10.1007/s11511-016-0141-5 [16] V. Gelfreich, D. Turaev, Arnold di ﬀ usion in a priori chaotic symplectic maps, Comm. Math. Phys. 353 (2) (2017)507–547. doi:10.1007/s00220-017-2867-0 .URL https://doi.org/10.1007/s00220-017-2867-0 [17] J.-P. Marco, Twist maps and Arnold di ﬀ usion for di ﬀ eomorphisms, in: Variational methods, Vol. 18 of Radon Ser.Comput. Appl. Math., De Gruyter, Berlin, 2017, pp. 473–495.[18] J.-P. Marco, Mod`eles pour les applications ﬁbr´ees et les polysyst`emes, C. R. Math. Acad. Sci. Paris 346 (3-4)(2008) 203–208. doi:10.1016/j.crma.2007.11.017 .URL https://doi.org/10.1016/j.crma.2007.11.017 [19] M. Gidea, R. de la Llave, T. M-Seara, A general mechanism of di ﬀ usion in Hamiltonian systems: qualitative results,Comm. Pure Appl. Math. 73 (1) (2020) 150–209. doi:10.1002/cpa.21856 .URL https://doi.org/10.1002/cpa.21856 [20] M. J. Capi´nski, M. Gidea, Arnold di ﬀ usion, quantitative estimates and stochastic behavior in the three-body prob-lem.URL https://arxiv.org/abs/1812.03665 [21] M. W. Hirsch, C. C. Pugh, M. Shub, Invariant manifolds, Bull. Amer. Math. Soc. 76 (1970) 1015–1019. doi:10.1090/S0002-9904-1970-12537-X .URL https://doi.org/10.1090/S0002-9904-1970-12537-X [22] A. Delshams, R. de la Llave, T. M. Seara, Geometric properties of the scattering map of a normally hyperbolicinvariant manifold, Adv. Math. 217 (3) (2008) 1096–1153.URL https://doi-org.ezproxy.fau.edu/10.1016/j.aim.2007.08.014 [23] G. Alefeld, Inclusion methods for systems of nonlinear equations—the interval Newton method and modiﬁcations,in: Topics in validated computations (Oldenburg, 1993), Vol. 5 of Stud. Comput. Math., North-Holland, Amster-dam, 1994, pp. 7–26.[24] P. Zgliczy´nski, Covering relations, cone conditions and the stable manifold theorem, J. Di ﬀ erential Equations246 (5) (2009) 1774–1819. doi:10.1016/j.jde.2008.12.019 .URL https://doi.org/10.1016/j.jde.2008.12.019 [25] M. J. Capi´nski, P. Zgliczy´nski, Cone conditions and covering relations for topologically normally hyperbolic in-variant manifolds, Discrete Contin. Dyn. Syst. 30 (3) (2011) 641–670. doi:10.3934/dcds.2011.30.641 .URL https://doi.org/10.3934/dcds.2011.30.641 [26] M. J. Capi´nski, P. Zgliczy´nski, Geometric proof for normally hyperbolic invariant manifolds, J. Di ﬀ erential Equa-tions 259 (11) (2015) 6215–6286. doi:10.1016/j.jde.2015.07.020 .URL https://doi.org/10.1016/j.jde.2015.07.020 [27] X. Cabr´e, E. Fontich, R. de la Llave, The parameterization method for invariant manifolds. I. Manifolds associated o non-resonant subspaces, Indiana Univ. Math. J. 52 (2) (2003) 283–328. doi:10.1512/iumj.2003.52.2245 .URL https://doi.org/10.1512/iumj.2003.52.2245 [28] X. Cabr´e, E. Fontich, R. de la Llave, The parameterization method for invariant manifolds. II. Regularity withrespect to parameters, Indiana Univ. Math. J. 52 (2) (2003) 329–360. doi:10.1512/iumj.2003.52.2407 .URL https://doi.org/10.1512/iumj.2003.52.2407 [29] X. Cabr´e, E. Fontich, R. de la Llave, The parameterization method for invariant manifolds. III. Overview andapplications, J. Di ﬀ erential Equations 218 (2) (2005) 444–515. doi:10.1016/j.jde.2004.12.003 .URL https://doi.org/10.1016/j.jde.2004.12.003 [30] J. D. Mireles James, K. Mischaikow, Rigorous a-posteriori computation of (un)stable manifolds and connectingorbits for analytic maps, SIAM J. Appl. Dyn. Syst. 12 (2) (2013) 957–1006. doi:10.1137/12088224X .URL http://dx.doi.org/10.1137/12088224X [31] T. Kapela, M. Mrozek, D. Wilczak, Z. P., Capd::dynsys: a ﬂexible c ++ toolbox for rigorous numerical analysis ofdynamical systems, Commun. Nonlinear Sci. Numer. Simul.URL https://arxiv.org/abs/1812.03665 [32] M. J. Capi´nski, Computer assisted existence proofs of Lyapunov orbits at L and transversal intersections ofinvariant manifolds in the Jupiter-Sun PCR3BP, SIAM J. Appl. Dyn. Syst. 11 (4) (2012) 1723–1753. doi:10.1137/110847366 .URL https://doi.org/10.1137/110847366 [33] X. Cabr´e, E. Fontich, R. de la Llave, The parameterization method for invariant manifolds. I. Manifolds associatedto non-resonant subspaces, Indiana Univ. Math. J. 52 (2) (2003) 283–328.[34] X. Cabr´e, E. Fontich, R. de la Llave, The parameterization method for invariant manifolds. II. Regularity withrespect to parameters, Indiana Univ. Math. J. 52 (2) (2003) 329–360.[35] A. Haro, M. Canadell, J.-L. s. Figueras, A. Luque, J.-M. Mondelo, The parameterization method for invariantmanifolds, Vol. 195 of Applied Mathematical Sciences, Springer, [Cham], 2016, from rigorous results to e ﬀ ectivecomputations. doi:10.1007/978-3-319-29662-3 .URL http://dx.doi.org.ezproxy.fau.edu/10.1007/978-3-319-29662-3 [36] M. Breden, J.-P. Lessard, J. D. Mireles James, Computation of maximal local (un)stable manifold patches by theparameterization method, Indag. Math. (N.S.) 27 (1) (2016) 340–367. doi:10.1016/j.indag.2015.11.001 .URL http://dx.doi.org/10.1016/j.indag.2015.11.001http://dx.doi.org/10.1016/j.indag.2015.11.001