aa r X i v : . [ m a t h . D S ] M a r To appear in:
Rendiconti dell’Istituto Lombardo Accademia di Scienze e Lettere, Classe di Scienze.
ON A THEOREM OF LYAPOUNOV
ANTONIO GIORGILLIDipartimento di Matematica, Via Saldini 50,20133 — Milano, Italy.
Sunto.
Si mostra che un sistema Hamiltoniano nell’intorno di un punto di equilibrio, sottocondizione che gli autovalori soddisfino delle condizioni di non–risonanza del tipo di Melnikov,ammette una forma normale che rende evidente l’esistenza di una variet`a invariante (locale)a due dimensioni sulla quale si hanno soluzioni note. Nel caso di un autovalore puramenteimmaginario tali soluzioni formano una famiglia periodica a due parametri che costituisce lacontinuazione naturale di un modo normale. Questo secondo risultato `e stato dimostrato inprecedenza da Lyapounov. In questo lavoro si completa quello di Lyapounov dimostrando laconvergenza della trasformazione dell’Hamiltoniana a forma normale e rimuovendo le restrizioneche gli autovalori siano puramente immaginari.
Abstract.
It is shown that a Hamiltonian system in the neighbourhood of an equilibrium maybe given a special normal form in case the eigenvalues of the linearized system satisfy non–resonance conditions of Melnikov’s type. The normal form possesses a two dimensional (local)invariant manifold on which the solutions are known. If the eigenvalue is pure imaginary thenthese solutions are the natural continuation of a normal mode of the linear system. The latterresult was first proved by Lyapounov. The present paper completes Lyapounov’s result in thatthe convergence of the transformation of the Hamiltonian to a normal form is proven and thecondition that the eigenvalues be pure imaginary is removed.
1. Introduction
Consider a canonical system of differential equations in a neighbourhood of an equilib-rium, with Hamiltonian(1) H ( x, y ) = H ( x, y ) + H ( x, y ) + . . . , ( x, y ) ∈ C n , A. Giorgilli where the unperturbed quadratic part of the Hamiltonian is(2) H ( x, y ) = n X j =1 λ j x j y j , ( λ , . . . , λ n ) ∈ C n , and H s ( x, y ) for s ≥ s + 2. The form (2) is atypical one for the quadratic part of a Hamiltonian system in the neighbourhood of anequilibrium, as under quite general conditions the system may be given that form via a(complex) linear canonical transformation (see, e.g., [10] or [12], § C n . Moreover λ will be assumed to satisfy at least the first of the following non–resonance conditions:(i) First Melnikov’s condition: (3) λ ν − kλ = 0 for k ∈ Z and ν = 1 , . . . , n . (ii) Second Melnikov’s condition: (4) λ ν ± λ ν ′ − kλ = 0 for k ∈ Z and ν, ν ′ = 1 , . . . , n , the case ν ′ = ν being included.In [10] Lyapounov proved that if λ = iω is pure imaginary and the non resonancecondition (i) above is satisfied then there exists a two parameter family of solutions ofthe form(5) x j = ϕ j ( ξ , η ) , y j = ψ j ( ξ , η )written as convergent power series in the arguments(6) ξ = ˚ ξ e ita (˚ ζ ) , η = ˚ η e − ita (˚ ζ ) , where a (˚ ζ ) = λ + . . . is a convergent power series in ˚ ζ = ˚ ξ ˚ η . In the case n = 1 thisactually describes all solutions of the system. A proof in case all λ ’s are pure imaginaryis reported in [12].The proof of the theorem is worked out by the authors quoted above by expandingthe solution in the form (5) and proceeding by comparison of coefficients. From a formalviewpoint the statement above looks equivalent to the existence of a canonical trans-formation that gives the system (1) a suitable normal form, making the Hamiltonianto depend at least quadratically on x , . . . , x n , y , . . . , y n . A formal construction givingsuch a normal form can be easily produced. However, proving the convergence of thenormalization procedure seems to be more difficult. The aim of this paper is preciselyto produce a proof of convergence of the transformation to normal form.I will actually give two different statements that can be proved with the samemethod. The first one is Theorem 1:
With the nonresonance hypothesis (i) above (first Melnikov’s condition)on λ , . . . , λ n , there exists a canonical, near the identity transformation in the form ofn a theorem of Lyapounov 3a power series convergent in a neighbourhood of the origin, which gives the Hamilto-nian (1) the normal form (7) H ( x, y ) = H ( x, y ) + Γ( x y ) + F ( x, y ) , where H ( x, y ) as in (1), Γ( x y ) depends only on the product x y , and F ( x, y ) is atleast quadratic in x , . . . , x n , y , . . . , y n The existence of the Lyapounov orbits for λ pure imaginary is evident from thenormal form: just put initially x = . . . = x n = y = . . . = y n = 0, which defines alocal invariant two dimensional manifold on which the dynamics is generated by theHamiltonian λ x y + Γ( x y ). The advantage of the normal form is that it allows alsoto investigate the dynamics in the neighbourhood of the orbits so found. To this endthe following statement may be even more useful. Theorem 2:
With the nonresonance hypotheses (i) and (ii) above (first and secondMelnikov’s conditions) on λ , . . . , λ n , there exists a canonical, near the identity trans-formation in the form of a power series convergent in a neighbourhood of the origin,which gives the Hamiltonian (1) the normal form (8) H ( x, y ) = H ( x, y ) + Γ( x y , . . . , x n y n ) + F ( x, y ) , where H ( x, y ) as in (1), Γ( x, y ) contains only monomials of the form x j y j x ν y ν with a positive integer j and with ν = 2 , . . . , n , and F ( x, y ) is at least cubic in x , . . . , x n , y , . . . , y n This requires a stronger non–resonance condition. However this normal form may bemore convenient if one is interested in the stability of a Lyapounov orbit. Indeed, let all λ be pure imaginary, say λ j = iω j , and write the Hamiltonian restricted to the invariantmanifold x = . . . = x n = y = . . . = y n = 0 in action–angle variables by transforming x = √ p e iq , y = − i √ p e iq . Thus one gets the Hamiltonian ω p +Γ( p ), which representsa non linear oscillator, with orbits written as p ( t ) = p ∗ , q ( t ) = q (0)Ω( p ∗ ) t , whereΩ( p ∗ ) = ω + O ( p ∗ ) is a fixed frequency. By a translation p ′ = p − p ∗ the Hamiltonianmay be reexpanded (omitting primes) as H ( q, p, x, y ) = Ω p + n X j =2 λ j x j y j + H + H + . . . where H s is a homogeneous polynomial of degree s +2 in p / , x , . . . , y n with coefficientsperiodically depending on q . The dynamics of the latter Hamiltonian may be investigatedwith known methods from perturbation theory. The advantage with respect to thenormal form of theorem 1 is that the quadratic part of the Hamiltonian is independentof the angle q .The proof is based on a previous work by the author [8] concerning the constructionof the normal form in a case investigated by Cherry [2] and Moser [11]. It must bestressed that this problem does not involve small divisors. Rather, the possible sourceof divergence is due to the use of Cauchy’s estimates for the derivatives required by thenormalization algorithm. The global effect of accumulation of derivatives is controlled A. Giorgilli with a technique introduced by the author and U. Locatelli in order to achieve a proof ofKAM theorem using classical expansions in a perturbation parameter (see [4],[5],[6],[7]).
2. Formal algorithm
Reducing the Hamiltonian to a normal form is a quite general problem which may besolved in a number of different ways. Moreover, the concept of “normal form” mayassume a quite general meaning, depending on what one is looking for. Here I state thealgorithm in a general form, using the method of composition of Lie series.
Write the Hamiltonian after r normalization steps as(9) H ( r ) ( x, y ) = H ( x, y ) + Z ( x, y ) + . . . + Z r ( x, y ) + X s>r H ( r ) s ( x, y ) , where Z ( x, y ) , . . . , Z r ( x, y ) are in normal form, whatever it means, and are homoge-neous polynomials of degree 3 , . . . , r + 2. For r = 0 the Hamiltonian (1) is consideredto be already in the wanted form, with no functions Z .Assume that the Hamiltonian has been given a normal form (9) up to order r − H ( r − is known. The generating function χ r and the normal form Z r aredetermined by solving the equation(10) L H χ r + Z r = H ( r − r . where the common notation L ϕ · := {· , ϕ } has been used. The solution of this equationdepends on what is meant by “normal form”. At a formal level, any form that allowsto solve the equation above for Z r and χ r is acceptable. Assume for a moment that amethod of solution has been found. Then the transformed Hamiltonian is expanded as(11) H ( r ) sr + m = 1 s ! L sχ r Z m + s − X p =0 p ! L pχ r H ( r − s − p ) r + m for r ≥ , s ≥ ≤ m < r ,H ( r ) sr = 1( s − L s − χ r (cid:18) s Z r + s − s H ( r − r (cid:19) + s − X p =0 p ! L pχ r H ( r − s − p ) r for r ≥ s ≥ . The justification of the algorithm requires only some straightforward calculation, andis deferred to appendix A.Thus, the problem is how to solve the equation (10) for the generating function andthe normal form. Let me make some general considerations.Let P s denote the linear space of homogeneous polynomials of degree s in thecomplex variables x, y . Let also P = S s ≥ P s , so that a formal power series is anelement of P . A basis in P is given by the monomials x j y k := x j · . . . · x j n n y k · . . . · y k n n ,where j, k are integer vectors with non–negative components. The linear operator L H n a theorem of Lyapounov 5 maps every space P s into itself. If, due to the choice of the coordinates, the unperturbedHamiltonian H has the form (2) then the operator L H is diagonal, since L H x j y k = h j − k, λ i x j y k . The kernel and the range of L H are defined as usual, namely N = L − H (0), the inverseimage of the null vector in P , and R = L H ( P ). Both N and R are actually subspacesof the same space P , and it turns out that they are complementary subspaces, i.e., N ∩ R = { } , the null vector, and N ⊕ R = P . A consequence of the properties above isthat L H restricted to the subspace R is uniquely inverted, i.e., the equation L H χ = ψ with ψ ∈ R admits an unique solution χ satisfying the condition χ ∈ R . That uniquesolution will be written as χ = L − H ψ , i.e., L − H is defined as the inverse of L H restrictedto R . It’s easy to identify the subspaces N and R using the coordinates. Thanks to thediagonal form of L H one has(12) N = span (cid:8) x j y k : h j − k, λ i = 0 (cid:9) , R = span (cid:8) x j y k : h j − k, λ i 6 = 0 (cid:9) . Given ψ ∈ R and writing ψ = P j,k ψ j,k x j y k , with ψ j,k = 0 for x j y k ∈ N , one has(13) L − H ψ = X j,k ψ j,k h j − k, λ i x j y k . In view of the general considerations above we can conclude that the choice of anormal form is subjected to the constraint that in equation (10) we have H ( r − r − Z r ∈R . The simplest choice is to ask also Z r ∈ N , i.e., to set Z r to be the projection of H ( r − r on the subspace N . This is known indeed as Birkhoff’s normal form. I come now to show that the construction of the normal form of theorem 1 is formallyconsistent. Consider the disjoint subsets of Z n (14) K ♯ = { k ∈ Z n : k = . . . = k n = 0 } , K ♮ = { k ∈ Z n : | k | + . . . + | k n | = 1 } , K ♭ = { k ∈ Z n : | k | + . . . + | k n | > } . One has Z n = K ♯ ∪ K ♮ ∪ K ♭ , of course. Considering only integer vectors j, k withnon–negative components, introduce the subspaces of P (15) P ♯ = span (cid:8) x j y k : j + k ∈ K ♯ (cid:9) P ♮ = span (cid:8) x j y k : j + k ∈ K ♮ (cid:9) P ♭ = span n x j y k : j + k ∈ K ♭ o These subspaces are clearly disjoint, and moreover one has P = P ♯ ⊕ P ♮ ⊕ P ♭ . Finally,let N ♯ = N ∩ P ♯ and R ♯ = R ∩ P ♯ , and define the subspaces Z and W of P as(16) Z = N ♯ ⊕ P ♭ , W = R ♯ ⊕ P ♮ . A. Giorgilli
It is an easy matter to check that
Z ∩ W = { } and Z ⊕ W = P . The construction ofbases for Z and W is quite straightforward: a monomial x j y k belongs to Z in either case( j + k ∈ K ♯ and h j − k, λ i = 0) or ( j + k ∈ K ♭ ); else it belongs to W . The hypothesis (i)on λ (first Melnikov’s condition) formulated at the beginning of the introduction meansthat the non-resonance condition(17) h k, λ i 6 = 0 for 0 = k ∈ K ♯ ∪ K ♮ is satisfied. This implies W ⊂ R , so that for every ψ ∈ W the unique solution χ = L − H ψ , χ ∈ W of the equation L H χ = ψ exists. With this setting, the equation(18) L H χ + Z = Ψ , with Ψ known, admits a straightforward solution. Split Ψ = Ψ Z + Ψ W with Ψ Z ∈Z and Ψ W ∈ W ; such a decomposition exists and is unique, because Z and W arecomplementary subspaces. Then set Z = Ψ Z , and determine χ = L − H Ψ W accordingto (13). With minor changes one can also prove that the normal form of theorem 2 can beconstructed. Let(19) K ♯ = { k ∈ Z n : k = . . . = k n = 0 } , K ♮ = { k ∈ Z n : | k | + . . . + | k n | = 1 , } , K ♭ = { k ∈ Z n : | k | + . . . + | k n | > } . One has again Z n = K ♯ ∪ K ♮ ∪ K ♭ , of course. The subspaces of P are defined again asin (15), although they turn out to be different in view of the differences in the sets K .Finally, let N ♯ = N ∩ P ♯ , R ♯ = R ∩ P ♯ , N ♮ = N ∩ P ♮ and R ♮ = R ∩ P ♮ , and define thesubspaces Z and W of P as(20) Z = N ♯ ⊕ N ♮ ⊕ P ♭ , W = R ♯ ⊕ R ♮ . The difference with respect to the previous case is just that now N ♮ is not empty,because it contains all monomials of the form ( x y ) j × x ν y ν with ν = 2 , . . . , n andpositive j . This forces the change in the definition of the subspaces Z and W . However,the properties Z ∩ W = { } and Z ⊕ W = P remain true. Furthermore, in view of thesecond Melnikov’s condition, also the property W ⊂ R holds true, so that for every ψ ∈ W the unique solution χ = L − H ψ , χ ∈ W of the equation L H χ = ψ exists.
3. Quantitative estimates
Pick a real vector R ∈ R n with positive components. and consider the domain(21) ∆ R = { ( x, y ) ∈ C n : | x j | ≤ R j , | y j | ≤ R j , ≤ j ≤ n } , n a theorem of Lyapounov 7 namely a polydisk which is the product of disks of radii R , . . . , R n in the planes of thecomplex coordinates ( x , . . . , x n ) and ( y , . . . , y n ), respectively. Let also(22) Λ = min ≤ j ≤ n R j . The norm k f k R in the polydisk ∆ R is defined as(23) k f k R = X | j + k | = r | f j,k | R j + k . A family of polydisks ∆ δR of radii δR , with 0 < δ ≤ k · k δ in place of k · k δR will be used.The main result of this section is Lemma 1:
Let the Hamiltonian H (0) satisfy k H (0) s k ≤ h s − E for s ≥ , with someconstants h ≥ and E > . Let < d < / . Then there exist positive constants β and G depending on E, h, Λ , d and on λ , . . . , λ n such that k χ r k − d ≤ β r − G for all r ≥ . The rest of this section is devoted to the proof. Some technical calculations are deferredto appendix B.
The following lemma will play a crucial role in the proof of lemma 1.
Lemma 2:
Let λ ∈ C n be such that λ satisfies the non-resonance condition (3). Thenthere exists a positive γ such that the inequality |h k, λ i| ≥ | k | γ holds true for all non–zero k ∈ K ♯ ∪ K ♮ defined as in (14). Corollary 1:
Let in addition the non-resonance condition (4) be satisfied. Then thesame statement holds true for all non–zero k ∈ K ♯ ∪ K ♮ defined as in (19). The proof of the corollary is just a trivial modification of the
Proof of lemma 2.
For k ∈ K ♯ the claim is obvious, since |h k, λ i| = | k λ | . So, let k ∈ K ♮ . Set ϑ = max (cid:0) | λ | , . . . , | λ n | (cid:1) (for the corollary maximize also over | λ ν | and | λ ν ± λ ν ′ | with ν ′ = ν ). Pick an integer N ≥ ϑ and set δ = min k ∈K ♮ | k |≤ N (cid:12)(cid:12) h k, λ i (cid:12)(cid:12) , γ = min (cid:18) δN , | λ | (cid:19) ;in view of the non-resonance condition (17) one has δ >
0. Then the claim of the lemmaholds true with the given value of γ . Indeed, let k ∈ K ♮ , so that | k | = | k | −
1. If | k | ≤ N then (cid:12)(cid:12) h k, λ i (cid:12)(cid:12) ≥ δ ≥ N γ ≥ | k | γ . If | k | > N use ϑ ≤ ( N − δ/
2, which follows from thechoice of N , and evaluate (cid:12)(cid:12) h k, λ i (cid:12)(cid:12) ≥ (cid:12)(cid:12) k λ (cid:12)(cid:12) − ϑ ≥ (cid:0) | k | − (cid:1) δ − ( N − δ A. Giorgilli ≥ | k | − δ + N δ − ( N − δ = | k | δ ≥ | k | γ . Q.E.D.
Here I refer to the more restrictive hypotheses of theorem 2, and in particular to thespaces P defined as in sect. 2.3. However, the same arguments with very little simpli-fications apply also to the setting of sect. 2.2, which applies to theorem 1. I will insertshort comments in parentheses concerning the latter case, where appropriate.The estimates in this section strongly depend on a suitable splitting of all functionsover the subspaces P ♯ , P ♮ and P ♭ . At a formal level, it is useful to keep in mind thefollowing table concerning the Poisson bracket:(24) {· , ·} P ♯ (cid:12)(cid:12)(cid:12)(cid:12) P ♮ (cid:12)(cid:12)(cid:12)(cid:12) P ♭ P ♯ P ♯ (cid:12)(cid:12)(cid:12)(cid:12) P ♮ (cid:12)(cid:12)(cid:12)(cid:12) P ♭ P ♮ P ♮ (cid:12)(cid:12)(cid:12)(cid:12) P ♯ ⊕ P ♮ ⊕ P ♭ (cid:12)(cid:12)(cid:12)(cid:12) P ♮ ⊕ P ♭ P ♭ P ♭ (cid:12)(cid:12)(cid:12)(cid:12) P ♮ ⊕ P ♭ (cid:12)(cid:12)(cid:12)(cid:12) P ♭ (For the subspaces defined as in sect.2.2 just remove P ♮ from the central case corre-sponding to the Poisson bracket between functions in P ♮ .) In view of the transforma-tion formulæ (11) the situation to be considered is the following. A generating function χ ∈ W ∩ P r with some r ≥ χ = L − H ψ , with known ψ ∈ W ∩ P r .Since W = R ♯ ⊕ R ♮ one has χ = χ ♯ + χ ♮ , with an obvious meaning of the notation. Theoperator L χ may be applied either to a generic function f = f ♯ + f ♮ + f ♭ ∈ P s with s ≥ r or to a function in normal form Z = Z ♯ + Z ♮ + Z ♭ ∈ Z ∩ P m with 0 < m < r (intheorem 1 Z ♮ = 0). In particular one has Z ♯ = X j> z j x j y j , Z ♮ = X j> , ≤ ν ≤ n z j,ν x j y j x ν y ν , due to the non–resonance conditions on λ . For some non–negative δ ′ , δ ′′ , δ satisfying0 ≤ max( δ ′ , δ ′′ ) < δ ≤ / k ψ k − δ ′ , k f k − δ ′′ and k Z k − δ ′′ are assumed tobe known, and one looks for an estimate of the Lie derivative in a domain ∆ (1 − δ ) R . Thefollowing estimates will be used in the rest of the paper.(i) The generating function χ is estimated by(25) k χ k − δ ′ ≤ γ k ψ k − δ ′ , with γ as in lemma 2.(ii) The general estimate for the Lie derivative of a generic function f is(26) (cid:13)(cid:13) L χ f (cid:13)(cid:13) − δ ≤ δ − δ ′ )( δ − δ ′′ )Λ k χ k − δ ′ k f k − δ ′′ n a theorem of Lyapounov 9 with Λ as in (22). Denoting by (cid:0) L χ ♮ f ♭ (cid:1) ♮ the projection of L χ ♮ f ♭ over P ♮ one has(27) (cid:13)(cid:13)(cid:0) L χ ♮ f ♭ (cid:1) ♮ (cid:13)(cid:13) − δ ≤ δ − δ ′′ )Λ k χ k − δ ′ k f k − δ ′′ . (iii) For a function Z in normal form one has(28) (cid:13)(cid:13) L χ ( Z ♯ + Z ♮ ) (cid:13)(cid:13) − δ ≤ δ − δ ′′ ) γ Λ k ψ k − δ ′ k Z k − δ ′′ . I recall the reader’s attention on the missing denominator δ − δ ′ in (27) and (28). Thisis crucial for the convergence proof. For, working out the convergence proof requiresa quite accurate control of the accumulation of the divisors δ − δ ′ , δ − δ ′′ that appearin the generalized Cauchy estimates for derivatives. The scheme in the next section isspecially devised in order to allow such a control.The proof of (25) is a straightforward consequence of the definition of the normand of (13). For, the denominators are uniformly estimated from below by γ , in view oflemma 2.The proof of the estimates (26), (27) and (28) is a purely technical matter, and isdeferred to appendix B. The aim of this section is to obtain estimates for the norms of the generating functionsand of the transformed Hamiltonians, at every step of the normalization procedure.Consider a sequence of boxed domains ∆ (1 − δ r ) R , where { δ r } r ≥ is a monotonicallyincreasing sequence of positive numbers converging to some d < /
2. Let also δ = 0,and d r = δ r − δ r − for r ≥
1, so that d r < r . The purpose is to lookfor estimates of the norms of the generating function χ r and of the normal form Z r inthe polydisk ∆ (1 − δ r − ) R , and of the functions H ( r ) s in the domain ∆ (1 − δ r ) R .Let J r,s for 1 < r < s be the set of integer arrays defined as(29) J r,s = n J = { j , . . . j k } : j m ∈ { , . . . , r } , ≤ k ≤ s − , k X m =1 log j m ≤ s − − log s ) o . Let also J ,s = ∅ for s ≥
1. Recalling that { d r } r ≥ is a sequence of positive numbersnot exceeding 1 define the sequence { T r,s } ≤ r
1; for r > J r,s ⊂ J r ′ ,s for r < r ′ . In order to prove (32) remark that by definition one has1 d r T r − ,r T r ′ ,s = 1 d r max J ∈J r − ,r Y j ∈ J d − j max J ′ ∈J r ′ ,s Y j ′ ∈ J ′ d − j ′ = max J ∈J r − ,r max J ′ ∈J r ′ ,s Y j ∈{ r,r }∪ J ∪ J ′ d − j . It is enough to prove that { r, r } ∪ J ∪ J ′ =: ˜ J ∈ J r ′ ,r + s . First check that (cid:0) ˜ J ) = 2 + J ) + J ′ ) ≤ r −
1) + 2( s −
1) = 2( r + s − . On the other hand, since 1 ≤ j ≤ r − j ∈ J and 1 ≤ j ′ ≤ r ′ for all j ′ ∈ J ′ , onealso has 1 ≤ ˜ j ≤ r ′ for all ˜ j ∈ ˜ J . Finally, evaluate X ˜ j ∈ ˜ J log ˜ j = 2 log r + X j ∈ J log j + X j ′ ∈ J ′ log j ′ ≤ r + 2( r − − log r ) + 2( s − − log s ) ≤ (cid:2) r + s − − (1 + log s ) (cid:3) ≤ (cid:2) r + s − − log ( r + s ) (cid:3) , where the elementary inequality 1 + log s = log s = log (2 s ) > log ( r + s ) hasbeen used (recall that r ≤ r ′ < s ). Hence, ˜ J ∈ J r ′ ,r + s , as claimed.I shall also use the numerical sequence { µ r,s } r ≥ ,s ≥ defined as(33) µ , = 0 , µ ,s = 1 for s > ,µ r,s = X ≤ rp s ≥ . The recursive estimates are collected in
Lemma 3:
Let the Hamiltonian H (0) satisfy k H (0) s k ≤ h s − E for some constants h ≥ and E > . Let d = 1 and { d r } r ≥ be an arbitrary sequence of positive numberssatisfying P r ≥ d r = d with d < . Let also δ = 0 and δ r = d + . . . + d r . Then for s > r ≥ the following estimates hold true: k χ r k − δ r − ≤ µ r − ,r T r − ,r C r − Eγ , (34) k Z r k − δ r − ≤ µ r − ,r T r − ,r C r − Ed r − , (35) k Z ♯r + Z ♮r k − δ r − ≤ µ r − ,r T r − ,r C r − E , (36) k H ( r ) s k − δ r ≤ µ r,s T r,s C s − Ed r , (37) k H ( r ) ,♯s + H ( r ) ,♮s k − δ r ≤ µ r,s T r,s C s − E , (38) n a theorem of Lyapounov 11where (39) C = h + 4 e Eγ Λ , and µ r,s and T r,s are the sequences defined by (30) and (33). Remark that (34), (36) and (38) differ from (35) and (37), respectively, only because adivisor d r is missing. Proof.
By induction. For r = 0 only (37) and (38) are meaningful, and hold true inview of d = µ ,s = T ,s = 1 and of h < C . The induction consists in first provingthat if (37) and (38) hold true up to r − r ; nextproving that if (34), (35) and (36) hold true up to r then (37) and (38) are true for r .Let r > r − r and r in place of s in (37) and (38). Recallingthat only H ( r − ,♯r + H ( r − ,♮r is used in order to determine χ r use the definition of thenorm, the form of the solution of eq. (10) discussed in sect. 2, and the estimate (25).This immediately shows that (34), (35) and (36) are true for r provided (37) and (38)hold true for r −
1. Coming to (37) and (38) and recalling the recursive definitions (11)there are only two kinds of terms to be estimated, namely s ! L sχ r Z m for 1 ≤ m < r and p ! L pχ r H ( r − s − p ) r + m for 0 ≤ p ≤ s and 0 ≤ m < r . For, remarking that Z r and H ( r − r are estimated by exactly the same quantity it is safe to estimate (cid:13)(cid:13) H ( r ) sr (cid:13)(cid:13) by replacing Z r with H ( r − r in the second of (11). This is tantamount to extending the sum in thesecond of (11) to p = s − m = 0.Denote ϕ s = L sχ r Z m , where r >
1, and split ϕ s = ϕ ♯s + ϕ ♮s + ϕ ♭s . I claim (cid:13)(cid:13) ϕ s (cid:13)(cid:13) − δ r ≤ s ! (cid:18) ed r Λ (cid:19) s − k χ r k s − − δ r − Dd r , (40) (cid:13)(cid:13) ϕ ♯s + ϕ ♮s (cid:13)(cid:13) − δ r ≤ s ! (cid:18) ed r Λ (cid:19) s − k χ r k s − − δ r − D , (41)for s ≥
1, where(42) D = µ r − ,r µ m − ,m T r,r + m C r + m − E The proof proceeds by induction. Let s = 1. By the general estimate (26) one has (cid:13)(cid:13) ϕ (cid:13)(cid:13) − δ r ≤ d r d m Λ (cid:13)(cid:13) χ r (cid:13)(cid:13) − δ r − (cid:13)(cid:13) Z m (cid:13)(cid:13) − δ m − . Using (34) and (35) one gets (cid:13)(cid:13) ϕ (cid:13)(cid:13) − δ r ≤ d r µ m − ,m µ r − ,r d m − d m T m − ,m T r − ,r C r + m − E γ Λ , so that (40) immediately follows from (31), (32) and (39)Still keeping s = 1, (41) is obtained by remarking that the contributions to ϕ ♯ + ϕ ♮ come only from L χ r ( Z ♯m + Z ♮m ) and (cid:0) L χ ♮r Z ♭m (cid:1) ♮ . Proceeding as above, from (28) and (27) one gets (cid:13)(cid:13) ϕ ♯ + ϕ ♮ (cid:13)(cid:13) − δ r ≤ γd m Λ (cid:13)(cid:13) H ( r − ,♯r + H ( r − ,♮r (cid:13)(cid:13) − δ r − (cid:13)(cid:13) Z m (cid:13)(cid:13) − δ m − + 4 d m Λ (cid:13)(cid:13) χ r (cid:13)(cid:13) − δ r − (cid:13)(cid:13) Z m (cid:13)(cid:13) − δ m − . Then (41) for s = 1 follows from (35), (38) for r − d r does not appear here.Let now s >
1, and assume that (40) be true up to s −
1. Recalling that the divisor d r due to the generalized Cauchy estimates is arbitrary, replace d r with s − s d r in theestimates (40) and (41) for ϕ s − , thus getting(43) (cid:13)(cid:13) ϕ s − (cid:13)(cid:13) − δ r + d r /s ≤ ( s − (cid:18) ss − (cid:19) s − (cid:18) ed r Λ (cid:19) s − k χ r k s − − δ r − Dd r , (cid:13)(cid:13) ϕ ♯s − + ϕ ♮s − (cid:13)(cid:13) − δ r + d r /s ≤ ( s − (cid:18) ss − (cid:19) s − (cid:18) ed r Λ (cid:19) s − k χ r k s − − δ r − D .
Consider first the estimate (41). Remarking that the contributions to ϕ ♯s + ϕ ♮s come onlyfrom L χ r (cid:0) ϕ ♯s − + ϕ ♮s − (cid:1) and (cid:0) L χ ♮r ϕ ♭s − (cid:1) ♮ , use (26) and (27) to estimate (cid:13)(cid:13) ϕ ♯s + ϕ ♮s (cid:13)(cid:13) − δ r ≤ (cid:13)(cid:13) L χ r (cid:0) ϕ ♯s − + ϕ ♮s − (cid:1)(cid:13)(cid:13) − δ r + (cid:13)(cid:13)(cid:0) L χ ♮r ϕ ♭s − (cid:1) ♮ (cid:13)(cid:13) − δ r ≤ sd r Λ (cid:13)(cid:13) χ r (cid:13)(cid:13) − δ r − (cid:13)(cid:13) ϕ ♯s − + ϕ ♮s − (cid:13)(cid:13) − δ r + d r /s + 2 sd r Λ (cid:13)(cid:13) χ r (cid:13)(cid:13) − δ r − (cid:13)(cid:13) ϕ s − (cid:13)(cid:13) − δ r + d r /s Replacing (43) in the latter expression one gets (cid:13)(cid:13) ϕ ♯s + ϕ ♮s (cid:13)(cid:13) − δ r ≤ s ! d r Λ (cid:18) ss − (cid:19) s − (cid:18) ed r Λ (cid:19) s − (cid:13)(cid:13) χ r (cid:13)(cid:13) s − − δ r − D , so that (41) follows from the trivial inequality (cid:0) ss − (cid:1) s − < e . The estimate (40) ischecked with a similar calculation, just taking into account that (26) must be usedin order to estimate L χ r ϕ s − . This produces an extra divisor d r with respect to thecalculation above.Finally, replace (34) and (42) in (40) and (41). Using also (39), one gets (cid:13)(cid:13) ϕ s (cid:13)(cid:13) − δ r ≤ s ! µ s − r − ,r (cid:18) d r T r − ,r (cid:19) s − T r,r + m C sr + m − Ed r . (cid:13)(cid:13) ϕ ♯s + ϕ ♮s (cid:13)(cid:13) − δ r ≤ s ! µ s − r − ,r (cid:18) d r T r − ,r (cid:19) s − T r,r + m C sr + m − E .
Using s − (cid:18) d r T r − ,r (cid:19) s − T r,r + m ≤ (cid:18) d r T r − ,r (cid:19) s − T r, r + m ≤ . . . ≤ T r,sr + m . n a theorem of Lyapounov 13 Thus one concludes 1 s ! (cid:13)(cid:13) L sχ r Z m (cid:13)(cid:13) − δ r ≤ µ sr − ,r µ m − ,m T r,sr + m C sr + m − Ed r , (44) 1 s ! (cid:13)(cid:13)(cid:0) L sχ r Z m (cid:1) ♯ + (cid:0) L sχ r Z m (cid:1) ♮ (cid:13)(cid:13) − δ r ≤ µ sr − ,r µ m − ,m T r,sr + m C sr + m − E . (45) The estimate for p ! L pχ r H ( r − s − p ) r + m is a minor variazione of the scheme above. Onlythe first step must be omitted. E.g., set ϕ p = L pχ r H ( r − s − p ) r + m and proceed as follows.Using (37) for r − (cid:13)(cid:13) ϕ (cid:13)(cid:13) − δ r − ≤ µ r − ,sr + m T r,sr + m C sr + m − E ;this starts the induction on p . Then proceed for p > p ! (cid:13)(cid:13) L pχ r H ( r − s − p ) r + m (cid:13)(cid:13) − δ r ≤ µ pr − ,r µ r − , ( s − p ) r + m T r,sr + m C sr + m − Ed r , (47) 1 p ! (cid:13)(cid:13)(cid:0) L pχ r H ( r − s − p ) r + m (cid:1) ♯ + (cid:0) L pχ r H ( r − s − p ) r + m (cid:1) ♮ (cid:13)(cid:13) − δ r ≤ µ pr − ,r µ r − , ( s − p ) r + m T r,sr + m C sr + m − E .
Collecting (44), (45), (46) and (47) and referring to the transformation formulæ (11)it is now an easy matter to verify that (37) and (38) hold true provided the sequence µ r,s for 0 < r < s is defined as(48) µ ,s = 1 for s > ,µ r,sr + m = µ sr − ,r µ m − ,m + s − X p =0 µ pr − ,r µ r − , ( s − p ) r + m for r ≥ , s ≥ , ≤ m < r .µ r,sr = s − X p =0 µ pr − ,r µ r − , ( s − p ) r for r ≥ , s ≥ . This looks quite different from (33). However, I claim that (33) is just a harmlessextension of (48). Indeed, just redefine the indexes by replacing sr + m with s , alsoaccepting s ≥
0, which removes the implicit restriction s > r . This is harmless, be-cause for s ≤ r one gets for (33) µ r,s = µ r − ,s . Therefore, in the second line one canreplace µ m − ,m = µ r − ,m and include it into the sum. This completes the proof oflemma 3. Q.E.D.
The statement of the lemma concerns only the sequence of generating functions, thatare estimated by (34). The completion of the proof rests on a suitable choice of thesequence { d r } r ≥ , that was left arbitrary, and on a suitable estimate of the sequence { µ r,s } s ≥ r ≥ . As a matter of fact, only the diagonal elements of the latter sequence need to be estimated, because the estimate for the generating functions in lemma 3 involvesonly µ r − ,r = µ r,r .First, pick a value for d , with 0 < d < /
2, and set d r = br , b = 6 dπ , so that P r ≥ d r = d in view of P r ≥ /r = π /
6. The immediate consequence is that(49) T r,s ≤ (cid:18) b (cid:19) s − . For, use the definition (30) and recall the definition (29) of J r,s ; then let J ∈ J r,s andevaluate Y j ∈ J d j ≤ b s − Y j ∈ J j , because J ) ≤ s − Y j ∈ J j = 2 X j ∈ J log j ≤ s − . This proves (49)Coming to the sequence (33), the problem is to show that µ r − ,r ≤ η r − for somepositive η . For, only µ r − ,r enters the estimate (34). By separating the term p = 0 inthe sum one gets µ r,s = µ r − ,s + µ r − ,r X ≤ q n a theorem of Lyapounov 15 is a majorant of { µ r − ,r } r ≥ . This is known as the Catalan’s sequence, and one has(51) ν r = 2 r − (2 r − r ! ≤ r − , where the common notation (2 n + 1)!! = 1 · · . . . · (2 n + 1) has been used.Thus, we conclude that µ r − ,r ≤ r − . Inserting the latter inequality and (49)in (34) the statement of lemma 1 follows.
4. Proof of theorem 1
Having established the estimate of lemma 1 on the sequence of generating functions itis now a standard matter to complete the proof of theorem 1. Hence this section will beless detailed with respect to the previous ones.The situation to be dealt with is the following. An infinite sequence { χ r } r ≥ ofgenerating functions is given, with χ r ∈ P r +2 (a homogeneous polynomial of degree r + 2) satisfying k χ r k R ≤ β r − G for some real vector R with positive components andsome positive β and G . Define a corresponding sequence of canonical transformations( x ( r − , y ( r − ) = exp( L χ r )( x ( r ) , y ( r ) ). By composition one also constructs a sequence {C ( r ) } r ≥ of canonical transformations ( x (0) , y (0) ) = C ( r ) ( x ( r ) , y ( r ) ) recursively definedas C (0) = Id , C ( r ) = exp( L χ r ) ◦ C ( r − , Id denoting the identity operator. The problem is to prove the following statements.(i) Every near the identity canonical transformation defined via the exponential oper-ator exp( L χ r ) is expressed as a power series which is convergent in a polydisk ∆ ̺R for some positive ̺ .(ii) For any function f ( x ( r − , y ( r − ) analytic in ∆ ̺R the transformed function is an-alytic in the same polydisk, and moreover f ( x ( r − , y ( r − ) (cid:12)(cid:12)(cid:12) ( x ( r − ,y ( r − )=exp( L χr )( x ( r ) ,y ( r ) ) = (cid:2) exp( L χ r ) f (cid:3) ( x ( r ) , y ( r ) ) . (iii) The sequence (cid:8) C ( r ) (cid:9) r ≥ of canonical transformations converges for r → ∞ to acanonical transformation C ( ∞ ) which is analytic in a polydisk ∆ (1 − d ) ̺R for somepositive d < / f analytic in ∆ ̺R the sequence recursively defined as f (0) = f , f ( r ) = exp( L χ r ) f ( r − converges for r → ∞ to a function f ( ∞ ) that is analytic in∆ (1 − d ) ̺R , and moreover one has f ( ∞ ) = f ◦ C ( ∞ ) . The statement (i) actually reduces to Cauchy’s proof of the existence and uniqueness ofthe local solution of an analytic system of differential equations. For, the transformationdefined via the exponential operator is the time–one canonical flow induced by theHamiltonian vector field generated by χ r . The statement (ii) actually claims that thesubstitution of variables in a function f may be effectively replaced by the application of the exponential operator to f ; this is indeed the basis of the algorithm for constructingthe normal form used in sect. 2. A detailed proof of both these statements may be found,e.g., in [9]; however, the reader may be able to reconstruct the proof by following thehints in [3].The proof of (iii) rests on the following remarks. In the polydisk ∆ ̺R one has | χ r ( x, y ) | ≤ ̺ r +2 k χ r k ̺R ; this, in turn, implies that (cid:12)(cid:12) x ( r ) − x ( r − (cid:12)(cid:12) ∼ β r − ̺ r +2 and (cid:12)(cid:12) y ( r ) − y ( r − (cid:12)(cid:12) ∼ β r − ̺ r +2 . The geometric bound on the latter quantities implies that P r> (cid:12)(cid:12) x ( r ) − x ( r − (cid:12)(cid:12) and P r> (cid:12)(cid:12) y ( r ) − y ( r − (cid:12)(cid:12) behave like geometric series, i.e., con-verge for ̺ small enough. Thus, the claim follows from Weierstrass theorem. Finally,the statement (iv) follows from (ii) being true for all r >
0, which implies that bothsequences f ( r ) = C ( r ) f and f ◦ C ( r ) converge to the same limit. This concludes the proofof theorem 1. A. Justification of the normalization algorithm
Justifying the normalization algorithm of sect. 2 is just matter of rearranging terms inthe expansion of exp( L χ r ) H ( r − . Considering first H and H ( r − r together, one hasexp( L χ r ) (cid:0) H + H ( r − r (cid:1) = H + L χ r H + X s ≥ s ! L sχ r H + H ( r − r + X s ≥ s ! L sχ r H ( r − r . Here, H is the first term in the transformed Hamiltonian H ( r ) in (9). In view of (10) onehas L χ r H + H ( r − r = Z r , which kills the unwanted term H ( r − r and replaces it withthe normalized term Z r . The two sums may be collected and simplified by calculating X s ≥ s ! L sχ r H + X s ≥ s ! L sχ r H ( r − r = X s ≥ s − L s − χ r (cid:20) s (cid:16) L χ r H + H ( r − r (cid:17) + s − s H ( r − r (cid:21) = X s ≥ s − L s − χ r (cid:18) s Z r + s − s H ( r − r (cid:19) . Here, both L s − χ r Z r and L s − χ r H ( r − r are homogeneous polynomials of degree sr + 2, thatare added to H ( r ) sr in the second of (11).Proceed now by transforming the functions Z , . . . , Z r − that are already in normalform. Recall that no such term exists for r = 1. For r > L χ r ) Z m = Z m + X s ≥ s ! L sχ r Z m , for 1 ≤ m < r . n a theorem of Lyapounov 17 The term Z m is copied into H ( r ) in (9). The term L sχ r Z m is a homogeneous polynomialof degree sr + m + 2 that is added to H ( r ) sr + m in the first of (11).Finally, consider all terms H ( r − s with s > r , that may be written as H ( r − lr + m with l ≥ ≤ m < r , the case l = 1 , m = 0 being excluded. One getsexp( L χ r ) H ( r − lr + m = X p ≥ p ! L pχ r H ( r − lr + m where L pχ r H ( r − lr + m is a homogeneous polynomial of degree ( p + l ) r + m . Collecting allhomogeneous terms with m = 0, l ≥ p + l = s ≥ P s − p =0 1 p ! L pχ r H ( r − s − p ) r ,that is added to H ( r ) sr in the second of (11). Similarly, collecting all homogeneous termswith 0 < m < r , l ≥ p + l = s ≥ P s − p =0 1 p ! L pχ r H ( r − s − p ) r + m , that is addedto H ( r ) sr + m in the first of (11). The latter case does not occur for r = 1. This completesthe justification of the formal algorithm. B. Technical calculations
The aim is to check the estimates (26), (27) and (28). Write, generically, χ = P j,k c j,k x j y k and f = P j,k f j,k x j y k . Then compute(52) L χ f = X j,k,j ′ ,k ′ n X l =1 j ′ l k l − j l k ′ l x l y l c j,k f j ′ ,k ′ x j + j ′ y k + k ′ . Using the definition of norm evaluate k L χ f k − δ ≤ X j,k,j ′ ,k ′ n X l =1 | j ′ l k l − j l k ′ l | R l | c j,k | | f j ′ ,k ′ | (1 − δ ) | j + k | + | j ′ + k ′ |− R j + k + j ′ + k ′ ≤ X j,k X j ′ ,k ′ n X l =1 | j ′ l k l − j l k ′ l | | c j,k | (cid:0) (1 − δ ′ ) − ( δ − δ ′ ) (cid:1) | j + k |− R j + k × | f j ′ ,k ′ | (cid:0) (1 − δ ′′ ) − ( δ − δ ′′ ) (cid:1) | j ′ + k ′ |− R j ′ + k ′ . If f is a generic function, then in view of | j l | ≤ | j + k | and k l ≤ | j + k | one has(53) n X l =1 | j ′ l k l − j l k ′ l | < | j + k | n X l =1 | j ′ l + k ′ l | = | j + k | · | j ′ + k ′ | . Replacing in the estimate above and using the elementary inequality(54) m ( λ − x ) m − < λ m x for 0 < x < λ and m ≥ one gets(55) k L χ f k − δ ≤ X j,k | c j,k | | j + k | (cid:0) (1 − δ ′ ) − ( δ − δ ′ ) (cid:1) | j + k |− R j + k × X j ′ ,k ′ | f j ′ ,k ′ | | j ′ + k ′ | (cid:0) (1 − δ ′′ ) − ( δ − δ ′′ ) (cid:1) | j ′ + k ′ |− R j ′ + k ′ ≤ δ − δ ′ )( δ − δ ′′ )Λ X j,k | c j,k | (1 − δ ′ ) | j + k | R j + k × X j ′ ,k ′ | f j ′ ,k ′ | (1 − δ ′′ ) | j ′ + k ′ | R j ′ + k ′ , from which (26) immediately follows in view of the definition of the norm.In order to prove (27) recall that χ ♮ contains only monomials c j,k x j y k with j + k ∈K ♮ . The projection (cid:0) L χ ♮ f ♭ (cid:1) ♮ is just part of the general expression (52). In particularthe value l = 1 in the sum must be discarded because the resulting monomials belongto P ♭ . Moreover, for k ∈ K ♮ one has P nl =2 | j l + k l | ≤
2. Thus the estimate (53) may bereplaced by(56) n X l =2 | j ′ l k l − j l k ′ l | < n X l =2 | j ′ l + k ′ l | = 2 | j ′ + k ′ | . Hence the inequality (54) must be used only for the term involving | j ′ + k ′ | , and thereis no need to introduce the divisor δ − δ ′ . Use instead 1 / (1 − δ ) < δ < / f in the general expression (52) by Z ♯ + Z ♮ = P ν ∈K ♯ ∪K ♯ z ν,ν x ν y ν . Recall also that the coefficients c j,k of χ have the form c j,k = ψ j,k h k − j,λ i , in view of (13). Then (53) may be replaced by(57) X l | ν l ( j l − k l ) | ≤ | ν | X l | j l − k l | ≤ | ν | | j − k | On the other hand, by lemma 2 one has | c j,k | ≤ | ψ j,k || j − k | γ , so that the factor | j − k | in (57) is compensated by the divisor here. This removes the needto introduce the divisor δ − δ ′ in the rest of the estimates. Use instead (1 − δ ) | j + k |− ≤ − δ ) | j + k | , which holds true in view of δ < /
2. Then (55) is replaced by k L χ ( Z ♯ + Z ♮ ) k − δ ≤ X j,k γ | ψ j,k | (1 − δ ) | j + k | R j + k × X ν ∈K ♯ | z ν,ν | | ν | (cid:0) (1 − δ ′′ ) − ( δ − δ ′′ ) (cid:1) | ν |− R ν ≤ δ − δ ′′ ) γ Λ X j,k | ψ j,k | (1 − δ ′ ) | j + k | R j + k X ν ∈K ♯ | z ν,ν | (1 − δ ′′ ) | ν | R ν . n a theorem of Lyapounov 19 Thus (28) follows in view of the definition of the norm.
References [1] Birkhoff, G. D.:
Dynamical systems , New York (1927).[2] Cherry, T. M.:
On the solutions of Hamiltonian systems in the neighborhood ofa singular point , Proc. London Math. Soc., Ser. 2, , 151–170 (1926).[3] Giorgilli, A.: Quantitative methods in classical perturbation theory , proceedingsof the Nato ASI school “From Newton to chaos: modern techniques for under-standing and coping with chaos in N–body dynamical systems”, A.E. Roy e B.D.Steves eds., Plenum Press, New York (1995).[4] Giorgilli, A. and Locatelli, U.:
Kolmogorov theorem and classical perturbationtheory , ZAMP , 220–261 (1997).[5] Giorgilli, A. and Locatelli, U.: On classical series expansions for quasi–periodicmotions , MPEJ N. 5 (1997).[6] Giorgilli:
Classical constructive methods in KAM theory , PSS, A classical self–contained proof of Kolmogorov’stheorem on invariant tori , in
Hamiltonian systems with three or more degreesof freedom , Carles Sim´o ed., NATO ASI series C, Vol. 533, Kluwer AcademicPublishers, Dordrecht–Boston–London (1999).[8] Giorgilli, A.:
Unstable equilibria of Hamiltonian systems , Disc. and Cont. Dy-namical Systems, Vol. 7, N. 4, 855–871 (2001).[9] Gr¨obner, W.:
Die Lie–Reihen und Ihre Anwendungen , VEB Deutscher Verlagder Wissenschaften (1967).[10] Lyapunov, A.M.:
The General Problem of the Stability of Motion (In Russian),Doctoral dissertation, Univ. Kharkov (1892). French translation in:
Probl`emeg´en´eral de la stabilit´e du mouvement , Annales de la Facult´e des Sciences deToulouse, deuxi`eme s´erie, Tome IX, 203–474 (1907). Reprinted in: Ann. Math.Study, Princeton University Press, n. 17, (1949).[11] Moser, J.: