Optimal time averages in non-autonomous nonlinear dynamical systems
OOPTIMAL TIME AVERAGES IN NON-AUTONOMOUSNONLINEAR DYNAMICAL SYSTEMS
CHARLES R. DOERING AND ANDREW MCMILLAN
Abstract.
The auxiliary function method allows computation of ex-tremal long-time averages of functions of dynamical variables in au-tonomous nonlinear ordinary differential equations via convex optimiza-tion. For dynamical systems defined by autonomous polynomial vectorfields, it is operationally realized as a semidefinite program utilizing sumof squares technology. In this contribution we review the method andextend it for application to periodically driven non-autonomous non-linear vector fields involving trigonometric functions of the dynamicalvariables. The damped driven Duffing oscillator and periodically drivenpendulum are presented as examples to illustrate the auxiliary functionmethod’s utility. Introduction
Dynamical systems governed by ordinary differential equations (ODEs)can have complex global attractors containing complicated and chaotic so-lutions. The primary interest in such cases is often on statistics of solutions,e.g., long-time averages of functions of the dynamical variables. Averagesalong trajectories generally depend on initial conditions and it is natural toseek the largest or smallest such averages among all solutions, as well as theextremal trajectories that realize them. Moreover, for various purposes—including, notably, control of chaos [21, 10, 30]—it is valuable to know ex-tremal trajectories regardless of their stability.The brute-force approach to searching for extremal time averages is toconstruct a large number of candidate trajectories which is ad hoc, computa-tionally expensive, and operationally limited to sufficiently stable solutions.An alternative approach that is broadly applicable and often more tractableis to construct sharp a priori bounds on long-time averages via convex opti-mization. In this paper we review a mathematical device that has come tobe known as the auxiliary function method [3, 5, 27] and its computationalimplementation, and describe some new developments to generalize it appli-cability. To illustrate the method’s utility we apply the tools to the dampeddriven Duffing oscillator and periodically driven pendulum.
Mathematics Subject Classification.
Key words and phrases. dynamical systems, ergodic optimization, polynomial opti-mization, semidefinite programming, sum-of-squares optimization. a r X i v : . [ m a t h . D S ] A ug C. R. DOERING AND A. MCMILLAN
To introduce the auxiliary function method we focus on determining up-per bounds for time averages of functions of the dynamical variable forautonomous ODEs; lower bounds are analogous. Consider x ( t ) ∈ R d satis-fying(1.1) d x dt = f ( x )for continuously differentiable vector fields f ( x ). When there is no confusion,we will denote the vector components of x ( t ) and f ( x ) as x i ( t ) and f i ( x ),respectively.Given a quantity of interest Φ( x ), define its long-time average along thetrajectory x ( t ) with x (0) = x by(1.2) Φ( x ) = lim sup T →∞ T (cid:90) T Φ( x ( t )) dt. The choice of Φ( x ) is subject to the particular application in mind. LetB ⊂ R d be a compact invariant region in the phase space. In a dissipativesystem B could be an absorbing compact set, or in a conservative system Bcould be defined by constraints on dynamical invariants. We are interested inthe maximal long-time average among all trajectories (eventually) remainingin B, i.e.,(1.3) Φ ∗ = max x ∈ B Φ( x ) . The fundamental questions are: what is the value of Φ ∗ and what trajectoriesattain it?Upper bounds on averages can be deduced using the fact that time deriva-tives of bounded functions average to zero. This elementary observationfollows from the fact that for every V( x ) ∈ C (B)—the set of continuouslydifferentiable functions on B—we have(1.4) 0 = lim sup T → + ∞ V( x ( T ) − V( x (0)) T = ddt V( x ( · )) = f ( x ( · )) · ∇ V( x ( · )) . We hereafter refer such V( x ) ∈ C (B) as “auxiliary” functions. Note thatequation (4) holds for any auxiliary function so there is an infinite family offunctions with the same time average as Φ( x ). In particular,(1.5) Φ( x ) = [Φ + f · ∇ V]( x ) . For any auxiliary function one can obtain a trivial upper-bound on Φ( x )by bounding the right hand-side point-wise on B and susequently maximiz-ing the left hand side over initial data x :(1.6) Φ ∗ ≤ max x ∈ B [Φ( x ) + f ( x ) · ∇ V( x )] . The best such a priori upper bound on Φ ∗ is then(1.7) Φ ∗ ≤ inf V ∈ C (B) max x ∈ B [Φ( x ) + f ( x ) · ∇ V( x )] . PTIMAL TIME AVERAGES 3
The minimization over the right hand side of (7) is a convex optimizationin the auxiliary function V. Indeed, define the functional(1.8) F (V) = max x ∈ B [Φ( x ) + f ( x ) · ∇ V( x ))]and insert a convex combination of auxiliary functions and apply the triangleinequality to deduce F ( λ V + (1 − λ )V ) = max x ∈ B [Φ( x ) + f ( x ) · ∇ ( λ V ( x ) + (1 − λ )V ( x ))]= max x ∈ B [ λ { Φ( x ) + f ( x ) · ∇ V ( x ) } + (1 − λ )) { Φ( x ) + f ( x ) · ∇ V ( x )) } ] ≤ λ max x ∈ B [Φ( x ) + f ( x ) · ∇ V ( x )] + (1 − λ ) max x ∈ B [Φ( x ) + f ( x ) · ∇ V ( x )]= λ F (V ) + (1 − λ ) F (V ) . The remarkable fact is that the inequality in (1.7) is actually an equality :(1.9) Φ ∗ = inf V ∈ C (B) max x ∈ B [Φ( x ) + f ( x ) · ∇ V( x )] . Details of the proof can be found elsewhere [27] but we sketch it here in fourlines for completeness: max x ∈ B Φ = max µ ∈ Pr(B) µ is invar. (cid:90) Φ dµ = sup µ ∈ Pr(B) inf V ∈ C (B) (cid:90) (Φ + f · ∇ V) dµ = inf V ∈ C (B) sup µ ∈ Pr(B) (cid:90) (Φ + f · ∇ V) dµ = inf V ∈ C (B) max x ∈ B [Φ( x ) + f ( x ) · ∇ V( x )]The key observations above are (i) that time averages can be realized asphase space averages against invariant measures, (ii) maximizing over invari-ant probability measures can be realized as a Lagrange multiplier problemwhere (cid:82) f · ∇ V dµ = 0 for all V ensures µ is invariant, (iii) swapping theorder of supremum and infimum can be performed due to standard abstractmin-max theorems, and (iv) the sup µ ∈ Pr(B) (cid:82) (Φ + f · ∇ V) dµ is realized by adelta-mass located where Φ( x ) + f ( x ) · ∇ V( x ) assumes its maximum.Thus arbitrarily sharp bounds on the maximal long-time average areavailable via convex optimization over auxiliary functions. Optimal or se-quences of near-optimal V produce optimal or sequences of increasinglynear-optimal bounds. Moreover, if V ∈ C (B) is an optimal auxiliary func-tion, then it’s straightforward to see that the corresponding optimal trajec-tory or trajectories reside in the subset of B where the continuous functionΦ( x ) + f ( x ) · ∇ V( x ) = Φ ∗ . Likewise if V is just near-optimal, then cor-responding near-optimal trajectories spend a significant fraction of time inhigh altitude level sets of Φ( x ) + f ( x ) · ∇ V( x ). Either way the auxiliaryfunction approach can be used to localize extremal trajectories in the phase C. R. DOERING AND A. MCMILLAN space; see [27] for further details and an example application to the Lorenzequations.On the surface the minimization over auxiliary functions in (1.9) seemscomputationally intractable, as the optimization must be performed over aninfinite dimensional function space. However, there are two key observationsto be made. The first is that (1.3) is equivalent to finding(1.10) min Us.t. Φ( x ) ≤ U , ∀ x ∈ B . and a sufficient condition for the constraint is that Φ( x ) + f ( x ) · ∇ V ( x ) ≤ Ufor all x ∈ B because a global point-wise constraint obviously implies aglobal average constraint. Therefore, (1.10) can be replaced with a mini-mization subject to a point-wise non-negative constraint,(1.11) min Us.t. U − Φ( x ) − f ( x ) · ∇ V( x ) ≥ , ∀ x ∈ B . As stated, the problem reduces to determining the non-negativity of a givenmultivariate function. Unfortunately determining the non-negativity of mul-tivariate functions is NP hard [20], but in section 2, we will formulate theproblem as a semidefinite program and perform suitable and somewhat nat-ural relaxations to make the problem computationally accessible.Meanwhile it is important to recognize that interest in extreme time av-erages is not new. In the abstract dynamical systems community it goesunder the name “ergodic optimization” [12, 1] and was motivated in part byconjectures late last century that many quantities of interest in applicationsfor chaotic dynamical systems are optimized, in a time averaged sense, on(relatively) simple unstable periodic orbits [10]. Those conjectures, in turn,underly “control of chaos” notions [21] that emerged earlier.Rather than developing theoretical or quantitative computational toolsto evaluate extreme time averages, however, the ergodic optimization fieldfocused on more conceptual questions resulting in theorems such as thatevery ergodic measure is the unique maximizing measure for some contin-uous function. (In our setting this is the statement that for every initialcondition x ∈ B, Φ( x ) = Φ ∗ for some continuous function Φ.) The er-godic optimization community recognized the variational structure reflectedin (1.9) and—given complete knowledge of the flow map —proposed a strat-egy to produce a sequence of increasingly near-optimal auxiliary functions[2]. In section 3 we offer an elementary example to explicate both the ideasdiscussed in this section and those developed in the immediately followingsection 2. The flow map takes x to x ( t ) along the ensuing trajectory for each time t > PTIMAL TIME AVERAGES 5 Semidefinite Programming, Sum of Squares Technology, andPolynomial Dynamics
Semidefinite Programming.
Computing upper and lower boundson the quantity of interest, Φ, can be simplified to a convex optimizationproblem over a finite dimensional vector space of polynomials in a Semidef-inite Program (SDP) under suitable relaxations. In general, a SDP takes
C, A i ∈ R n × n and b ∈ R m for i ∈ { , , .., m } as inputs with the goal ofdetermining(2.1) min X ∈ R n × n (cid:104) C, X (cid:105) s.t. (cid:104) A i , X (cid:105) = b i for i ∈ { , , .., m } and X (cid:23) B, D ∈ R n × n , (cid:104) B, D (cid:105) = (cid:80) ni =1 (cid:80) nj =1 b i,j d i,j and X (cid:23) X is positive semi-definite. Equation (2.1) is frequently called the primal problem of a SDP which also has a dual problem of the formmax y ∈ R m (cid:104) b, y (cid:105) s.t. C (cid:23) m (cid:88) i =1 y i A i (2.2)where P (cid:23) Q means P − Q (cid:23)
0. If the solutions to both the primal and dualproblem are the same, then we say that the SDP has a dual gap of zero. See[28] for a review of SDPs and their applications.An important class of SDPs are polynomial optimization problems. Inparticular, one is frequently interested in optimizing a multivariate poly-nomial subject to a set of non-negative constraints. If the polynomial tooptimize is given to be p ( x ) such that g i ( x ) ≥ i ∈ { , , .., m } and x ∈ R d , the problem is of the form(2.3) min x ∈ R d p ( x )s.t. g i ( x ) ≥ i ∈ { , , .., m } . In the next Section 2.2 we will see how problems of the form (2.3) maywritten as a SDP just as in (2.1). SDPs are now somewhat standard andeasily implementable as there are a wide array of various soft-wares to solvewell-posed problems of the form (2.1) and (2.2). One needs only a parser,such as YALMIP, SOSTOOLS, or GloptiPoly, and a semi-definite programsolver, such as Mosek, SeDumi, or SCS. Many of these are implementable ina standard Mathlab toolbox. Computations in this paper were performedusing Yalmip paired with Mosek.2.2.
Positivity of Polynomials.
Global Positivity.
Differential equations for many applications arepurely polynomial in their arguments. That is, one is frequently interestedin ˙ x = f ( x ) for f i ( x ) ∈ R [ x ], where R [ x ] is the vector space of all polyno-mials over x . If f i ( x ) is polynomial and V( x ) is restricted to R [ x ], then the C. R. DOERING AND A. MCMILLAN constraint in (1.11) simplifies to determining whether a multi-variate polyno-mial is non-negative. Even determining the non-negativity of a multivariatepolynomial is still
NP hard, however, except for an extremely limited setof examples such as uni-variate or quadratic polynomials [20]. The key ob-servation is that determining the stronger condition that the polynomialis a Sum of Squares (SOS) of polynomials is a problem can be solved inpolynomial time [23, 24].
Definition 2.1.
A polynomial p( x ) ∈ R [ x ] is a Sum of Squares if there is afinite collection of polynomials p i ( x ) ∈ R [ x ] such that p( x ) = (cid:80) Ni =1 [ p i ( x )] .We denote S [ x ] and S d [ x ] as the cones of all SOS polynomials and all SOSpolynomials up to degree d, respectively.This sum of squares condition is of course sufficient for the non-negativityof a polynomial, and it is necessary if the polynomials in question are up toquadratic [9], but in general being SOS is not equivalent to non-negativity.Therefore, one might be concerned that the proposed strengthening, of goingfrom a non-negative polynomial to one with a SOS representation, has givenup too much. However, there is a wonderful result [16], which states thatSOS polynomials are dense in the space of non-negative, real polynomialsof arbitrary degree and of arbitrary dimension in the (cid:96) norm of the poly-nomial’s coefficients. There is quite a rich history in determining whether apolynomial, or more generally a rational function, can be written as a SOSor a sum of rational functions with square numerators and denominatorsdating back to Hilbert’s 17th problem; see [19] for a historical review.Fortunately for applications there is a simple yet computationally usefulresult about representations of SOS polynomials: Theorem 2.2.
Given a multi-variate polynomial p ( x ) in n variables andof degree 2d, p ( x ) is representable as a sum of squares if and only if thereexists a positive semi-definite and symmetric matrix Q such that p ( x ) = z ( x ) T Qz ( x ) , where z ( x ) = [1 , x , x , .., x n , x x , .., x dn ] .Proof. The “if” is evident. Conversely, suppose that p ( x ) has a sum ofsquares representation. Then p ( x ) = n (cid:88) i =1 q i ( x ) = n (cid:88) i =1 ( a iT z ( x )) = n (cid:88) i =1 ( z T ( x ) a i )( a iT z ( x ))= z T ( x ) (cid:0) n (cid:88) i =1 a i a iT (cid:1) z ( x ) = z T ( x ) Qz ( x ) (cid:3) PTIMAL TIME AVERAGES 7
Therefore, determining whether an even degree, non-negative polynomial isa SOS is equivalent to finding a positive semi-definite and symmetric matrix,Q, such that(2.4) p ( x ) = z ( x ) T Qz ( x ) ≥ , where z ( x ) is a suitable polynomial basis. We compute an example fordemonstration purposes: suppose we wish to represent f ( x, y ) = 2 x +5 y + x y , a SOS polynomial, in the form of (2.4). Write f ( x, y ) = 2 x + 5 y + x y = [ x y xy ] T q q q q q q q q q [ x y xy ]= q x + q y + ( q + 2 q ) x y + 2 q x y + 2 q xy . (2.5)Equating coefficients we find(2.6) q = 2 , q = 5 , q + 2 q = 1 , q = 0 , q = 0so that the matrix is positive semi-definite for −√ ≤ q ≤ with q =1 − q .2.2.2. Local Positivity.
The SOS criterion in (2.4) is a global condition in-sofar as it insists that our desired polynomial is non-negative for all x ∈ R d .But frequently one is satisfied with local positivity of a polynomial, and thiswas the original formulation of the problem in (1.11). Due to the robustnessof characterizing regions in phase space with polynomials, we can restrictour attention to locality constraints defined in terms of only polynomials. Definition 2.3.
A set K is called semi-algebraic if K is defined by finitelymany polynomial equalities or inequalities. A prototypical example of such K is(2.7) K := { x ∈ R d | g i ( x ) ≤ , h j ( x ) = 0 for i = 1 , ..., m and j = 1 , ..., n } , where g i ( x ) , h j ( x ) ∈ R [ x ].We now look to enforce equation (2.4) under the strengthened constraint(2.8) p ( x ) = z ( x ) T Qz ( x ) ≥ , ∀ x ∈ semi-algebraic set K . One way of viewing the localized constraint of being within K is to saythat for all x ∈ K n (cid:88) j =1 h j ( x ) r j ( x ) = 0 , ∀ r j ( x ) ∈ R [ x ] and m (cid:88) i =1 g i ( x ) s i ( x ) ≤ , ∀ s i ( x ) ∈ R + [ x ] , (2.9) C. R. DOERING AND A. MCMILLAN where R + [ x ] is the positive cone in R [ x ]. This can be realized by writing(2.8) as(2.10) Find r , .., r n and s , .., s m s.t. p ( x ) + n (cid:88) i =1 h i ( x ) r i ( x ) + m (cid:88) j =1 g j ( x ) s j ( x ) ≥ s , .., s m ∈ S [ x ]where we’ve replaced the positive cone condition with being representableas a SOS. Equation (2.10) is frequently called the S-Procedure , where the“S” comes from the SOS constraints on the s i polynomials.2.2.3. Sum of Squares in Dynamical Systems.
Returning to (1.11), replacingpositivity of U − Φ( x ) − f ( x ) · ∇ V on all of R d to SOS allows us to relax theproblem to(2.11) min Us.t. U − Φ( x ) − f ( x ) · ∇ V( x ) ∈ S [ x ] . Moreover, if we wish to ensure that the polynomial in question is only locallypositive on the compact set B ⊂ semi-algebraic set K defined as in (2.7), wecan augment equation (2.11) with(2.12) min Us.t. U − Φ( x ) − f ( x ) · ∇ V( x ) + . . . · · · + m (cid:88) i =1 h i ( x ) r i ( x ) + n (cid:88) j =1 g j ( x ) s j ( x ) ∈ S [ x ] and s ( x ) , ..., s n ( x ) ∈ S [ x ] . A few key remarks are required here. The polynomials f i ( x ) are ex-ogenously given as part of the dynamical system in question but there arechoices to be made for polynomials Φ( x ) and V( x ). Φ( x ) is chosen accordingto the particular application in mind. However, upon further inspecting theprograms from a computational perspective, it turns out that the resultingU is generally very sensitive to the choice of V( x ). In particular the degreeof V( x ) is pertinent, and the reason is two fold.Firstly, if the degree of V( x ) is too small then the SOS constraint mayfail to be feasible even within a reasonable tolerance for numerical error.Secondly, the resulting U may fail to be a sharp upper bound for Φ. Therestriction that V( x ) is polynomial is completely absent in (1.10) as well as(1.11), the original problem and its slight strengthening, so it is unreasonableto expect that sharp bounds can be achieved by restricting to the space ofpolynomials. Fortunately, though, there is the following result [15]: Theorem 2.4.
Suppose that K is a compact, semi-algebraic set defined interms of { g i } mi =1 . Let s = max i deg ( g i ) , r = deg ( U − Φ − f · ∇ V ) and Γ d denote the set of polynomials that are a weighted sum of the g i ’s, where the PTIMAL TIME AVERAGES 9 weights are SOS polynomials of degree no more than r − s . If there exists L such that L − || x || ∈ Γ d for some d , then Φ ∗ = lim d →∞ inf U ∈ R V ∈ R d [ x ] { U | U − Φ( x ) − f ( x ) · ∇ V ( x ) ∈ Γ d } . Therefore, by taking the polynomial degree of our auxiliary function toinfinity we are guaranteed to achieve the sharp bounds of the theoreticalformalism in Section 1, i.e., in theory we have lost nothing in restricting V to being polynomial. Coupled with Lassere’s density result [16] it is opera-tionally reasonable to restrict our attention to polynomial auxiliary functionsas well as positive polynomials with a sum of squares representation.In practice, one incrementally increases the allowed degree of V and thebounds are declared sharp if increasing the degree only yields small (near nu-merical precision) improvements in the bounds U. In practice sharp boundsmay be achieved for auxiliary functions of relatively small degree—say,around degree 8 or 10—that are computationally accessible on a standardlaptop for systems with relatively low degrees of freedom.3. A Simple Example
To illustrate the ideas introduced above in the context of a concrete ex-ample, consider the one dimensional polynomial dynamical system(3.1) dxdt = x − x = f ( x )and quantity of interest(3.2) Φ( x ) = x . For our purposes (3.1) possesses three classes of solutions corresponding tothree classes of initial data: x ( t ) → − −∞ < x < ,x ( t ) → x = 0 , and(3.3) x ( t ) → +1 for 0 > x > ∞ . Therefore Φ(0) = 0 and Φ( x ) = 1 for all x (cid:54) = 0 so that Φ ∗ = 1. But howmight one discern this within the auxiliary function formulation?In this example it is easy to divine an optimal polynomial auxiliary func-tion, namely V ( x ) = x . Indeed,Φ + f V (cid:48) = x + ( x − x ) x = 2 x − x (3.4) = 1 − ( x + 1) ( x − . That is, for this optimal auxiliary function Φ( x ) + f ( x ) V (cid:48) ( x ) = Φ ∗ − S ( x )where S ( x ) is a (sum of) square(s) of polynomials. Moreover, for this particular Φ( x ) and optimal auxiliary function V ( x ),the quantity Φ( x )+ f ( x ) V (cid:48) ( x ) achieves its pointwise maximum Φ ∗ precisely—and only—at x = ±
1, points that are both optimal initial conditions (butnot uniquely so—any x (cid:54) = 0 is optimal) and optimal trajectories such thatevery neighborhood thereof hosts every optimal trajectory 100% of the timeover the infinite time interval of averaging.Given our quantitative analytical knowledge of the flow map for this sim-ple example, however, by relaxing the polynomial restriction we can alsoconceive a sequence of near-optimal auxiliary functions V (cid:15) ∈ C ( R ) so thatthat lim (cid:15) → { Φ( x ) + f ( x ) V (cid:48) (cid:15) ( x ) } = Φ ∗ for every x (cid:54) = 0. Indeed, for every (cid:15) > V (cid:15) ( x ) = 12 ln ( x + (cid:15) )so that Φ( x ) + f ( x ) V (cid:48) (cid:15) ( x ) = x + ( x − x ) xx + (cid:15) = (1 + (cid:15) ) x x + (cid:15) . (3.6)Then it is evident that V (cid:15) is an increasingly near-optimal sequence of aux-iliary functions in the sense that inf (cid:15)> sup x { Φ( x ) + f ( x ) V (cid:48) (cid:15) ( x ) } = Φ ∗ andfurthermore, as advertised, lim (cid:15) → { Φ( x ) + f ( x ) V (cid:48) (cid:15) ( x ) } = Φ ∗ for every x (cid:54) = 0.Note as well that Φ( x ) + f ( x ) × [lim (cid:15) → V (cid:48) (cid:15) ( x )] = Φ ∗ for every x (cid:54) = 0 eventhough the limit of the sequence V (cid:15) , i.e., ln | x | , is not C .But even more is true about this sequence of auxiliary functions: for every x ∈ R (3.7) lim (cid:15) → { Φ( x ) + f ( x ) V (cid:48) (cid:15) ( x ) } = Φ( x ) . That is, Φ( x ) + f ( x ) V (cid:48) (cid:15) ( x ) is a sequence of functions such that its limit ateach point in the phase space yields the infinite time average of Φ( · ) alongthe trajectories passing through that point.While this impressive feature of the sequence V (cid:15) ( x ) is apparent in thissimple example, such sequences of increasingly near-optimal auxiliary func-tions V (cid:15) ( x ) also exist more generally for well behaved d x dt = f ( x ) definedby sufficiently smooth vector fields f . If we could deduce these sequencesthen we could bypass the dynamics altogether to estimate and evaluate longtime averages Φ( x ) along all trajectories. But, alas, as of now constructionof such sequences requires explicit knowledge of the flow map—completeaccess to all information about all trajectories [2, 1]—so this approach isessentially tautological in an operational sense. In practice at the presenttime we are limited to the variational methods described in section 2 toeffectively compute sequences of increasingly optimal auxiliary functions. PTIMAL TIME AVERAGES 11 Application to Non-Autonomous and Non-polynomial systems
The theoretical formalism and computational implementation via SDPdescribed in Sections 1 and 2 depend very much on, respectively, the au-tonomous nature of the dynamics and the polynomial nature of the equationsof motion. But models in many applications involve non-autonomous, i.e.,driven systems, and non-polynomial vector fields. Therefore, it is usefulto consider how broader classes of ODEs might be recast as autonomouspolynomial systems.Periodically forced dynamics of the form(4.1) d x ( t ) dt = f ( x , cos( ωt ) , sin( ωt ))with x = ( x , x , ...., x d ) are particularly interesting and ubiquitous. Thetraditional way of “autonomizing” such systems—introduce a new coordi-nate x d +1 = t and extend the system dimension from d to d + 1—has thedrawbacks, however, of introducing an unbounded dependent variable whileretaining non-polynomial dependence on it.For our purposes these problems can be circumvented by introducing two new dynamical variables satisfying the polynomial sub-system(4.2) dx d +1 dt = − ωx d +2 + (1 − x d +1 − x d +2 ) x d +1 dx d +2 dt = ωx d +1 + (1 − x d +1 − x d +2 ) x d +2 . After a uniform-in-initial-condition exponentially decaying transient, x d +1 =cos( ωt + φ ) and x d +2 = sin( ωt + φ ) with arbitrary phase φ . Insofar as we’reultimately interested in extreme long time behavior among all initial data,however, the phase is irrelevant: φ (cid:54) = 0 corresponds to a translation of thetime origin which can be absorbed into a shift in initial conditions.Note as well that x d +1 = x d +2 ≡ x ) over allinitial conditions. In practice appearance of this “spurious” solution may beobviated theoretically by adding appropriate multiples of x d +1 and/or x d +2 to Φ, or computationally by implementing the S-procedure.This approach can also be used to formulate equivalent autonomous poly-nomial dynamics for both quasiperiodic and substantially more complex πω -periodic time dependences in the vector field. Employing a new pairof dynamical variables like those in (4.2) for each independent frequencyallows for quasiperiodic time dependence, at least for quasiperiodicity in-volving only a finite number of independent frequencies. Other πω -periodictime functions can be expressed as finite linear combinations of cos( nωt )and sin( nωt ), each of which in turn is a finite polynomial combination ofcos( ωt ) and sin( ωt ). The overall order of the dynamical system necessarilyincreases but autonomous polynomial dynamics are still sufficient to capturethe systems’ dynamics. A broad class of autonomous vector fields with trigonometric variabledependence can similarly be handled similarly [22]. Consider, for example,vector fields f ( x ) where the components f , . . . , f d depend polynomially on x j for j (cid:54) = i and on x i via cos x i and/or sin x i but not on x i itself, i.e.,(4.3) f j = f j ( x , . . . , x i − , cos x i , sin x i , x i +1 , . . . , x d ) for each j = 1 , . . . , d. For notational simplicity let us denote the “angular” variable x i ( t ) = θ ( t )and the corresponding component of the vector field(4.4) f i = Ω( x , . . . , . . . , x i − , cos θ, sin θ, x i − , . . . , x d ) . Then augment the system with two new variables evolving according to dx d +1 dt = − Ω x d +2 + (1 − x d +1 − x d +2 ) x d +1 dx d +2 dt = Ω x d +1 + (1 − x d +1 − x d +2 ) x d +2 . (4.5)After transients, x d +1 ( t ) = cos (cid:18)(cid:90) t Ω ds + θ (cid:19) and x d +2 = sin (cid:18)(cid:90) t Ω ds + θ (cid:19) (4.6)where θ is determined by initial data.The claim now is that solutions of the original d -dimensional system(4.7) dx k dt = f k ( x , . . . , x i − , cos x i , sin x i , x i +1 , . . . , x d ) for k = 1 , . . . , d are in 1-to-1 correspondence with solutions of the ( d +1)-dimensional systemconsisting of (4.5) and the remaining d − dx j dt = f j ( x , . . . , x i − , x d +1 , x d +2 , x i +1 , . . . , x d )for j = 1 , . . . , i − , i + 1 , . . . , d. (4.8)The new system does not involve the original x i variable which evolvespassively via dx i /dt = Ω( x , . . . , x i − , x d +1 , x d +2 , x i +1 , . . . , x d ), the righthand side of which does not depend on x i . Variation in the arbitrary phase θ in (4.6) corresponds to a translation of the initial condition for the elimi-nated x i variable, and this does not matter when we are concerned with func-tions of interest Φ that only depend on x , . . . , x i − , x d +1 , x d +2 , x i +1 , . . . , x d extremized over trajectories.We remark that the (1 − x d +1 − x d +2 ) x d +1 and (1 − x d +1 − x d +2 ) x d +1 termsin both (4.2) and (4.5), enforcing amplitude constraints, may be droppedand replaced with the S-procedure to constrain x d +1 and x d +2 to circles intheir subspace of the phase space. In the following subsections we illus-trate these approaches and their robustness by converting the periodicallyforced Duffing equation and the damped-driven pendulum into autonomouspolynomial form and applying the SOS/SDP technologies. PTIMAL TIME AVERAGES 13
The Periodically Driven Duffing Equation.
The damped drivenDuffing system is the non-autonomous second order nonlinear ODE(4.9) ¨ x + δ ˙ x + αx + βx = F cos( ωt ) . It has received widespread attention for its various engineering applications,as a simple paradigmatic model that displays dynamical hysteresis, and forexhibiting chaotic behavior [14].The Harmonic Balance method produces πω -periodic approximate solu-tions via insertion of ansatz(4.10) x ( t ) = A cos( ωt ) + B sin( ωt ) . into (4.9) and projecting onto cos( ωt ) and sin( ωt ). Harmonic Balance yieldsan implicit prediction for the frequency response curve in the form(4.11) (cid:104)(cid:16) ω − α − βR (cid:17) + ( δω ) (cid:105) R − F = 0where R = √ A + B . For fixed parameters α , β , F , and δ , one can solvefor the roots (4.11) to deduce the oscillation amplitude R .When α > β > β < A m p lit ud e =.04=.06=.09 Figure 1.
Harmonic Balance approximate mean amplitude R = √ A + B vs. driving frequency ω with δ = . α = 1, F = 1, and β = . , . , . The Duffing equation (4.9) is not of the form (1.1) so we proceed byaugmenting it with two additional variables to make the system autonomous.It is then realized as the 4-dimensional first order system˙ x = y ˙ y = z − δy − αx − βx ˙ z = ωz ˙ z = − ωz , (4.12)where the amplitudes of z and z will be enforced by the S-procedure sothat z = F sin( ωt + φ ) and z = F cos( ωt + φ ) with phase φ determined byinitial conditions but which is irrelevant for long time averages.When the function to be maximized is(4.13) Φ( x, y, z , z ) = x , the relevant SDP is(4.14) min Us.t. U − x − f ( x, y, z , z ) · ∇ V + . . . · · · + S( x, y, z , z )( F − z − z ) ∈ S [ x, y, z , z ]S ∈ S [ x, y, z , z ] . We now systematically increase the polynomial degrees of both V and Suntil sharp bounds are achieved. (Lower bounds on x can be computedby negating Φ, performing the SDP, and taking the absolute value of theresulting U.) A m p lit ud e Plot of SOS, Harmonic Balance vs. Driving Frequency
Harmonic BalanceUpper boundLower bound
Figure 2.
Harmonic Balance approximate mean amplitude, (cid:112) x , and upper and lower bounds on the solutions’ meanamplitude vs. driving frequency ω with δ = . α = 1, β = .
04, and F = 1 for a degree 10 polynomial auxiliary function. PTIMAL TIME AVERAGES 15
The Harmonic Balance approximation of the mean amplitude agrees re-markably well with the upper and lower bounds on the true solution’s meanamplitude; see Figure 2. The differences between the upper and lowerbounds plotted in Figure 3 suggest that they agree (to computational pre-cision) for points on the frequency response curve that are single valuedwhen the degree of the auxiliary function is sufficiently high. Not un-expectedly, there is an order 1 difference between the bounds when thecurve is multi-valued. We conclude that for this sort of small amplitudeforcing and weak nonlinearity, the Harmonic Balance approximation doesexceptionally well quantitatively approximating the true solution’s meanamplitude—even though the forcing and nonlinearity are strong enough toinduce multi-stability and hysteresis. -15 -10 -5 D i ff e r e n ce Degree 4Degree 6Degree 8Degree 10
Figure 3.
Difference between the upper and lower boundson the solution’s mean amplitude vs. driving frequency fordegree 4, 6, 8 and 10 polynomial auxiliary functions.It is worthwhile remarking that if we consider the degree of the auxiliaryfunctions as a parameter then there seem to be ω -dependent thresholds inthe parameter space for which the degree 6 bounds become sharp. In thisexample a transition occurs at ω ≈ . ∗ = Φ( x ) + f ( x ) · ∇ V ( x ) for polynomial optimal V ,whose complexity (degrees) change discontinuously with the system param-eters. Under what conditions we might expect such transitions to occur,especially with a smoothly varying parameter such as ω , is an question forresearch in its own right. The Damped Periodically Driven Pendulum.
Consider a dampedand periodically driven pendulum dynamics defined by the non-polynomialand non-autonomous 2 nd order ODE(4.15) ¨ θ + γ ˙ θ + sin( θ ) = F cos( ωt ) . For weak forcing, the sinusoidal non-linearity may be modeled by expand-ing the sin( θ ) term in a Taylor series, and we employ a procedure similarto that of the Duffing example here to test the validity of the HarmonicBalance approximation. We expect the two term expansion of the sin( θ )term—that results in a Duffing equation—to perform poorly, however, formoderately large forcing amplitude. Hence, we expand the sin( θ ) term ina Taylor Series to 7th order and employ (4.10) to obtain an approximatefrequency response curve. The result is(4.16) R ( R + 1152 R − R − R ω + R ( R + 4608( γ −
2) + 1152 R − R ) ω − F = 0 , where R = √ A + B . The calculation is tedious and purely algebraic, butplots of R vs. ω for several forcing amplitudes are shown in Figure 4. A m p lit ud e F=.1F=.15F=.2
Figure 4.
Plot of the Harmonic Balance approximate meanamplitude vs. ω with γ = . F = 0 . , .
15 and 0 . E = ( ˙ θ ) − cos( θ ) from the Harmonic Balance approximation with auxiliaryfunction bounds on solutions to (4.15). But as written, (4.15) is neitherpolynomial nor autonomous which prevents immediate implementation ofthe polynomial optimization via an SDP. PTIMAL TIME AVERAGES 17
Augmenting the system with four additional variables, however, we mayre-write equation (4.15) as the 4-dimensional first order polynomial system˙ φ = z − γφ − ψ ˙ ψ = φψ ˙ ψ = − φψ ˙ z = ωz ˙ z = − ωz . (4.17)The quantity of interest to extremize is the total energy plus z given by(4.18) E + z = 12 ( ˙ θ ) − cos( θ ) + z = 12 ( φ ) − ψ + ( z ) = Φ . The z makes the SDP more computationally tractable and, because z = ,we can interpret the upper-bound and lower bounds on the mean energy asa shift down and up, respectively. Letting x = [ φ, ψ , ψ , z , z ], the semi-definite program for upper bounds becomes(4.19) min Us.t. U − Φ( x ) − f ( x ) · ∇ V( x ) + C ( x ) + C ( x ) ∈ S ( x ) S , S ∈ S ( x ) , where C ( x ) = S ( x )( F − z − z ) and C ( x ) = S ( x )(1 − ψ − ψ ).Lower bounds on Φ are computed just as in the Duffing setting. Figure 5.
Plot of the bounds and Harmonic Balance ap-proximate total mean energy vs. driving frequency, ω , with γ = . F = 0 .
10 and degree 6 polynomial auxiliary functions.
Performing the SDP in (4.19), we find that the auxiliary function method’slower bound on the mean energy and the harmonic balance approximationto the mean energy can agree quite nicely—for sufficiently weak forcing.See Figure 5. As the forcing amplitude increases, however, the HarmonicBalance approximation is bound to fail.On the other hand the upper bound in Figure 5 clearly does not cor-respond to the Harmonic Balance approximations we found. Indeed, theupper bound with Φ ∗ ≈ . θ = π as illustrated inFigure 6. Due to its dynamical instability, however, one would never expectto discover it via direct numerical simulation. ! = Figure 6.
A potentially unstable solution ocillating abouta neighborhood of θ = π .With this interpretation in mind, we can make the linear change of vari-ables such that θ (cid:48) = π − θ . Then when θ has low potential energy θ (cid:48) hashigh potential energy and vice versa. Figure 7 is the Harmonic Balance ap-proximation with the Taylor expansion performed about θ = π , the analogof Figure 4. Meanwhile Figure 8 shows that the harmonic balance approxi-mation of the high potential solution’s total mean energy agrees quite wellwith the auxiliary function upper bound on the true solution’s total meanenergy.This example illustrates one of the operational “quirks” of the auxiliaryfunction method: it produces upper bounds or lower bounds on the chosen Φacross all potential initial conditions including those that breed not readilyobserved unstable solutions. Of course the knowledge of the existence of suchunstable solutions is frequently a concern—it is certainly the central concernfor control-of-chaos applications—but if one is interested in estimates of longtime averages of Φ on particular solutions (or branches of solutions) there iscurrently no supplementary procedure that one can employ to ensure thatthe bounds computed correspond with specific trajectories. PTIMAL TIME AVERAGES 19 A m p lit ud e F=.1F=.15F=.2
Figure 7.
Harmonic Balance approximate mean amplitudeabout θ = π vs. driving frequency ω with γ = . F =0.10, 0.15 and 0.20. Figure 8.
Plot of the bounds and Harmonic Balance ap-proximation (about θ = π ) of Φ vs. driving frequency ω with γ = . F = 0 .
20 and degree 6 polynomial auxiliary functions. Summary & Discussion
There are several key remarks to be made regarding both what we’ve doneand future directions. First is that we’ve displayed the robustness of the aux-iliary function and SOS-SDP technologies to handle both non-autonomousand (certain forms of) trigonometric dependence in non-linear ODEs. Suchsystems are ubiquitious in applications and as canonical case studies. Sec-ond, we note that the SOS technology is computationally tractable even formuch larger ODE systems that we considered here. The plots for this paperwere produced on a standard laptop, and the procedure of augmenting adynamical system with additional polynomial degrees of freedom appearsrobust. Additionally, we observe that sharp—within computer precision—bounds are often recovered for polynomial auxiliary functions of reasonablyrestricted degree. This appears to be the case not only in this work, butalso various others [3, 5], so the SDP algorithm is able to concentrate therelative coefficients on potentially severely truncated polynomials for whichsharp bounds are guaranteed.We can clearly see areas of ongoing research for which this technology isbroadly applicable. Many such problems can directly be cast in the lightof the auxiliary function method and have been inaccessible until due toboth theoretical and computational limitations. There are a variety appliedscience and engineering application where moderately low dimensional ODEsystems serve as central models for both conceptual and design purposes.These include energy harvesting [18, 29] where the challenge is to optimallyextract power from vibrations of a continuously stimulated mechanical bodywhere mathematical models often consist of periodically driven nonlinear os-cillators [4]. Another area is the periodic operation of chemical and biochem-ical reactors [25] where the task is to optimize the time-average productionof certain byproducts. Mass action and related kinetic models often consistof ODEs with polynomial vector fields. Circadian [7, 13] or seasonally forced[8, 26] models in biology, ecology and epidemiology are often described bysuch periodically driven ODEs with polynomial vector fields as well.Finally, we recognize the frontier for application of the auxiliary func-tion approach and related numerical methods to systems described by par-tial differential equations (PDEs). Of course PDEs are often approximatedby finite—albeit sometimes very large—systems of ODEs, but fundamentalmathematical and computational questions remain for future research.6.
Acknowledgements
Some of the ideas presented here were developed in discussions with DavidGoluskin (Victoria) and Jeremy Parker (Cambridge) supported in part byUS National Science Foundation Awards DMS-1813003 at the University ofMichigan and OCE-1829864 for the Geophysical Fluid Dynamics Programat Woods Hole Oceanographic Institution.
PTIMAL TIME AVERAGES 21
References [1] J. Bochi,
Ergodic optimization of Birkhoff averages and Lyapunov exponents , Pro-ceedings of the International Congress of Mathematicians (2018): 1843–1864.[2] J.-P. Conze and Y. Guivarch, Croissance des sommes ergodiques (unpublished, circa1993).[3] S.I. Chernyshenko, P. Goulart, D. Huang and A. Papachristodoulou,
Polynomial sumof squares in fluid dynamics: a review with a look ahead , Philosophical Transactionsof the Royal Society A (2014): 20130350.[4] A. Erturk and D.J Inman,
Broadband Piezoelectric Power Generation on High-EnergyOrbits of the Bistable Duffing Oscillator with Electromechanical Coupling , Journal ofSound & Vibration (2011): 2339–2353.[5] G. Fantuzzi, D. Goluskin, D. Huang and S.I. Chernyshenko,
Bounds for Deterministicand Stochastic Dynamical Systems using Sum-of-Squares Optimization , SIAM Journalon Applied Dynamical Systems (2016): 1962–1988.[6] D. Goluskin, Bounding Averages Rigorously Using Semidefinite Programming: MeanMoments of the Lorenz System , Journal of Nonlinear Science (2018): 621–651.[7] D. Gonze, S. Bernard, C. Waltermann, A. Kramer and H. Herzel, Spontaneous Syn-chronization of Coupled Circadian Oscillators , Biophysical Journal (2005): 120–129.[8] J.V. Greenman and R.A. Norman, Environmental Forcing, Invasion and Control ofEcological and Epidemiological Systems , Journal of Theoretical Biology (2007):492506.[9] D. Hilbert, ¨Uber die Darstellung definiter Formen als Summe von Formenquadraten ,Mathematische Annalen. (1888): 342–350.[10] B.R. Hunt and E. Ott, Optimal periodic orbits of chaotic systems occur at low period ,Physical Review E (1996): 328–337.[11] Z.-P. Jiang, Advanced Feedback Control of the Chaotic Duffing Equation , IEEE Trans-actions on Circuits and Systems I: Fundamental Theory and Applications (2002):244–249.[12] O. Jenkinson, Ergodic Optimization , Discrete & Continuous Dynamical Systems (2006): 197–224.[13] N. Komin, et al. Synchronization and Entrainment of Coupled Circadian Oscillators ,Interface Focus (2010): 167176.[14] I. Kovacic and M. J. Brennan, The Duffing Equation: Nonlinear Oscillators and TheirBehaviour (Wiley, 2011).[15] M. Lakshmi, G. Fantuzzi, J. Fernandez-Caballero, Y. Hwang and S.I Chernyshenko,
Finding extremal periodic orbits with polynomial optimization, with application to anine-mode model of shear flow , arXiv:1906.04001 (2020).[16] J.B. Lasserre,
A Sum of Squares Approximation of Nonnegative Polynomials , IFACProceedings Volumes (2005): 441–444.[17] J.B. Lasserre, Moments, Positive Polynomials and Their Applications (Imperial Col-lege Press 2010).[18] B.P. Mann and N.D Sims,
Energy Harvesting from the Nonlinear Oscillations ofMagnetic Levitation , Journal of Sound and Vibration (2009): 515–530.[19] M. Marshall,
Positive Polynomials and Sums of Squares (American MathematicalSociety 2008).[20] K.G. Murty and S.N. Kabadi,
Some NP-complete problems in quadratic and nonlinearprogramming , Mathematical Programming (1987):117–129.[21] E. Ott, C. Gregobi and J.A. Yorke, Controlling Chaos , Physical Review Letters (1990): 1196–1199.[22] J. Parker, Can sum-of-squares programming tell us anything useful about Hamiltonianchaos? , GFD Report (Woods Hole Oceanographic Institution 2019). [23] P. Parrilo,
Polynomial Optimization, Sums of Squares, and Applications in Semidefi-nite Optimization and Convex Algebraic Geometry, MOS-SIAM Series on Optimiza-tion (SIAM, Philadelphia, 2012).[24] P.A. Parrilo,
Semidefinite programming relaxations for semialgebraic problems. ,Mathematical Programming Series B (2003): 293–320.[25] P.L Silveston and R.R. Hudgins, Periodic Operation of Reactors (Butterworth-Heinemann 2013).[26] R.A. Taylor, J.A. Sherratt and A. White,
Seasonal Forcing and Multi-Year Cyclesin Interacting Popula-tions: Lessons from a Predator-Prey Model , Journal of Mathe-matical Biology (2012): 1741–1764.[27] I. Tobasco, D. Goluskin and C.R. Doering, Optimal bounds and extremal trajectoriesfor time averages in nonlinear dynamical systems , Physics Letters A (208): 382–386.[28] L. Vandenberghe and S. Boyd.
Semidefinite Programming , SIAM Review : 49–95,2006.[29] C. Wei and J. Xingjian, A Comprehensive Review on Vibration Energy Harvesting:Modelling and Realization , Renewable and Sustainable Energy Reviews (2017):1–18.[30] T.H. Yang, B.R. Hunt and E. Ott, Optimal periodic orbits of continuous time chaoticsystems , Physical Review E (2000): 1950–1959.(C. R. Doering) University of Michigan, Ann Arbor MI, USA
E-mail address : [email protected] (A. McMillan) University of Michigan, Ann Arbor MI, USA
E-mail address ::