Saddle-point dynamics: conditions for asymptotic stability of saddle points
aa r X i v : . [ m a t h . O C ] N ov SADDLE-POINT DYNAMICS: CONDITIONS FOR ASYMPTOTICSTABILITY OF SADDLE POINTS ∗ ASHISH CHERUKURI † , BAHMAN GHARESIFARD ‡ , AND
JORGE CORT´ES † Abstract.
This paper considers continuously differentiable functions of two vector variablesthat have (possibly a continuum of) min-max saddle points. We study the asymptotic convergenceproperties of the associated saddle-point dynamics (gradient-descent in the first variable and gradient-ascent in the second one). We identify a suite of complementary conditions under which the set ofsaddle points is asymptotically stable under the saddle-point dynamics. Our first set of results isbased on the convexity-concavity of the function defining the saddle-point dynamics to establish theconvergence guarantees. For functions that do not enjoy this feature, our second set of results relieson properties of the linearization of the dynamics, the function along the proximal normals to thesaddle set, and the linearity of the function in one variable. We also provide global versions of theasymptotic convergence results. Various examples illustrate our discussion.
Key words. saddle-point dynamics, asymptotic convergence, convex-concave functions, proxi-mal calculus, center manifold theory, nonsmooth dynamics
AMS subject classifications.
1. Introduction.
It is well known that the trajectories of the gradient dynamicsof a continuously differentiable function with bounded sublevel sets converge asymp-totically to its set of critical points, see e.g. [20]. This fact, however, is not true ingeneral for the saddle-point dynamics (gradient descent in one variable and gradi-ent ascent in the other) of a continuously differentiable function of two variables, seee.g. [2, 13]. In this paper, our aim is to investigate conditions under which the abovestatement is true for the case where the critical points are min-max saddle pointsand they possibly form a continuum. Our motivation comes from the applications ofthe saddle-point dynamics (also known as primal-dual dynamics) to find solutions ofequality constrained optimization problems and Nash equilibria of zero-sum games.
Literature review.
In constrained optimization problems, the pioneering works [2,25] popularized the use of the primal-dual dynamics to arrive at the saddle points ofthe Lagrangian. For inequality constrained problems, this dynamics is modified witha projection operator on the dual variables to preserve their nonnegativity, whichresults in a discontinuous vector field. Recent works have further explored the con-vergence analysis of such dynamics, both in continuous [17, 9] and in discrete [27]time. The work [16] proposes instead a continuous dynamics whose design builds onfirst- and second-order information of the Lagrangian. In the context of distributedcontrol and multi-agent systems, an important motivation to study saddle-point dy-namics comes from network optimization problems where the objective function is anaggregate of each agents’ local objective function and the constraints are given by aset of conditions that are locally computable at the agent level. Because of this struc-ture, the saddle-point dynamics of the Lagrangian for such problems is inherentlyamenable to distributed implementation. This observation explains the emerging † Department of Mechanical and Aerospace Engineering, University of California, San Diego, 9500Gilman Dr., La Jolla, CA 92093-0411, United States, { acheruku,cortes } @ucsd.edu ‡ Department of Mathematics and Statistics, Queen’s University, 403 Jeffery Hall, UniversityAve., Kingston, ON, Canada K7L3N6, [email protected] ∗ A preliminary version of this paper appeared at the 2015 American Control Conference as [8].1 ody of work that, from this perspective, looks at problems in distributed convex op-timization [33, 18, 14], distributed linear programming [30], and applications to powernetworks [26, 34, 35] and bargaining problems [31]. The work [23] shows an interestingapplication of the saddle-point dynamics to find a common Lyapunov function for alinear differential inclusion. In game theory, it is natural to study the convergenceproperties of saddle-point dynamics to find the Nash equilibria of two-person zero-sumgames [3, 29]. A majority of these works assume the function whose saddle pointsare sought to be convex-concave in its arguments. Our focus here instead is on theasymptotic stability of the min-max saddle points under the saddle-point dynamicsfor a wider class of functions, and without any nonnegativity-preserving projection onindividual variables. We explicitly allow for the possibility of a continuum of saddlepoints, instead of isolated ones, and wherever feasible, on establishing convergence ofthe trajectories to a point in the set. The issue of asymptotic convergence, even inthe case of standard gradient systems, is a delicate one when equilibria are a contin-uum [1]. In such scenarios, convergence to a point might not be guaranteed, see e.g.,the counter example in [28]. Our work here is complementary to [21], which focuseson the characterization of the asymptotic behavior of the saddle-point dynamics whentrajectories do not converge to saddle points and instead show oscillatory behaviour.
Statement of contributions.
Our starting point is the definition of the saddle-point dynamics for continuously differentiable functions of two (vector) variables,which we term saddle functions. The saddle-point dynamics consists of gradient de-scent of the saddle function in the first variable and gradient ascent in the secondvariable. Our objective is to characterize the asymptotic convergence properties ofthe saddle-point dynamics to the set of min-max saddle points of the saddle function.Assuming this set is nonempty, our contributions can be understood as a catalog ofcomplementary conditions on the saddle function that guarantee that the trajectoriesof the saddle-point dynamics are proved to converge to the set of saddle points, andpossibly to a point in the set. We broadly divide our results in two categories, one inwhich the saddle function has convexity-concavity properties and the other in which itdoes not. For the first category, our starting result considers saddle functions that arelocally convex-concave on the set of saddle points. We show that asymptotic stabilityof the set of saddle points is guaranteed if either the convexity or concavity prop-erties are strict, and convergence is pointwise. Furthermore, motivated by equalityconstrained optimization problems, our second result shows that the same conclusionson convergence hold for functions that depend linearly on one of its arguments if thestrictness requirement is dropped. For the third and last result in this category, werelax the convexity-concavity requirement and establish asymptotic convergence forstrongly jointly quasiconvex-quasiconcave saddle functions. Moving on to the secondcategory of scenarios, where functions lack convexity-concavity properties, our firstcondition is based on linearization. We consider piecewise twice continuously differ-entiable saddle-point dynamics and provide conditions on the eigenvalues of the limitpoints of Jacobian matrices of the saddle function at the saddle points that ensurelocal asymptotic stability of a manifold of saddle points. Our convergence analysisis based on a general result of independent interest on the stability of a manifoldof equilibria for piecewise smooth vector fields that we state and prove using ideasfrom center manifold theory. The next two results are motivated by the observationthat saddle functions exist in the second category that do not satisfy the linearizationhypotheses and yet have convergent dynamics. In one result, we justify convergenceby studying the variation of the function and its Hessian along the proximal normal irections to the set of saddle points. Specifically, we assume polynomial bounds forthese variations and derive an appropriate relationship between these bounds thatensures asymptotic convergence. In the other result, we assume the saddle functionto be linear in one variable and indefinite in another, where the indefinite part satisfiessome appropriate regularity conditions. When discussing each of the above scenarios,we extend the conditions to obtain global convergence wherever feasible. Our anal-ysis is based on tools and notions from saddle points, stability analysis of nonlinearsystems, proximal normals, and center manifold theory. Various examples throughoutthe paper justify the complementary character of the hypotheses in our results. Organization.
Section 2 introduces notation and basic preliminaries. Section 3presents the saddle-point dynamics and the problem statement. Section 4 deals withsaddle functions with convexity-concavity properties. For the case when this propertydoes not hold, Section 5 relies on linearization techniques, proximal normals, and thelinearity structure of the saddle function to establish convergence guarantees. Finally,Section 6 summarizes our conclusions and ideas for future work.
2. Preliminaries.
This section introduces basic notation and presents prelimi-naries on proximal calculus and saddle points.
We let R , R ≥ , R ≤ , R > and Z ≥ be the set of real, nonnegativereal, nonpositive real, positive real, and positive integer numbers, respectively. Giventwo sets A , A ⊂ R n , we let A + A = { x + y | x ∈ A , y ∈ A } . We denote by k · k the 2-norm on R n and also the induced 2-norm on R n × n . Let B δ ( x ) representthe open ball centered at x ∈ R n of radius δ >
0. Given x ∈ R n , x i denotes the i -th component of x . For vectors u ∈ R n and w ∈ R m , the vector ( u ; w ) ∈ R n + m denotes their concatenation. For A ∈ R n × n , we use A (cid:23) A (cid:22) A ≻
0, and A ≺ A is positive semidefinite, negative semidefinite, positivedefinite, and negative definite, respectively. The eigenvalues of A are λ i ( A ) for i ∈{ , . . . , n } . If A is symmetric, λ max ( A ) and λ min ( A ) represent the maximum andminimum eigenvalues, respectively. The range and null spaces of A are denoted byrange( A ), null( A ), respectively. We use the notation C k for a function being k ∈ Z ≥ times continuously differentiable. A set S ⊂ R n is path connected if for any two points a, b ∈ S there exists a continuous map γ : [0 , → S such that γ (0) = a and γ (1) = b .A set S c ⊂ S ⊂ R n is an isolated path connected component of S if it is path connectedand there exists an open neighborhood U of S c in R n such that U ∩ S = S c . For areal-valued function F : R n × R m → R , we denote the partial derivative of F withrespect to the first argument by ∇ x F and with respect to the second argument by ∇ z F . The higher-order derivatives follow the convention ∇ xz F = ∂ F∂x∂z , ∇ xx F = ∂ F∂x ,and so on. The restriction of f : R n → R m to a subset S ⊂ R n is denoted by f | S .The Jacobian of a C map f : R n → R m at x ∈ R n is denoted by Df ( x ) ∈ R m × n .For a real-valued function V : R n → R and α >
0, we denote the sublevel set of V by V − ( ≤ α ) = { x ∈ R n | V ( x ) ≤ α } . Finally, a vector field f : R n → R n is saidto be piecewise C if it is continuous and there exists (1) a finite collection of disjointopen sets D , . . . , D m ⊂ R n , referred to as patches , whose closure covers R n , that is, R n = ∪ mi =1 cl( D i ) and (2) a finite collection of C functions { f i : D ei → R n } mi =1 where,for each i ∈ { , . . . , m } , D ei is open with cl( D i ) ⊂ D ei , such that f | cl( D i ) and f i takethe same values over cl( D i ). We present here a few notions on proximal calculusfollowing [11]. Given a closed set
E ⊂ R n and a point x ∈ R n \E , the distance from x o E is,(2.1) d E ( x ) = min y ∈E k x − y k . We let proj E ( x ) denote the set of points in E that are closest to x , i.e., proj E ( x ) = { y ∈ E | k x − y k = d E ( x ) } ⊂ E . For y ∈ proj E ( x ), the vector x − y is a proximal normaldirection to E at y and any nonnegative multiple ζ = t ( x − y ), t ≥ proximalnormal ( P -normal) to E at y . The distance function d E might not be differentiablein general (unless E is convex), but is globally Lipschitz and regular [11, p. 23]. Fora locally Lipschitz function f : R n → R , the generalized gradient ∂f : R n ⇒ R n is ∂f ( x ) = co { lim i →∞ ∇ f ( x i ) | x i → x, x i / ∈ S ∪ Ω f } , where co denotes convex hull, S ⊂ R n is any set of measure zero, and Ω f is the set(of measure zero) of points where f is not differentiable. In the case of the square ofthe distance function, one can compute [11, p. 99] the generalized gradient as, ∂d E ( x ) = co { x − y ) | y ∈ proj E ( x ) } . (2.2) Here, we provide basic definitions pertaining to the notionof saddle points. A point ( x ∗ , z ∗ ) ∈ R n × R m is a local min-max saddle point of a con-tinuously differentiable function F : R n × R m → R if there exist open neighborhoods U x ∗ ⊂ R n of x ∗ and U z ∗ ⊂ R m of z ∗ such that F ( x ∗ , z ) ≤ F ( x ∗ , z ∗ ) ≤ F ( x, z ∗ ) , (2.3)for all z ∈ U z ∗ and x ∈ U x ∗ . The point ( x ∗ , z ∗ ) is a global min-max saddle point of F if U x ∗ = R n and U z ∗ = R m . Min-max saddle points are a particular case of the moregeneral notion of saddle points. We focus here on min-max saddle points motivated byproblems in constrained optimization and zero-sum games, whose solutions correspondto min-max saddle points. With a slight abuse of terminology, throughout the paperwe refer to the local min-max saddle points simply as saddle points. We denote bySaddle( F ) the set of saddle points of F . From (2.3), for ( x ∗ , z ∗ ) ∈ Saddle( F ), thepoint x ∗ ∈ R n (resp. z ∗ ∈ R m ) is a local minimizer (resp. local maximizer) of themap x F ( x, z ∗ ) (resp. z F ( x ∗ , z )). Each saddle point is a critical point of F , i.e., ∇ x F ( x ∗ , z ∗ ) = 0 and ∇ z F ( x ∗ , z ∗ ) = 0. Additionally, if F is C , then ∇ xx F ( x ∗ , z ∗ ) (cid:22) ∇ zz F ( x ∗ , z ∗ ) (cid:23)
0. Also, if ∇ xx F ( x ∗ , z ∗ ) ≺ ∇ zz F ( x ∗ , z ∗ ) ≻
0, then theinequalities in (2.3) are strict.A function F : R n × R m → R is locally convex-concave at a point (˜ x, ˜ z ) ∈ R n × R m if there exists an open neighborhood U of (˜ x, ˜ z ) such that for all (¯ x, ¯ z ) ∈ U , thefunctions x F ( x, ¯ z ) and z F (¯ x, z ) are convex over U ∩ ( R n × { ¯ z } ) and concaveover U ∩ ( { ¯ x } × R m ), respectively. If in addition, either x F ( x, ˜ z ) is strictlyconvex in an open neighborhood of ˜ x , or z F (˜ x, z ) is strictly concave in an openneighborhood of ˜ z , then F is locally strictly convex-concave at (˜ x, ˜ z ). F is locally (resp.locally strictly) convex-concave on a set S ⊂ R n × R m if it is so at each point in S . F is globally convex-concave if in the local definition U = R n × R m . Finally, F is globallystrictly convex-concave if it is globally convex-concave and for any (¯ x, ¯ z ) ∈ R n × R m ,either x F ( x, ¯ z ) is strictly convex or z F (¯ x, z ) is strictly concave. Note that thisnotion is different than saying that F is both strictly convex and strictly concave. ext, we define strongly quasiconvex function following [22]. A function f : R n → R is strongly quasiconvex with parameter s > D ⊂ R n if for all x, y ∈ D and all λ ∈ [0 ,
1] we have,max { f ( x ) , f ( y ) } − f ( λx + (1 − λ ) y ) ≥ sλ (1 − λ ) k x − y k . A function f is strongly quasiconcave with parameter s > D if − f isstrongly quasiconvex with parameter s over D . A function F : R n × R m → R is locallyjointly strongly quasiconvex-quasiconcave at a point (˜ x, ˜ z ) ∈ R n × R m if there exist s > U of (˜ x, ˜ z ) such that for all (¯ x, ¯ z ) ∈ U , the function x F ( x, ¯ z ) is strongly quasiconvex with parameter s over U ∩ ( R n × { ¯ z } ) and thefunction z F (¯ x, z ) is strongly quasiconvex with parameter s over U ∩ ( { ¯ x } × R m ). F is locally jointly strongly quasiconvex-quasiconcave on a set S ⊂ R n × R m if it isso at each point in S . F is globally jointly strongly quasiconvex-quasiconcave if in thelocal definition U = R n × R m .
3. Problem statement.
Here we formulate the problem of interest in the paper.Given a continuously differentiable function F : R n × R m → R , which we refer to as saddle function , we consider its saddle-point dynamics, i.e., gradient-descent in oneargument and gradient-ascent in the other,˙ x = −∇ x F ( x, z ) , (3.1a) ˙ z = ∇ z F ( x, z ) . (3.1b)When convenient, we use the shorthand notation X sp : R n × R m → R n × R m to referto this dynamics. Our aim is to provide conditions on F under which the trajectoriesof its saddle-point dynamics (3.1) locally asymptotically converge to its set of saddlepoints, and possibly to a point in the set. We are also interested in identifyingconditions to establish global asymptotic convergence. Throughout our study, weassume that the set Saddle( F ) is nonempty. This assumption is valid under mildconditions in the application areas that motivate our study: for the Lagrangian of theconstrained optimization problem [6] and the value function for zero-sum games [3].Our forthcoming discussion is divided in two threads, one for the case of convex-concave functions, cf. Section 4, and one for the case of general functions, cf. Section 5.In each case, we provide illustrative examples to show the applicability of the results.
4. Convergence analysis for convex-concave saddle functions.
This sec-tion presents conditions for the asymptotic stability of saddle points under the saddle-point dynamics (3.1) that rely on the convexity-concavity properties of the saddlefunction.
Our first result providesconditions that guarantee the local asymptotic stability of the set of saddle points.
Proposition 4.1. (Local asymptotic stability of the set of saddle points viaconvexity-concavity):
For F : R n × R m → R continuously differentiable and lo-cally strictly convex-concave on Saddle( F ) , each isolated path connected componentof Saddle( F ) is locally asymptotically stable under the saddle-point dynamics X sp and, moreover, the convergence of each trajectory is to a point.Proof . Let S be an isolated path connected component of Saddle( F ) and take( x ∗ , z ∗ ) ∈ S . Without loss of generality, we consider the case when x F ( x, z ∗ ) islocally strictly convex (the proof for the case when z F ( x ∗ , z ) is locally strictly oncave is analogous). Consider the function V : R n × R m → R ≥ ,(4.1) V ( x, z ) = 12 (cid:16) k x − x ∗ k + k z − z ∗ k (cid:17) , which we note is radially unbounded (and hence has bounded sublevel sets). Werefer to V as a LaSalle function because locally, as we show next, its Lie derivativeis negative, but not strictly negative. Let U be the neighborhood of ( x ∗ , z ∗ ) wherelocal convexity-concavity holds. The Lie derivative of V along the dynamics (3.1) at( x, z ) ∈ U can be written as, L X sp V ( x, z ) = − ( x − x ∗ ) ⊤ ∇ x F ( x, z ) + ( z − z ∗ ) ⊤ ∇ z F ( x, z )(4.2) ≤ F ( x ∗ , z ) − F ( x, z ) + F ( x, z ) − F ( x, z ∗ )= F ( x ∗ , z ) − F ( x ∗ , z ∗ ) + F ( x ∗ , z ∗ ) − F ( x, z ∗ ) ≤ , where the first inequality follows from the first-order condition for convexity andconcavity, and the last inequality follows from the definition of saddle point. Asa consequence, for α > V − ( ≤ α ) ⊂ U , we concludethat V − ( ≤ α ) is positively invariant under X sp . The application of the LaSalleInvariance Principle [24, Theorem 4.4] yields that any trajectory starting from apoint in V − ( ≤ α ) converges to the largest invariant set M contained in { ( x, z ) ∈ V − ( ≤ α ) | L X sp V ( x, z ) = 0 } . Let ( x, z ) ∈ M . From (4.2), L X sp V ( x, z ) = 0 im-plies that F ( x ∗ , z ) = F ( x ∗ , z ∗ ) = F ( x, z ∗ ). In turn, the local strict convexity of x F ( x, z ∗ ) implies that x = x ∗ . Since M is positively invariant, the trajectory t ( x ( t ) , z ( t )) of X sp starting at ( x, z ) is contained in M . This implies that alongthe trajectory, for all t ≥
0, (a) x ( t ) = x ∗ i.e., ˙ x ( t ) = ∇ x F ( x ( t ) , z ( t )) = 0, and (b) F ( x ∗ , z ( t )) = F ( x ∗ , z ∗ ). The later implies0 = L X sp F ( x ∗ , z ( t )) = X sp ( x ∗ , z ( t )) · (0 , ∇ z F ( x ∗ , z ( t ))) = k∇ z F ( x ( t ) , z ( t )) k , for all t ≥
0. Thus, we get ∇ x F ( x, z ) = 0 and ∇ z F ( x, z ) = 0. Further, since( x, z ) ∈ U , local convexity-concavity holds over U , and S is an isolated component,we obtain ( x, z ) ∈ S , which shows M ⊂ S . Since ( x ∗ , z ∗ ) is arbitrary, the asymptoticconvergence property holds in a neighborhood of S . The pointwise convergence followsfrom the application of Lemma A.3.The result above shows that each saddle point is stable and that each path con-nected component of Saddle( F ) is asymptotically stable. Note that each saddle pointmight not be asymptotically stable. However, if a component consists of a singlepoint, then that point is asymptotically stable. Interestingly, a close look at the proofof Proposition 4.1 reveals that, if the assumptions hold globally, then the asymptoticstability of the set of saddle points is also global, as stated next. Corollary 4.2. (Global asymptotic stability of the set of saddle points viaconvexity-concavity):
For F : R n × R m → R continuously differentiable and globallystrictly convex-concave, Saddle( F ) is globally asymptotically stable under the saddle-point dynamics X sp and the convergence of trajectories is to a point. Remark 4.3. (Relationship with results on primal-dual dynamics: I):
Corol-lary 4.2 is an extension to more general functions and less stringent assumptions ofthe results stated for Lagrangian functions of constrained convex (or concave) op-timization problems in [33, 2, 17] and cost functions of differential games in [29].In [2, 17], for a concave optimization, the matrix ∇ xx F is assumed to be negative efinite at every saddle point and in [33] the set Saddle( F ) is assumed to be a single-ton. The work [29] assumes a sufficient condition on the cost functions to guaranteeconvergence that in the current setup is equivalent to having ∇ xx F and ∇ zz F positiveand negative definite, respectively. • Here westudy the asymptotic convergence properties of the saddle-point dynamics when theconvexity-concavity of the saddle function is not strict but, instead, the function de-pends linearly on its second argument. The analysis follows analogously for saddlefunctions that are linear in the first argument and concave in the other. The con-sideration of this class of functions is motivated by equality constrained optimizationproblems.
Proposition 4.4. (Local asymptotic stability of the set of saddle points viaconvexity-linearity):
For a continuously differentiable function F : R n × R m → R , if(i) F is locally convex-concave on Saddle( F ) and linear in z ,(ii) for each ( x ∗ , z ∗ ) ∈ Saddle( F ) , there exists a neighborhood U x ∗ ⊂ R n of x ∗ where, if F ( x, z ∗ ) = F ( x ∗ , z ∗ ) with x ∈ U x ∗ , then ( x, z ∗ ) ∈ Saddle( F ) ,then each isolated path connected component of Saddle( F ) is locally asymptoticallystable under the saddle-point dynamics X sp and, moreover, the convergence of trajec-tories is to a point.Proof . Given an isolated path connected component S of Saddle( F ), Lemma A.1implies that F |S is constant. Our proof proceeds along similar lines as those of Propo-sition 4.1. With the same notation, given ( x ∗ , z ∗ ) ∈ S , the arguments follow ver-batim until the identification of the largest invariant set M contained in { ( x, z ) ∈ V − ( ≤ α ) | L X sp V ( x, z ) = 0 } . Let ( x, z ) ∈ M . From (4.2), L X sp V ( x, z ) = 0 implies F ( x ∗ , z ) = F ( x ∗ , z ∗ ) = F ( x, z ∗ ). By assumption (ii), this means ( x, z ∗ ) ∈ S , and byassumption (i), the linearity property gives ∇ z F ( x, z ) = ∇ z F ( x, z ∗ ) = 0. Therefore ∇ z F | M = 0. For ( x, z ) ∈ M , the trajectory t ( x ( t ) , z ( t )) of X sp starting at ( x, z ) iscontained in M . Consequently, z ( t ) = z for all t ∈ [0 , ∞ ) and ˙ x ( t ) = −∇ x F ( x ( t ) , z )corresponds to the gradient dynamics of the (locally) convex function y F ( y, z ).Therefore, x ( t ) converges to a minimizer x ′ of this function, i.e., ∇ x F ( x ′ , z ) = 0.Since ∇ z F | M = 0, the continuity of ∇ z F implies that ∇ z F ( x ′ , z ) = 0, and hence( x ′ , z ) ∈ S . By continuity of F , it follows that F ( x ( t ) , z ) → F ( x ′ , z ) = F ( x ∗ , z ∗ ),where for the equality we use the fact that F |S is constant. On the other hand, notethat 0 = L X sp V ( x ( t ) , z ) = − ( x ( t ) − x ∗ ) ⊤ ∇ x F ( x ( t ) , z ) ≤ F ( x ∗ , z ) − F ( x ( t ) , z ) implies F ( x ( t ) , z ) ≤ F ( x ∗ , z ) = F ( x ∗ , z ∗ ) , for all t ∈ [0 , ∞ ). Therefore, the monotonically nonincreasing sequence { F ( x ( t ) , z ) } converges to F ( x ∗ , z ∗ ), which is also an upper bound on the whole sequence. Thiscan only be possible if F ( x ( t ) , z ) = F ( x ∗ , z ∗ ) for all t ∈ [0 , ∞ ). This further implies ∇ x F ( x ( t ) , z ) = 0 for all t ∈ [0 , ∞ ), and hence, ( x, z ) ∈ S . Consequently, M ⊂S . Since ( x ∗ , z ∗ ) has been chosen arbitrarily, the convergence property holds in aneighborhood of S . The pointwise convergence follows now from the application ofLemma A.3.The assumption (ii) in the above result is a generalization of the local strictconvexity condition for the function F ( · , z ∗ ). That is, (ii) allows other points in theneighborhood of x ∗ to have the same value of the function F ( · , z ∗ ) as that at x ∗ , as longas they are saddle points (whereas, under local strict convexity, x ∗ is the local unique inimizer of F ( · , z ∗ )). The next result extends the conclusions of Proposition 4.4globally when the assumptions hold globally. Corollary 4.5. (Global asymptotic stability of the set of saddle points viaconvexity-linearity):
For a C function F : R n × R m → R , if(i) F is globally convex-concave and linear in z ,(ii) for each ( x ∗ , z ∗ ) ∈ Saddle( F ) , if F ( x, z ∗ ) = F ( x ∗ , z ∗ ) , then ( x, z ∗ ) ∈ Saddle( F ) ,then Saddle( F ) is globally asymptotically stable under the saddle-point dynamics X sp and, moreover, convergence of trajectories is to a point. Example 4.6. (Saddle-point dynamics for convex optimization):
Consider thefollowing convex optimization problem on R ,minimize ( x + x + x ) , (4.3a) subject to x = x . (4.3b)The set of solutions of this optimization is { x ∈ R | x + x = 0 , x = x } , withLagrangian L ( x, z ) = ( x + x + x ) + z ( x − x ) , (4.4)where z ∈ R is the Lagrange multiplier. The set of saddle points of L (whichcorrespond to the set of primal-dual solutions to (4.3)) are Saddle( L ) = { ( x, z ) ∈ R × R | x + x = 0 , x = x , and z = 0 } . However, L is not strictly convex-concave and hence, it does not satisfy the hypotheses of Corollary 4.2. While L isglobally convex-concave and linear in z , it does not satisfy assumption (ii) of Corol-lary 4.5. Therefore, to identify a dynamics that renders Saddle( L ) asymptoticallystable, we form the augmented Lagrangian(4.5) ˜ L ( x, z ) = L ( x, z ) + ( x − x ) , that has the same set of saddle points as L . Note that ˜ L is not strictly convex-concavebut it is globally convex-concave (this can be seen by computing its Hessian) and islinear in z . Moreover, given any ( x ∗ , z ∗ ) ∈ Saddle( L ), we have ˜ L ( x ∗ , z ∗ ) = 0, and if˜ L ( x, z ∗ ) = ˜ L ( x ∗ , z ∗ ) = 0, then ( x, z ∗ ) ∈ Saddle( L ). By Corollary 4.5, the trajectoriesof the saddle-point dynamics of ˜ L converge to a point in S and hence, solve theoptimization problem (4.3). Figure 4.1 illustrates this fact. Note that the point ofconvergence depends on the initial condition. • Remark 4.7. (Relationship with results on primal-dual dynamics: II):
Thework [17, Section 4] considers concave optimization problems under inequality con-straints where the objective function is not strictly concave but analyzes the conver-gence properties of a different dynamics. Specifically, the paper studies a discontin-uous dynamics based on the saddle-point information of an augmented Lagrangiancombined with a projection operator that restricts the dual variables to the nonneg-ative orthant. We have verified that, for the formulation of the concave optimizationproblem in [17] but with equality constraints, the augmented Lagrangian satisfies thehypotheses of Corollary 4.5, implying that the dynamics X sp renders the primal-dualoptima of the problem asymptotically stable. • Motivated bythe aim of further relaxing the conditions for asymptotic convergence, we conclude (a) ( x, z ) (b) ( x + x + x ) Fig. 4.1 . (a) Trajectory of the saddle-point dynamics of the augmented Lagrangian ˜ L in (4.5) for the optimization problem (4.3) . The initial condition is ( x, z ) = (1 , − , , . The trajectoryconverges to ( − . , − . , , ∈ Saddle( L ) . (b) Evolution of the objective function of the optimiza-tion (4.3) along the trajectory. The value converges to the minimum, . this section by weakening the convexity-concavity requirement on the saddle function.The next result shows that strong quasiconvexity-quasiconcavity is sufficient to ensureconvergence of the saddle-point dynamics. Proposition 4.8. (Local asymptotic stability of the set of saddle points viastrong quasiconvexity-quasiconcavity):
Let F : R n × R m → R be C and the map ( x, z )
7→ ∇ xz F ( x, z ) be locally Lipschitz. Assume that F is locally jointly stronglyquasiconvex-quasiconcave on Saddle( F ) . Then, each isolated path connected compo-nent of Saddle( F ) is locally asymptotically stable under the saddle-point dynamics X sp and, moreover, the convergence of trajectories is to a point. Further, if F is glob-ally jointly strongly quasiconvex-quasiconcave and ∇ xz F is constant over R n × R m ,then Saddle( F ) is globally asymptotically stable under X sp and the convergence oftrajectories is to a point.Proof . Let ( x ∗ , z ∗ ) ∈ S , where S is an isolated path connected componentof Saddle( F ), and consider the function V : R n × R m → R ≥ defined in (4.1).Let U be the neighborhood of ( x ∗ , z ∗ ) where the local joint strong quasiconvexity-quasiconcavity holds. The Lie derivative of V along the saddle-point dynamics at( x, z ) ∈ U can be written as, L X sp V ( x, z ) = − ( x − x ∗ ) ⊤ ∇ x F ( x, z ) + ( z − z ∗ ) ⊤ ∇ z F ( x, z ) , = − ( x − x ∗ ) ⊤ ∇ x F ( x, z ∗ ) + ( z − z ∗ ) ⊤ ∇ z F ( x ∗ , z ) + M + M , (4.6)where M = − ( x − x ∗ ) ⊤ ( ∇ x F ( x, z ) − ∇ x F ( x, z ∗ )) ,M = ( z − z ∗ ) ⊤ ( ∇ z F ( x, z ) − ∇ z F ( x ∗ , z )) . Writing ∇ x F ( x, z ) − ∇ x F ( x, z ∗ ) = Z ∇ zx F ( x, z ∗ + t ( z − z ∗ ))( z − z ∗ ) dt, z F ( x, z ) − ∇ z F ( x ∗ , z ) = Z ∇ xz F ( x ∗ + t ( x − x ∗ ) , z )( x − x ∗ ) dt, we get M + M = ( z − z ∗ ) ⊤ (cid:16) Z (cid:0) ∇ xz F ( x ∗ + t ( x − x ∗ ) , z ) − ∇ xz F ( x, z ∗ + t ( z − z ∗ )) (cid:1) dt (cid:17) ( x − x ∗ ) ≤ k z − z ∗ k ( L k x − x ∗ k + L k z − z ∗ k ) k x − x ∗ k , (4.7)where in the inequality, we have used the fact that ∇ xz F is locally Lipschitz withsome constant L >
0. From the first-order property of a strong quasiconvex function,cf. Lemma A.2, there exist constants s , s > − ( x − x ∗ ) ⊤ ∇ x F ( x, z ∗ ) ≤ − s k x − x ∗ k , (4.8a) ( z − z ∗ ) ⊤ ∇ z F ( x ∗ , z ) ≤ − s k z − z ∗ k , (4.8b)for all ( x, z ) ∈ U . Substituting (4.7) and (4.8) into the expression for the Lie deriva-tive (4.6), we obtain L X sp V ( x, z ) ≤ − s k x − x ∗ k − s k z − z ∗ k + L k x − x ∗ k k z − z ∗ k + L k x − x ∗ kk z − z ∗ k . To conclude the proof, note that if k z − z ∗ k < s L and k x − x ∗ k < s L , then L X sp V ( x, z ) <
0, which implies local asymptotic stability. The pointwise convergence follows fromLemma A.3. The global asymptotic stability can be reasoned using similar argumentsas above using the fact that here M + M = 0 because ∇ xz F is constant.In the following, we present an example where the above result is employed toexplain local asymptotic convergence. In this case, none of the results from Section 4.1and 4.2 apply, thereby justifying the importance of the above result. Example 4.9. (Convergence for locally jointly strongly quasiconvex-quasiconcavefunction):
Consider F : R × R → R given by,(4.9) F ( x, z ) = (2 − e − x )(1 + e − z ) . Note that F is C and ∇ xz F ( x, z ) = − xze − x e − z is locally Lipschitz. To seethis, note that the function x xe − x is bounded and is locally Lipschitz (as itsderivative is bounded). Further, the product of two bounded and locally Lipschitzfunctions is locally Lipschitz [32, Theorem 4.6.3] and so, ( x, z )
7→ ∇ xz F ( x, z ) is locallyLipschitz. The set of saddle points of F is Saddle( F ) = { } . Next, we show that x f ( x ) = c − c e − x , c >
0, is locally strongly quasiconvex at 0. Fix δ > x, y ∈ B δ (0) such that f ( y ) ≤ f ( x ). Then, | y | ≤ | x | andmax { f ( x ) , f ( y ) } − f ( λx + (1 − λ ) y ) − sλ (1 − λ )( x − y ) = c ( − e − x + e − ( λx +(1 − λ ) y ) ) − sλ (1 − λ )( x − y ) = c e − x ( − e x − ( λx +(1 − λ ) y ) ) − sλ (1 − λ )( x − y ) ≥ c e − x ( x − ( λx + (1 − λ ) y ) ) − sλ (1 − λ )( x − y ) (1 − λ )( x − y ) (cid:16) c e − x ( x + y ) + λ ( x − y )( c e − x − s ) (cid:17) ≥ , for s ≤ c e − δ , given the fact that | y | ≤ | x | . Therefore, f is locally strongly qua-siconvex and so − f is locally strongly quasiconcave. Using these facts, we deducethat F is locally jointly strongly quasiconvex-quasiconcave. Thus, the hypotheses ofProposition 4.8 are met, implying local asymptotic stability of Saddle( F ) under thesaddle-point dynamics. Figure 4.2 illustrates this fact in simulation. Note that F does not satisfy the conditions outlined in results of Section 4.1 and 4.2. • (a) ( x, z ) (b) ( x + z ) / Fig. 4.2 . (a) Trajectory of the saddle-point dynamics for F given in (4.9) . The initial conditionis ( x, z ) = (0 . , . . The trajectory converges to the saddle point (0 , . (b) Evolution of the function V along the trajectory.
5. Convergence analysis for general saddle functions.
We study here theconvergence properties of the saddle-point dynamics associated to functions that arenot convex-concave. Our first result explores conditions for local asymptotic stabilitybased on the linearization of the dynamics and properties of the eigenstructure of theJacobian matrices. In particular, we assume that X sp is piecewise C and that the setof limit points of the Jacobian of X sp at any saddle point have a common kernel andnegative real parts for the nonzero eigenvalues. The proof is a direct consequence ofProposition A.5. Proposition 5.1. (Local asymptotic stability of manifold of saddle points vialinearization – piecewise C saddle function): Given F : R n × R m → R , let S ⊂
Saddle( F ) be a p -dimensional submanifold of saddle points. Assume that F is C with locally Lipschitz gradient on a neighborhood of S and that the vector field X sp ispiecewise C . Assume that at each ( x ∗ , z ∗ ) ∈ S , the set of matrices A ∗ ⊂ R n + m × n + m defined as A ∗ = { lim k →∞ DX sp ( x k , z k ) | ( x k , z k ) → ( x, z ) , ( x k , z k ) ∈ R n + m \ Ω X sp } , where Ω X sp is the set of points where X sp is not differentiable, satisfies the following:(i) there exists an orthogonal matrix Q ∈ R n + m × n + m such that (5.1) Q ⊤ AQ = (cid:20) A (cid:21) , for all A ∈ A ∗ , where ˜ A ∈ R n + m − p × n + m − p , ii) the nonzero eigenvalues of the matrices in A ∗ have negative real parts,(iii) there exists a positive definite matrix P ∈ R n + m − p × n + m − p such that ˜ A ⊤ P + P ˜ A ≺ , for all ˜ A obtained by applying transformation (5.1) on each A ∈ A ∗ .Then, S is locally asymptotically stable under (A.7) and the trajectories converge toa point in S . When F is sufficiently smooth, we can refine the above result as follows. Corollary 5.2. (Local asymptotic stability of manifold of saddle points vialinearization – C saddle function): Given F : R n × R m → R , let S ⊂
Saddle( F ) be a p -dimensional manifold of saddle points. Assume F is C on a neighborhood of S andthat the Jacobian of X sp at each point in S has no eigenvalues in the imaginary axisother than , which is semisimple with multiplicity p . Then, S is locally asymptoticallystable under the saddle-point dynamics X sp and the trajectories converge to a point.Proof . Since F is C , the map X sp is C and so, the limit point of Jacobianmatrices at a saddle point ( x ∗ , z ∗ ) ∈ S is the Jacobian at that point itself, that is, DX sp = (cid:20) −∇ xx F −∇ xz F ∇ zx F ∇ zz F (cid:21) ( x ∗ ,z ∗ ) . From the definition of saddle point, we have ∇ xx F ( x ∗ , z ∗ ) (cid:23) ∇ zz F ( x ∗ , z ∗ ) (cid:22) DX sp + DX ⊤ sp (cid:22)
0, and since Re( λ i ( DX sp )) ≤ λ max ( ( DX sp + DX ⊤ sp )) [4, Fact 5.10.28], we deduce that Re( λ i ( DX sp )) ≤
0. The statement nowfollows from Proposition 5.1 using the fact that the properties of the eigenvalues of DX sp shown here imply existence of an orthonormal transformation leading to a formof DX sp that satisfies assumptions (i)-(iii) of Proposition 5.1.Next, we provide a sufficient condition under which the Jacobian of X sp for asaddle function F that is linear in its second argument satisfies the hypothesis ofCorollary 5.2 regarding the lack of eigenvalues on the imaginary axis other than 0. Lemma 5.3. (Sufficient condition for absence of imaginary eigenvalues of theJacobian of X sp ): Let F : R n × R m → R be C and linear in the second argument.Then, the Jacobian of X sp at any saddle point ( x ∗ , z ∗ ) of F has no eigenvalues on theimaginary axis except for if range( ∇ zx F ( x ∗ , z ∗ )) ∩ null( ∇ xx F ( x ∗ , z ∗ )) = { } .Proof . The Jacobian of X sp at a saddle point ( x ∗ , z ∗ ) for a saddle function F thatis linear in z is given as DX sp = (cid:20) A B − B ⊤ (cid:21) , where A = −∇ xx F ( x ∗ , z ∗ ) and B = −∇ zx F ( x ∗ , z ∗ ). We reason by contradiction. Let iλ , λ = 0 be an imaginary eigenvalue of DX sp with the corresponding eigenvector a + ib . Let a = ( a ; a ) and b = ( b ; b ) where a , b ∈ R n and a , b ∈ R m . Then thereal and imaginary parts of the condition DX sp ( a + ib ) = ( iλ )( a + ib ) yield Aa + Ba = − λb , − B ⊤ a = − λb , (5.2) Ab + Bb = λa , − B ⊤ b = λa . (5.3)Pre-multiplying the first equation of (5.2) with a ⊤ gives a ⊤ Aa + a ⊤ Ba = − λa ⊤ b . sing the second equation of (5.2), we get a ⊤ Aa = − λ ( a ⊤ b + a ⊤ b ). A similarprocedure for the set of equations in (5.3) gives b ⊤ Ab = λ ( a ⊤ b + a ⊤ b ). Theseconditions imply that a ⊤ Aa = − b ⊤ Ab . Since A is negative semi-definite, we obtain a , b ∈ null( A ). Note that a , b = 0, because otherwise it would mean that a = b = 0.Further, using this fact in the first equations of (5.2) and (5.3), respectively, we get Ba = − λb , Bb = λa . That is, a , b ∈ range( B ), a contradiction.The following example illustrates an application of the above results to a noncon-vex constrained optimization problem. Example 5.4. (Saddle-point dynamics for nonconvex optimization):
Considerthe following constrained optimization on R ,minimize ( k x k − , (5.4a) subject to x = 0 . , (5.4b)where x = ( x , x , x ) ∈ R . The optimizers are { x ∈ R | x = 0 . , x + x = 0 . } .The Lagrangian L : R × R → R is given by L ( x, z ) = ( k x k − + z ( x − . , and its set of saddle points is the one-dimensional manifold Saddle( L ) = { ( x, z ) ∈ R × R | x = 0 . , x + x = 0 . , z = 0 } . The saddle-point dynamics of L takes theform ˙ x = − (cid:16) − k x k (cid:17) x − [0 , , z ] ⊤ , (5.5a) ˙ z = x − . . (5.5b)Note that Saddle( L ) is nonconvex and that L is nonconvex in its first argument onany neighborhood of any saddle point. Therefore, results that rely on the convexity-concavity properties of L are not applicable to establish the asymptotic convergenceof (5.5) to the set of saddle points. This can, however, be established throughCorollary 5.2 by observing that the Jacobian of X sp at any point of Saddle( L ) has0 as an eigenvalue with multiplicity one and the rest of the eigenvalues are noton the imaginary axis. To show this, consider ( x ∗ , z ∗ ) ∈ Saddle( L ). Note that DX sp ( x ∗ , z ∗ ) = (cid:20) − x ⊤∗ x ∗ − e e ⊤ (cid:21) , where e = [0 , , ⊤ . One can deduce from thisthat v ∈ null( DX sp ( x ∗ , z ∗ )) if and only if x ⊤∗ [ v , v , v ] ⊤ = 0, v = 0, and v = 0.These three conditions define a one-dimensional space and so 0 is an eigenvalue of DX sp ( x ∗ , z ∗ ) with multiplicity 1. To show that the rest of eigenvalues do not lie onthe imaginary axis, we show that the hypotheses of Lemma 5.3 are met. At anysaddle point ( x ∗ , z ∗ ), we have ∇ zx L ( x ∗ , z ∗ ) = e and ∇ xx L ( x ∗ , z ∗ ) = 2 x ⊤∗ x ∗ . If v ∈ range( ∇ zx L ( x ∗ , z ∗ )) ∩ null( ∇ xx L ( x ∗ , z ∗ )) then v = [0 , , λ ] ⊤ , λ ∈ R , and x ⊤∗ v = 0.Since ( x ∗ ) = 0 .
5, we get λ = 0 and hence, the hypotheses of Lemma 5.3 are satisfied.Figure 5.1 illustrates in simulation the convergence of the trajectories to a saddlepoint. The point of convergence depends on the initial condition. • There are functions that do not satisfy the hypotheses of Proposition 5.1 whosesaddle-point dynamics still seems to enjoy local asymptotic convergence properties.
50 100 150−0.500.51 x1 x2 x3 z (a) ( x, z ) (b) ( k x k − Fig. 5.1 . (a) Trajectory of the saddle-point dynamics (5.5) for the Lagrangian of the con-strained optimization problem (5.4) . The initial condition is ( x, z ) = (0 . , . , . , . . The tra-jectory converges to (0 . , . , . , ∈ Saddle( L ) . (b) Evolution of the objective function of theoptimization (5.4) along the trajectory. The value converges to the minimum, . As an example, consider the function F : R × R → R , F ( x, z ) = ( k x k − − z k x k , (5.6)whose set of saddle points is the one-dimensional manifold Saddle( F ) = { ( x, z ) ∈ R × R | k x k = 1 , z = 0 } . The Jacobian of the saddle-point dynamics at any( x, z ) ∈ Saddle( F ) has − F ) (and therefore Proposi-tion 5.1 cannot be applied). Simulations show that the trajectories of the saddle-pointdynamics asymptotically approach Saddle( S ) if the initial condition is close enoughto this set. Our next result allows us to formally establish this fact by studying thebehavior of the distance function along the proximal normals to Saddle( F ). Proposition 5.5. (Asymptotic stability of manifold of saddle points via prox-imal normals):
Let F : R n × R m → R be C and S ⊂
Saddle( F ) be a closed set.Assume there exist constants λ M , k , k , α , β > and L x , L z , α , β ≥ such thatthe following hold(i) either L x = 0 or α ≤ α + 1 ,(ii) either L z = 0 or β ≤ β + 1 ,(iii) for every ( x ∗ , z ∗ ) ∈ S and every proximal normal η = ( η x , η z ) ∈ R n × R m to S at ( x ∗ , z ∗ ) with k η k = 1 , the functions [0 , λ M ) ∋ λ F ( x ∗ + λη x , z ∗ ) , [0 , λ M ) ∋ λ F ( x ∗ , z ∗ + λη z ) , are convex and concave, respectively, with F ( x ∗ + λη x , z ∗ ) − F ( x ∗ , z ∗ ) ≥ k k λη x k α , (5.7a) F ( x ∗ , z ∗ + λη z ) − F ( x ∗ , z ∗ ) ≤ − k k λη z k β , (5.7b) and, for all λ ∈ [0 , λ M ) and all t ∈ [0 , , (5.8) k∇ xz F ( x ∗ + tλη x , z ∗ + λη z ) − ∇ xz F ( x ∗ + λη x , z ∗ + tλη z ) k L x k λη x k α + L z k λη z k β . Then, S is locally asymptotically stable under the saddle-point dynamics X sp . More-over, the convergence of the trajectories is to a point if every point of S is stable. Theconvergence is global if, for every λ M ∈ R ≥ , there exist k , k , α , β > such thatthe above hypotheses (i)-(iii) are satisfied by these constants along with L x = L z = 0 .Proof . Our proof is based on showing that there exists ¯ λ ∈ (0 , λ M ] such thatthe distance function d S decreases monotonically and converges to zero along thetrajectories of X sp that start in S + B ¯ λ (0). From (2.2), ∂d S ( x, z ) = co { x − x ∗ ; z − z ∗ ) | ( x ∗ , z ∗ ) ∈ proj S ( x, z ) } . Following [12], we compute the set-valued Lie derivative of d S along X sp , denoted L X sp d S : R n × R m ⇒ R , as L X sp d S ( x, z ) = co {− x − x ∗ ) ⊤ ∇ x F ( x, z )+2( z − z ∗ ) ⊤ ∇ z F ( x, z ) | ( x ∗ , z ∗ ) ∈ proj S ( x, z ) } . Since d S is globally Lipschitz and regular, cf. Section 2.2, the evolution of the function d S along any trajectory t ( x ( t ) , z ( t )) of (3.1) is differentiable at almost all t ∈ R ≥ ,and furthermore, cf. [12, Proposition 10], ddt ( d S ( x ( t ) , z ( t )) ∈ L X sp d S ( x ( t ) , z ( t ))for almost all t ∈ R ≥ . Therefore, our goal is to show that max L X sp d S ( x, z ) < x, z ) ∈ ( S + B ¯ λ (0)) \ S for some ¯ λ ∈ (0 , λ M ]. Let ( x, z ) ∈ S + B λ M (0) and take( x ∗ , z ∗ ) ∈ proj S ( x, z ). By definition, there exists a proximal normal η = ( η x , η z ) to S at ( x ∗ , z ∗ ) with k η k = 1 and x = x ∗ + λη x , z = z ∗ + λη z , and λ ∈ [0 , λ M ). Let2 ξ ∈ L X sp d S ( x, z ) denote(5.9) ξ = − ( x − x ∗ ) ⊤ ∇ x F ( x, z ) + ( z − z ∗ ) ⊤ ∇ z F ( x, z ) . Writing ∇ x F ( x, z ) = ∇ x F ( x, z ∗ ) + Z ∇ zx F ( x, z ∗ + t ( z − z ∗ ))( z − z ∗ ) dt, ∇ z F ( x, z ) = ∇ z F ( x ∗ , z ) + Z ∇ xz F ( x ∗ + t ( x − x ∗ ) , z )( x − x ∗ ) dt, and substituting in (5.9) we get ξ = − ( x − x ∗ ) ⊤ ∇ x F ( x, z ∗ ) + ( z − z ∗ ) ⊤ ∇ z F ( x ∗ , z ) + ( z − z ∗ ) ⊤ M ( x − x ∗ ) , (5.10)where M = R ( ∇ xz F ( x ∗ + t ( x − x ∗ ) , z ) − ∇ xz F ( x, z ∗ + t ( z − z ∗ ))) dt . Using theconvexity and concavity along the proximal normal and applying the bounds (5.7),we obtain − ( x − x ∗ ) ⊤ ∇ x F ( x, z ∗ ) ≤ F ( x ∗ , z ∗ ) − F ( x, z ∗ ) ≤ − k k λη x k α , (5.11a) ( z − z ∗ ) ⊤ ∇ z F ( x ∗ , z ) ≤ F ( x ∗ , z ) − F ( x ∗ , z ∗ ) ≤ − k k λη z k β . (5.11b) n the other hand, using (5.8), we bound M by(5.12) k M k ≤ L x k λη x k α + L z k λη z k β . Using (5.11) and (5.12) in (5.10), and rearranging the terms yields ξ ≤ (cid:0) − k k λη x k α + L x k λη x k α +1 k λη z k (cid:1) + (cid:0) − k k λη z k β + L z k λη z k β +1 k λη x k (cid:1) . If L x = 0, then the first parenthesis is negative whenever λη x = 0 (i.e., x = x ∗ ).If L x = 0 and α ≤ α + 1, then for k λη x k < k λη z k < min(1 , k /L x ), thefirst parenthesis is negative whenever λη x = 0. Analogously, the second parenthesisis negative for z = z ∗ if either L z = 0 or β ≤ β + 1 with k λη z k < k λη x k < min(1 , k /L z ). Thus, if λ < min { , k /L x , k /L z } (excluding from the min operationthe elements that are not well defined due to the denominator being zero), thenhypotheses (i)-(ii) imply that ξ < x, z ) = ( x ∗ , z ∗ ). Moreover, since( x ∗ , z ∗ ) ∈ proj S ( x, z ) was chosen arbitrarily, we conclude that max L X sp d S ( x, z ) < x, z ) ∈ S + B ¯ λ (0) where ¯ λ ∈ (0 , λ M ] satisfies ¯ λ < min { , k /L x , k /L z } . Thisproves the local asymptotic stability. Finally, convergence to a point follows fromLemma A.3 and global convergence follows from the analysis done above.Intuitively, the hypotheses of Proposition 5.5 imply that along the proximal nor-mal to the saddle set, the convexity (resp. concavity) in the x -coordinate (resp. z -coordinate) is ‘stronger’ than the influence of the x - and z -dynamics on each other,represented by the off-diagonal Hessian terms. When this coupling is absent (i.e., ∇ xz F ≡ x - and z -dynamics are independent of each other and they func-tion as individually aiming to minimize (resp. maximize) a function of one variable,thereby, reaching a saddle point. Note that the assumptions of Proposition 5.5 donot imply that F is locally convex-concave. As an example, the function in (5.6) isnot convex-concave in any neighborhood of any saddle point but we show next that itsatisfies the assumptions of Proposition 5.5, establishing local asymptotic convergenceof the respective saddle-point dynamics. Example 5.6. (Convergence analysis via proximal normals):
Consider the func-tion F defined in (5.6). Consider a saddle point ( x ∗ , z ∗ ) = (cos θ, sin θ, ∈ Saddle( F ),where θ ∈ [0 , π ). Let η = ( η x , η z ) = (( a cos θ, a sin θ ) , a ) , with a , a ∈ R and a + a = 1, be a proximal normal to Saddle( F ) at ( x ∗ , z ∗ ).Note that the function λ F ( x ∗ + λη x , z ∗ ) = ( λa ) is convex, satisfying (5.7a)with k = 1 and α = 4. The function λ F ( x ∗ , z ∗ + λη z ) = − ( λa ) is concave,satisfying (5.7b) with k = 1, β = 2. Also, given any λ M > t ∈ [0 , k∇ xz F ( x ∗ + tλη x , z ∗ + λη z ) − ∇ xz F ( x ∗ + λη x , z ∗ + tλη z ) k = k − λa )(1 + tλa ) (cid:0) cos θ sin θ (cid:1) + 4( tλa )(1 + λa ) (cid:0) cos θ sin θ (cid:1) k , ≤ k λa )(1 + tλa ) − tλa )(1 + λa ) k , ≤ λa )( λa ) ≤ L z ( λa ) , for λ ≤ λ M , where L z = 8(1 + λ M a ). This implies that L x = 0, L z = 0 and β = 1.Therefore, hypotheses (i)-(iii) of Proposition 5.5 are satisfied and this establishes symptotic convergence of the saddle-point dynamics. Figure 5.2 illustrates this fact.Note that since L z = 0, we cannot guarantee global convergence. • (a) ( x, z ) (b) F Fig. 5.2 . (a) Trajectory of the saddle-point dynamics for the function defined by (5.6) . Theinitial condition is ( x, z ) = (0 . , . , . The trajectory converges to (0 . , . , ∈ Saddle( F ) .(b) Evolution of the function F along the trajectory. The value converges to , the value that thefunction takes on its saddle set. Interestingly, Propositions 5.1 and 5.5 complement each other. The function (5.6)satisfies the hypotheses of Proposition 5.5 but not those of Proposition 5.1. Con-versely, the Lagrangian of the constrained optimization (5.4) satisfies the hypothesesof Proposition 5.1 but not those of Proposition 5.5.In the next result, we consider yet another scenario where the saddle functionmight not be convex-concave in its arguments but the saddle-point dynamics convergesto the set of equilibrium points. As a motivation, consider the function F : R × R → R , F ( x, z ) = xz . The set of saddle points of F are Saddle( F ) = R ≤ × { } . One canshow that, at the saddle point (0 , R ,see Figure 5.3 below. This asymptotic behavior can be characterized through thefollowing result which generalizes [23, Theorem 3]. Proposition 5.7. (Global asymptotic stability of equilibria of saddle-point dy-namics for saddle functions linear in one argument):
For F : R n × R m → R , assumethe following form F ( x, z ) = g ( z ) ⊤ x , where g : R m → R n is C . Assume that thereexists ( x ∗ , z ∗ ) ∈ Saddle( F ) such that(i) F ( x ∗ , z ∗ ) ≥ F ( x ∗ , z ) for all z ∈ R m ,(ii) for any z ∈ R m , the condition g ( z ) ⊤ x ∗ = 0 implies g ( z ) = 0 ,(iii) any trajectory of X sp is bounded.Then, all trajectories of the saddle-point dynamics X sp converge asymptotically to theset of equilibrium points of X sp .Proof . Consider the function V : R n × R m → R , V ( x, z ) = − x ⊤∗ x. The Lie derivative of V along the saddle-point dynamics X sp is L X sp V ( x, z ) = x ⊤∗ ∇ x F ( x, z ) = x ⊤∗ g ( z ) = F ( x ∗ , z ) ≤ F ( x ∗ , z ∗ ) = 0 , (5.13) here in the inequality we have used assumption (i), and F ( x ∗ , z ∗ ) = 0 is implied bythe definition of the saddle point, that is, ∇ x F ( x ∗ , z ∗ ) = g ( z ∗ ) = 0. Now considerany trajectory t ( x ( t ) , z ( t )), ( x (0) , z (0)) ∈ R n × R m of X sp . Since the trajectory isbounded by assumption (iii), the application of the LaSalle Invariance Principle [24,Theorem 4.4] yields that the trajectory converges to the largest invariant set M contained in { ( x, z ) ∈ R n × R m | L X sp V ( x, z ) = 0 } , which from (5.13) is equalto the set { ( x, z ) ∈ R n × R m | F ( x ∗ , z ) = 0 } . Let ( x, z ) ∈ M . Then, we have F ( x ∗ , z ) = g ( z ) ⊤ x ∗ = 0 and by hypotheses (ii) we get g ( z ) = 0. Therefore, if( x, z ) ∈ M then g ( z ) = 0. Consider the trajectory t ( x ( t ) , z ( t )) of X sp with( x (0) , z (0)) = ( x, z ) which is contained in M . Then, along the trajectory we have˙ x ( t ) = −∇ x F ( x ( t ) , z ( t )) = − g ( z ( t )) = 0Further, note that along this trajectory we have g ( z ( t )) = 0 for all t ≥
0. Thus, ddt g ( z ( t )) = 0 for all t ≥
0, which implies that ddt g ( z ( t )) = Dg ( z ( t )) ˙ z ( t ) = Dg ( z ( t )) Dg ( z ( t )) ⊤ x = 0 . From the above expression we deduce that ˙ z ( t ) = Dg ( z ( t )) ⊤ x = 0. This can beseen from the fact that Dg ( z ( t )) Dg ( z ( t )) ⊤ x = 0 implies x ⊤ Dg ( z ( t )) Dg ( z ( t )) ⊤ x =( Dg ( z ( t )) ⊤ x ) = 0. From the above reasoning, we conclude that ( x, z ) is an equilib-rium point of X sp .The proof of Proposition 5.7 hints at the fact that hypothesis (ii) can be omittedif information about other saddle points of F is known. Specifically, consider the casewhere n saddle points ( x (1) ∗ , z (1) ∗ ) , . . . , ( x ( n ) ∗ , z ( n ) ∗ ) of F exist, each satisfying hypothesis(i) of Proposition 5.7 and such that the vectors x (1) ∗ , . . . , x ( n ) ∗ are linearly independent.In this scenario, for those points z ∈ R m such that g ( z ) ⊤ x ( i ) ∗ = 0 for all i ∈ { , . . . , n } (as would be obtained in the proof), the linear independence of x ( i ) ∗ ’s already impliesthat g ( z ) = 0, making hypothesis (ii) unnecessary. Corollary 5.8. (Almost global asymptotic stability of saddle points for saddlefunctions linear in one argument):
If, in addition to the hypotheses of Proposition 5.7,the set of equilibria of X sp other than those belonging to Saddle( F ) are unstable, thenthe trajectories of X sp converge asymptotically to Saddle( F ) from almost all initialconditions (all but the unstable equilibria). Moreover, if each point in Saddle( F ) isstable under X sp , then Saddle( F ) is almost globally asymptotically stable under thesaddle-point dynamics X sp and the trajectories converge to a point in Saddle( F ) . Next, we illustrate how the above result can be applied to the motivating examplegiven before Proposition 5.7 to infer almost global convergence of the trajectories.
Example 5.9. (Convergence for saddle functions linear in one argument):
Con-sider again F ( x, z ) = xz with Saddle( F ) = { ( x, z ) ∈ R × R | x ≤ z = 0 } . Pick( x ∗ , z ∗ ) = ( − , F , the function x + z is preserved, which implies that all trajectoriesare bounded. One can also see that the equilibria of the saddle-point dynamics thatare not saddle points, that is the set R > × { } , are unstable. Therefore, from Corol-lary 5.8, we conclude that the trajectories of the saddle-point dynamics asymptoticallyconverge to the set of saddle points from almost all initial conditions. Figure 5.3 il-lustrates these observations. • (a) ( x, z ) (b) F (c) Vector field X sp Fig. 5.3 . (a) Trajectory of the saddle-point dynamics for the function F ( x, z ) = xz . Theinitial condition is ( x, z ) = (5 , . The trajectory converges to ( − . , ∈ Saddle( F ) . (b) Evolutionof the function F along the trajectory. The value converges to , the value that the function takeson its saddle set. (c) The vector field X sp , depicting that the set of saddle points are attractive whilethe other equilibrium points R > × { } are unstable.
6. Conclusions.
We have studied the asymptotic stability of the saddle-pointdynamics associated to a continuously differentiable function. We have identified aset of complementary conditions under which the trajectories of the dynamics areproved to converge to the set of saddle points of the saddle function and, whereverfeasible, we have also established global stability guarantees and convergence to apoint in the set. Our first class of convergence results is based on the convexity-concavity properties of the saddle function. In the absence of these properties, oursecond class of results explore, respectively, the existence of convergence guaranteesusing linearization techniques, the properties of the saddle function along proximalnormals to the set of saddle points, and the linearity properties of the saddle functionin one variable. For the linearization result, borrowing ideas from center manifoldtheory, we have established a general stability result of a manifold of equilibria for apiecewise twice continuously differentiable vector field. Several examples throughoutthe paper highlight the connections among the results and illustrate their applica-bility, in particular, for finding the primal-dual solutions of constrained optimizationproblems. Future work will study the robustness properties of the dynamics againstdisturbances, investigate the characterization of the rate of convergence, generalizethe results to the case of nonsmooth functions (where the associated saddle-point dy-namics takes the form of a differential inclusion involving the generalized gradient ofthe function), and explore the application to optimization problems with inequalityconstraints. We also plan to build on our results to synthesize distributed algorithmicsolutions for various networked optimization problems in power networks.
7. Acknowledgements.
Ashish Cherukuri and Jorge Corte´es’s research waspartially supported by NSF award ECCS-1307176. Bahman Gharesifard’s researchwas supported by an NSERC Discovery grant.
REFERENCES[1]
P.-A. Absil and K. Kurdyka , On the stable equilibrium points of gradient systems , Systems& Control Letters, 55 (2006), pp. 573–577.[2]
K. Arrow, L Hurwitz, and H. Uzawa , Studies in Linear and Non-Linear Programming ,Stanford University Press, Stanford, California, 1958.[3]
T. Bas¸ar and G.J. Oldser , Dynamic Noncooperative Game Theory , Academic Press, 1982.[4]
D. S. Bernstein , Matrix Mathematics , Princeton University Press, Princeton, NJ, 2005.[5]
S. P. Bhat and D. S. Bernstein , Nontangency-based Lyapunov tests for convergence and tability in systems having a continuum of equilibria , SIAM Journal on Control and Opti-mization, 42 (2003), pp. 1745–1775.[6] S. Boyd and L. Vandenberghe , Convex Optimization , Cambridge University Press, 2004.[7]
J. Carr , Applications of Centre Manifold Theory , Springer, New York, 1982.[8]
A. Cherukuri and J. Cort´es , Asymptotic stability of saddle points under the saddle-pointdynamics , in American Control Conference, Chicago, IL, July 2015, pp. 2020–2025.[9]
A. Cherukuri, E. Mallada, and J. Cort´es , Asymptotic convergence of primal-dual dynam-ics , Systems & Control Letters, 87 (2016), pp. 10–15.[10]
F. H. Clarke , Optimization and Nonsmooth Analysis , Canadian Mathematical Society Seriesof Monographs and Advanced Texts, Wiley, 1983.[11]
F. H. Clarke, Y.S. Ledyaev, R. J. Stern, and P. R. Wolenski , Nonsmooth Analysis andControl Theory , vol. 178 of Graduate Texts in Mathematics, Springer, 1998.[12]
J. Cort´es , Discontinuous dynamical systems - a tutorial on solutions, nonsmooth analysis,and stability , IEEE Control Systems Magazine, 28 (2008), pp. 36–73.[13]
R. Dorfman, P. A. Samuelson, and R. Solow , Linear programming in economic analysis ,McGraw Hill, New York, Toronto, and London, 1958.[14]
G. Droge and M. Egerstedt , Proportional integral distributed optimization for dynamicnetwork topologies , in American Control Conference, Portland, OR, June 2014, pp. 3621–3626.[15]
J. Dugundji , Topology , Allyn and Bacon, Inc., Boston, MA, 1966.[16]
H. B. Durr and C. Ebenbauer , A smooth vector field for saddle point problems , in IEEEConf. on Decision and Control, Orlando, Florida, Dec. 2011, pp. 4654–4660.[17]
D. Feijer and F. Paganini , Stability of primal-dual gradient dynamics and applications tonetwork optimization , Automatica, 46 (2010), pp. 1974–1981.[18]
B. Gharesifard and J. Cort´es , Distributed continuous-time convex optimization on weight-balanced digraphs , IEEE Transactions on Automatic Control, 59 (2014), pp. 781–786.[19]
D. Henry , Geometric Theory of Semilinear Parabolic Equations , vol. 840 of Lecture Notes inMathematics, Springer, New York, 1981.[20]
W. M. Hirsch and S. Smale , Differential Equations, Dynamical Systems and Linear Algebra ,Academic Press, 1974.[21]
T. Holding and I. Lestas , On the convergence of saddle points of concave-convex functions,the gradient method and emergence of oscillations , in IEEE Conf. on Decision and Control,Los Angeles, CA, 2014, pp. 1143–1148.[22]
M. V. Jovanovic , A note on strongly convex and quasiconvex functions , Mathematical Notes,60 (1996), pp. 584–585.[23]
V. A. Kamenetskiy and Y. S. Pyatnitskiy , An iterative method of lyapunov function con-struction for differential inclusions , Systems & Control Letters, 8 (1987), pp. 445–451.[24]
H. K. Khalil , Nonlinear Systems , Prentice Hall, 3 ed., 2002.[25]
T. Kose , Solutions of saddle value problems by differential equations , Econometrica, 24 (1956),pp. 59–70.[26]
X. Ma and N. Elia , A distributed continuous-time gradient dynamics approach for the activepower loss minimizations , in Allerton Conf. on Communications, Control and Computing,Monticello, IL, Oct. 2013, pp. 100–106.[27]
A. Nedi´c and A. Ozdaglar , Subgradient methods for saddle-point problems , Journal of Opti-mization Theory & Applications, 142 (2009), pp. 205–228.[28]
J. Palis, Jr. and W. de Melo , Geometric Theory of Dynamical Systems , Springer, New York,1982.[29]
L. J. Ratliff, S. A. Burden, and S. S. Sastry , Characterization and computation of localNash equilibrium in continuous games , in Allerton Conf. on Communications, Control andComputing, Monticello, IL, Oct. 2013.[30]
D. Richert and J. Cort´es , Robust distributed linear programming , IEEE Transactions onAutomatic Control, 60 (2015), pp. 2567–2582.[31] ,
Distributed bargaining in dyadic-exchange networks , IEEE Transactions on Control ofNetwork Systems, 3 (2016), pp. 310–321.[32]
H. H. Sohrab , Basic Real Analysis , Birkh¨auser, Boston, MA, 2003.[33]
J. Wang and N. Elia , A control perspective for centralized and distributed convex optimization ,in IEEE Conf. on Decision and Control, Orlando, Florida, 2011, pp. 3800–3805.[34]
X. Zhang and A. Papachristodoulou , A real-time control framework for smart power net-works with star topology , in American Control Conference, Washington, DC, June 2013,pp. 5062–5067.[35]
C. Zhao, U. Topcu, N. Li, and S. Low , Design and stability of load-side primary frequencycontrol in power systems , IEEE Transactions on Automatic Control, 59 (2014), pp. 1177–20189.
Appendix.
This section contains some auxiliary results for our convergence anal-ysis in Sections 4 and 5. Our first result establishes the constant value of the saddlefunction over its set of (local) saddle points.
Lemma A.1. (Constant function value over saddle points):
For F : R n × R m → R continuously differentiable, let S ⊂
Saddle( F ) be a path connected set. If F is locallyconvex-concave on S , then F |S is constant.Proof . We start by considering the case when S is compact. Given ( x, z ) ∈ S ,let δ ( x, z ) > B δ ( x,z ) ( x, z ) ⊂ ( U x × U z ) ∩ U , where U x and U z areneighborhoods where the saddle property (2.3) holds and U is the neighborhood of( x, z ) where local convexity-concavity holds (cf. Section 2.3). This defines a coveringof S by open sets as S ⊂ ∪ ( x,z ) ∈S B δ ( x,z ) ( x, z ) . Since S is compact, there exist a finite number of points ( x , z ) , ( x , z ) , . . . , ( x n , z n )in S such that ∪ ni =1 B δ ( x i ,z i ) ( x i , z i ) covers S . For convenience, denote B δ ( x i ,z i ) ( x i , z i )by B i . Next, we show that F |S∩ B i is constant for all i ∈ { , . . . , n } . To see this, let(¯ x, ¯ z ) ∈ S ∩ B i . From (2.3), we have(A.1) F ( x i , ¯ z ) ≤ F ( x i , z i ) ≤ F (¯ x, z i ) . From the convexity of x F ( x, ¯ z ) over U ∩ ( R n × { ¯ z } ), (cf. definition of localconvexity-concavity in Section 2.3), and the fact that ∇ x F (¯ x, ¯ z ) = 0, we obtain F ( x i , ¯ z ) ≥ F (¯ x, ¯ z ) + ( x i − ¯ x ) ⊤ ∇ x F (¯ x, ¯ z ) = F (¯ x, ¯ z ). Similarly, using the concavity of z F (¯ x, z ), we get F (¯ x, z i ) ≤ F (¯ x, ¯ z ). These inequalities together with (A.1) yield F ( x i , z i ) ≤ F (¯ x, z i ) ≤ F (¯ x, ¯ z ) ≤ F ( x i , ¯ z ) ≤ F ( x i , z i ) . That is, F (¯ x, ¯ z ) = F ( x i , z i ) and hence F |S∩ B i is constant. Using this reasoning, if S ∩ B i ∩ B j = ∅ for any i, j ∈ { , . . . , n } , then F |S∩ ( B i ∪ B j ) is constant. Using that S ispath connected, the fact [15, p. 117] states that, for any two points ( x l , z l ) , ( x m , z m ) ∈S , there exist distinct members i , i , . . . , i k of the set { , . . . , n } such that ( x l , z l ) ∈S ∩ B i , ( x m , z m ) ∈ S ∩ B i k and S ∩ B i t ∩ B i t +1 = ∅ for all t ∈ { , . . . , k − } . Hence,we conclude that F |S is constant. Finally, in the case when S is not compact, pickany two points ( x l , z l ) , ( x m , z m ) ∈ S and let γ : [0 , → S be a continuous map with γ (0) = ( x l , z l ) and γ (1) = ( x m , z m ) denoting the path between these points. Theimage γ ([0 , ⊂ S is closed and bounded, hence compact, and therefore, F | γ ([0 , isconstant. Since the two points are arbitrary, we conclude that F |S is constant.The difficulty in Lemma A.1 arises due to the local nature of the saddle points (theresult is instead straightforward for global saddle points). The next result provides afirst-order condition for strongly quasiconvex functions. Lemma A.2. (First-order property of a strongly quasiconvex function):
Let f : R n → R be a C function that is strongly quasiconvex on a convex set D ⊂ R n .Then, there exists a constant s > such that f ( x ) ≤ f ( y ) ⇒ ∇ f ( y ) ⊤ ( x − y ) ≤ − s k x − y k , (A.2) for any x, y ∈ D . roof . Consider x, y ∈ D such that f ( x ) ≤ f ( y ). From strong quasiconvexity wehave f ( y ) ≥ f ( λx + (1 − λ ) y ) + sλ (1 − λ ) k x − y k , for any λ ∈ [0 , f ( λx + (1 − λ ) y ) − f ( y ) ≤ − sλ (1 − λ ) k x − y k . On the other hand, the Taylor’s approximation of f at y yields the following equalityat point y + λ ( x − y ), which is equal to λx + (1 − λ ) y , as f ( λx + (1 − λ ) y ) − f ( y ) = ∇ f ( y ) ⊤ ( λx + (1 − λ ) y − y ) + g ( λx + (1 − λ ) y − y )= λ ∇ f ( y ) ⊤ ( x − y ) + g ( λ ( x − y )) , (A.4)for some function g with the property lim λ → g ( λ ( x − y )) λ = 0. Using (A.4) in (A.3),dividing by λ , and taking the limit λ → Lemma A.3. (Asymptotic convergence to a point [5, Corollary 5.2]):
Considerthe nonlinear system (A.5) ˙ x ( t ) = f ( x ( t )) , x (0) = x , where f : R n → R n is locally Lipschitz. Let W ⊂ R n be a compact set that ispositively invariant under (A.5) and let E ⊂ W be a set of stable equilibria. If atrajectory t x ( t ) of (A.5) with x ∈ W satisfies lim t →∞ d E ( x ( t )) = 0 , then thetrajectory converges to a point in E . Finally, we establish the asymptotic stability of a manifold of equilibria throughlinearization techniques. We start with a useful intermediary result.
Lemma A.4. (Limit points of Jacobian of a piecewise C function): Let f : R n → R n be piecewise C . Then, for every x ∈ R n , there exists a finite index set I x ⊂ Z ≥ and a set of matrices { A x,i ∈ R n × n } i ∈I x such that (A.6) { A x,i | i ∈ I x } = { lim k →∞ Df ( x k ) | x k → x, x k ∈ R n \ Ω f } , where Ω f is the set of points where f is not differentiable.Proof . Since f is piecewise C , cf. Section 2.1, let D , . . . , D m ⊂ R n be the finitecollection of disjoint open sets such that f is C on D i for each i ∈ { , . . . , m } and R n = ∪ mi =1 cl( D i ). Let x ∈ R n and define I x = { i ∈ { , . . . , m } | x ∈ cl( D i ) } and A x,i = { lim k →∞ Df ( x k ) | x k → x, x k ∈ D i } . Note that A x,i is uniquely definedfor each i as, by definition, f | cl( D i ) is C . To show that (A.6) holds for the abovedefined matrices, first note that the set { A x,i | i ∈ I x } is included in the righthand side of (A.6) by definition. To show the other inclusion, consider any sequence { x k } ∞ k =1 ⊂ R n \ Ω f with x k → x . One can partition this sequence into subsequences,each contained in one of the sets D i , i ∈ I x and each converging to x . Thus, the limitlim k →∞ Df ( x k ) is contained in the set { A x,i } i ∈I x , proving the other inclusion andyielding (A.6). Note that, in the nonsmooth analysis literature [10, Chapter 2], theconvex hull of matrices { A x,i } i ∈I x is known as the generalized Jacobian of f at x .The following statement is an extension of [19, Exercise 6] to vector fields thatare only piecewise twice continuously differentiable. Its proof is inspired, but cannot e directly implied from, center manifold theory [7]. Proposition A.5. (Asymptotic stability of a manifold of equilibrium points forpiecewise C vector fields): Consider the system (A.7) ˙ x = f ( x ) , where f : R n → R n is piecewise C and locally Lipschitz in a neighborhood of a p -dimensional submanifold of equilibrium points E ⊂ R n of (A.7) . Assume that at each x ∗ ∈ E , the set of matrices { A x ∗ ,i } i ∈I x ∗ from Lemma A.4 satisfy:(i) there exists an orthogonal matrix Q ∈ R n × n such that, for all i ∈ I x ∗ , Q ⊤ A x ∗ ,i Q = (cid:20) A x ∗ ,i (cid:21) , where ˜ A x ∗ ,i ∈ R n − p × n − p ,(ii) the eigenvalues of the matrices { ˜ A x ∗ ,i } i ∈I x ∗ have negative real parts,(iii) there exists a positive definite matrix P ∈ R n − p × n − p such that ˜ A ⊤ x ∗ ,i P + P ˜ A x ∗ ,i ≺ , for all i ∈ I ( x ∗ ,z ∗ ) . Then, E is locally asymptotically stable under (A.7) and the trajectories converge toa point in E .Proof . Our strategy to prove the result is to linearize the vector field f on each ofthe patches around any equilibrium point and employ a common Lyapunov functionand a common upper bound on the growth of the second-order term to establish theconvergence of the trajectories. This approach is an extension of the proof of [24,Theorem 8.2], where the vector field f is assumed to be C everywhere. Let x ∗ ∈ E .For convenience, translate x ∗ to the origin of (A.7). We divide the proof in its variousparts to make it easier to follow the technical arguments. Step I: linearization of the vector field on patches around the equilibrium point.
From Lemma A.4, define I = { i ∈ { , . . . , m } | ∈ cl( D i ) } and matrices { A ,i } i ∈I as the limit points of the Jacobian matrices. From the definition of piecewise C function, there exist C functions { f i : D ei → R n } i ∈I with D ei open such that withcl( D i ) ⊂ D ei and the maps f | cl( D i ) and f i take the same value over the set cl( D i ). Notethat 0 ∈ D ei for every i ∈ I . By definition of the matrices { A ,i } i ∈I , we deducethat Df i (0) = A ,i for each i ∈ I . Therefore, there exists a neighborhood N ⊂ R n of the origin and a set of C functions { g i : R n → R n } i ∈I such that, for all i ∈ I , f i ( x ) = A ,i x + g i ( x ), for all x ∈ N ∩ D ei , where g i (0) = 0 and ∂g i ∂x (0) = 0 . (A.8)Without loss of generality, select N such that N ∩ D i is empty for every i
6∈ I .That is, ∪ i ∈I ( N ∩ cl( D i )) contains a neighborhood of the origin. With the aboveconstruction, the vector field f in a neighborhood around the origin is written as f ( x ) = f i ( x ) = A ,x x + g i ( x ) , for all x ∈ N ∩ cl( D i ) , i ∈ I , (A.9)where for each i ∈ I , g i satisfies (A.8). Step II: change of coordinates.
Subsequently, from hypothesis (i), there exists n orthogonal matrix Q ∈ R n × n , defining an orthonormal transformation denoted by T Q : R n → R n , x ( u, v ), that yields the new form of (A.9) as (cid:20) ˙ u ˙ v (cid:21) = (cid:20) A ,i (cid:21) (cid:20) uv (cid:21) + (cid:20) ˜ g i, ( u, v )˜ g i, ( u, v ) (cid:21) , for all ( u, v ) ∈ T Q ( N ∩ cl( D i )) , i ∈ I , (A.10)where for each i ∈ I , the matrix ˜ A ,i has eigenvalues with negative real parts (cf.hypothesis (ii)) and for each i ∈ I and k ∈ { , } we have˜ g i,k (0 ,
0) = 0 , ∂ ˜ g i,k ∂u (0 ,
0) = 0 , and ∂ ˜ g i,k ∂v (0 ,
0) = 0 . (A.11)With a slight abuse of notation, denote the manifold of equilibrium points in thetransformed coordinates by E itself, i.e., E = T Q ( E ). From (A.10), we deduce thatthe tangent and the normal spaces to the equilibrium manifold E at the origin are { ( u, v ) ∈ R p × R n − p | v = 0 } and { ( u, v ) ∈ R p × R n − p | u = 0 } , respectively.Due to this fact and since E is a submanifold of R n , there exists a smooth function h : R p → R n − p and a neighborhood U ⊂ T Q ( N ) ⊂ R n of the origin such that for any( u, v ) ∈ U , v = h ( u ) if and only if ( u, v ) ∈ E ∩ U . Moreover,(A.12) h (0) = 0 and ∂h∂u (0) = 0 . Now, consider the coordinate w = v − h ( u ) to quantify the distance of a point ( u, v )from the set E in the neighborhood U . To conclude the proof, we focus on showingthat there exists a neighborhood of the origin such that along a trajectory of (A.10)initialized in this neighborhood, we have w ( t ) → u ( t ) , h ( u ( t ))) ∈ U at all t ≥ u, w )-coordinates, over the set U , the system (A.10) reads as (cid:20) ˙ u ˙ w (cid:21) = (cid:20) A ,i (cid:21) (cid:20) uw (cid:21) + (cid:20) ¯ g i, ( u, w )¯ g i, ( u, w ) (cid:21) , for ( u, w + h ( u )) ∈ U ∩ T Q (cl( D i )) , i ∈ I , (A.13)where ¯ g i, ( u, w ) = ˜ g i, ( u, w + h ( u )) and ¯ g i, ( u, w ) = ˜ A ,i h ( u ) + ˜ g i, ( u, w + h ( u )) − ∂h∂u ( u )(˜ g i, ( u, w + h ( u ))). Further, the equilibrium points E ∩ U in these coordinatesare represented by the set of points ( u, u satisfies ( u, h ( u )) ∈ E ∩ U . Thesefacts, along with the conditions on the first-order derivatives of ˜ g i, , ˜ g i, in (A.11) andthat of h in (A.12) yield ¯ g i,k ( u,
0) = 0 and ∂ ¯ g i,k ∂w (0 ,
0) = 0 , (A.14)for all i ∈ I and k ∈ { , } . Note that the functions ¯ g i, and ¯ g i, are C . Thisimplies that, for small enough ǫ >
0, we have k ¯ g i,k ( u, w ) k ≤ M i,k k w k , for k ∈ { , } , i ∈ I , and ( u, w ) ∈ B ǫ (0), where the constants { M i,k } i ∈I ,k ∈{ , } ⊂ R > can be madearbitrarily small by selecting smaller ǫ . Defining M ǫ = max { M i,k | i ∈ I , k ∈ { , }} , k ¯ g i,k ( u, w ) k ≤ M ǫ k w k , for k ∈ { , } and i ∈ I . (A.15) Step III: Lyapunov analysis.
With the bounds above, we proceed to carry out theLyapunov analysis for (A.13). Using the matrix P from assumption (iii), define the andidate Lyapunov function V : R n − p → R ≥ for (A.13) as V ( w ) = w ⊤ P w whoseLie derivative along (A.13) is L (A.13) V ( w ) = w ⊤ ( ˜ A ⊤ ,i P + P ˜ A ,i ) w + 2 w ⊤ P ¯ g i, ( u, w ) , for ( u, w + h ( u )) ∈ U ∩ T Q (cl( D i )) , i ∈ I . By assumption (iii), there exists λ > w ⊤ ( ˜ A ⊤ ,i P + P ˜ A ,i ) w ≤ − λ k w k .Pick ǫ such that ( u, w ) ∈ B ǫ (0) implies ( u, h ( u ) + w ) ∈ U . Then, the above Liederivative can be upper bounded as L (A.13) V ( w ) ≤ − λ k w k + 2 M ǫ k P kk w k = − β k w k , for ( u, w ) ∈ B ǫ (0) , where β = λ − M ǫ . Let ǫ small enough so that β > L (A.13) V ( w ) ≤− β k w k < w = 0. Now assume that there exists a trajectory t ( u ( t ) , w ( t ))of (A.13) that satisfies ( u ( t ) , w ( t )) ∈ B ǫ (0) for all t ≥
0. Then, using the following λ min ( P ) k w k ≤ w ⊤ P w ≤ λ max ( P ) k w k , we get V ( w ( t )) ≤ e − β t/λ max ( P ) V ( w (0)) along this trajectory. Employing the sameinequalities again, we get(A.16) k w ( t ) k ≤ K k w (0) k e − β t , where K = q λ max ( P ) λ min ( P ) and β = β λ max ( P ) >
0. This proves that w ( t ) → δ > u (0) , w (0)) ∈ B δ (0) satisfy( u ( t ) , w ( t )) ∈ B ǫ (0) for all t ≥ E . From (A.13), (A.15)and (A.16), we have k u ( t ) k ≤ k u (0) k + Z t M ǫ Ke − β s k w (0) k ds, ≤ k u (0) k + M ǫ Kβ k w (0) k . (A.17)By choosing ǫ small enough, M ǫ can be made arbitrarily small and β can be boundedaway from the origin. With this, from (A.16) and (A.17), one can select a smallenough δ > u (0) , w (0)) ∈ B δ (0) imply ( u ( t ) , w ( t )) ∈ B ǫ (0) for all t ≥ w ( t ) →
0. From this, we deduce that the trajectories staring in B δ (0) converge tothe set E and the origin is stable. Since x ∗ was arbitrary, we conclude local asymptoticstability of E . Convergence to a point follows from the application of Lemma A.3.The next example illustrates the application of the above result to conclude localconvergence of trajectories to a point in the manifold of equilibria. Example A.6. (Asymptotic stability of a manifold of equilibria for piecewise C ector fields): Consider the system ˙ x = f ( x ), where f : R → R is given by(A.18) f ( x ) = − − − x x x + ( x − x ) , if x − x ≥ , − − − x x x + ( x − x ) (1 − x + x ) , if x − x < . The set of equilibria of f is the one-dimensional manifold E = { x ∈ R | x = x = x } . Consider the regions D = { x ∈ R | x − x > } and D = { x ∈ R | x − x < } . Note that f is locally Lipschitz on R and C on D and D . At any equilibriumpoint x ∗ ∈ E , the limit point of the generalized Jacobian belongs to { A , A } , where A = − − − and A = − − − . With the orthogonal matrix Q = − − we get, Q ⊤ A Q = − − , Q ⊤ A Q = − − . The nonzero 2 × E is locally asymptotically stableunder ˙ x = f ( x ), as illustrated in Figure 7.1. • (a) Trajectory (b) Distance to equilibrium set Fig. 7.1 . (a) Trajectory of the vector field f defined in (A.18) . The initial condition is x =(1 , . , − . . The trajectory converges to the equilibrium point (2 . , . , . . (b) Evolution ofthe distance to the equilibrium set E of the trajectory.of the trajectory.