Global exponential stability of primal-dual gradient flow dynamics based on the proximal augmented Lagrangian: A Lyapunov-based approach
GGlobal exponential stability of primal-dual gradient flow dynamics based onthe proximal augmented Lagrangian: A Lyapunov-based approach
Dongsheng Ding,
Student Member, IEEE, and Mihailo R. Jovanovi´c,
Fellow, IEEE
Abstract — For a class of nonsmooth composite optimiza-tion problems with linear equality constraints, we utilize aLyapunov-based approach to establish the global exponentialstability of the primal-dual gradient flow dynamics based onthe proximal augmented Lagrangian. The result holds when thedifferentiable part of the objective function is strongly convexwith a Lipschitz continuous gradient; the non-differentiablepart is proper, lower semi-continuous, and convex; and thematrix in the linear constraint is full row rank. Our quadraticLyapunov function generalizes recent result from stronglyconvex problems with either affine equality or inequality con-straints to a broader class of composite optimization problemswith nonsmooth regularizers and it provides a worst-case lowerbound of the exponential decay rate. Finally, we use compu-tational experiments to demonstrate that our convergence rateestimate is less conservative than the existing alternatives.
I. I
NTRODUCTION
Primal-dual gradient flow dynamics belong to a classof Lagrangian-based methods for constrained optimizationproblems. Among other applications, such dynamics havefound use in network utility maximization [1], resourceallocation [2], distributed optimization [3], and feedback-based online optimization [4] problems. Stability conditionsfor various forms of the gradient flow dynamics have beenproposed since their introduction in the 1950’s [5].Lyapunov-based approach has been an effective tool forstudying the stability of primal-dual algorithms starting withthe seminal paper of Arrow, Hurwicz, and Uzawa [5].They utilized a quadratic Lyapunov function to establish theglobal asymptotic stability of the primal-dual dynamics forstrictly convex-concave Lagrangians. This early result wasextended to the problems in which the Lagrangian is eitherstrictly convex or strictly concave [6]. A simplified Lyapunovfunction was also proposed for linearly-convex or linearly-concave Lagrangians. For a projected variant of the primal-dual dynamics that could account for inequality constraints,a Krasovskii-based Lyapunov function was combined withLaSalle’s invariance principle to show the global asymptoticstability in [1]. The invariance principle was also special-ized to discontinuous Carath´eodory systems and a quadraticLyapunov function was used to show the global asymptoticstability of projected primal-dual gradient flow dynamicsunder globally-strict or locally-strong convexity-concavityassumptions [7], [8]. Additional information about the utility
Financial support from the National Science Foundation under awardsECCS-1708906 and ECCS-1809833 is gratefully acknowledged.D. Ding and M. R. Jovanovi´c are with the Ming Hsieh Department ofElectrical and Computer Engineering, University of Southern California,Los Angeles, CA 90089. E-mails: [email protected], [email protected]. of a Lyapunov-based analysis in optimization can be foundin a recent reference [9].In [10], the theory of proximal operators was combinedwith the augmented Lagrangian approach to solve optimiza-tion problems in which the objective function can be decom-posed into the sum of the strongly convex term with a Lip-schitz continuous gradient and a convex non-differentiableterm. By evaluating the augmented Lagrangian along a cer-tain manifold, a continuously differentiable function of bothprimal and dual variables was obtained. This function wasnamed the proximal augmented Lagrangian and the theory ofintegral quadratic constraints (IQCs) in the frequency domainwas employed to prove the global exponential stability of theresulting primal-dual dynamics [10]. This method yields anevolution model with a continuous right-hand-side even fornonsmooth problems and it avoids the explicit construction ofa Lyapunov function. In [11], a quadratic Lyapunov functionwas used to prove similar properties for a narrower class ofproblems that involve strongly convex and smooth objectivefunctions with either affine equality or inequality constraints.More recently, this Lyapunov-based result was extended toaccount for variations in the constraints [12], [13] and thetheory of IQCs was used to prove global exponential stabilityof the differential equations that govern the evolution ofproximal gradient and Douglas-Rachford splitting flows [14].Herein, we utilize a Lyapunov-based approach to establishthe global exponential stability of the primal-dual gradientflow dynamics resulting from the proximal augmented La-grangian. As aforementioned, this method was introducedin [10] to solve a class of nonsmooth composite optimizationproblems. When the differentiable part of the objective func-tion is strongly convex with a Lipschitz continuous gradient;the non-differentiable part is proper, lower semi-continuous,and convex; and the matrix in the linear constraint is fullrow rank, we construct a new quadratic Lyapunov functionfor the underlying primal-dual dynamics. This Lyapunovfunction allows us to derive a worst-case lower bound onthe exponential decay rate. In contrast to [1], [7], [8], [15],our gradient flow dynamics are projection-free and there areno nonsmooth terms in the Lyapunov function.We employ the theory of IQCs in the time domain toobtain a quadratic Lyapunov function that establishes theglobal exponential stability of the primal-dual gradient flowdynamics resulting from the proximal augmented Lagrangianframework for nonsmooth composite optimization. Our Lya-punov function is more general that the one in [11] andit yields less conservative convergence rate estimates. Thisextends the results of [11] from strongly convex problems a r X i v : . [ m a t h . O C ] O c t ith affine equality or inequality constraints to a broaderclass of optimization problems with nonsmooth regularizers.The remainder of the paper is organized as follows. InSection II, we provide background material, formulate thenonsmooth composite optimization problem, and describethe proximal augmented Lagrangian as well as the result-ing primal-dual gradient flow dynamics. In Section III, weconstruct a quadratic Lyapunov function for verifying theglobal exponential stability of the primal-dual dynamics. InSection IV, we use computational experiments to illustratethe utility of our results. In Section V, we close the paperwith concluding remarks.II. P ROBLEM FORMULATION AND BACKGROUND
We consider convex composite optimization problems inwhich the objective function consists of a continuouslydifferentiable term f and a non-differentiable term g minimize x, z f ( x ) + g ( z )subject to T x − z = 0 (1)where T ∈ R m × n is a matrix that relates the optimizationvariables x ∈ R n and z ∈ R m . Assumption 1:
Problem (1) is feasible and its minimumis finite.
Assumption 2:
The continuously differentiable function f is m f -strongly convex with an L f -Lipschitz continuousgradient and the non-differentiable function g is proper, lowersemi-continuous, and convex. Assumption 3:
The matrix T ∈ R m × n has a full row rank. A. Proximal augmented Lagrangian
The proximal operator of the function g is given by [16] prox µg ( v ) := argmin x (cid:18) g ( x ) + 12 µ (cid:107) x − v (cid:107) (cid:19) and the associated value function is Moreau envelope, M µg ( v ) := g ( prox µg ( v )) + 12 µ (cid:107) prox µg ( v ) − v (cid:107) where µ is a positive parameter. The Moreau envelope iscontinuously differentiable, even when g is not, and itsgradient is determined by, ∇ M µg ( v ) = 1 µ (cid:0) v − prox µg ( v ) (cid:1) . The augmented Lagrangian of the constrained optimiza-tion problem (1) is given by L ( x, z ; y ) = f ( x ) + g ( z ) + y T ( T x − z ) + 12 µ (cid:107) T x − z (cid:107) where x ∈ R n and z ∈ R m are the primal variables, y ∈ R m is a dual variable, and µ is a positive parameter. Completionof squares brings L ( x, z ; y ) into the following form L ( x, z ; y ) = f ( x ) + g ( z ) + 12 µ (cid:107) z − ( T x + µy ) (cid:107) − µ (cid:107) y (cid:107) . The minimizer of the augmented Lagrangian with respect to z is z (cid:63)µ ( x ; y ) = prox µg ( T x + µy ) where prox µg denotes the proximal operator of the func-tion g . Restriction of L along the manifold determined by z (cid:63)µ ( x ; y ) yields the proximal augmented Lagrangian [10], L µ ( x ; y ) := L ( x, z (cid:63)µ ( x ; y ); y )= f ( x ) + M µg ( T x + µy ) − µ (cid:107) y (cid:107) (2)where M µg is the Moreau envelope of the function g .Continuous differentiability of the proximal augmented La-grangian L µ ( x ; y ) with respect to both x and y follows fromcontinuous differentiability of M µg and Lipschitz continuityof the gradient of f . B. Examples
We next provide examples of convex optimization prob-lems that can be brought into the form (1). For instance, theproblem with linear equality constraints, minimize x f ( x )subject to T x = b (3)where b ∈ R m is a given vector can be cast as (1) bychoosing g ( z ) to be an indicator function, g ( z i ) := { , z i = b i ; ∞ , otherwise } . In this case, the proximal operator isgiven by prox µg ( v i ) = b i , the associated Moreau envelopeis M µg ( v i ) = µ ( v i − b i ) , and the gradient of the Moreauenvelope is ∇ M µg ( v i ) = ( v i − b i ) /µ. The problem with linear inequality constraints, minimize x f ( x )subject to T x ≤ b (4)where b ∈ R m is a given vector can be cast as (1) bychoosing g ( z ) to be an indicator function, g ( z i ) := { , z i ≤ b i ; ∞ , otherwise } . The proximal operator is prox µg ( v i ) =min { v i , b i } , the associated Moreau envelope is M µg ( v i ) = { µ ( v i − b i ) , v i > b i ; 0 , otherwise } , and the gradient of theMoreau envelope is ∇ M µg ( v i ) = max(0 , ( v i − b i ) /µ ) . Unconstrained optimization problems with nonsmoothregularizers can be also represented by (1). For example,the logistic regression with elastic net regularization [17] minimize x (cid:96) ( x ) + (cid:107) x (cid:107) + (cid:107) x (cid:107) (5)where the logistic loss (cid:96) ( x ) is given by (cid:80) di =1 (log(1+e a Ti x ) − y i a Ti x ) where a i is the feature vector and y i ∈ { , } isthe corresponding label. Choosing f ( x ) := (cid:96) ( x ) + (cid:107) x (cid:107) , g ( z ) := (cid:107) z (cid:107) , and T := I brings (5) into (1). Theproximal operator is the soft-thresholding prox µg ( v i ) = sign ( v i ) max {| v i | − µ, } , the associated Moreau envelopeis the Huber function M µg ( v i ) = { µ v i , | v i | ≤ µ ; | v i | − µ , | v i | ≥ µ } , and the gradient of the Moreau envelope is thesaturation function ∇ M µg ( v i ) = sign( v i ) min ( | v i | /µ, . C. Primal-dual gradient flow dynamics
The primal-dual gradient flow dynamics can be used tocompute the saddle points of (2), ˙ w = F ( w ) (6a)here w := [ x T y T ] T and F ( w ) := (cid:20) −∇ x L µ ( x ; y ) ∇ y L µ ( x ; y ) (cid:21) = (cid:20) − ( ∇ f ( x ) + T T ∇ M µg ( T x + µy )) µ ( ∇ M µg ( T x + µy ) − y ) (cid:21) . (6b)Let ¯ w := [ ¯ x T ¯ y T ] T denote the equilibrium points of (6),i.e., the solutions to F ( ¯ w ) = 0 . The Lagrangian of theoptimization problem (1) is given by f ( x ) + g ( z ) + y T ( T x − z ) and the associated KKT optimality condition are, ∇ f ( x (cid:63) ) + T T y (cid:63) ∈ ∂g ( z (cid:63) ) − y (cid:63) T x (cid:63) − z (cid:63) (7)where ∂g is the subgradient of g . The following lemmaestablishes the relation between ¯ w and the optimality condi-tions (7); see [10] for details. Lemma 1:
Let Assumptions 1 and 2 hold. The equilibriumpoint ¯ w := [ ¯ x T ¯ y T ] T of the primal-dual gradient flowdynamics (6) satisfies optimality conditions (7) with ¯ z := prox µg ( T ¯ x + µ ¯ y ) . Moreover, (¯ x, ¯ z ) is the optimal solutionof nonsmooth composite optimization problem (1).Under Assumptions 1-3, the global exponential stability ofthe primal-dual gradient flow dynamics (6) was establishedin [10] by employing the theory of integral quadratic con-straints in the frequency domain. An upper bound on theconvergence rate was also obtained but the explicit form forthe quadratic Lyapunov function was not provided. Recentreference [11] used a Lyapunov-based approach to show theglobal exponential stability for a class of problems with astrongly convex and smooth objective function f subjectto either affine equality or inequality constraints. In ourpreliminary work [18], a similar quadratic Lyapunov functionwas used to prove global exponential stability of the primal-dual gradient flow dynamics (6). In what follows, we employthe theory of IQCs in the time domain to obtain a quadraticLyapunov function that establishes the global exponentialstability of (6) and yields less conservative convergence rateestimates.III. G LOBAL EXPONENTIAL STABILITY VIA QUADRATIC L YAPUNOV FUNCTION
In this section, we identify a quadratic Lyapunov functionthat can be used to establish the global exponential stabilityof the primal-dual gradient flow dynamics (6) for stronglyconvex problems (1) with full row rank matrix T and providean estimate of the convergence rate. A. A system-theoretic viewpoint of primal-dual dynamics
Inspired by [19], we view (6) as a feedback interconnec-tion of an LTI system with static nonlinearities; see Fig. 1.These are determined by the gradient of the smooth part ofthe objective function ∇ f and the proximal operator prox µg .Structural properties of nonlinear terms that we exploit in ouranalysis are specified in Assumption 2. ∆ ∆ G ∆ u u ξ = xξ = T x + µy Fig. 1: Block diagram of primal-dual gradient flow dynam-ics (6): G is an exponentially stable LTI system in (8a) and ∆ is a static nonlinear map that satisfies quadratic constraint (9).Let u = [ u T u T ] T and ξ = [ ξ T ξ T ] T , with ξ := xξ := T x + µyu := ∆ ( ξ ) = ∇ f ( x ) − m f xu := ∆ ( ξ ) = prox µg ( T x + µy ) . For strongly convex f , the primal-dual dynamics (6) can becast as an LTI system G in feedback with a nonlinear block ∆ , where ˙ w = Aw + Buξ = Cwu = ∆( ξ ) (8a)with A = (cid:20) − ( m f I + µ T T T ) − T T T (cid:21) B = (cid:20) − I µ T T − I (cid:21) , C = (cid:20) I T µI (cid:21) . (8b)The input is given by u = ∆( ξ ) where ∆( ξ ) is a × block-diagonal matrix with the diagonal blocks ∆ ( ξ ) and ∆ ( ξ ) . These nonlinearities satisfy the pointwise quadraticinequalities [10] (cid:20) ξ i − ¯ ξ i u i − ¯ u i (cid:21) T (cid:20) L i I ˆ L i I − I (cid:21) (cid:20) ξ i − ¯ ξ i u i − ¯ u i (cid:21) ≥ where ¯ w := [ ¯ x T ¯ y T ] T is the equilibrium point of system (8), ¯ ξ = ¯ x , ¯ ξ = T ¯ x + µ ¯ y , ˆ L := L f − m f , and ˆ L = 1 .This is because ∆ is the gradient of the convex function f ( ξ ) − ( m f / (cid:107) ξ (cid:107) and, thus, it is Lipschitz continuouswith parameter L f − m f [19, Proposition 5]; and ∆ isgiven by the proximal operator of the function g and, thus,it is firmly non-expansive (i.e., Lipschitz continuous withparameter one) [16]. These quadratic constraints can becombined into, (cid:20) ξ − ¯ ξu − ¯ u (cid:21) T (cid:20) Π − (cid:21)(cid:124) (cid:123)(cid:122) (cid:125) Π (cid:20) ξ − ¯ ξu − ¯ u (cid:21) ≥ (9)where Π = (cid:20) λ ˆ L I λ I (cid:21) , Λ = (cid:20) λ I λ I (cid:21) nd λ , λ are non-negative scalars. B. Lyapunov-based analysis for global exponential stability
For the primal-dual gradient flow dynamics (6) with equi-librium point ¯ w , we propose a quadratic Lyapunov functioncandidate V ( ˜ w ) = ˜ w T P ˜ w (10a)with ˜ w := w − ¯ w and P = α (cid:20) I (1 /µ ) T T (1 /µ ) T (1 + m f /µ ) I + (1 /µ ) T T T (cid:21) (10b)where α is a positive parameter, m f is the strong convexitymodule of the function f , µ is the augmented Lagrangianparameter, and T is the full rank matrix associated with thelinear equality constraint in (1). The matrix P is positivedefinite and, for A in (8b), we have A T P + P A = − α (cid:20) m f I
00 (1 /µ ) T T T (cid:21) ≺ . (11)Thus, A is a Hurwitz matrix and the LTI system in Fig. 1 isexponentially stable. Furthermore, the derivative of V alongthe solutions of (8a) is determined by ˙ V = (cid:20) ˜ w ˜ u (cid:21) T (cid:20) A T P + P A P BB T P (cid:21) (cid:20) ˜ w ˜ u (cid:21) (12a)where ˜ u := u − ¯ u , and the substitution of the output equation ξ = Cw in (8a) to (9) yields the quadratic inequality, (cid:20) ˜ w ˜ u (cid:21) T (cid:20) C T Π Π C − (cid:21) (cid:20) ˜ w ˜ u (cid:21) ≥ . (12b)The sufficient condition for the global exponential stabilityof (8) is obtained by adding (12b) to (12a) and it amountsto the existence of a positive constant ρ such that (cid:20) − ( A T P + P A + 2 ρP ) − ( P B + C T Π ) − ( P B + C T Π ) T (cid:21) (cid:31) . (13)If this condition holds, we have ˙ V ≤ − ρV . Thus, V ( ˜ w ( t )) ≤ V ( ˜ w (0)) e − ρt and since P (cid:31) , (cid:107) ˜ w ( t ) (cid:107) ≤ √ κ p (cid:107) ˜ w (0) (cid:107) e − ρt , for all t ≥ where κ p is the condition number of the matrix P . Since Λ (cid:31) , the remaining task is to verify the existence of thepositive parameters α , µ , λ , λ , and ρ such that − ( A T P + P A + 2 ρP ) − ( P B + C T Π ) Λ − ( P B + C T Π ) T (cid:31) (14)which follows from the application of the Schur complementto (13).We are now ready to prove the global exponential stabilityof the primal-dual gradient flow dynamics (6) and provideestimates of the convergence rate ρ for L f > m f . Similarresult can be established for L f = m f . Theorem 2:
Let Assumptions 1-3 hold, let L f > m f ,and let σ max ( T ) be the largest singular value of the matrix T . Then, the global exponential stability of the primal-dualgradient flow dynamics (6) can be established with Lyapunov function (10) if the augmented Lagrangian parameter satisfies µ > max (cid:32) L f − m f , σ ( T )8 m f (cid:32) (cid:114) m f σ ( T ) (cid:33)(cid:33) . (15) Proof:
If (14) holds for ρ = 0 , the continuity of theleft-hand side of (14) with respect to ρ implies the existenceof ρ > such that (14) holds. For ρ = 0 , (14) becomes − ( A T P + P A ) − ( P B + C T Π ) Λ − ( P B + C T Π ) T (cid:31) (16)where A T P + P A is given by (11), and
P B + C T Π reads (cid:20) ( λ ˆ L − α ) I λ T T − ( α/µ ) T ( µλ − α (1 + αm f /µ )) I (cid:21) where ˆ L := L f − m f > . Thus, the matrix M := ( P B + C T Π ) Λ − ( P B + C T Π ) T is given by M = (cid:20) M M T M M (cid:21) (17)where M = (cid:16) ( α − λ ˆ L ) λ I + λ T T T (cid:17) M = (cid:16) α λ µ T T T + λ ( µλ − α (1 + m f µ )) I (cid:17) M = (cid:16) α ( α − λ ˆ L ) λ µ + µλ − α (1 + m f µ ) (cid:17) T and α , λ , λ , and µ are positive parameters that have tobe selected such that (16) holds. Setting λ := α/ ˆ L and λ := ( α/µ )(1+ m f /µ ) yields M = 0 and (16) simplifies to (cid:20) αm f I − λ T T T αµ (2 − α λ µ ) T T T (cid:21) (cid:31) or, equivalently, αm f > λ σ ( T ) and µλ > α. Combining these two conditions with the above definitionsof λ and λ yields (15).We next utilize the choices of parameters λ and λ inTheorem 2 to estimate the convergence rate ρ . Proposition 3:
Let Assumptions 1-3 hold, let L f > m f ,and let σ min ( T ) and σ max ( T ) be the smallest and thelargest singular values of the matrix T . Then, the primal-dual gradient flow dynamics (6) are globally exponentiallystable with the rate ρ ≥ ρ ( µ ) := σ ( T )2( µ + m f + σ ( T ) /µ ) (18a)if µ > max ( L f − m f , ˆ µ ) , where ˆ µ := inf { µ ∈ [ σ max ( T ) , ∞ ) , β ( µ ) < m f } (18b) β ( µ ) := ( m f + µ ) σ ( T )2 µ + 2 ρ ( µ )( µ + 4 ρ ( µ )) µ . (18c) Proof:
See Appendix A.V. C
OMPUTATIONAL EXPERIMENTS
We next provide an example to demonstrate the merits ofour approach. Let us consider optimization problem (1) with, f ( x ) = x T Qx + q T xg ( z ) = (cid:40) , z ≤ b ∞ , otherwise (19)where x and q are the n -dimensional vectors, Q ∈ R n × n is apositive definite matrix, T ∈ R m × n if a full row rank matrix,and b ∈ R m is a given vector. The gradient of the Moreauenvelope is determined by ∇ M µg ( v i ) = max (0 , ( v i − b i ) /µ ) and ( L f , m f ) are the largest and the smallest eigenvalues ofthe matrix Q , respectively.We use Matlab ODE solver ode45 to simulate the primal-dual gradient flow dynamics (6) and set n = m = 10 , q = 10 × randn ( n, , and Q = HH T + K , where H = randn ( n, n ) and K = diag ( exp ( randn ( n , ))) . We choose b to be a vector of all ones, set T = I , and report results for ( L f , m f ) = (1 . , . and ( L f , m f ) = (27 . , . .Figure 2a demonstrates the exponential convergence of dy-namics (6) with ( L f , m f ) = (1 . , . for different valuesof µ . We note that the convergence rate decreases when µ becomes larger than . For a given value of µ that satisfiesProposition 3, we use formula (18a) to estimate the lowerbound on the convergence rate ρ . We compare our estimatewith [11, Theorem 2] and [18, Theorem 6]. As shown inFig. 3a, Proposition 3 provides a less conservative estimate ofthe convergence rate than the existing methods. As Figs. 2band 3b illustrates, similar observations can be made for alarger condition number, ( L f , m f ) = (27 . , . . Clearly,the increase in condition number reduces the rate of expo-nential decay and our estimates are less conservative thanthose provided in the literature.V. C ONCLUDING REMARKS
In this paper, we use a Lyapunov-based approach toestablish global exponential stability of the primal-dual gra-dient flow dynamics resulting from the proximal augmentedLagrangian framework for nonsmooth composite optimiza-tion. We provide a worst-case estimate of the exponentialdecay rate when the differentiable part of the objectivefunction is strongly convex and its gradient is Lipschitzcontinuous. For a quadratic programming problem, compu-tational experiments are used to show that our estimate ofthe convergence rate is less conservative compared to theexisting literature. Our ongoing work focuses on identifyinga quadratic Lyapunov function that can certify the globalexponential stability of a second-order primal-dual methodfor nonsmooth composite optimization [20].A
PPENDIX
A. Proof of Proposition 3
We show that (14) holds for ρ = ρ ( µ ) . Substitution of theexpressions for A T P + P A and ( P B + C T Π )Λ − ( B T P + (cid:107) w ( t ) − w (cid:63) (cid:107) (a) ( L f , m f ) = (1 . , . time (seconds) (cid:107) w ( t ) − w (cid:63) (cid:107) (b) ( L f , m f ) = (27 . , . time (seconds)Fig. 2: Convergence of the primal-dual gradient flow dynam-ics (6) for problem (19) with (a) ( L f , m f ) = (1 . , . and(b) ( L f , m f ) = (27 . , . . Π C ) given by (11) and (17) into (14) yields R = (cid:20) R R T R R (cid:21) (cid:31) (20)where R = 2 α ( m f − ρ ) I − ( α − λ ˆ L ) λ I − λ T T TR = ( αµ − α λ µ ) T T T − ρα ((1 + m f µ ) I + µ T T T ) − λ ( µλ − α (1 + m f µ )) IR = − αµ ( α − λ ˆ L λ + 2 ρ ) T − ( µλ − α (1 + m f µ )) T. Here, ˆ L := L f − m f > , and α , λ , λ , and µ are positiveparameters that have to be selected such that (14) holds for ρ = ρ ( µ ) . We set λ := α/ ˆ L , λ := α (1 + m f /µ ) /µ , andadd/subtract αT T T / (2 µ ) to R to obtain R = α ( µ − ˆ L µ ) T T T − α µ T T T + αµ T T T − ρα ((1 + m f µ ) I + µ T T T ) + α µ T T T . If µ ≥ ˆ L , then α ( µ − ˆ L µ ) T T T − α µ T T T (cid:23) . (21) (a) ( L f , m f ) = (1 . , . µ ρ (b) ( L f , m f ) = (27 . , . µ Fig. 3: Convergence rate estimates, as a function of µ , re-sulting from (18a) ( – – ), [18, Theorem 6] ( ··· ), and [11, The-orem 2] ( – - ) for problem (1) with (19) and (a) ( L f , m f ) =(1 . , . ; (b) ( L f , m f ) = (27 . , . .Furthermore, for ρ = ρ , we have αµ T T T − ρα ((1 + m f µ ) I + µ T T T ) (cid:23) αµ σ ( T ) I − ρα (1 + m f µ + σ ( T ) µ ) I = 0 . (22)Combining (21) and (22) with the definition of R yields R (cid:23) αT T T / (2 µ ) and the positive definiteness of R follows from the fact that T is a full row rank matrix.The application of the Schur complement requires R − R T R − R (cid:31) . Using R (cid:23) αT T T / (2 µ ) , we can rewritethis condition as R − (2 µ/α ) R T ( T T T ) − R (cid:31) . For β ( µ ) and ˆ µ given by (18c) and (18b), respectively, if µ > ˆ µ , wehave R − (2 µ/α ) R T ( T T T ) − R (cid:23) α (cid:0) m f − (2 ρ + σ ( T )2 µ (1 + m f µ ) + ρ µ ) (cid:1) I = α (cid:0) m f − β ( µ ) (cid:1) I (cid:31) . We now prove the existence of such ˆ µ . Since ρ ( µ ) is mono-tonically decreasing for µ ≥ σ max ( T ) , β ( µ ) monotonicallydecreases to zero on the interval µ ∈ [ σ max ( T ) , ∞ ) . Thereare two cases: (i) if β ( σ max ( T )) > m f , then β (¯ µ ) = 2 m f for some ¯ µ ∈ [ σ max ( T ) , ∞ ) and ˆ µ = ¯ µ ; (ii) if β ( σ max ( T )) ≤ m f , then β ( µ ) < β ( σ max( T ) ) ≤ m f forall µ ∈ ( σ max( T ) , ∞ ) . Thus ˆ µ = σ max( T ) . Therefore, the setin (18b) is nonempty and such ˆ µ always exists.R EFERENCES[1] D. Feijer and F. Paganini, “Stability of primal-dual gradient dynamicsand applications to network optimization,”
Automatica , vol. 46, no. 12,pp. 1974–1981, 2010.[2] D. Ding and M. R. Jovanovi´c, “A primal-dual Laplacian gradient flowdynamics for distributed resource allocation problems,” in
Proceedingsof the 2018 American Control Conference , 2018, pp. 5316–5320.[3] J. Wang and N. Elia, “A control perspective for centralized anddistributed convex optimization,” in
Proceedings of the 50th IEEEConference on Decision and Control , 2011, pp. 3800–3805.[4] M. Colombino, E. Dall’Anese, and A. Bernstein, “Online optimizationas a feedback controller: stability and tracking,”
IEEE Trans. ControlNetw. Syst. , 2019, doi:10.1109/TCNS.2019.2906916.[5] K. J. Arrow, L. Hurwicz, and H. Uzawa,
Studies in linear and non-linear programming . Stanford University Press, 1958.[6] A. Cherukuri, B. Gharesifard, and J. Cort´es, “Saddle-point dynamics:conditions for asymptotic stability of saddle points,”
SIAM J. ControlOptim. , vol. 55, no. 1, pp. 486–511, 2017.[7] A. Cherukuri, E. Mallada, and J. Cort´es, “Asymptotic convergence ofconstrained primal-dual dynamics,”
Syst. Control Lett. , vol. 87, pp.10–15, 2016.[8] A. Cherukuri, E. Mallada, S. Low, and J. Cort´es, “The role of con-vexity on saddle-point dynamics: Lyapunov function and robustness,”
IEEE Trans. Automat. Control , vol. 63, no. 8, pp. 2449–2464, 2018.[9] B. Polyak and P. Shcherbakov, “Lyapunov functions: An optimizationtheory perspective,”
IFAC-PapersOnLine , vol. 50, no. 1, pp. 7456–7461, 2017.[10] N. K. Dhingra, S. Z. Khong, and M. R. Jovanovi´c, “The proximal aug-mented Lagrangian method for nonsmooth composite optimization,”
IEEE Trans. Automat. Control , vol. 64, no. 7, pp. 2861–2868, 2019.[11] G. Qu and N. Li, “On the exponential stability of primal-dual gradientdynamics,”
IEEE Control Syst. Lett. , vol. 3, no. 1, pp. 43–48, 2019.[12] Y. Tang, G. Qu, and N. Li, “Semi-global exponential stability ofprimal-dual gradient dynamics for constrained convex optimization,”2019, arXiv:1903.09580.[13] X. Chen and N. Li, “Exponential stability of primal-dual gradientdynamics with non-strong convexity,” 2019, arXiv:1905.00298.[14] S. Hassan-Moghaddam and M. R. Jovanovi´c, “Proximal gradient flowand Douglas-Rachford splitting dynamics: global exponential stabilityvia integral quadratic constraints,”
Automatica , 2019, submitted; alsoarXiv:1908.09043.[15] J. Cort´es and S. K. Niederl¨ander, “Distributed coordination for nons-mooth convex optimization via saddle-point dynamics,”
J. NonlinearSci. , pp. 1–26, 2018.[16] N. Parikh and S. Boyd, “Proximal algorithms,”
Foundations andTrends in Optimization , vol. 1, no. 3, pp. 127–239, 2014.[17] H. Zou and T. Hastie, “Regularization and variable selection via theelastic net,”
J. R. Stat. Soc. B , vol. 67, no. 2, pp. 301–320, 2005.[18] D. Ding and M. R. Jovanovi´c, “Global exponential stability of primal-dual gradient flow dynamics based on the proximal augmented La-grangian,” in
Proceedings of the 2019 American Control Conference ,2019, pp. 3414–3419.[19] L. Lessard, B. Recht, and A. Packard, “Analysis and design ofoptimization algorithms via integral quadratic constraints,”
SIAM J.Optim. , vol. 26, no. 1, pp. 57–95, 2016.[20] N. K. Dhingra, S. Z. Khong, and M. R. Jovanovi´c, “A second orderprimal-dual method for nonsmooth convex composite optimization,”