[PDF] A primal-dual dynamical approach to structured convex minimization problems

Abstract

In this paper we propose a primal-dual dynamical approach to the minimization of a structured convex function consisting of a smooth term, a nonsmooth term, and the composition of another nonsmooth term with a linear continuous operator. In this scope we introduce a dynamical system for which we prove that its trajectories asymptotically converge to a saddle point of the Lagrangian of the underlying convex minimization problem as time tends to infinity. In addition, we provide rates for both the violation of the feasibility condition by the ergodic trajectories and the convergence of the objective function along these ergodic trajectories to its minimal value. Explicit time discretization of the dynamical system results in a numerical algorithm which is a combination of the linearized proximal method of multipliers and the proximal ADMM algorithm.

Full PDF

AA primal-dual dynamical approach to structured convexminimization problems

Radu Ioan Bot¸ ∗ Ern¨o Robert Csetnek † Szil´ard Csaba L´aszl´o ‡ May 22, 2019

Abstract.

In this paper we propose a primal-dual dynamical approach to the minimization of astructured convex function consisting of a smooth term, a nonsmooth term, and the composition ofanother nonsmooth term with a linear continuous operator. In this scope we introduce a dynamicalsystem for which we prove that its trajectories asymptotically converge to a saddle point of theLagrangian of the underlying convex minimization problem as time tends to inﬁnity. In addition,we provide rates for both the violation of the feasibility condition by the ergodic trajectories and theconvergence of the objective function along these ergodic trajectories to its minimal value. Explicittime discretization of the dynamical system results in a numerical algorithm which is a combinationof the linearized proximal method of multipliers and the proximal ADMM algorithm.

Keywords. structured convex minimization, dynamical system, proximal ADMM algorithm,primal-dual algorithm

AMS Subject Classiﬁcation.

For H and G real Hilbert spaces, we consider the convex minimization probleminf x ∈H f ( x ) + h ( x ) + g ( Ax ) , (1)where f : H −→ R = R ∪ {±∞} and g : G −→ R are proper, convex and lower semicontinuousfunctions, h : H −→ R is a convex and Fr´echet diﬀerentiable function with L h -Lipschitz continuousgradient ( L h ≥ (cid:107)∇ h ( x ) − ∇ h ( y ) (cid:107) ≤ L h (cid:107) x − y (cid:107) for every x, y ∈ H , and A : H −→ G is acontinuous linear operator.Problem (1) can be rewritten as inf ( x,z ) ∈H×G Ax − z =0 f ( x ) + h ( x ) + g ( z ) . (2)Obviously, x ∗ ∈ H is an optimal solution of (1) if and only if ( x ∗ , z ∗ ) ∈ H × G is an optimal solutionof (2) and Ax ∗ = z ∗ . ∗ University of Vienna, Faculty of Mathematics, Oskar-Morgenstern-Platz 1, A-1090 Vienna, Austria, email:[email protected]. Research partially supported by FWF (Austrian Science Fund), project I 2419-N32. † University of Vienna, Faculty of Mathematics, Oskar-Morgenstern-Platz 1, A-1090 Vienna, Austria, email: [email protected]. Research supported by FWF (Austrian Science Fund), project P 29809-N32. ‡ Technical University of Cluj-Napoca, Department of Mathematics, Memorandumului 28, Cluj-Napoca, Romania,email: [email protected]. This work was supported by a grant of Ministry of Research and Innovation,CNCS - UEFISCDI, project number PN-III-P1-1.1-TE-2016-0266, within PNCDI III. a r X i v : . [ m a t h . O C ] M a y ased on this reformulation of problem (1) we deﬁne its Lagrangian l : H × G × G −→ R , l ( x, z, y ) = f ( x ) + h ( x ) + g ( z ) + (cid:104) y, Ax − z (cid:105) . An element ( x ∗ , z ∗ , y ∗ ) ∈ H × G × G is said to be a saddle point of the Lagrangian l , if l ( x ∗ , z ∗ , y ) ≤ l ( x ∗ , z ∗ , y ∗ ) ≤ l ( x, z, y ∗ ) , ∀ ( x, z, y ) ∈ H × G × G . It is known that ( x ∗ , z ∗ , y ∗ ) ∈ H × G × G is a saddle point of l if and only if x ∗ is an optimal solutionof (1), Ax ∗ = z ∗ , and y ∗ is an optimal solution of the Fenchel dual to problem (1), which readssup y ∈G ( − ( f ∗ (cid:3) h ∗ )( − A ∗ y ) − g ∗ ( y )) . (3)In this situation the optimal objective values of (1) and (3) coincide.In the formulation of (3), f ∗ : H → R , f ∗ ( u ) = sup x ∈H ( (cid:104) u, x (cid:105) − f ( x )) , h ∗ : H → R , h ∗ ( u ) = sup x ∈H ( (cid:104) u, x (cid:105) − h ( x )) , and g ∗ : G → R , g ∗ ( y ) = sup z ∈G ( (cid:104) y, z (cid:105) − g ( z )) , denote the conjugate functions of f, h and g , respectively, and A ∗ : G → H denotes the adjointoperator of A . The inﬁmal convolution f ∗ (cid:3) h ∗ : H → R of the functions f ∗ and h ∗ is deﬁned by( f ∗ (cid:3) h ∗ )( x ) = inf y ∈H ( f ∗ ( y ) + h ∗ ( x − y )) . It is also known that ( x ∗ , z ∗ , y ∗ ) ∈ H × G × G is a saddle point of the Lagrangian l if and onlyif it is a solution of the following system of primal-dual optimality conditions (cid:26) ∈ ∂f ( x ) + ∇ h ( x ) + A ∗ yAx = z, Ax ∈ ∂g ∗ ( y ) . We recall that the convex subdiﬀerential of the function f : H → R at x ∈ H is deﬁned by ∂f ( x ) = { u ∈ H : f ( x (cid:48) ) − f ( x ) ≥ (cid:104) u, x (cid:48) − x (cid:105) ∀ x (cid:48) ∈ H} , for f ( x ) ∈ R , and by ∂f ( x ) = ∅ , otherwise.A saddle point of the Lagrangian l exists whenever the primal problem (1) has an optimalsolution and the so-called Attouch-Br´ezis regularity condition0 ∈ sqri(dom g − A (dom f ))holds. Here, sqri Q := { x ∈ Q : ∪ λ> λ ( Q − x ) is a closed linear subspace of G} denotes the strong quasi-relative interior of a set Q ⊆ G . We refer the reader to [9, 11, 28] for moreinsights into the world of regularity conditions and convex duality theory.Let S + ( H ) denote the family of continuous linear operators U : H −→ H which are self-adjointand positive semideﬁnite. For U ∈ S + ( H ) we introduce the following seminorm on H : (cid:107) x (cid:107) U = (cid:104) x, U x (cid:105) ∀ x ∈ H . This introduces on S + ( H ) the following partial ordering: for U , U ∈ S + ( H ) U (cid:60) U ⇔ (cid:107) x (cid:107) U ≥ (cid:107) x (cid:107) U ∀ x ∈ H . α > P α ( H ) = { U ∈ S + ( H ) : U (cid:60) αI } , where I : H −→ H , I ( x ) = x, denotes the identity operator on H .The subject of our investigations in this paper will be the following dynamical system, forwhich we will show that it asymptotically approaches the set of solutions of the primal-dual pair ofoptimization problems (1)-(3)  ˙ x ( t ) + x ( t ) ∈ ( ∂f + cA ∗ A + M ( t )) − ( M ( t ) x ( t ) + cA ∗ z ( t ) − A ∗ y ( t ) − ∇ h ( x ( t )))˙ z ( t ) + z ( t ) ∈ ( ∂g + cI + M ( t )) − ( M ( t ) z ( t ) + cA ( γ ˙ x ( t ) + x ( t )) + y ( t ))˙ y ( t ) = cA ( x ( t ) + ˙ x ( t )) − c ( z ( t ) + ˙ z ( t )) x (0) = x ∈ H , z (0) = z ∈ G , y (0) = y ∈ G , (4)where c > γ ∈ [0 , M : [0 , + ∞ ) −→ S + ( H ) and M : [0 , + ∞ ) −→ S + ( G ).One of the motivation for the study of this dynamical system comes from the fact that, as wewill see in Remark 1, it provides through explicit time discretization a numerical algorithm which isa combination of the linearized proximal method of multipliers and the proximal ADMM algorithm.In the next section we will show the existence and uniqueness of strong global solutions for thedynamical system (4) in the framework of the Cauchy-Lipschitz Theorem. In Section 3 we will provesome technical results, which will play an important role in the asymptotic analyis. In Section 4we will investigate the asymptotic behaviour of the trajectories as the time tends to inﬁnity. Bycarrying out a Lyapunov analysis and by relying on the continuous variant of the Opial Lemma,we are able to prove that the trajectories generated by (4) asymptotically convergence to a saddlepoint of the Lagrangian l . Furthermore, we provide convergence rates of O ( t ) for the violation ofthe feasibility condition by ergodic trajectories and the convergence of the objective function alongthese ergodic trajectories to its minimal value.The approach of optimization problems by dynamical systems has a long tradition. Crandall andPazy considered in [20] dynamical systems governed by subdiﬀerential operators (and more generalby maximally monotone operators) in Hilbert spaces, addressed questions like the existence anduniqueness of solution trajectories, and related the latter to the theory of semi-groups of nonlinearcontractions. Br´ezis [14] studied the asymptotic behaviour of the trajectories for dynamical systemsgoverned by convex subdiﬀerentials, and Bruck carried out in [15] a similar analysis for maximallymonotone operators. Dynamical systems deﬁned via resolvent/proximal evaluations of the governingoperators have enjoyed much attention in the last years, as they result by explicit time discretizationin relaxed versions of standard numerical algorithms, with high ﬂexibility and good numericalperformances. Abbas and Attouch introduced in [1] a forward-backward dynamical system, byextending to more general optimization problems an approach proposed by Antipin in [5] and Boltein [10] on a gradient-projected dynamical system associated to the minimization of a smooth convexfunction over a convex closed set. Implicit dynamical systems were considered also in [13] in thecontext of monotone inclusion problems. A dynamical system of forward-backward-forward typewas considered in [7], while, a dynamical system of Douglas-Rachford type was recently introducedin [21].It is important to notice that the approaches mentioned above have been introduced in connec-tion with the study of “simple” monotone inclusion and convex minimization problems. They relyon straightforward splitting strategies and cannot be eﬃciently used when addressing structuredminimization problems, like (1), which need to be addressed from a primal and a dual perspective,thus, require for tools and techniques from the convex duality theory. The dynamical approach we3ntroduce and investigate in this paper is, to our knowledge, the ﬁrst meant to address structuredconvex minimization problems in the spirit of the full splitting paradigm. Remark 1.

The ﬁrst inclusion in (4) can be equivalently written as0 ∈ ∂f ( ˙ x ( t )+ x ( t ))+ cA ∗ A ( ˙ x ( t )+ x ( t ))+ M ( t ) ˙ x ( t ) − ( cA ∗ z ( t ) − A ∗ y ( t ) −∇ h ( x ( t ))) ∀ t ∈ [0 , + ∞ ) , (5)while the second one as0 ∈ ∂g ( ˙ z ( t ) + z ( t )) + c ( ˙ z ( t ) + z ( t )) − cA ( γ ˙ x ( t ) + x ( t )) − y ( t ) + M ( t ) ˙ z ( t ) ∀ t ∈ [0 , + ∞ ) . (6)The explicit discretization of (5) with respect to the time variable t and constant step h k ≡ ∈ c ∂f ( x k +1 ) + A ∗ Ax k +1 + M k c ( x k +1 − x k ) − A ∗ z k + A ∗ c y k + 1 c ∇ h ( x k ) ∀ k ≥ . By convex subdiﬀerential calculus, one can easily see that this can be for every k ≥ ∈ ∂ (cid:32) f ( x ) + (cid:104) x − x k , ∇ h ( x k ) (cid:105) + c (cid:13)(cid:13)(cid:13)(cid:13) Ax − z k + y k c (cid:13)(cid:13)(cid:13)(cid:13) + 12 (cid:107) x − x k (cid:107) M k (cid:33) (cid:12)(cid:12)(cid:12)(cid:12) x = x k +1 and, further, as x k +1 ∈ argmin x ∈H (cid:32) f ( x ) + (cid:104) x − x k , ∇ h ( x k ) (cid:105) + c (cid:13)(cid:13)(cid:13)(cid:13) Ax − z k + y k c (cid:13)(cid:13)(cid:13)(cid:13) + 12 (cid:107) x − x k (cid:107) M k (cid:33) . Similarly, (6) leads for every k ≥ ∈ ∂ (cid:32) g ( z ) + c (cid:13)(cid:13)(cid:13)(cid:13) A ( γx k +1 + (1 − γ ) x k ) − z + y k c (cid:13)(cid:13)(cid:13)(cid:13) + 12 (cid:107) z − z k (cid:107) M k (cid:33) (cid:12)(cid:12)(cid:12)(cid:12) z = z k +1 , which is nothing else than z k +1 ∈ argmin z ∈G (cid:32) g ( z ) + c (cid:13)(cid:13)(cid:13)(cid:13) A ( γx k +1 + (1 − γ ) x k ) − z + y k c (cid:13)(cid:13)(cid:13)(cid:13) + 12 (cid:107) z − z k (cid:107) M k (cid:33) . Here, ( M k ) k ≥ and ( M k ) k ≥ are two operator sequences in S + ( H ) and S + ( G ), respectively.Thus the dynamical system (4) leads through explicit time discretization to a numerical algo-rithm, which, for a starting point ( x , z , y ) ∈ H × G × G , generates a sequence ( x k , z k , y k ) k ≥ forevery k ≥  x k +1 ∈ argmin x ∈H (cid:18) f ( x ) + (cid:104) x − x k , ∇ h ( x k ) (cid:105) + c (cid:13)(cid:13)(cid:13) Ax − z k + y k c (cid:13)(cid:13)(cid:13) + (cid:107) x − x k (cid:107) M k (cid:19) z k +1 ∈ argmin z ∈G (cid:18) g ( z ) + c (cid:13)(cid:13)(cid:13) A ( γx k +1 + (1 − γ ) x k ) − z + y k c (cid:13)(cid:13)(cid:13) + (cid:107) z − z k (cid:107) M k (cid:19) y k +1 = y k + c ( Ax k +1 − z k +1 ) . (7)The algorithm (7) is a combination of the linearized proximal method of multipliers and the proximalADMM algorithm. 4ndeed, in the case when γ = 1, (7) becomes the proximal ADMM algorithm with variablemetrics from [8] (see, also, [12]). If, in addition, h = 0 and the operator sequences ( M k ) k ≥ and( M k ) k ≥ are constant, then (7) becomes the proximal ADMM algorithm investigated in [25, Section3.2] (see, also, [23]). It is known that the proximal ADMM algorithm can be seen as a generalizationof the full splitting primal-dual algorithms of Chambolle-Pock (see [16]) and Condat-Vu (see [19,27]).On the other hand, in the case when γ = 0, (7) becomes an extension of the linearized proximalmethod of multipliers of Chen-Teboulle (see [17], [25, Algorithm 1]).In the following remark we provide a particular choice for the linear maps M and M , whichtransforms (4) into a dynamical system of primal-dual type formulated in the spirit of the fullsplitting paradigm. Remark 2.

For every t ∈ [0 , + ∞ ), deﬁne M ( t ) = 1 τ ( t ) I − cA ∗ A and M ( t ) = 0 , where τ ( t ) > cτ ( t ) (cid:107) A (cid:107) ≤ t ∈ [0 , + ∞ ) be ﬁxed. In this particular setting, (5) is equivalent to (cid:18) τ ( t ) I − cA ∗ A (cid:19) x ( t ) + cA ∗ z ( t ) − A ∗ y ( t ) − ∇ h ( x ( t )) ∈ τ ( t ) ˙ x ( t ) + 1 τ ( t ) x ( t ) + ∂f ( ˙ x ( t ) + x ( t ))and further to˙ x ( t ) + x ( t ) = ( I + τ ( t ) ∂f ) − (( I − cτ ( t ) A ∗ A ) x ( t ) + cτ ( t ) A ∗ z ( t ) − τ ( t ) A ∗ y ( t ) − τ ( t ) ∇ h ( x ( t ))) . In other words,˙ x ( t ) + x ( t ) = prox τ ( t ) f (cid:0) ( I − cτ ( t ) A ∗ A ) x ( t ) + cτ ( t ) A ∗ z ( t ) − τ ( t ) A ∗ y ( t ) − τ ( t ) ∇ h ( x ( t )) (cid:1) , where prox κ : H → H , prox κ ( x ) = argmin y ∈H (cid:26) κ ( y ) + 12 (cid:107) x − y (cid:107) (cid:27) = ( I + ∂κ ) − ( x ) , denotes the proximal point operator of a proper, convex and lower semicontinuous function k : H → R . On the other hand, relation (6) is equivalent to˙ y ( t ) + y ( t ) + c ( γ − A ˙ x ( t ) ∈ ∂g ( ˙ z ( t ) + z ( t )) , hence, ˙ z ( t ) + z ( t ) ∈ ∂g ∗ ( ˙ y ( t ) + y ( t ) + c ( γ − A ˙ x ( t )) . This is further equivalent to A ( γ ˙ x ( t ) + x ( t )) + 1 c y ( t ) ∈ c ˙ y ( t ) + 1 c y ( t ) + ( γ − A ˙ x ( t ) + ∂g ∗ ( ˙ y ( t ) + y ( t ) + c ( γ − A ˙ x ( t ))and further to ˙ y ( t ) + y ( t ) + c ( γ − A ˙ x ( t ) = ( I + c∂g ∗ ) − ( cA ( γ ˙ x ( t ) + x ( t )) + y ( t )) . In other words, ˙ y ( t ) + y ( t ) + c ( γ − A ˙ x ( t ) = prox cg ∗ (cid:0) cA ( γ ˙ x ( t ) + x ( t )) + y ( t ) (cid:1) .  ˙ x ( t ) + x ( t ) = prox τ ( t ) f (cid:0) ( I − cτ ( t ) A ∗ A ) x ( t ) + cτ ( t ) A ∗ z ( t ) − τ ( t ) A ∗ y ( t ) − τ ( t ) ∇ h ( x ( t )) (cid:1) ˙ y ( t ) + y ( t ) + c ( γ − A ˙ x ( t ) = prox cg ∗ (cid:0) cA ( γ ˙ x ( t ) + x ( t )) + y ( t ) (cid:1) ˙ y ( t ) = cA ( x ( t ) + ˙ x ( t )) − c ( z ( t ) + ˙ z ( t )) x (0) = x ∈ H , z (0) = z ∈ G , y (0) = y ∈ G . (8)Let us also mention that when h = 0 and γ = 1 the dynamical system (8) reads  ˙ x ( t ) + x ( t ) = prox τ ( t ) f ( x ( t ) − τ ( t ) A ∗ ( y ( t ) + cAx ( t ) − cz ( t )))˙ y ( t ) + y ( t ) = prox cg ∗ ( y ( t ) + cA ( ˙ x ( t ) + x ( t )))˙ y ( t ) = cA ( x ( t ) + ˙ x ( t )) − c ( z ( t ) + ˙ z ( t )) x (0) = x ∈ H , z (0) = z ∈ G , y (0) = y ∈ G . (9)The explicit time discretization of (9) leads to a numerical algorithm, which, for a starting point( x , z , y ) ∈ H × G × G , generates the sequence ( x k , z k , y k ) k ≥ for every k ≥  x k +1 = prox τ k f (cid:0) x k − τ k A ∗ ( y k + cAx k − cz k ) (cid:1) y k +1 = prox cg ∗ ( y k + cAx k +1 ) y k +1 = y k + c ( Ax k +1 − z k +1 ) . (10)By substituting in the ﬁrst equation of (10) the term cAx k − cz k by y k − y k − , which is allowedaccording to the last equation, one can easily see that (10) is equivalent to the following numericalalgorithm, which, for a starting point ( x , y , y − ) ∈ H × G × G , y = y − , generates the sequence( x k , y k ) k ≥ for every k ≥  x k +1 = prox τ k f (cid:0) x k − τ k A ∗ (2 y k − y k − ) (cid:1) y k +1 = prox cg ∗ ( y k + cAx k +1 ) . (11)For τ k = τ > k ≥

0, (11) is nothing else than the primal-dual algorithm proposed byChambolle and Pock in [16].

Example 1.

In this example we will illustrate via some numerical experiments the way in whichthe parameters γ, c and τ ( t ) , t ∈ [0 , + ∞ ) may inﬂuence the asymptotic convergence of the primaland dual trajectories. In this scope, we considered the following primal optimization probleminf ( x ,x ) ∈ R (cid:107) x (cid:107) + (cid:112) ( x − x ) + ( x + x ) , (12)which is in fact problem (1) written in the following particular setting: H = G = R , f, g, h : R → R , f ( x ) = (cid:107) x (cid:107) , g ( x ) = (cid:107) x (cid:107) , h ( x ) = 0, for every x ∈ R , and A : R → R , A ( x , x ) = ( x − x , x + x ). One can easily see that x = (0 ,

0) is the unique optimal solution of (12) and thatsup (cid:107) ( y ,y ) (cid:107) ≤ , | y + y |≤ , |− y + y |≤

20 40 60 80 100-10-8-6-4-20246810 gamma=0.99, tau*c=0.49 gamma=0.5, tau*c=0.49 gamma=0.1, tau*c=0.49 gamma=0.99, tau*c=0.49 gamma=0.5, tau*c=0.49 gamma=0.1, tau*c=0.49

Figure 1:

First row: the primal trajectory x ( t ) approaching the primal optimal solution (0 ,

0) for τ c = 0 . x = ( − , y ( t ) approaching a dual optimal solutionfor τ c = 0 .

49 and starting point y = ( − , is the Fenchel dual problem of (12). This means that every feasible element of (13) is a dual optimalsolution.We considered the dynamical system (8) attached to the primal-dual pair (12)-(13) with startingpoints x = ( − , z = Ax = ( − ,

0) and y = ( − ,

10) in the case when τ ( t ) = τ > t ∈ [0 , + ∞ ) is a constant function. In order to solve the resulting dynamical system we usedthe Matlab function ode15s and, to this end, we reformulated it as (cid:26) ˙ U ( t ) = Γ( U ( t )) U (0) = ( x , y , z ) , where U ( t ) = ( x ( t ) , y ( t ) , z ( t )) ∈ H × G × G and Γ : H × G × G → H × G × G , Γ( u , u , u ) = ( u , u , u ) , is deﬁned as  u = prox τf ( u − τ A ∗ ( u + cAu − cu )) − u u = prox cg ∗ ( u + cA ( γu + u )) − u − c ( γ − Au u = A ( u + u ) − u − c u .

20 40 60 80 100-10-8-6-4-20246810 gamma=0.99, tau*c=0.25 gamma=0.5, tau*c=0.25 gamma=0.1, tau*c=0.25 gamma=0.99, tau*c=0.25 gamma=0.5, tau*c=0.25 gamma=0.1, tau*c=0.25

Figure 2:

First row: the primal trajectory x ( t ) approaching the primal optimal solution (0 ,

0) for τ c = 0 . x = ( − , y ( t ) approaching a dual optimal solutionfor τ c = 0 .

25 and starting point y = ( − , Notice that A ∗ ( x , x ) = ( x + x , − x + x ) ∀ ( x , x ) ∈ R , prox τf ( x ) = x − τ proj [ − , (cid:18) τ x (cid:19) ∀ x ∈ R , prox cg ∗ ( y ) = (cid:26) y, if (cid:107) y (cid:107) ≤ , (cid:107) y (cid:107) y, otherwise , ∀ y ∈ R , where proj Q denotes the projection operator on a convex and closed set Q ⊆ H .As we will see later in Theorem 12, the asymptotic convergence of the trajectories as the timetends to inﬁnity can be proved when τ c (cid:107) A (cid:107) ≤

1. Since (cid:107) A (cid:107) = √

2, we considered for τ c ∈ (0 , )three diﬀerent choices, namely, τ c = 0 . , .

25 and 0 .

1. The primal and the dual trajectoriesgenerated by the dynamical system for each of these three choices are represented in the ﬁgures1, 2 and 3, respectively. The ﬁrst row of each ﬁgure represents the primal trajectories x ( t ) for γ = 0 . , . .

1, while the second row represents the dual trajectories y ( t ) for the same choicesof the parameter γ .One can see that the parameter γ plays in the dynamical system a regularizing role. Namely,in all three ﬁgures, thus somehow independently of the choice of the parameters τ and c , the con-vergence behaviour of the primal trajectories, which approach the unique primal optimal solution(0 ,

0) are more stable when γ gets closer to 0. For the dual trajectories we can observe a reversephenomenon. Namely, in all three ﬁgures, thus also independently of the choice of the other param-eters, the dual trajectories, which approach a dual optimal solution, are more stable when γ getscloser to 1. 8

20 40 60 80 100-10-8-6-4-20246810 gamma=0.99, tau*c=0.1 gamma=0.5, tau*c=0.1 gamma=0.1, tau*c=0.1 gamma=0.99, tau*c=0.1 gamma=0.5, tau*c=0.1 gamma=0.1, tau*c=0.1

Figure 3:

First row: the primal trajectory x ( t ) approaching the primal optimal solution (0 ,

0) for τ c = 0 . x = ( − , y ( t ) approaching a dual optimal solutionfor τ c = 0 . y = ( − , Notations.

The following two functions will play an important role in particular in the forthcominganalysis F : [0 , + ∞ ) × H −→ R , F ( t, x ) = f ( x ) + c (cid:107) Ax (cid:107) − (cid:107) x (cid:107) ) + 12 (cid:107) x (cid:107) M ( t ) , and G : [0 , + ∞ ) × G −→ R , G ( t, z ) = g ( z ) + 12 (cid:107) z (cid:107) M ( t ) . With these two notations, the dynamical system (4) can be rewritten as  ˙ x ( t ) + x ( t ) ∈ argmin x ∈H (cid:16) F ( t, x ) + c (cid:13)(cid:13) x − (cid:0) c M ( t ) x ( t ) + A ∗ z ( t ) − A ∗ c y ( t ) − c ∇ h ( x ( t )) (cid:1)(cid:13)(cid:13) (cid:17) ˙ z ( t ) + z ( t ) = argmin z ∈G (cid:16) G ( t, z ) + c (cid:13)(cid:13) z − (cid:0) c M ( t ) z ( t ) + A ( γ ˙ x ( t ) + x ( t )) + c y ( t ) (cid:1)(cid:13)(cid:13) (cid:17) ˙ y ( t ) = cA ( x ( t ) + ˙ x ( t )) − c ( z ( t ) + ˙ z ( t )) x (0) = x ∈ H , y (0) = y ∈ G , z (0) = z ∈ G . (14)Let t ∈ [0 , + ∞ ) be ﬁxed. The function G ( t, · ) is proper, convex and lower semicontinuous, hence z → G ( t, z ) + c (cid:107) z − v (cid:107) is proper, strongly convex and lower semicontinuous for every v ∈ G . This allowsus to use the sign equal in the second relation of (14). On the other hand, a suﬃcient condition whichguarantees that the function x (cid:55)→ F ( x, t ) + c (cid:13)(cid:13) x − (cid:0) c M ( t ) x ( t ) + A ∗ z ( t ) − A ∗ c y ( t ) − c ∇ h ( x ( t )) (cid:1)(cid:13)(cid:13) ,which is proper and lower semicontinuous, is strongly convex is that there exists α ( t ) > cA ∗ A + M ( t ) ∈ P α ( t ) ( H ). This actually ensures that x → F ( t, x ) + c (cid:107) x − u (cid:107) is proper, stronglyconvex and lower semicontinuous for every u ∈ H .9his means that if the assumption( Cweak ) for every t ∈ [0 , + ∞ ) there exists α ( t ) > cA ∗ A + M ( t ) ∈ P α ( t ) ( H )holds, then we can use also in the ﬁrst relation of (14) the sign equal. It is easy to see, that,if ( Cweak ) holds, then ∂f + cA ∗ A + M ( t ) is α ( t )-strongly monotone for every t ∈ [0 , + ∞ ). Inother words, for every t ∈ [0 , + ∞ ), all u, v ∈ H and all u ∗ ∈ ( ∂f + cA ∗ A + M ( t ))( u ) , x ∗ ∈ ( ∂f + cA ∗ A + M ( t ))( x ) we have (cid:104) u ∗ − x ∗ , u − x (cid:105) ≥ α ( t ) (cid:107) u − x (cid:107) . Notice that, since A ∗ A ∈ S + ( H ) and M ( t ) ∈ S + ( H ) for every t ∈ [0 , + ∞ ), ( Cweak ) is fulﬁlled, iffor every t ∈ [0 , + ∞ ) there exists α ( t ) > M ( t ) ∈ P α ( t ) ( H ) (15)or, if there exists α > A ∗ A ∈ P α ( H ) . (16)Notice also that, if H is a ﬁnite dimensional Hilbert space, then (16), which is independent of t , isnothing else than A ∗ A is positively deﬁnite or, equivalently, A is injective.Let S = { x ∈ H : (cid:107) x (cid:107) = 1 } be the unit sphere of H . Assumption ( Cweak ) is fulﬁlled ifand only if inf x ∈ S (cid:107) x (cid:107) cA ∗ A + M ( t ) > t ∈ [0 , + ∞ ). In this case we can take α ( t ) :=inf x ∈ S (cid:107) x (cid:107) cA ∗ A + M ( t ) for every t ∈ [0 , + ∞ ). In this section we will investigate the existence and uniqueness of the trajectories generated by (4).We start by recalling the deﬁnition of a locally absolutely continuous map.

Deﬁnition 1.

A function x : [0 , + ∞ ) → H is said to be locally absolutely continuous, if it isabsolutely continuous on every interval [0 , T ] , T > ; that is, for every T > there exists anintegrable function y : [0 , T ] → H such that x ( t ) = x (0) + (cid:90) t y ( s ) ds ∀ t ∈ [0 , T ] . Remark 3. (a) Every absolutely continuous function is diﬀerentiable almost everywhere, its deriva-tive coincides with its distributional derivative almost everywhere and one can recover the functionfrom its derivative ˙ x = y by the above integration formula.(b) Let be T > x : [0 , T ] → H an absolutely continuous function. This is equivalentto (see [2, 6]): for every ε > η > I k = ( a k , b k ) ⊆ [0 , T ] the following property holds:for any subfamily of disjoint intervals I j with (cid:88) j | b j − a j | < η it holds (cid:88) j (cid:107) x ( b j ) − x ( a j ) (cid:107) < ε. From this characterization it is easy to see that, if B : H → H is L -Lipschitz continuous with L ≥ z = B ◦ x is absolutely continuous, too. This means that z is diﬀerentiable almosteverywhere and (cid:107) ˙ z ( · ) (cid:107) ≤ L (cid:107) ˙ x ( · ) (cid:107) holds almost everywhere.The following deﬁnition speciﬁes which type of solutions we consider in the analysis of thedynamical system (4). 10 eﬁnition 2. Let ( x , z , y ) ∈ H × G × G , c > , γ ∈ [0 , , and M : [0 , + ∞ ) → S + ( H ) and M : [0 , + ∞ ) → S + ( G ) . We say that the function ( x, z, y ) : [0 , + ∞ ) −→ H × G × G is a strong globalsolutions of (4) , if the following properties are satisﬁed:(i) the functions x, z, y are locally absolutely continuous;(ii) for almost every t ∈ [0 , + ∞ )˙ x ( t ) + x ( t ) ∈ ( ∂f + cA ∗ A + M ( t )) − ( M ( t ) x ( t ) + cA ∗ z ( t ) − A ∗ y ( t ) − ∇ h ( x ( t ))) , ˙ z ( t ) + z ( t ) ∈ ( ∂g + c Id + M ( t )) − ( M ( t ) z ( t ) + cA ( γ ˙ x ( t ) + x ( t )) + y ( t ))˙ y ( t ) = cA ( x ( t ) + ˙ x ( t )) − c ( z ( t ) + ˙ z ( t )); (iii) x (0) = x , z (0) = z , and y (0) = y . The following results will be useful in the proof of the existence and uniqueness theorem.

Lemma 1.

Assume that ( Cweak ) holds. Then, for every ﬁxed t ∈ [0 , + ∞ ) , the operator S t : H −→ H , S t ( u ) = argmin x ∈H (cid:16) F ( t, x ) + c (cid:107) x − u (cid:107) (cid:17) , is Lipschitz continuous.Proof. Let t ∈ [0 , + ∞ ) be ﬁxed and u, v ∈ H . By subdiﬀerential calculus we obtain that cu ∈ ∂f ( S t ( u )) + (cid:0) cA ∗ A + M ( t ) (cid:1) ( S t ( u ))and cv ∈ ∂f ( S t ( v )) + (cid:0) cA ∗ A + M ( t ) (cid:1) ( S t ( v )) . Using that, due to (

Cweak ), ∂f + cA ∗ A + M ( t ) is α ( t )-strongly monotone, we get α ( t ) (cid:107) S t u − S t v (cid:107) ≤ c (cid:104) u − v, S t ( u ) − S t ( v ) (cid:105) . By the Cauchy-Schwartz inequality we obtain (cid:107) S t u − S t v (cid:107) ≤ cα ( t ) (cid:107) u − v (cid:107) , which shows that S t is Lipschitz continuous with constant cα ( t ) .A stronger variant of condition ( Cweak ) reads(

Cstrong ) there exists α > cA ∗ A + M ( t ) ∈ P α ( H ) ∀ t ∈ [0 , + ∞ ) . Obviously, if (

Cstrong ) holds, then (

Cweak ) holds with α ( t ) := α > t ∈ [0 , + ∞ ). Inthis case, for every t ∈ [0 , + ∞ ) the operator S t in the lemma above is Lipschitz continuous withconstant cα .Now we are going to prove another technical result which will be used in the proof of the maintheorem of this section. 11 emma 2. Assume that ( Cweak ) holds. Let be ( x, z, y ) ∈ H × G × G and the maps R ( x,z,y ) :[0 , + ∞ ) −→ H ,R ( x,z,y ) ( t ) = argmin u ∈H (cid:32) F ( t, u ) + c (cid:13)(cid:13)(cid:13)(cid:13) u − (cid:18) c M ( t ) x + A ∗ z − c A ∗ y − c ∇ h ( x ) (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) (cid:33) − x, and Q ( x,z,y ) : [0 , + ∞ ) −→ G ,Q ( x,z,y ) ( t ) = argmin v ∈G (cid:32) G ( t, v ) + c (cid:13)(cid:13)(cid:13)(cid:13) v − (cid:18) c M ( t ) z + A (cid:0) γR ( x,z,y ) ( t ) + x (cid:1) + 1 c y (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) (cid:33) − z. Then the following statements are true for every t, r ∈ [0 , + ∞ ) :(i) (cid:107) R ( x,z,y ) ( t ) − R ( x,z,y ) ( r ) (cid:107) ≤ (cid:107) R ( x,z,y ) ( r ) (cid:107) cα ( t ) (cid:107) M ( t ) − M ( r ) (cid:107) ; (ii) (cid:107) Q ( x,z,y ) ( t ) − Q ( x,z,y ) ( r ) (cid:107) ≤ (cid:107) Q ( x,z,y ) ( r ) (cid:107) c (cid:107) M ( t ) − M ( r ) (cid:107) + γ (cid:107) A (cid:107)(cid:107) R ( x,z,y ) ( r ) (cid:107) cα ( t ) (cid:107) M ( t ) − M ( r ) (cid:107) . Proof.

Let t, r ∈ [0 , + ∞ ) be ﬁxed.(i) From the deﬁnition of R ( x,z,y ) one has M ( t ) x + cA ∗ z − A ∗ y − ∇ h ( x ) ∈ ∂f ( R ( x,z,y ) ( t ) + x ) + (cid:0) cA ∗ A + M ( t ) (cid:1) ( R ( x,z,y ) ( t ) + x )and M ( r ) x + cA ∗ z − A ∗ y − ∇ h ( x ) ∈ ∂f ( R ( x,z,y ) ( r ) + x ) + (cid:0) cA ∗ A + M ( r ) (cid:1) ( R ( x,z,y ) ( r ) + x ) , which is equivalent to M ( t )( R ( x,z,y ) ( r ) + x ) − M ( r )( R ( x,z,y ) ( r )) + cA ∗ z − A ∗ y − ∇ h ( x ) ∈ ∂f ( R ( x,z,y ) ( r ) + x ) + (cid:0) cA ∗ A + M ( t ))( R ( x,z,y ) ( r ) + x ) . Using again that ∂f + cA ∗ A + M ( t ) is α ( t )-strongly monotone for every t ∈ [0 , + ∞ ), we obtain (cid:104) M ( t )( R ( x,z,y ) ( r )) − M ( r )( R ( x,z,y ) ( r )) , R ( x,z,y ) ( r ) − R ( x,z,y ) ( t ) (cid:105) ≥ α ( t ) (cid:107) R ( x,z,y ) ( r ) − R ( x,z,y ) ( t ) (cid:107) . The conclusion follows via the Cauchy-Schwartz inequality.(ii) From the deﬁnition of Q ( x,z,y ) one has M ( t ) z + cA ( γR ( x,z,y ) ( t ) + x ) + y ∈ ∂g ( Q ( x,z,y ) ( t ) + z ) + (cid:0) M ( t ) + cI (cid:1) ( Q ( x,z,y ) ( t ) + z )and M ( r ) z + cA ( γR ( x,z,y ) ( r ) + x ) + y ∈ ∂g ( Q ( x,z,y ) ( r ) + z ) + (cid:0) M ( r ) + cI (cid:1) ( Q ( x,z,y ) ( r ) + z ) , which is equivalent to − M ( r )( Q ( x,z,y ) ( r )) + M ( t )( Q ( x,z,y ) ( r ) + z ) + cA ( γR ( x,z,y ) ( r ) + x ) + y ∈ ∂g ( Q ( x,z,y ) ( r ) + z ) + (cid:0) M ( t ) + cI (cid:1) ( Q ( x,z,y ) ( r ) + z ) . Using that ∂g + M ( t ) + cI is c − strongly monotone, we obtain (cid:104) M ( t )( Q ( x,z,y ) ( r )) − M ( r )( Q ( x,z,y ) ( r )) + cγA ( R ( x,z,y ) ( r ) − R ( x,z,y ) ( t )) , Q ( x,z,y ) ( r ) − Q ( x,z,y ) ( t ) (cid:105) ≥ c (cid:107) Q ( x,z,y ) ( r ) − Q ( x,z,y ) ( t ) (cid:107) . From the Cauchy-Schwartz inequality and (i) it follows (cid:107) Q ( x,z,y ) ( r ) − Q ( x,z,y ) ( t ) (cid:107) ≤ (cid:107) Q ( x,z,y ) ( r ) (cid:107) c (cid:107) M ( t ) − M ( r ) (cid:107) + γ (cid:107) A (cid:107)(cid:107) R ( x,z,y ) ( r ) (cid:107) cα ( t ) (cid:107) M ( t ) − M ( r ) (cid:107) . Theorem 3.

Assume that ( Cstrong ) holds, and M ∈ L loc ([0 , + ∞ ) , H ) and M ∈ L loc ([0 , + ∞ ) , G ) ,namely, t −→ (cid:107) M ( t ) (cid:107) and t −→ (cid:107) M ( t ) (cid:107) are integrable on [0 , T ] for every T > . Then, for every starting points ( x , z , y ) ∈ H × G × G ,the dynamical system (4) has a unique strong global solution ( x, z, y ) : [0 , + ∞ ) −→ H × G × G . Proof.

Denoting U ( t ) = ( x ( t ) , z ( t ) , y ( t )), the dynamical system (4) can be rewritten as (cid:26) ˙ U ( t ) = Γ( t, U ( t )) U (0) = ( x , z , y ) , (17)where Γ : [0 , + ∞ ) × H × G × G −→ H × G × G , Γ( t, x, z, y ) = ( u, v, w ) , is deﬁned as u = u ( t, x, z, y ) = argmin a ∈H (cid:32) F ( t, a ) + c (cid:13)(cid:13)(cid:13)(cid:13) a − (cid:18) c M ( t ) x + A ∗ z − c A ∗ y − c ∇ h ( x ) (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) (cid:33) − xv = v ( t, x, z, y ) = argmin b ∈G (cid:32) G ( t, b ) + c (cid:13)(cid:13)(cid:13)(cid:13) b − (cid:18) c M ( t ) z + A ( γu + x ) + 1 c y (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) (cid:33) − z = prox c G ( t, · ) (cid:18) c M ( t ) z + A ( γu + x ) + 1 c y (cid:19) − z,w = w ( t, x, z, y ) = cA ( u + x ) − c ( v + z ) . The existence and uniqueness of a strong global solution follows according to the Cauchy-Lipschitz-Picard Theorem, if we show: (1) that Γ( t, · , · , · ) is L ( t )-Lipschitz continuous for every t ∈ [0 , + ∞ )and the Lipschitz constant as a function of time has the property that L ( · ) ∈ L loc ([0 , + ∞ ) , R ); (2)that Γ( · , x, z, y ) ∈ L loc ([0 , + ∞ ) , H × G × G ) for every ( x, z, y ) ∈ H × G × G .(1) Let t ∈ [0 , + ∞ ) be ﬁxed and consider ( x, z, y ) , ( x, z, y ) ∈ H × G × G . We have (cid:107) Γ( t, x, z, y ) − Γ( t, x, z, y ) (cid:107) = (cid:112) (cid:107) u − u (cid:107) + (cid:107) v − v (cid:107) + (cid:107) w − w (cid:107) , where (see Lemma 1) u − u = argmin a ∈H (cid:32) F ( t, a ) + c (cid:13)(cid:13)(cid:13)(cid:13) a − (cid:18) c M ( t ) x + A ∗ z − c A ∗ y − c ∇ h ( x ) (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) (cid:33) − argmin a ∈H (cid:32) F ( t, a ) + c (cid:13)(cid:13)(cid:13)(cid:13) a − (cid:18) c M ( t ) x + A ∗ z − c A ∗ y − c ∇ h ( x ) (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) (cid:33) + x − x = S t (cid:18) c M ( t ) x + A ∗ z − c A ∗ y − c ∇ h ( x ) (cid:19) − S t (cid:18) c M ( t ) x + A ∗ z − c A ∗ y − c ∇ h ( x ) (cid:19) + x − x. (cid:107) u − u (cid:107) ≤ (cid:13)(cid:13)(cid:13)(cid:13) S t (cid:18) M ( t ) c x + A ∗ z − A ∗ c y − c ∇ h ( x ) (cid:19) − S t (cid:18) M ( t ) c x + A ∗ z − A ∗ c y − c ∇ h ( x ) (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) + 2 (cid:107) x − x (cid:107) . Using Lemma 1 and taking into account that (

Cstrong ) is fulﬁlled, which means that the Lipschitzconstant of the operator S t is cα , it follows (cid:107) u − u (cid:107) ≤ c α (cid:13)(cid:13)(cid:13)(cid:13) c M ( t )( x − x ) + A ∗ ( z − z ) − c A ∗ ( y − y ) − c ( ∇ h ( x ) − ∇ h ( x )) (cid:13)(cid:13)(cid:13)(cid:13) + 2 (cid:107) x − x (cid:107) ≤ c α (cid:18) (cid:107) M ( t ) (cid:107) c (cid:107) x − x (cid:107) + 4 (cid:107) A (cid:107) (cid:107) z − z (cid:107) + 4 (cid:107) A (cid:107) c (cid:107) y − y (cid:107) + 4 c (cid:107)∇ h ( x ) − ∇ h ( x ) (cid:107) (cid:19) + 2 (cid:107) x − x (cid:107) ≤ (cid:18) (cid:107) M ( t ) (cid:107) + 4 L h α + 1 (cid:19) (cid:107) x − x (cid:107) + 8 c α (cid:107) A (cid:107) (cid:107) z − z (cid:107) + 8 α (cid:107) A (cid:107) (cid:107) y − y (cid:107) . By taking into account the nonexpansiveness of the proximal operator and that γ ∈ [0 , (cid:107) v − v (cid:107) ≤ (cid:13)(cid:13)(cid:13)(cid:13) prox c G ( t, · ) (cid:18) c M ( t ) z + A ( γu + x ) + 1 c y (cid:19) − prox c G ( t, · ) (cid:18) c M ( t ) z + A ( γu + x ) + 1 c y (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) + 2 (cid:107) z − z (cid:107) ≤ (cid:13)(cid:13)(cid:13)(cid:13) c M ( t )( z − z ) + A ( γ ( u − u ) + x − x ) + 1 c ( y − y ) (cid:13)(cid:13)(cid:13)(cid:13) + 2 (cid:107) z − z (cid:107) ≤ (cid:107) M ( t ) (cid:107) c (cid:107) z − z (cid:107) + 8 γ (cid:107) A (cid:107) (cid:107) u − u (cid:107) + 8 (cid:107) A (cid:107) (cid:107) x − x (cid:107) + 8 c (cid:107) y − y (cid:107) + 2 (cid:107) z − z (cid:107) ≤ (cid:107) A (cid:107) (cid:18) (cid:107) M ( t ) (cid:107) + 8 L h α + 3 (cid:19) (cid:107) x − x (cid:107) + (cid:18) c α (cid:107) A (cid:107) + 8 (cid:107) M ( t ) (cid:107) c + 2 (cid:19) (cid:107) z − z (cid:107) + (cid:18) α (cid:107) A (cid:107) + 8 c (cid:19) (cid:107) y − y (cid:107) . Finally, (cid:107) w − w (cid:107) = (cid:107) cA ( u − u + x − x ) − c ( v − v + z − z ) (cid:107) ≤ c (cid:107) A (cid:107) (cid:107) u − u (cid:107) + 4 c (cid:107) A (cid:107) (cid:107) x − x (cid:107) + 4 c (cid:107) v − v (cid:107) + 4 c (cid:107) z − z (cid:107) ≤ c (cid:107) A (cid:107) (cid:18) (cid:107) M ( t ) (cid:107) + 8 L h α + 3 (cid:19) (cid:107) x − x (cid:107) + 4 (cid:18) c α (cid:107) A (cid:107) + 8 (cid:107) M ( t ) (cid:107) + 3 c (cid:19) (cid:107) z − z (cid:107) + 32 (cid:18) c α (cid:107) A (cid:107) + 1 (cid:19) (cid:107) y − y (cid:107) . Consequently, (cid:107) Γ( t, x, z, y ) − Γ( t, x, z, y ) (cid:107) ≤ (cid:112) L ( t ) (cid:107) x − x (cid:107) + L ( t ) (cid:107) z − z (cid:107) + L ( t ) (cid:107) y − y (cid:107) (cid:107)≤ (cid:112) L ( t ) + L ( t ) + L ( t ) (cid:112) (cid:107) x − x (cid:107) + (cid:107) z − z (cid:107) + (cid:107) y − y (cid:107) = L ( t ) (cid:107) ( x, z, y ) − ( x, z, y ) (cid:107) , where L ( t ) = (cid:112) L ( t ) + L ( t ) + L ( t )14nd L ( t ) = 2 (cid:18) (cid:107) M ( t ) (cid:107) + 4 L h α + 1 (cid:19) + 8 (cid:107) A (cid:107) (cid:18) (cid:107) M ( t ) (cid:107) + 8 L h α + 3 (cid:19) + 36 c (cid:107) A (cid:107) (cid:18) (cid:107) M ( t ) (cid:107) + 8 L h α + 3 (cid:19) ,L ( t ) = 8 c α (cid:107) A (cid:107) + (cid:18) c α (cid:107) A (cid:107) + 8 (cid:107) M ( t ) (cid:107) c + 2 (cid:19) + 4 (cid:18) c α (cid:107) A (cid:107) + 8 (cid:107) M ( t ) (cid:107) + 3 c (cid:19) ,L ( t ) = 8 α (cid:107) A (cid:107) + (cid:18) α (cid:107) A (cid:107) + 8 c (cid:19) + 32 (cid:18) c α (cid:107) A (cid:107) + 1 (cid:19) , which means that Γ( t, · , · , · ) is L ( t )-Lipschitz continuous. Since M ∈ L loc ([0 , + ∞ ) , H ) and M ∈ L loc ([0 , + ∞ ) , G ), it is obvious that L ( · ) ∈ L loc ([0 , + ∞ ) , R ).(2) Now we will show that Γ( · , x, z, y ) ∈ L loc ([0 , + ∞ ) , H × G × G ) for every ( x, z, y ) ∈ H × G × G .Let ( x, z, y ) ∈ H × G × G be ﬁxed and T >

0. We have (cid:90) T (cid:107) Γ( t, x, z, y ) (cid:107) dt = (cid:90) T (cid:112) (cid:107) u ( t, x, z, y ) (cid:107) + (cid:107) v ( t, x, z, y ) (cid:107) + (cid:107) w ( t, x, z, y ) (cid:107) dt. By Lemma 2 and taking into account that α ( t ) = α > t ∈ [0 , + ∞ ) and γ ∈ [0 , t ∈ [0 , + ∞ ) that (cid:107) u ( t, x, z, y ) (cid:107) ≤ (cid:107) u ( t, x, z, y ) − u (0 , x, z, y ) (cid:107) + 2 (cid:107) u (0 , x, z, y ) (cid:107) ≤ (cid:107) u (0 , x, z, y ) (cid:107) c α (cid:107) M ( t ) − M (0) (cid:107) + 2 (cid:107) u (0 , x, z, y ) (cid:107) , (cid:107) v ( t, x, z, y ) (cid:107) ≤ (cid:107) v ( t, x, z, y ) − v (0 , x, z, y ) (cid:107) + 2 (cid:107) v (0 , x, z, y ) (cid:107) ≤ (cid:107) v (0 , x, z, y ) (cid:107) c (cid:107) M ( t ) − M (0) (cid:107) + 4 (cid:107) A (cid:107) (cid:107) u (0 , x, z, y ) (cid:107) c α (cid:107) M ( t ) − M (0) (cid:107) + 2 (cid:107) v (0 , x, z, y ) (cid:107) and (cid:107) w ( t, x, z, y ) (cid:107) = c (cid:107) ( Au ( t, x, z, y ) + x ) − ( v ( t, x, z, y ) + z ) (cid:107) ≤ c ( (cid:107) A (cid:107) (cid:107) u ( t, x, z, y ) (cid:107) + (cid:107) v ( t, x, z, y ) (cid:107) + (cid:107) x − z (cid:107) ) ≤ (cid:107) A (cid:107) (cid:107) u (0 , x, z, y ) (cid:107) α (cid:107) M ( t ) − M (0) (cid:107) + 12 (cid:107) v (0 , x, z, y ) (cid:107) (cid:107) M ( t ) − M (0) (cid:107) + 3 c (cid:0) (cid:107) A (cid:107) (cid:107) u (0 , x, z, y ) (cid:107) + 2 (cid:107) v (0 , x, z, y ) (cid:107) + (cid:107) x − z (cid:107) (cid:1) . Since M ∈ L loc ([0 , + ∞ ) , H ) and M ∈ L loc ([0 , + ∞ ) , G ), it follows that the integral (cid:90) T (cid:107) Γ( t, x, z, y ) (cid:107) dt exists and it is ﬁnite, in other words, Γ( · , x, z, y ) ∈ L loc ([0 , + ∞ ) , H × G × G ).Consequently, the dynamical system (17) has a unique locally absolutely continuous solution,which means that the dynamical system (14) has a unique strong global solution.15

Some technical results

In this section we will prove some technical results which will be useful in the asymptotic anal-ysis of the dynamical system (4). We endow the real linear space L ( H ) := { A : H −→ H : A is linear and continuous } with the norm (cid:107) A (cid:107) = sup (cid:107) x (cid:107)≤ (cid:107) Ax (cid:107) . If A ∈ L ( H ) is self-adjoint, then it holds (see [29, Lemma 3.2.4 iv)]) (cid:107) A (cid:107) = sup (cid:107) x (cid:107)≤ |(cid:104) Ax, x (cid:105)| . Deﬁnition 3.

We say that the map M : [0 , + ∞ ) −→ L ( H ) , t −→ M ( t ) , is derivable at t ∈ [0 , + ∞ ) ,if the limit lim h −→ M ( t + h ) − M ( t ) h taken with respect to the norm topology of L ( H ) exists. In this case we denote by ˙ M ( t ) ∈ L ( H ) thevalue of this limit. If ˙ M ( t ) exists, for t ∈ [0 , + ∞ ), then one can easily see that˙ M ( t ) x = lim h −→ M ( t + h ) x − M ( t ) xh for every x ∈ H . According to Remark 3, if M is locally absolutely continuous then ˙ M ( t ) exists for almost every t ∈ [0 , + ∞ ) . Assume now that M ( t ) ∈ L ( H ) is self-adjoint for every t ∈ [0 , + ∞ ) and that it is derivable at t ∈ [0 , + ∞ ). For all x, u ∈ H we have (cid:104) ˙ M ( t ) x, u (cid:105) = (cid:28) lim h −→ M ( t + h ) x − M ( t ) xh , u (cid:29) = lim h −→ (cid:28) M ( t + h ) x − M ( t ) xh , u (cid:29) = lim h −→ (cid:28) x, M ( t + h ) u − M ( t ) uh (cid:29) = (cid:104) x, ˙ M ( t ) u (cid:105) , which shows that ˙ M ( t ) is also self-adjoint. Lemma 4.

Let M : [0 , + ∞ ) −→ L ( H ) , t −→ M ( t ) , be derivable at t ∈ [0 , + ∞ ) , and let the maps x, y : [0 , + ∞ ) −→ H be also derivable at t . Then the real function t −→ (cid:104) M ( t ) x ( t ) , y ( t ) (cid:105) is derivableat t and one has ddt (cid:104) M ( t ) x ( t ) , y ( t ) (cid:105) (cid:12)(cid:12) t = t = (cid:104) ˙ M ( t ) x ( t ) , y ( t ) (cid:105) + (cid:104) M ( t ) ˙ x ( t ) , y ( t ) (cid:105) + (cid:104) M ( t ) x ( t ) , ˙ y ( t ) (cid:105) . Proof.

We have ddt M ( t ) x ( t ) (cid:12)(cid:12) t = t = lim h −→ M ( t + h ) x ( t + h ) − M ( t ) x ( t ) h = lim h −→ M ( t + h ) (cid:18) x ( t + h ) − x ( t ) h (cid:19) + lim h −→ M ( t + h ) x ( t ) − M ( t ) x ( t ) h = M ( t ) ˙ x ( t ) + ˙ M ( t ) x ( t ) . ddt (cid:104) M ( t ) x ( t ) , y ( t ) (cid:105) (cid:12)(cid:12) t = t = (cid:28) ddt M ( t ) x ( t ) (cid:12)(cid:12) t = t , y ( t ) (cid:29) + (cid:104) M ( t ) x ( t ) , ˙ y ( t ) (cid:105) = (cid:104) M ( t ) ˙ x ( t ) , y ( t ) (cid:105) + (cid:104) ˙ M ( t ) x ( t ) , y ( t ) (cid:105) + (cid:104) M ( t ) x ( t ) , ˙ y ( t ) (cid:105) . The main result of this section follows.

Lemma 5.

Assume that ( Cstrong ) holds and that the maps M : [0 , + ∞ ) −→ S + ( H ) and M :[0 , + ∞ ) −→ S + ( G ) are locally absolutely continuous. For a given starting point ( x , z , y ) ∈ H ×G × G , let ( x, z, y ) : [0 , + ∞ ) −→ H × G × G be the unique strong global solution of the dynamicalsystem (4) . Then t −→ ( ˙ x ( t ) , ˙ z ( t ) , ˙ y ( t )) is locally absolutely continuous, hence (¨ x ( t ) , ¨ z ( t ) , ¨ y ( t )) exists for almost every t ∈ [0 , + ∞ ) .In addition, if sup t ≥ (cid:107) M ( t ) (cid:107) < + ∞ and sup t ≥ (cid:107) M ( t ) (cid:107) < + ∞ , then there exists L > suchthat (cid:107) ¨ x ( t ) (cid:107) + (cid:107) ¨ z ( t ) (cid:107) + (cid:107) ¨ y ( t ) (cid:107) ≤ L (cid:0) (cid:107) ˙ x ( t ) (cid:107) + (cid:107) ˙ z ( t ) (cid:107) + (cid:107) ˙ y ( t ) (cid:107) + (cid:107) ˙ M ( t ) (cid:107)(cid:107) ˙ x ( t ) (cid:107) + (cid:107) ˙ M ( t ) (cid:107)(cid:107) ˙ z ( t ) (cid:107) (cid:1) for almost every t ∈ [0 , + ∞ ) . Proof.

Let

T > t, r ∈ [0 , T ] be ﬁxed. We have (cid:107) ˙ U ( t ) − ˙ U ( r ) (cid:107) = (cid:107) Γ( t, U ( t )) − Γ( r, U ( r )) (cid:107) ≤ (cid:107) Γ( t, U ( t )) − Γ( t, U ( r )) (cid:107) + (cid:107) Γ( t, U ( r )) − Γ( r, U ( r )) (cid:107)≤ (cid:107) u ( t, x ( t ) , z ( t ) , y ( t )) − u ( t, x ( r ) , z ( r ) , y ( r )) (cid:107) + (cid:107) v ( t, x ( t ) , z ( t ) , y ( t )) − v ( t, x ( r ) , z ( r ) , y ( r )) (cid:107) + (cid:107) w ( t, x ( t ) , z ( t ) , y ( t )) − w ( t, x ( r ) , z ( r ) , y ( r )) (cid:107) + (cid:107) u ( t, x ( r ) , z ( r ) , y ( r )) − u ( r, x ( r ) , z ( r ) , y ( r )) (cid:107) + (cid:107) v ( t, x ( r ) , z ( r ) , y ( r )) − v ( r, x ( r ) , z ( r ) , y ( r )) (cid:107) + (cid:107) w ( t, x ( r ) , z ( r ) , y ( r )) − w ( r, x ( r ) , z ( r ) , y ( r )) (cid:107) . Since u ( t, x ( t ) , z ( t ) , y ( t )) − u ( t, x ( r ) , z ( r ) , y ( r )) = S t (cid:18) c M ( t ) x ( t ) + A ∗ z ( t ) − c A ∗ y ( t ) − c ∇ h ( x ( t )) (cid:19) − S t (cid:18) c M ( t ) x ( r ) + A ∗ z ( r ) − c A ∗ y ( r ) − c ∇ h ( x ( r )) (cid:19) − x ( t ) + x ( r ) , according to Lemma 1, we get (cid:107) u ( t, x ( t ) , z ( t ) , y ( t )) − u ( t, x ( r ) , z ( r ) , y ( r )) (cid:107)≤ (cid:18) (cid:107) M ( t ) (cid:107) α + L h α + 1 (cid:19) (cid:107) x ( t ) − x ( r ) (cid:107) + cα (cid:107) A (cid:107)(cid:107) z ( t ) − z ( r ) (cid:107) + (cid:107) A (cid:107) α (cid:107) y ( t ) − y ( r ) (cid:107) . Since t −→ (cid:107) M ( t ) (cid:107) is bounded on [0 , T ] , there exists L := L ( T ) > (cid:107) u ( t, x ( t ) , z ( t ) , y ( t )) − u ( t, x ( r ) , z ( r ) , y ( r )) (cid:107) ≤ L ( (cid:107) x ( t ) − x ( r ) (cid:107) + (cid:107) z ( t ) − z ( r ) (cid:107) + (cid:107) y ( t ) − y ( r ) (cid:107) ) . (18)17imilarly, since v ( t, x ( t ) , z ( t ) , y ( t )) − v ( t, x ( r ) , z ( r ) , y ( r ))= prox c G ( t, · ) (cid:18) c M ( t ) z ( t ) + A ( γu ( t, x ( t ) , z ( t ) , y ( t )) + x ( t )) + 1 c y ( t ) (cid:19) − prox c G ( t, · ) (cid:18) c M ( t ) z ( r ) + A ( γu ( t, x ( r ) , z ( r ) , y ( r )) + x ( r )) + 1 c y ( r ) (cid:19) − z ( t ) + z ( r ) , by the nonexpansiveness of the proximal operator we get (cid:107) v ( t, x ( t ) , z ( t ) , y ( t )) − v ( t, x ( r ) , z ( r ) , y ( r )) (cid:107)≤ (cid:18) (cid:107) M ( t ) (cid:107) c + 1 (cid:19) (cid:107) z ( t ) − z ( r ) (cid:107) + (cid:107) A (cid:107)(cid:107) x ( t ) − x ( r ) (cid:107) + 1 c (cid:107) y ( t ) − y ( r ) (cid:107) + γ (cid:107) A (cid:107)(cid:107) u ( t, x ( t ) , z ( t ) , y ( t )) − u ( t, x ( r ) , z ( r ) , y ( r )) (cid:107) . Since t −→ (cid:107) M ( t ) (cid:107) is bounded on [0 , T ] , by taking into consideration (18), one can easily see thatthere exists L := L ( T ) > (cid:107) v ( t, x ( t ) , z ( t ) , y ( t )) − v ( t, x ( r ) , z ( r ) , y ( r )) (cid:107) ≤ L ( (cid:107) x ( t ) − x ( r ) (cid:107) + (cid:107) z ( t ) − z ( r ) (cid:107) + (cid:107) y ( t ) − y ( r ) (cid:107) ) . (19)Further, by using (18) and (19), we get (cid:107) w ( t, x ( t ) , z ( t ) , y ( t )) − w ( t, x ( r ) , z ( r ) , y ( r )) (cid:107)≤ c (cid:107) A ( u ( t, x ( t ) , z ( t ) , y ( t )) − u ( t, x ( r ) , z ( r ) , y ( r )) + x ( t ) − x ( r )) (cid:107) + c (cid:107) v ( t, x ( t ) , z ( t ) , y ( t )) − v ( t, x ( r ) , z ( r ) , y ( r )) + z ( t ) − z ( r ) (cid:107)≤ c ( (cid:107) A (cid:107) L + (cid:107) A (cid:107) + L ) (cid:107) x ( t ) − x ( r ) (cid:107) + c ( (cid:107) A (cid:107) L + L + 1) (cid:107) z ( t ) − z ( r ) (cid:107) + c ( (cid:107) A (cid:107) L + L ) (cid:107) y ( t ) − y ( r ) (cid:107) . Hence, there exists L := c ( (cid:107) A (cid:107) L + (cid:107) A (cid:107) + L + 1) > (cid:107) w ( t, x ( t ) , z ( t ) , y ( t )) − w ( t, x ( r ) , z ( r ) , y ( r )) (cid:107) ≤ L ( (cid:107) x ( t ) − x ( r ) (cid:107) + (cid:107) z ( t ) − z ( r ) (cid:107) + (cid:107) y ( t ) − y ( r ) (cid:107) ) . (20)Using now Lemma 2 (i), we get (cid:107) u ( t, x ( r ) , z ( r ) , y ( r )) − u ( r, x ( r ) , z ( r ) , y ( r )) (cid:107) = (cid:107) R ( x ( r ) ,z ( r ) ,y ( r )) ( t ) − R ( x ( r ) ,z ( r ) ,y ( r )) ( r ) (cid:107)≤ (cid:107) R ( x ( r ) ,z ( r ) ,y ( r )) ( r ) (cid:107) cα (cid:107) M ( t ) − M ( r ) (cid:107) . (21)Since r (cid:55)→ S r and ∇ h are Lipschitz continuous and x, z, y and M are absolutely continuous, themap r (cid:55)→ R ( x ( r ) ,z ( r ) ,y ( r )) ( r ) = S r (cid:18) c M ( r ) x ( r ) + A ∗ z ( r ) − c A ∗ y ( r ) − c ∇ h ( x ( r )) (cid:19) − x ( r )is bounded on [0 , T ] . Consequently, there exists L := L ( T ) > (cid:107) u ( t, x ( r ) , z ( r ) , y ( r )) − u ( r, x ( r ) , z ( r ) , y ( r )) (cid:107) ≤ L (cid:107) M ( t ) − M ( r ) (cid:107) . (22)Similarly, using this time Lemma 2 (ii), we get (cid:107) v ( t, x ( r ) , z ( r ) , y ( r )) − v ( r, x ( r ) , z ( r ) , y ( r )) (cid:107) = (cid:107) Q ( x ( r ) ,z ( r ) ,y ( r )) ( t ) − Q ( x ( r ) ,z ( r ) ,y ( r )) ( r ) (cid:107)≤ (cid:107) A (cid:107)(cid:107) R ( x ( r ) ,z ( r ) ,y ( r )) ( r ) (cid:107) cα (cid:107) M ( t ) − M ( r ) (cid:107) + (cid:107) Q ( x ( r ) ,z ( r ) ,y ( r )) ( r ) (cid:107) c (cid:107) M ( t ) − M ( r ) (cid:107) . (23)18ince the proximal operator is nonexpansive and x, z, y and M are absolutely continuous, the map r (cid:55)→ Q ( x ( r ) ,z ( r ) ,y ( r )) ( r ) = prox c G ( r, · ) (cid:18) c M ( r ) z ( r ) + A ( γu ( r, x ( r ) , z ( r ) , y ( r )) + x ( r )) + 1 c y ( r ) (cid:19) − z ( r )is bounded on [0 , T ] . Consequently, there exists L := L ( T ) > (cid:107) v ( t, x ( r ) , z ( r ) , y ( r )) − v ( r, x ( r ) , z ( r ) , y ( r )) (cid:107) ≤ L ( (cid:107) M ( t ) − M ( r ) (cid:107) + (cid:107) M ( t ) − M ( r ) (cid:107) ) . (24)Further, by using (22) and (24), we get (cid:107) w ( t, x ( r ) , z ( r ) , y ( r )) − w ( r, x ( r ) , z ( r ) , y ( r )) (cid:107)≤ c (cid:107) A ( u ( t, x ( r ) , z ( r ) , y ( r )) − u ( r, x ( r ) , z ( r ) , y ( r ))) (cid:107) + c (cid:107) v ( t, x ( r ) , z ( r ) , y ( r )) − v ( r, x ( r ) , z ( r ) , y ( r )) (cid:107)≤ c ( (cid:107) A (cid:107) L + L ) (cid:107) M ( t ) − M ( r ) (cid:107) + cL (cid:107) M ( t ) − M ( r ) (cid:107) Consequently, there exists L := c ( (cid:107) A (cid:107) L + L ) > (cid:107) w ( t, x ( r ) , z ( r ) , y ( r )) − w ( r, x ( r ) , z ( r ) , y ( r )) (cid:107) ≤ L ( (cid:107) M ( t ) − M ( r ) (cid:107) + (cid:107) M ( t ) − M ( r ) (cid:107) ) . (25)Summing the relations (18)-(25) we obtain that there exists L > (cid:107) ˙ U ( t ) − ˙ U ( r ) (cid:107)≤ L ( (cid:107) x ( t ) − x ( r ) (cid:107) + (cid:107) z ( t ) − z ( r ) (cid:107) + (cid:107) y ( t ) − y ( r ) (cid:107) + (cid:107) M ( t ) − M ( r ) (cid:107) + (cid:107) M ( t ) − M ( r ) (cid:107) ) . Let be (cid:15) >

0. Since the maps x, z, y, M and M are absolutely continuous on [0 , T ], there exists η > I k = ( a k , b k ) ⊆ [0 , T ] such that for any subfamilyof disjoint intervals I j with (cid:80) j | b j − a j | < η it holds (cid:88) j (cid:107) x ( b j ) − x ( a j ) (cid:107) < ε L , (cid:88) j (cid:107) z ( b j ) − z ( a j ) (cid:107) < ε L , (cid:88) j (cid:107) y ( b j ) − y ( a j ) (cid:107) < ε L , (cid:88) j (cid:107) M ( b j ) − M ( a j ) (cid:107) < ε L and (cid:88) j (cid:107) M ( b j ) − M ( a j ) (cid:107) < ε L . Consequently, (cid:88) j (cid:107) ˙ U ( b j ) − ˙ U ( a j ) (cid:107) < ε, hence ˙ U ( · ) = ( ˙ x ( · ) , ˙ z ( · ) , ˙ y ( · )) is absolutely continuous on [0 , T ]. This proves that the second orderderivatives ¨ x, ¨ z, ¨ y exist almost everywhere on [0 , + ∞ ) . We come now to the proof of the second statement and assume to this end that sup t ≥ (cid:107) M ( t ) (cid:107) < + ∞ and sup t ≥ (cid:107) M ( t ) (cid:107) < + ∞ . Under these assumption, L , L and L appearing in (18), (19)and (20), respectively, can be taken as being global constants, that is, (18)-(20) hold for every t, r ∈ [0 , + ∞ ).Since R ( x ( r ) ,z ( r ) ,y ( r )) ( r ) = ˙ x ( r ) and Q ( x ( r ) ,z ( r ) ,y ( r )) ( r ) = ˙ z ( r ) for every r ∈ [0 , + ∞ ), from (21) and(23) we get (cid:107) u ( t, x ( r ) , z ( r ) , y ( r )) − u ( r, x ( r ) , z ( r ) , y ( r )) (cid:107) ≤ (cid:107) ˙ x ( r ) (cid:107) cα (cid:107) M ( t ) − M ( r ) (cid:107) and, respectively, (cid:107) v ( t, x ( r ) , z ( r ) , y ( r )) − v ( r, x ( r ) , z ( r ) , y ( r )) (cid:107) ≤ (cid:107) A (cid:107)(cid:107) ˙ x ( r ) (cid:107) cα (cid:107) M ( t ) − M ( r ) (cid:107) + (cid:107) ˙ z ( r ) (cid:107) c (cid:107) M ( t ) − M ( r ) (cid:107) t, r ∈ [0 , + ∞ ) . Consequently, (cid:107) w ( t, x ( r ) , z ( r ) , y ( r )) − w ( r, x ( r ) , z ( r ) , y ( r )) (cid:107)≤ c (cid:107) A (cid:107)(cid:107) u ( t, x ( r ) , z ( r ) , y ( r )) − u ( r, x ( r ) , z ( r ) , y ( r )) (cid:107) + c (cid:107) v ( t, x ( r ) , z ( r ) , y ( r )) − v ( r, x ( r ) , z ( r ) , y ( r )) (cid:107)≤ (cid:107) A (cid:107) α (cid:107) ˙ x ( r ) (cid:107)(cid:107) M ( t ) − M ( r ) (cid:107) + (cid:107) ˙ z ( r ) (cid:107)(cid:107) M ( t ) − M ( r ) (cid:107) for every t, r ∈ [0 , + ∞ ) . This shows that there exists

L > (cid:107) ˙ U ( t ) − ˙ U ( r ) (cid:107) ≤ L √ (cid:107) x ( t ) − x ( r ) (cid:107) + (cid:107) z ( t ) − z ( r ) (cid:107) + (cid:107) y ( t ) − y ( r ) (cid:107) + (cid:107) ˙ x ( r ) (cid:107)(cid:107) M ( t ) − M ( r ) (cid:107) + (cid:107) ˙ z ( r ) (cid:107)(cid:107) M ( t ) − M ( r ) (cid:107) )for every t, r ∈ [0 , + ∞ ) . Now we ﬁx r ∈ [0 , + ∞ ) at which the second derivative of the trajectories exist and take in theabove inequality t = r + h for some h >

0. This yields (cid:107) ˙ x ( r + h ) − ˙ x ( r ) (cid:107) + (cid:107) ˙ z ( r + h ) − ˙ z ( r ) (cid:107) + (cid:107) ˙ y ( r + h ) − ˙ y ( r ) (cid:107) ) ≤ √ (cid:107) ˙ U ( r + h ) − ˙ U ( r ) (cid:107)≤ L ( (cid:107) x ( r + h ) − x ( r ) (cid:107) + (cid:107) z ( r + h ) − z ( r ) (cid:107) + (cid:107) y ( r + h ) − y ( r ) (cid:107) )+ L ( (cid:107) ˙ x ( r ) (cid:107)(cid:107) M ( r + h ) − M ( r ) (cid:107) + (cid:107) ˙ z ( r ) (cid:107)(cid:107) M ( r + h ) − M ( r ) (cid:107) ) . After dividing in the above inequality by h and letting h −→

0, we obtain (cid:107) ¨ x ( r ) (cid:107) + (cid:107) ¨ z ( r ) (cid:107) + (cid:107) ¨ y ( r ) (cid:107) ≤ L ( (cid:107) ˙ x ( r ) (cid:107) + (cid:107) ˙ z ( r ) (cid:107) + (cid:107) ˙ y ( r ) (cid:107) + (cid:107) ˙ x ( r ) (cid:107)(cid:107) ˙ M ( r ) (cid:107) + (cid:107) ˙ z ( r ) (cid:107)(cid:107) ˙ M ( r ) (cid:107) ) . This inequality holds for almost every r ∈ [0 , + ∞ ) . In this section we will address the asymptotic behaviour of the trajectories generated by the dy-namical system (4). At the beginning we will recall two results which will play a central role in theasymptotic analysis (see [2, Lemma 5.1] and [2, Lemma 5.2], respectively).

Lemma 6.

Suppose that A : [0 , + ∞ ) → R is locally absolutely continuous and bounded from belowand that there exists B ∈ L ([0 , + ∞ ) , R ) such that for almost every t ∈ [0 , + ∞ ) ddt A ( t ) ≤ B ( t ) . Then there exists lim t → + ∞ A ( t ) ∈ R . Lemma 7. If ≤ p < ∞ , ≤ r ≤ ∞ , A : [0 , + ∞ ) → [0 , + ∞ ) is locally absolutely continuous, A ∈ L p ([0 , + ∞ ) , R ) , B : [0 , + ∞ ) → R , B ∈ L r ([0 , + ∞ ) , R ) and for almost every t ∈ [0 , + ∞ ) ddt A ( t ) ≤ B ( t ) , then lim t → + ∞ A ( t ) = 0 . The ﬁrst result which we prove in this section is a continuous version of the Opial Lemmaformulated in the setting of variable metrics (see [18, Theorem 3.3] for its discrete counterpart).20 emma 8.

Let

C ⊆ H be a nonempty set and x : [0 , + ∞ ) → H a continuous map. Let M :[0 , + ∞ ) −→ S + ( H ) be such that M ( t ) (cid:60) M ( t ) for every t , t ∈ [0 , + ∞ ) with t ≤ t and thereexists α > with M ( t ) ∈ P α ( H ) for every t ∈ [0 , + ∞ ) . If the following two conditions are fulliled(i) the limit lim t → + ∞ (cid:107) x ( t ) − z (cid:107) M ( t ) exists for every z ∈ C ;(ii) every weak sequential cluster point of x ( t ) , t ∈ [0 , + ∞ ) , belongs to C ;then there exists x ∞ ∈ C such that x ( t ) , t ∈ [0 , + ∞ ) , converges weakly to x ∞ as t → + ∞ .Proof. Since

C (cid:54) = ∅ and M ( t ) ∈ P α ( H ), by (i) we have that x is bounded, hence it possesses atleast one weak sequential cluster point, which belongs to C . We show that x has exactly one weaksequential cluster point.Indeed, let x , x two weak sequential cluster points of x . For our claim it is enough to showthat x = x . Obviously x , x ∈ C and there exist the sequences ( t n ) n ≥ , ( t n ) n ≥ ⊆ [0 , + ∞ ) withlim n −→ + ∞ t n = + ∞ and lim n −→ + ∞ t n = + ∞ such that ( x ( t n )) n ≥ converges weakly to x and( x ( t n )) n ≥ converges weakly to x as n → + ∞ .Further, since M ( t ) (cid:60) M ( t ) for every for every t , t ∈ [0 , + ∞ ) with t ≤ t and M ( t ) ∈ P α ( H )for every t ∈ [0 , + ∞ ) , it follows that for every z ∈ H the function[0 , + ∞ ) → [0 , + ∞ ) , t −→ (cid:107) z (cid:107) M ( t ) , is decreasing and is bounded from below, hence there existslim t −→ + ∞ (cid:107) z (cid:107) M ( t ) ∈ R . (26)Since x , x ∈ C , we have that the limits lim t → + ∞ (cid:107) x ( t ) − x (cid:107) M ( t ) and lim t → + ∞ (cid:107) x ( t ) − x (cid:107) M ( t ) exist. Further, since −(cid:104) x ( t ) , M ( t )( x − x ) (cid:105) = 12 (cid:16) (cid:107) x ( t ) − x (cid:107) M ( t ) − (cid:107) x ( t ) − x (cid:107) M ( t ) − (cid:107) x (cid:107) M ( t ) + (cid:107) x (cid:107) M ( t ) (cid:17) holds for every t ∈ [0 , + ∞ ), the limit λ := lim t −→ + ∞ (cid:104) x ( t ) , M ( t )( x − x ) (cid:105) ∈ R (27)exists.Next we show that the limits lim t −→ + ∞ M ( t ) z (28)exists for every z ∈ H . To this end we ﬁx z ∈ H . We will actually show thatlim s,t → + ∞ (cid:107) M ( t ) z − M ( s ) z (cid:107) = 0and the conclusion will follow by the Cauchy criterion.For U ∈ S + ( H ) we have by the generalized Cauchy-Schwarz inequality that |(cid:104) U x, z (cid:105)| ≤ (cid:107) x (cid:107) U (cid:107) z (cid:107) U for every x, z ∈ H. Hence, for t, s ∈ [0 , + ∞ ) with t ≤ s we have M ( t ) − M ( s ) ∈ S + ( H ), therefore (cid:107) ( M ( t ) − M ( s )) z (cid:107) = (cid:104) ( M ( t ) − M ( s )) z, M ( t ) − M ( s )) z (cid:105)≤ (cid:107) z (cid:107) ( M ( t ) − M ( s )) (cid:107) ( M ( t ) − M ( s )) z (cid:107) ( M ( t ) − M ( s )) = (cid:107) z (cid:107) ( M ( t ) − M ( s )) (cid:0) (cid:104) ( M ( t ) − M ( s )) z, ( M ( t ) − M ( s )) z (cid:105) (cid:1) ≤ (cid:107) z (cid:107) ( M ( t ) − M ( s )) (cid:107) M ( t ) − M ( s ) (cid:107) (cid:107) z (cid:107) . M (0) (cid:60) M ( t ), we have that (cid:107) M ( t ) (cid:107) = sup (cid:107) x (cid:107) =1 (cid:104) M ( t ) x, x (cid:105) ≤ sup (cid:107) x (cid:107) =1 (cid:104) M (0) x, x (cid:105) ≤ (cid:107) M (0) (cid:107) for every t ∈ [0 , + ∞ ). This shows that (cid:107) M ( t ) − M ( s ) (cid:107) , t, s ∈ [0 , + ∞ ) , is bounded. This, togetherwith the fact that lim s,t → + ∞ (cid:107) z (cid:107) M ( t ) − M ( s )) = 0, which follows from (26), implies (cid:107) ( M ( t ) − M ( s )) z (cid:107) −→ s, t −→ + ∞ . This proves (28). For every z ∈ H let us denote by M z := lim t −→ + ∞ M ( t ) z. Since M ( t ) ∈ P α ( H )for every t ∈ [0 , + ∞ ), it holds M ∈ P α ( H ) . Since ( x ( t n )) n ≥ converges weakly to x and ( x ( t n )) n ≥ converges weakly to x as n → + ∞ and M ( t n )( x − x ) −→ M ( x − x ) and M ( t n )( x − x ) −→ M ( x − x ) as n −→ + ∞ , passing to the limit in (27) we getlim n −→ + ∞ (cid:104) x ( t n ) , M ( t n )( x − x ) (cid:105) = (cid:104) x , M ( x − x (cid:105) ) = λ and lim n −→ + ∞ (cid:104) x ( t n ) , M ( t n )( x − x ) (cid:105) = (cid:104) x , M ( x − x ) (cid:105) = λ. In conclusion,0 = (cid:104) x , M ( x − x ) (cid:105) − (cid:104) x , M ( x − x ) (cid:105) = (cid:107) x − x (cid:107) M ≥ α (cid:107) x − x (cid:107) , which shows that x = x . Remark 4.

If a map M : [0 , + ∞ ) −→ S + ( H ) satisﬁes M ( t ) (cid:60) M ( t ) for every t , t ∈ [0 , + ∞ )with t ≤ t we say that M is monotonically decreasing. If M is monotonically decreasing andlocally absolutely continuous, then ˙ M ( t ) exists and (cid:104) ˙ M ( t ) x, x (cid:105) ≤ t ∈ [0 , + ∞ ) . The following result is an adaptation of a result from [3] to our setting.

Proposition 9. (see [3, Proposition 2.4]) In the setting of the optimization problem (1) , let ( a n , a ∗ n ) n ≥ be a sequence in the graph of ∂ ( f + h ) and ( b n , b ∗ n ) n ≥ a sequence in the graph of ∂g. Suppose that a n converges weakly to x ∈ H , b ∗ n converges weakly to v ∈ G , a ∗ n + A ∗ b ∗ n −→ , and Aa n − b n −→ as n → + ∞ . Then (cid:104) a n , a ∗ n (cid:105) + (cid:104) b n , b ∗ n (cid:105) −→ as n → + ∞ and v ∈ ∂g ( Ax ) , − A ∗ v ∈ ∂f ( x ) + ∇ h ( x ) . The theorem which states the asymptotic convergence of the trajectories generated by the dy-namical system (4) to a saddle point of the Lagrangian of the problem (1) follows.

Theorem 10.

The proof of the theorem relies on Lemma 8. An important step will in the proof will be thederivation of two inequalities of Lyapunov type, namely, (37), in the case when L h (cid:54) = 0, and (43),in the case when L h = 0. Let ( x ∗ , z ∗ , y ∗ ) ∈ H × G × G be a saddle point of the Lagrangian l . Then (cid:26) ∈ ∂f ( x ∗ ) + ∇ h ( x ∗ ) + A ∗ y ∗ Ax ∗ = z ∗ , Ax ∗ ∈ ∂g ∗ ( y ∗ ) . According to (5) we have for almost every t ∈ [0 , + ∞ ) − cA ∗ A ( ˙ x ( t ) + x ( t )) − M ( t ) ˙ x ( t ) + cA ∗ z ( t ) − A ∗ y ( t ) − ∇ h ( x ( t )) ∈ ∂f ( ˙ x ( t ) + x ( t )) , (29)which yields, by taking into account the monotonicity of ∂f , (cid:104)− cA ∗ A ( ˙ x ( t )+ x ( t )) − M ( t ) ˙ x ( t )+ cA ∗ z ( t ) − A ∗ ( y ( t ) − y ∗ ) − ( ∇ h ( x ( t )) −∇ h ( x ∗ )) , ˙ x ( t )+ x ( t ) − x ∗ (cid:105) ≥ . (30)Similarly, according to (6) we have for almost every t ∈ [0 , + ∞ ) − c ( ˙ z ( t ) + z ( t )) + cA ( γ ˙ x ( t ) + x ( t )) − M ( t ) ˙ z ( t ) + y ( t ) ∈ ∂g ( ˙ z ( t ) + z ( t )) , (31)which yields, by taking into account the monotonicity of ∂g , (cid:104)− c ( ˙ z ( t ) + z ( t )) + cA ( γ ˙ x ( t ) + x ( t )) − M ( t ) ˙ z ( t ) + ( y ( t ) − y ∗ ) , ˙ z ( t ) + z ( t ) − Ax ∗ (cid:105) ≥ . (32)By using the last equation of (4) we obtain for almost every t ∈ [0 , + ∞ ) (cid:104)− A ∗ ( y ( t ) − y ∗ ) , ˙ x ( t ) + x ( t ) − x ∗ (cid:105) + (cid:104) y ( t ) − y ∗ , ˙ z ( t ) + z ( t ) − Ax ∗ (cid:105) = − (cid:104) y ( t ) − y ∗ , A ( ˙ x ( t ) + x ( t )) − Ax ∗ − ( ˙ z ( t ) + z ( t )) + Ax ∗ (cid:105) = 1 c (cid:104) y ( t ) − y ∗ , ˙ y ( t ) (cid:105) (33)= − c ddt (cid:107) y ( t ) − y ∗ (cid:107) . Assume that L h >

0. By using the Baillon-Haddad Theorem we have for almost every t ∈ [0 , + ∞ ) (cid:104)− ( ∇ h ( x ( t )) − ∇ h ( x ∗ )) , ˙ x ( t ) + x ( t ) − x ∗ (cid:105) = − (cid:104)∇ h ( x ( t )) − ∇ h ( x ∗ ) , x ( t ) − x ∗ (cid:105) − (cid:104)∇ h ( x ( t )) − ∇ h ( x ∗ ) , ˙ x ( t ) (cid:105)≤ − L h (cid:107)∇ h ( x ( t )) − ∇ h ( x ∗ ) (cid:107) − (cid:104)∇ h ( x ( t )) − ∇ h ( x ∗ ) , ˙ x ( t ) (cid:105) (34)= − L h (cid:32)(cid:13)(cid:13)(cid:13)(cid:13) ∇ h ( x ( t )) − ∇ h ( x ∗ ) + L h x ( t ) (cid:13)(cid:13)(cid:13)(cid:13) − L h (cid:107) ˙ x ( t ) (cid:107) (cid:33) .

23y summing (30) and (32) and by taking into account (33) and (34) we obtain for almost every t ∈ [0 , + ∞ )0 ≤ (cid:104)− cA ∗ A ( ˙ x ( t ) + x ( t )) − M ( t ) ˙ x ( t ) + cA ∗ z ( t ) , ˙ x ( t ) + x ( t ) − x ∗ (cid:105) + (cid:104)− c ( ˙ z ( t ) + z ( t )) + cAx ( t ) + cγA ˙ x ( t ) − M ( t ) ˙ z ( t ) , ˙ z ( t ) + z ( t ) − Ax ∗ (cid:105) (35) − c ddt (cid:107) y ( t ) − y ∗ (cid:107) − L h (cid:32)(cid:13)(cid:13)(cid:13)(cid:13) ∇ h ( x ( t )) − ∇ h ( x ∗ ) + L h x ( t ) (cid:13)(cid:13)(cid:13)(cid:13) − L h (cid:107) ˙ x ( t ) (cid:107) (cid:33) . We have for almost every t ∈ [0 , + ∞ ) (cid:104)− cA ∗ A ( ˙ x ( t ) + x ( t )) + cA ∗ z ( t ) , ˙ x ( t ) + x ( t ) − x ∗ (cid:105) + (cid:104)− c ( ˙ z ( t ) + z ( t )) + cAx ( t ) + cγA ˙ x ( t ) , ˙ z ( t ) + z ( t ) − Ax ∗ (cid:105) = − c (cid:107) ˙ y ( t ) (cid:107) + (cid:104)− c ˙ z ( t ) , A ( ˙ x ( t ) + x ( t ) − x ∗ ) (cid:105) + (cid:104) ( γ − cA ˙ x ( t ) , ˙ z ( t ) + z ( t ) − Ax ∗ (cid:105) = − c (cid:107) ˙ y ( t ) (cid:107) + (cid:28) − c ˙ z ( t ) , c ˙ y ( t ) + ˙ z ( t ) + z ( t ) − Ax ∗ (cid:29) + (cid:28) ( γ − cA ˙ x ( t ) , A ( ˙ x ( t ) + x ( t )) − c ˙ y ( t ) − Ax ∗ (cid:29) = − c (cid:107) ˙ y ( t ) (cid:107) − c (cid:107) ˙ z ( t ) (cid:107) + ( γ − c (cid:107) A ˙ x ( t ) (cid:107) − (cid:104) ˙ z ( t ) , ˙ y ( t ) (cid:105) + (1 − γ ) (cid:104) A ˙ x ( t ) , ˙ y ( t ) (cid:105) + c ( γ − ddt (cid:0) (cid:107) Ax ( t ) − Ax ∗ (cid:107) (cid:1) − c ddt (cid:0) (cid:107) z ( t ) − Ax ∗ (cid:107) (cid:1) . Since (cid:104) ˙ z ( t ) , ˙ y ( t ) (cid:105) = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ c z ( t ) + 1 √ c ˙ y ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) − c (cid:107) ˙ z ( t ) (cid:107) − c (cid:107) ˙ y ( t ) (cid:107) , and (cid:104) A ˙ x ( t ) , ˙ y ( t ) (cid:105) = − (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ c A ˙ x ( t ) − √ c ˙ y ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + 3 c (cid:107) A ˙ x ( t ) (cid:107) + 13 c (cid:107) ˙ y ( t ) (cid:107) , we obtain from above that for almost every t ∈ [0 , + ∞ ) it holds (cid:104)− cA ∗ A ( ˙ x ( t ) + x ( t )) + cA ∗ z ( t ) , ˙ x ( t ) + x ( t ) − x ∗ (cid:105) + (cid:104)− c ( ˙ z ( t ) + z ( t )) + cAx ( t ) + cγA ˙ x ( t ) , ˙ z ( t ) + z ( t ) − Ax ∗ (cid:105) = − γ + 13 c (cid:107) ˙ y ( t ) (cid:107) − c (cid:107) ˙ z ( t ) (cid:107) − (1 − γ ) c (cid:107) A ˙ x ( t ) (cid:107) − (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ c z ( t ) + 1 √ c ˙ y ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (36) − (1 − γ ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ c A ˙ x ( t ) − √ c ˙ y ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + c ( γ − ddt (cid:0) (cid:107) Ax ( t ) − Ax ∗ (cid:107) (cid:1) − c ddt (cid:0) (cid:107) z ( t ) − Ax ∗ (cid:107) (cid:1) . By using Lemma 4 we observe that for almost every t ∈ [0 , + ∞ ) it holds (cid:104)− M ( t ) ˙ x ( t ) , ˙ x ( t ) + x ( t ) − x ∗ (cid:105) = − (cid:107) ˙ x ( t ) (cid:107) M ( t ) − (cid:104) M ( t ) ˙ x ( t ) , x ( t ) − x ∗ (cid:105) = − (cid:107) ˙ x ( t ) (cid:107) M ( t ) + 12 (cid:104) ˙ M ( t )( x ( t ) − x ∗ ) , x ( t ) − x ∗ (cid:105)− ddt (cid:107) x ( t ) − x ∗ (cid:107) M ( t ) (cid:104)− M ( t ) ˙ z ( t ) , ˙ z ( t ) + z ( t ) − Ax ∗ (cid:105) = − (cid:107) ˙ z ( t ) (cid:107) M ( t ) − (cid:104) M ( t ) ˙ z ( t ) , z ( t ) − Ax ∗ (cid:105) = − (cid:107) ˙ z ( t ) (cid:107) M ( t ) + 12 (cid:104) ˙ M ( t )( z ( t ) − Ax ∗ ) , z ( t ) − Ax ∗ (cid:105)− ddt (cid:107) z ( t ) − Ax ∗ (cid:107) M ( t ) . By plugging the last two identities and (36) into (35), we obtain for almost every t ∈ [0 , + ∞ )0 ≤ − γ + 13 c (cid:107) ˙ y ( t ) (cid:107) − c (cid:107) ˙ z ( t ) (cid:107) − (1 − γ ) c (cid:107) A ˙ x ( t ) (cid:107) − (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ c z ( t ) + 1 √ c ˙ y ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) − (1 − γ ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ c A ˙ x ( t ) − √ c ˙ y ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + c ( γ − ddt (cid:0) (cid:107) Ax ( t ) − Ax ∗ (cid:107) (cid:1) − c ddt (cid:0) (cid:107) z ( t ) − Ax ∗ (cid:107) (cid:1) − (cid:107) ˙ x ( t ) (cid:107) M ( t ) + 12 (cid:104) ˙ M ( t )( x ( t ) − x ∗ ) , x ( t ) − x ∗ (cid:105) − ddt (cid:107) x ( t ) − x ∗ (cid:107) M ( t ) − (cid:107) ˙ z ( t ) (cid:107) M ( t ) + 12 (cid:104) ˙ M ( t )( z ( t ) − Ax ∗ ) , z ( t ) − Ax ∗ (cid:105) − ddt (cid:107) z ( t ) − Ax ∗ (cid:107) M ( t ) − c ddt (cid:107) y ( t ) − y ∗ (cid:107) − L h (cid:32)(cid:13)(cid:13)(cid:13)(cid:13) ∇ h ( x ( t )) − ∇ h ( x ∗ ) + L h x ( t ) (cid:13)(cid:13)(cid:13)(cid:13) − L h (cid:107) ˙ x ( t ) (cid:107) (cid:33) . According to Remark 4, (cid:104) ˙ M ( t )( x ( t ) − x ∗ ) , x ( t ) − x ∗ (cid:105) ≤ (cid:104) ˙ M ( t )( z ( t ) − Ax ∗ ) , z ( t ) − Ax ∗ (cid:105) ≤ t ∈ [0 , + ∞ ). This means that for almost every t ∈ [0 , + ∞ ) we have12 ddt (cid:18) (cid:107) x ( t ) − x ∗ (cid:107) M ( t )+ c (1 − γ ) A ∗ A + (cid:107) z ( t ) − Ax ∗ (cid:107) M ( t )+ cI + 1 c (cid:107) y ( t ) − y ∗ (cid:107) (cid:19) + (cid:107) ˙ x ( t ) (cid:107) M ( t )+ (1 − γ ) c A ∗ A − Lh I + (cid:107) ˙ z ( t ) (cid:107) M ( t )+ c I + γ + 13 c (cid:107) ˙ y ( t ) (cid:107) + (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ c z ( t ) + 1 √ c ˙ y ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + (1 − γ ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ c A ˙ x ( t ) − √ c ˙ y ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + (37)1 L h (cid:13)(cid:13)(cid:13)(cid:13) ∇ h ( x ( t )) − ∇ h ( x ∗ ) + L h x ( t ) (cid:13)(cid:13)(cid:13)(cid:13) ≤ . From Lemma 6 we havelim t −→ + ∞ ( (cid:107) x ( t ) − x ∗ (cid:107) M ( t )+ c (1 − γ ) A ∗ A + (cid:107) z ( t ) − Ax ∗ (cid:107) M ( t )+ cI + 1 c (cid:107) y ( t ) − y ∗ (cid:107) ) ∈ R . (38)Let be T >

0. By integrating (37) on the interval [0 , T ] we obtain12 (cid:18) (cid:107) x ( T ) − x ∗ (cid:107) M ( T )+ c (1 − γ ) A ∗ A + (cid:107) z ( T ) − z ∗ (cid:107) M ( T )+ cI + 1 c (cid:107) y ( T ) − y ∗ (cid:107) (cid:19) + (cid:90) T (cid:107) ˙ x ( t ) (cid:107) M ( t )+ (1 − γ ) c A ∗ A − Lh I dt + (cid:90) T (cid:107) ˙ z ( t ) (cid:107) M ( t )+ c I dt + γ + 13 c (cid:90) T (cid:107) ˙ y ( t ) (cid:107) dt + (cid:90) T (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ c z ( t ) + 1 √ c ˙ y ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) dt + (1 − γ ) (cid:90) T (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ c A ˙ x ( t ) − √ c ˙ y ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) dt +25 L h (cid:90) T (cid:13)(cid:13)(cid:13)(cid:13) ∇ h ( x ( t )) − ∇ h ( x ∗ ) + L h x ( t ) (cid:13)(cid:13)(cid:13)(cid:13) dt ≤ (cid:18) (cid:107) x − x ∗ (cid:107) M (0)+ c (1 − γ ) A ∗ A + (cid:107) z − z ∗ (cid:107) M (0)+ cI + 1 c (cid:107) y − y ∗ (cid:107) (cid:19) . Letting T converge to + ∞ we ﬁnd (cid:107) ˙ x ( · ) (cid:107) M ( · )+ (1 − γ ) c A ∗ A − Lh I ∈ L ([0 , + ∞ ) , R ) , (39) (cid:107) ˙ z ( · ) (cid:107) M ( · )+ c I ∈ L ([0 , + ∞ ) , R ) , ˙ y ( · ) ∈ L ([0 , + ∞ ) , G ) , (40) √ c z ( · ) + 1 √ c ˙ y ( · ) ∈ L ([0 , + ∞ ) , G ) , (1 − γ ) (cid:32) √ c A ˙ x ( · ) − √ c ˙ y ( · ) (cid:33) ∈ L ([0 , + ∞ ) , G ) , (41)and, consequently, ˙ z ( · ) , (1 − γ ) A ˙ x ( · ) ∈ L ([0 , + ∞ ) , G ) . (42)In the case when L h = 0, which corresponds to the situation when h is an aﬃne-continuous function,instead of (37) we obtain that for almost every t ∈ [0 , + ∞ )12 ddt (cid:18) (cid:107) x ( t ) − x ∗ (cid:107) M ( t )+ c (1 − γ ) A ∗ A + (cid:107) z ( t ) − Ax ∗ (cid:107) M ( t )+ cI + 1 c (cid:107) y ( t ) − y ∗ (cid:107) (cid:19) + (cid:107) ˙ x ( t ) (cid:107) M ( t )+ (1 − γ ) c A ∗ A + (cid:107) ˙ z ( t ) (cid:107) M ( t )+ c I + γ + 13 c (cid:107) ˙ y ( t ) (cid:107) + (43) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ c z ( t ) + 1 √ c ˙ y ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + (1 − γ ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ c A ˙ x ( t ) − √ c ˙ y ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ . By arguing as above, we obtain also in this case (38), (39)-(41) and (42).Further, we have that ˙ x ( · ) ∈ L ([0 , + ∞ ) , H ). Indeed, in case (I), when we assume that thereexists α > M ( t ) + (1 − γ ) c A ∗ A − L h I ∈ P α ( H ) for every t ∈ [0 , + ∞ ), then this yieldsautomatically. In case (II), from (1 − γ ) A ˙ x ( · ) ∈ L ([0 , + ∞ ) , G ) and γ ∈ [0 , A ˙ x ( · ) ∈ L ([0 , + ∞ ) , G ) . But, since A ∗ A ∈ P α ( H ), it yields (cid:107) A ˙ x ( t ) (cid:107) ≥ α (cid:107) ˙ x ( t ) (cid:107) for almost every t ∈ [0 , + ∞ ), which meansthat also in this case ˙ x ( · ) ∈ L ([0 , + ∞ ) , H ) . According to Lemma 5, this yields¨ x ( · ) ∈ L ([0 , + ∞ ) , H ) and ¨ z ( · ) , ¨ y ( · ) ∈ L ([0 , + ∞ ) , G ) . Consequently, for almost every t ∈ [0 , + ∞ ) it holds ddt (cid:107) ˙ x ( t ) (cid:107) = 2 (cid:104) ¨ x ( t ) , ˙ x ( t ) (cid:105) ≤ (cid:0) (cid:107) ¨ x ( t ) (cid:107) + (cid:107) ˙ x ( t ) (cid:107) (cid:1) and the right-hand side is a function in L ([0 , + ∞ ) , R ). Hence, according to Lemma 7,lim t −→ + ∞ ˙ x ( t ) = 0 . Similarly, we obtain that lim t −→ + ∞ ˙ z ( t ) = 0 and lim t −→ + ∞ ˙ y ( t ) = 0 .

26e will close the proof of the theorem by showing that the asymptotic convergence of thetrajectory follows from Lemma 8. One can easily notice that (38) is nothing else but condition (i)of this lemma when applied in the product space for the trajectory[0 , + ∞ ) (cid:55)→ H × G × G , t −→ ( x ( t ) , z ( t ) , y ( t )) , the monotonically decreasing map W : [0 , + ∞ ) (cid:55)→ H × G × G , W ( t ) = (cid:18) M ( t ) + c (1 − γ ) A ∗ A, M ( t ) + cI, c I (cid:19) and the set C taken as the set of saddle points of the Lagrangian l. Next we will show that also condition (ii) in Lemma 8 is fulﬁlled, namely, that every weaksequential cluster point of the trajectory ( x ( t ) , z ( t ) , y ( t )) , t ∈ [0 , + ∞ ) , is a saddle point of theLangrangian l .Let ( x, z, y ) be such a weak sequentially cluster point. This means that there exists a sequence( s n ) n ≥ with s n −→ + ∞ such that ( x ( s n ) , z ( s n ) , y ( s n )) converges to ( x, z, y ) as n −→ + ∞ in theweak topology of H × G × G .From (29) and (31) we get for every n ≥ − cA ∗ A ( ˙ x ( s n ) + x ( s n )) − M ( s n ) ˙ x ( s n ) + cA ∗ z ( s n ) − A ∗ y ( s n ) − ∇ h ( x ( s n )) ∈ ∂f ( ˙ x ( s n ) + x ( s n ))and − c ( ˙ z ( s n ) + z ( s n )) + cA ( γ ˙ x ( s n ) + x ( s n )) − M ( s n ) ˙ z ( s n ) + y ( s n ) ∈ ∂g ( ˙ z ( s n ) + z ( s n )) , respectively. For every n ≥

0, let a ∗ n := − cA ∗ A ( ˙ x ( s n ) + x ( s n )) − M ( s n ) ˙ x ( s n ) + cA ∗ z ( s n ) − A ∗ y ( s n ) − ∇ h ( x ( s n )) + ∇ h ( ˙ x ( s n ) + x ( s n ))and a n := ˙ x ( s n ) + x ( s n ) . Hence, ( a n , a ∗ n ) n ≥ ⊆ Gr ∂ ( f + h ) . Similarly, for every n ≥

0, let b ∗ n := − c ( ˙ z ( s n ) + z ( s n )) + cA ( γ ˙ x ( s n ) + x ( s n )) − M ( s n ) ˙ z ( s n ) + y ( s n )and b n := ˙ z ( s n ) + z ( s n ) . Hence, ( b n , b ∗ n ) n ≥ ⊆ Gr ∂g. Since lim t −→ + ∞ ˙ x ( t ) = 0, lim t −→ + ∞ ˙ z ( t ) = 0 and lim t −→ + ∞ ˙ y ( t ) = 0 it follows that ( a n ) n ≥ converges weakly to x as n → ∞ . Furthermore, since ( M ( s n )) n ≥ is bounded, and b ∗ n = c ( γ − A ˙ x ( s n ) + ˙ y ( s n ) − M ( s n ) ˙ z ( s n ) + y ( s n ) ∀ n ≥ , it follows that ( b ∗ n ) n ≥ converges weakly to y as n → ∞ .From (14) we have Aa n − b n = 1 c ˙ y ( s n ) −→ n → + ∞ ) , which implies that Ax = z . On the other hand, since ∇ h is Lipschitz continuous, we have ∇ h ( ˙ x ( s n ) + x ( s n )) − ∇ h ( x ( s n )) −→ n → + ∞ ) , n −→ + ∞ ( a ∗ n + A ∗ b ∗ n )= lim n −→ + ∞ ( c ( γ − A ∗ A ˙ x ( s n ) − cA ∗ ˙ z ( s n ) − A ∗ M ( s n ) ˙ z ( s n ) − M ( s n ) ˙ x ( s n ))+ lim n −→ + ∞ ( ∇ h ( ˙ x ( s n ) + x ( s n )) − ∇ h ( x ( s n )))= 0 . Thus, according to Proposition 9, we have − A ∗ y − ∇ h ( x ) ∈ ∂f ( x ) and y ∈ ∂g ( Ax ) . Consequently, ( x, z, y ) is a saddle point of l. The conclusion of the theorem follows from Lemma 8.Next we will address two particular cases of the dynamical system (4). We consider ﬁrst thecase when M ( t ) = M ( t ) = 0 for every t ∈ [0 , + ∞ ), thus, the system (4) becomes  ˙ x ( t ) + x ( t ) ∈ argmin x ∈H (cid:16) f ( x ) + (cid:104) x, ∇ h ( x ( t )) (cid:105) + c (cid:13)(cid:13) Ax − z ( t ) + c y ( t ) (cid:13)(cid:13) (cid:17) ˙ z ( t ) + z ( t ) = argmin x ∈G (cid:16) g ( x ) + c (cid:13)(cid:13) x − (cid:0) A ( γ ˙ x ( t ) + x ( t )) + c y ( t ) (cid:1)(cid:13)(cid:13) (cid:17) ˙ y ( t ) = cA ( x ( t ) + ˙ x ( t )) − c ( z ( t ) + ˙ z ( t )) x (0) = x ∈ H , z (0) = z ∈ G , y (0) = y ∈ G , (44)where c > γ ∈ [0 , Theorem 11.

In the setting of the optimization problem (1) , assume that the set of saddle points ofthe Lagrangian l is nonempty, γ ∈ [0 , and that there exists α > such that A ∗ A − L h c (1 − γ ) I ∈ P α ( H ) .For an arbitrary starting point ( x , z , y ) ∈ H × G × G , let ( x, z, y ) : [0 , + ∞ ) −→ H × G × G bethe unique strong global solution of the dynamical system (44) . Then the trajectory ( x ( t ) , z ( t ) , y ( t )) converges weakly to a saddle point of l as t −→ + ∞ . Next we consider the setting from Remark 2 with M ( t ) = τ ( t ) I − cA ∗ A and M ( t ) = 0, where τ ( t ) is such that cτ ( t ) (cid:107) A (cid:107) ≤

1, for every t ∈ [0 , + ∞ ). The resulting dynamical system is theprimal-dual system (8). The corresponding convergence result follows again as a particular case ofTheorem 10. Theorem 12.

In the setting of the optimization problem (1) , assume that the set of saddle pointsof the Lagrangian l is nonempty, the map τ : [0 , + ∞ ) → (0 , + ∞ ) is locally absolutely continuousand monotonically increasing with cτ ( t ) (cid:107) A (cid:107) ≤ and − τ ( t ) L h τ ( t ) I − c (3 + γ )4 A ∗ A ∈ S + ( H ) ∀ t ∈ [0 , + ∞ ) , and sup t ≥ τ (cid:48) ( t ) τ ( t ) < + ∞ . For an arbitrary starting point ( x , z , y ) ∈ H × G × G , let ( x, z, y ) :[0 , + ∞ ) −→ H × G × G be the unique strong global solution of the dynamical system (8) . If one ofthe following assumptions holds: I) there exists α > such that − τ ( t ) L h τ ( t ) I − c (3+ γ )4 A ∗ A ∈ P α ( H ) for every t ∈ [0 , + ∞ ) ;(II) γ ∈ [0 , and there exists α > such that A ∗ A ∈ P α ( H ) ;then the trajectory ( x ( t ) , z ( t ) , y ( t )) converges weakly to a saddle point of l as t −→ + ∞ . Remark 5.

Let be t ∈ [0 , + ∞ ). Notice that the condition − τ ( t ) L h τ ( t ) I − c (3+ γ )4 A ∗ A ∈ S + ( H ) isfulﬁlled if and only if τ ( t ) (cid:18) L h c (3 + γ )4 (cid:107) A (cid:107) (cid:19) ≤ . On the other hand, the condition − τ ( t ) L h τ ( t ) I − c (3+ γ )4 A ∗ A ∈ P α ( H ) holds, for α >

0, if and only if τ ( t ) (cid:18) α + L h c (3 + γ )4 (cid:107) A (cid:107) (cid:19) ≤ . For the last result of this paper we go back to the general dynamical system (4) and provideconvergence rates for the violation of the feasibility condition by ergodic trajectories and the conver-gence of the objective function along these ergodic trajectories to its minimal value. The result canbe seen as the continuous counterpart of a convergence rate result proved for the ADMM algorithmin [22, Theorem 4.3].

Theorem 13.

In the setting of the optimization problem (1) , assume that the set of saddle pointsof the Lagrangian l is nonempty, the maps [0 , + ∞ ) → S + ( H ) , t (cid:55)→ M ( t ) , and [0 , + ∞ ) → S + ( G ) , t (cid:55)→ M ( t ) , are locally absolutely continuous and monotonically decreasing, M ( t ) + c (1 − γ )4 A ∗ A − L h I ∈ S + ( H ) ∀ t ∈ [0 , + ∞ ) , sup t ≥ (cid:107) ˙ M ( t ) (cid:107) < + ∞ and sup t ≥ (cid:107) ˙ M ( t ) (cid:107) < + ∞ and that one of the following conditions holds:(I) there exists α > such that M ( t ) + c (1 − γ )4 A ∗ A − L h I ∈ P α ( H ) for every t ∈ [0 , + ∞ ) ;(II) γ ∈ [0 , and there exists α > such that A ∗ A ∈ P α ( H ) ;For an arbitrary starting point ( x , z , y ) ∈ H × G × G , let ( x, z, y ) : [0 , + ∞ ) −→ H × G × G be theunique strong global solution of the dynamical system (4) . Consider further for every t ∈ (0 , + ∞ ) the ergodic trajectories ˜ x ( t ) = 1 t (cid:90) t ( ˙ x ( s ) + x ( s )) ds and ˜ z ( t ) = 1 t (cid:90) t ( ˙ z ( s ) + z ( s )) ds. Then there exists K ≥ such that for every t ∈ (0 , + ∞ ) (cid:107) A ˜ x ( t ) − ˜ z ( t ) (cid:107) ≤ Kt . n addition, for every x ∈ H and every t ∈ (0 , + ∞ ) such that (˜ x ( t ) , ˜ z ( t )) ∈ dom f × dom g , one has (cid:16) ( f + h )(˜ x ( t )) + g (˜ z ( t )) (cid:17) − (cid:16) ( f + h )( x ) + g ( Ax ) (cid:17) ≤ (cid:107) ( x , z , y ) − ( x, Ax, (cid:107) W (0) t , where W ( t ) = (cid:18) M ( t ) + c (1 − γ ) A ∗ A, M ( t ) + cI, c I (cid:19) . Proof.

Let x ∈ H be ﬁxed. By using (5), that is − cA ∗ A ( ˙ x ( t ) + x ( t )) − M ( t ) ˙ x ( t ) + cA ∗ z ( t ) − A ∗ y ( t ) − ∇ h ( x ( t )) ∈ ∂f ( ˙ x ( t ) + x ( t )) , it yields f ( ˙ x ( t )+ x ( t )) − f ( x ) ≤ (cid:104) cA ∗ A ( ˙ x ( t )+ x ( t ))+ M ( t ) ˙ x ( t ) − cA ∗ z ( t )+ A ∗ y ( t )+ ∇ h ( x ( t )) , x − ( ˙ x ( t )+ x ( t )) (cid:105) (45)for almost every t ∈ [0 , + ∞ ). Similarly, by using (31), that is − c ( ˙ z ( t ) + z ( t )) + cA ( γ ˙ x ( t ) + x ( t )) − M ( t ) ˙ z ( t ) + y ( t ) ∈ ∂g ( ˙ z ( t ) + z ( t )) , it yields g ( ˙ z ( t )+ z ( t )) − g ( Ax ) ≤ (cid:104) c ( ˙ z ( t )+ z ( t )) − cA ( γ ˙ x ( t )+ x ( t ))+ M ( t ) ˙ z ( t ) − y ( t ) , Ax − ( ˙ z ( t )+ z ( t )) (cid:105) . (46)for almost every t ∈ [0 , + ∞ ). Further, by using the convexity of h and the Descent Lemma weobtain for almost every t ∈ [0 , + ∞ ) h ( x ) − h ( ˙ x ( t ) + x ( t )) − (cid:104)∇ h ( x ( t )) , x − ( ˙ x ( t ) + x ( t )) (cid:105) ≥ h ( x ( t )) + (cid:104)∇ h ( x ( t )) , x − x ( t ) (cid:105) − h ( ˙ x ( t ) + x ( t )) − (cid:104)∇ h ( x ( t )) , x − ( ˙ x ( t ) + x ( t )) (cid:105) = (47) h ( x ( t )) − h ( ˙ x ( t ) + x ( t )) + (cid:104)∇ h ( x ( t )) , ˙ x ( t ) (cid:105) ≥ − L h (cid:107) ˙ x ( t ) (cid:107) . Adding (45) and (47) we obtain for almost every t ∈ [0 , + ∞ )( f + h )( ˙ x ( t ) + x ( t )) − ( f + h )( x ) ≤(cid:104) cA ∗ A ( ˙ x ( t ) + x ( t )) + M ( t ) ˙ x ( t ) − cA ∗ z ( t ) + A ∗ y ( t ) , x − ( ˙ x ( t ) + x ( t )) (cid:105) + L h (cid:107) ˙ x ( t ) (cid:107) . (48)We recall the following four identities from the proof of Theorem 10 (here were actually replace x ∗ with x and y ∗ by 0) (cid:104)− A ∗ y ( t ) , ˙ x ( t ) + x ( t ) − x (cid:105) + (cid:104) y ( t ) , ˙ z ( t ) + z ( t ) − Ax (cid:105) = − (cid:104) y ( t ) , A ( ˙ x ( t ) + x ( t )) − Ax − ( ˙ z ( t ) + z ( t )) + Ax (cid:105) = 1 c (cid:104) y ( t ) , ˙ y ( t ) (cid:105) = − c ddt (cid:107) y ( t ) (cid:107) , which corresponds to (33), (cid:104)− cA ∗ A ( ˙ x ( t ) + x ( t )) + cA ∗ z ( t ) , ˙ x ( t ) + x ( t ) − x (cid:105) + (cid:104)− c ( ˙ z ( t ) + z ( t )) + cAx ( t ) + cγA ˙ x ( t ) , ˙ z ( t ) + z ( t ) − Ax (cid:105) = − γ + 13 c (cid:107) ˙ y ( t ) (cid:107) − c (cid:107) ˙ z ( t ) (cid:107) − (1 − γ ) c (cid:107) A ˙ x ( t ) (cid:107) − (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ c z ( t ) + 1 √ c ˙ y ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) − (1 − γ ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ c A ˙ x ( t ) − √ c ˙ y ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + c ( γ − ddt (cid:0) (cid:107) Ax ( t ) − Ax (cid:107) (cid:1) − c ddt (cid:0) (cid:107) z ( t ) − Ax (cid:107) (cid:1) , (cid:104)− M ( t ) ˙ x ( t ) , ˙ x ( t ) + x ( t ) − x (cid:105) = − (cid:107) ˙ x ( t ) (cid:107) M ( t ) − (cid:104) M ( t ) ˙ x ( t ) , x ( t ) − x (cid:105) = − (cid:107) ˙ x ( t ) (cid:107) M ( t ) + 12 (cid:104) ˙ M ( t )( x ( t ) − x ) , x ( t ) − x (cid:105)− ddt (cid:107) x ( t ) − x (cid:107) M ( t ) and (cid:104)− M ( t ) ˙ z ( t ) , ˙ z ( t ) + z ( t ) − Ax (cid:105) = − (cid:107) ˙ z ( t ) (cid:107) M ( t ) − (cid:104) M ( t ) ˙ z ( t ) , z ( t ) − Ax (cid:105) = − (cid:107) ˙ z ( t ) (cid:107) M ( t ) + 12 (cid:104) ˙ M ( t )( z ( t ) − Ax ) , z ( t ) − Ax (cid:105)− ddt (cid:107) z ( t ) − Ax (cid:107) M ( t ) , which all hold for for almost every t ∈ [0 , + ∞ ). By adding the four identities, (48) and (46), weobtain for almost every t ∈ [0 , + ∞ ) (cid:16) ( f + h )( ˙ x ( t ) + x ( t )) + g ( ˙ z ( t ) + z ( t )) (cid:17) − (cid:16) ( f + h )( x ) + g ( Ax ) (cid:17) ≤− γ + 13 c (cid:107) ˙ y ( t ) (cid:107) − c (cid:107) ˙ z ( t ) (cid:107) − (1 − γ ) c (cid:107) A ˙ x ( t ) (cid:107) − (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ c z ( t ) + 1 √ c ˙ y ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) − (1 − γ ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) √ c A ˙ x ( t ) − √ c ˙ y ( t ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) − c (1 − γ )2 ddt (cid:0) (cid:107) Ax ( t ) − Ax (cid:107) (cid:1) − c ddt (cid:0) (cid:107) z ( t ) − Ax (cid:107) (cid:1) + L h (cid:107) ˙ x ( t ) (cid:107) − c ddt (cid:107) y ( t ) (cid:107) −(cid:107) ˙ x ( t ) (cid:107) M ( t ) + 12 (cid:104) ˙ M ( t )( x ( t ) − x ) , x ( t ) − x (cid:105) − ddt (cid:107) x ( t ) − x (cid:107) M ( t ) −(cid:107) ˙ z ( t ) (cid:107) M ( t ) + 12 (cid:104) ˙ M ( t )( z ( t ) − Ax ) , z ( t ) − Ax (cid:105) − ddt (cid:107) z ( t ) − Ax (cid:107) M ( t ) . By neglecting the negative terms (here we use also that M ( t ) + c (1 − γ )4 A ∗ A − L h I ∈ S + ( H )), weobtain for almost every t ∈ [0 , + ∞ ) (cid:16) ( f + h )( ˙ x ( t ) + x ( t )) + g ( ˙ z ( t ) + z ( t )) (cid:17) − (cid:16) ( f + h )( x ) + g ( Ax ) (cid:17) ≤− ddt (cid:18) (cid:107) x ( t ) − x (cid:107) M ( t )+ c (1 − γ ) A ∗ A + (cid:107) z ( t ) − Ax (cid:107) M ( t )+ cI + 1 c (cid:107) y ( t ) (cid:107) (cid:19) = (49) − ddt (cid:107) ( x ( t ) , z ( t ) , y ( t )) − ( x, Ax, (cid:107) W ( t ) , where W ( t ) = (cid:18) M ( t ) + c (1 − γ ) A ∗ A, M ( t ) + cI, c I (cid:19) . For ˜ x ( t ) = t (cid:82) t ( ˙ x ( s ) + x ( s )) ds and ˜ z ( t ) = t (cid:82) t ( ˙ z ( s ) + z ( s )) ds, it holds A ˜ x ( t ) − ˜ z ( t ) = 1 t (cid:90) t A ( ˙ x ( s ) + x ( s )) − ( ˙ z ( s ) + z ( s )) ds = 1 ct (cid:90) t ˙ y ( s ) ds = y ( t ) − y ct ∀ t ∈ (0 , + ∞ ) . From Theorem 10 it follows that the trajectory ( x ( t ) , z ( t ) , y ( t )) = ( x ∞ , z ∞ , y ∞ ) , t ∈ [0 , + ∞ ), con-verges weakly to a saddle point of l as t → + ∞ . This means that y ( t ) , t ∈ [0 , + ∞ ) , it is bounded,thus there exists K ≥ (cid:107) A ˜ x ( t ) − ˜ z ( t ) (cid:107) ≤ Kt ∀ t ∈ (0 , + ∞ ) . t ∈ (0 , + ∞ ) be such that (˜ x ( t ) , ˜ z ( t )) ∈ dom f × dom g . By Jensen’s inequality in the integralform we have for every t ∈ (0 , + ∞ )( f + h )(˜ x ( t )) = ( f + h ) (cid:18) t (cid:90) t ( ˙ x ( s ) + x ( s )) ds (cid:19) ≤ t (cid:90) t ( f + h )( ˙ x ( s ) + x ( s )) ds and g (˜ z ( t )) = g (cid:18) t (cid:90) t ( ˙ z ( s ) + z ( s )) ds (cid:19) ≤ t (cid:90) t g ( ˙ z ( s ) + z ( s )) ds, which, combined with (49), yields ( f + h )(˜ x ( t )) + g (˜ z ( t )) ≤ t (cid:90) t (cid:16) ( f + h )( ˙ x ( s ) + x ( s )) + g ( ˙ z ( s ) + z ( s )) (cid:17) ds ≤ t (cid:90) t (cid:18)(cid:16) ( f + h )( x ) + g ( Ax ) (cid:17) − dds (cid:107) ( x ( s ) , z ( s ) , y ( s )) − ( x, Ax, (cid:107) W ( s ) (cid:19) ds =( f + h )( x ) + g ( Ax ) − t (cid:16) (cid:107) ( x ( t ) , z ( t ) , y ( t )) − ( x, Ax, (cid:107) W ( t ) − (cid:107) ( x (0) , z (0) , y (0)) − ( x, Ax, (cid:107) W (0) (cid:17) ≤ ( f + h )( x ) + g ( Ax ) + (cid:107) ( x , z , y ) − ( x, Ax, (cid:107) W (0) t . Hence, (cid:16) ( f + h )(˜ x ( t )) + g (˜ z ( t )) (cid:17) − (cid:16) ( f + h )( x ) + g ( Ax ) (cid:17) ≤ (cid:107) ( x , z , y ) − ( x, Ax, (cid:107) W (0) t . References [1] B. Abbas, H. Attouch,

Dynamical systems and forward-backward algorithms associated withthe sum of a convex subdiﬀerential and a monotone cocoercive operator , Optimization 64(10),2223–2252, 2015[2] B. Abbas, H. Attouch, B.F. Svaiter,

Newton-like dynamics and forward-backward methodsfor structured monotone inclusions in Hilbert spaces , Journal of Optimization Theory and itsApplications 161(2), 331–360, 2014[3] A. Alotaibi, P. L. Combettes and N. Shahzad,

Solving coupled composite monotone inclusionsby successive Fej´er approximations of their Kuhn-Tucker set , SIAM Journal on Optimization24(4), 2076–2095, 2014[4] F. Alvarez, H. Attouch, J. Bolte, P. Redont,

A second-order gradient-like dissipative dynamicalsystem with Hessian-driven damping. Application to optimization and mechanics , Journal deMath´ematiques Pures et Appliqu´ees (9) 81(8), 747–779, 2002[5] A.S. Antipin,

Minimization of convex functions on convex sets by means of diﬀerential equa-tions , (Russian) Diﬀerentsial’nye Uravneniya 30(9), 1475–1486, 1994; translation in Diﬀeren-tial Equations 30(9), 1365–1375, 1994[6] H. Attouch, B.F. Svaiter,

A continuous dynamical Newton-like approach to solving monotoneinclusions , SIAM Journal on Control and Optimization 49(2), 574–598, 2011327] S. Banert, R.I. Bot¸,

A forward-backward-forward diﬀerential equation and its asymptotic prop-erties , Journal of Convex Analysis 25(2), 371–388, 2018[8] S. Banert, R.I. Bot¸, E.R. Csetnek, Fixing and extending some recent results on the ADMMalgorithm, arXiv:1612.05057, 2016[9] H.H. Bauschke, P.L. Combettes,

Convex Analysis and Monotone Operator Theory in HilbertSpaces , CMS Books in Mathematics, Springer, New York, 2011[10] J. Bolte,

Continuous gradient projection method in Hilbert spaces , Journal of OptimizationTheory and its Applications 119(2), 235–259, 2003[11] R.I. Bot¸,

Conjugate Duality in Convex Optimization , Lecture Notes in Economics and Math-ematical Systems, Vol. 637, Springer, Berlin Heidelberg, 2010[12] R.I. Bot¸, E.R. Csetnek,

ADMM for monotone operators: convergence analysis and rates ,Advances in Computational Mathematics 45(1), 327-359, 2019[13] R.I. Bot¸, E.R. Csetnek,

A dynamical system associated with the ﬁxed points set of a nonex-pansive operator , Journal of Dynamics and Diﬀerential Equations 29(1), 155–168, 2017[14] H. Br´ezis,

Op´erateurs Maximaux Monotones et Semi-Groupes de Contractions Dans les Es-paces de Hilbert , North-Holland Mathematics Studies No. 5, Notas de Matem´atica (50), North-Holland/Elsevier, New York, 1973[15] R.E. Bruck, Jr.,

Asymptotic convergence of nonlinear contraction semigroups in Hilbert space ,Journal of Functional Analysis 18, 15–26, 1975[16] A. Chambolle, T. Pock,

A ﬁrst-order primal-dual algorithm for convex problems with applica-tions to imaging , Journal of Mathematical Imaging and Vision 40(1), 120–145, 2011[17] G. Chen, M. Teboulle,

A proximal-based decomposition method for convex minimization prob-lems , Mathematical Programming 64, 81–101, 1994[18] P.L. Combettes, B.C. Vu,

Variable metric quasi-Fej´er monotonicity , Nonlinear Analysis 78,17-31, 2013[19] L. Condat,

A primal-dual splitting method for convex optimization involving Lipschitzian,proximable and linear composite terms , Journal of Optimization Theory and Applications158(2), 460–479, 2013[20] M.G. Crandall, A. Pazy,

Semi-groups of nonlinear contractions and dissipative sets , Journalof Functional Analysis 3, 376-418, 1969[21] E.R. Csetnek, Y. Malitsky, M.K. Tam,

Shadow Douglas-Rachford Splitting for MonotoneInclusions , arXiv:1903.03393, 2019[22] Y. Cui, X. Li, D. Sun, K.-C. Toh,

On the convergence properties of a majorized alternatingdirection method of multipliers for linearly constrained convex optimization problems withcoupled objective functions , Journal of Optimization Theory and Applications 169, 1013–1041,2016[23] M. Fazel, T.K. Pong, D. Sun, P. Tseng,

Hankel matrix rank minimization with applicationsin system identiﬁcation and realization , SIAM Journal on Matrix Analysis and Applications34, 946–977, 2013 3324] A. Haraux,

Syst`emes Dynamiques Dissipatifs et Applications , Recherches en Math´ematiquesAppliqu´ees 17, Masson, Paris, 1991[25] R. Sheﬁ, M. Teboulle,

Rate of convergence analysis of decomposition methods based on theproximal method of multipliers for convex optimization , SIAM Journal on Optimization 24(1),269–297, 2014[26] E.D. Sontag,

Mathematical Control Theory. Deterministic Finite-Dimensional Systems , Textsin Applied Mathematics 6, Springer-Verlag, New York, 1998[27] B.C. V˜u,

A splitting algorithm for dual monotone inclusions involving cocoercive operators ,Advances in Computational Mathematics 38(3), 667–681, 2013[28] C. Z˘alinescu,

Convex Analysis in General Vector Spaces , World Scientiﬁc, Singapore, 2002[29] R. Zimmer,