Solving Composite Monotone Inclusions in Reflexive Banach Spaces by Constructing Best Bregman Approximations from Their Kuhn-Tucker Set
aa r X i v : . [ m a t h . O C ] S e p Solving Composite Monotone Inclusions in Reflexive Banach Spacesby Constructing Best Bregman Approximations from TheirKuhn-Tucker Set
Patrick L. Combettes and Quang Van Nguyen
Sorbonne Universit´es – UPMC Univ. Paris 06UMR 7598, Laboratoire Jacques-Louis LionsF-75005 Paris, France [email protected] , [email protected] In memory of Jean Jacques Moreau (1923–2014)
Abstract
We introduce the first operator splitting method for composite monotone inclusions outsideof Hilbert spaces. The proposed primal-dual method constructs iteratively the best Bregman ap-proximation to an arbitrary point from the Kuhn-Tucker set of a composite monotone inclusion.Strong convergence is established in reflexive Banach spaces without requiring additional re-strictions on the monotone operators or knowledge of the norms of the linear operators involvedin the model. The monotone operators are activated via Bregman distance-based resolvent op-erators. The method is novel even in Euclidean spaces, where it provides an alternative to theusual proximal methods based on the standard distance.
Key words.
Best approximation, Banach space, Bregman distance, duality, Legendre function,monotone operator, operator splitting, primal-dual algorithm.1
Introduction
Let X be a reflexive real Banach space with norm k · k and let h· , ·i be the duality pairing be-tween X and its topological dual X ∗ . A set-valued operator M : X → X ∗ with graph gra M = (cid:8) ( x, x ∗ ) ∈ X × X ∗ (cid:12)(cid:12) x ∗ ∈ M x (cid:9) is monotone if ( ∀ ( x , x ∗ ) ∈ gra M )( ∀ ( x , x ∗ ) ∈ gra M ) h x − x , x ∗ − x ∗ i > , (1.1)and maximally monotone if, furthermore, there exists no monotone operator from X to X ∗ thegraph of which properly contains gra M . Monotone operator theory emerged in the early 1960sas a well-structured branch of nonlinear analysis [24, 29, 30, 41], and its remains very active[9, 10, 35, 42]. One of the main reasons for the success of the theory is that a significant rangeof problems in areas such as optimization, economics, variational inequalities, partial differentialequations, mechanics, signal and image processing, optimal transportation, machine learning, andtraffic theory can be reduced to solving inclusions of the typefind x ∈ X such that ∈ M x, (1.2)where M : X → X ∗ is maximally monotone. Conceptually, this inclusion can be solved via theBregman proximal point algorithm, special instances of which go back to [22, 25, 36]. To present itsgeneral form [7], we need the following definitions, which revolve around the notion of a Bregmandistance pioneered in [13]. Definition 1.1 [6, 7] Let X be a reflexive real Banach space and let f : X → ] −∞ , + ∞ ] bea proper lower semicontinuous convex function, with conjugate f ∗ : X ∗ → ] −∞ , + ∞ ] : x ∗ sup x ∈X ( h x, x ∗ i − f ( x )) and Moreau subdifferential [32] ∂f : X → X ∗ : x (cid:8) x ∗ ∈ X ∗ (cid:12)(cid:12) ( ∀ y ∈ X ) h y − x, x ∗ i + f ( x ) f ( y ) (cid:9) . (1.3)Then f is a Legendre function if it is essentially smooth in the sense that ∂f is both locally boundedand single-valued on its domain, and essentially strictly convex in the sense that ∂f ∗ is locallybounded on its domain and f is strictly convex on every convex subset of dom ∂f . Moreover, f is Gˆateaux differentiable on int dom f = ∅ and the associated Bregman distance is D f : X × X → [0 , + ∞ ]( x, y ) ( f ( x ) − f ( y ) − h x − y, ∇ f ( y ) i , if y ∈ int dom f ;+ ∞ , otherwise . (1.4)Let C be a closed convex subset of X such that C ∩ int dom f = ∅ . The Bregman projector onto C induced by f is P fC : int dom f → C ∩ int dom fy argmin x ∈ C D f ( x, y ) . (1.5)2he fact that, for every y ∈ int dom f , P fC y ∈ int dom f exists and is unique is established in [6,Corollary 7.9]. It follows from [7, Theorem 5.18] that, under suitable assumptions on f and M ,given a sequence ( γ n ) n ∈ N in ]0 , + ∞ [ such that inf n ∈ N γ n > , the sequence defined by x ∈ int dom f and ( ∀ n ∈ N ) x n +1 = ( ∇ f + γ n M ) − ◦ ∇ f ( x n ) (1.6)converges weakly to a solution to (1.2) (in the case when X is a Hilbert space and f = k · k / , ( ∇ f + γ n M ) − ◦∇ f reduces to the standard resolvent J γ n M and we obtain the classical result of [34,Theorem 1]). A strongly convergent variant of (1.6) was proposed in [8]. In applications, however, M is typically too complex for (1.6) to be implementable. For instance, given a real Banach space Y ,a typical composite model is M = A + L ∗ BL , where A : X → X ∗ and B : Y → Y ∗ are monotone,and L : X → Y is linear and bounded. In Hilbert spaces, if X = Y and L = Id , several well-established splitting methods are available to solve (1.2), i.e., to find a zero of A + B using A and B separately at each iteration [9, 27, 28, 37]. Splitting methods for the more versatile compositemodel M = A + L ∗ BL in Hilbert spaces were first proposed in [14] (see [1, 11, 12, 19, 20, 38]for subsequent developments). These methods provide in general only weak convergence to anunspecified solution and, in addition, they require knowledge of k L k or potentially costly inversionsof linear operators. The recent method primal-dual method of [3] circumvents these limitationsand, in addition, converges to the best approximation to a reference point from the Kuhn-Tucker setrelative to the underlying hilbertian distance. The objective of this paper is to extend it to reflexiveBanach spaces and to best approximation relative to general Bregman distances. Let us stress thatthe theory of splitting algorithms in Banach spaces is rather scarce as most hilbertian splittingmethods cannot be naturally extended to that setting; in particular, to the best of our knowledgethere exists at present no splitting algorithm for finding a zero of M = A + L ∗ BL outside of Hilbertspaces. By contrast, the geometric primal-dual construction of [3], which consists in projecting areference point onto successive simple outer approximations to the Kuhn-Tucker set of the inclusion ∈ Ax + L ∗ BLx , lends itself to such an extension. Our analysis will borrow tools on Legendrefunctions and Bregman-based algorithms from [6] and [7], as well as geometric constructs from[3] and [8]. The proposed results will provide not only the first splitting methods for compositeinclusions outside of Hilbert spaces, but also new algorithms in Hilbert, and even Euclidean, spaces.The problem under consideration is the following.
Problem 1.2
Let X and Y be reflexive real Banach spaces such that X 6 = { } and Y 6 = { } , let X bethe standard product vector space X × Y ∗ equipped with the norm ( x, y ∗ ) p k x k + k y ∗ k , andlet X ∗ be its topological dual, that is, X ∗ × Y equipped with the norm ( x ∗ , y ) p k x ∗ k + k y k . Let A : X → X ∗ and B : Y → Y ∗ be maximally monotone, and let L : X → Y be linear and bounded.Consider the inclusion problemfind x ∈ X such that ∈ Ax + L ∗ BLx, (1.7)the dual problemfind y ∗ ∈ Y ∗ such that ∈ − LA − ( − L ∗ y ∗ ) + B − y ∗ , (1.8)and let Z = (cid:8) ( x, y ∗ ) ∈ X (cid:12)(cid:12) − L ∗ y ∗ ∈ Ax and Lx ∈ B − y ∗ (cid:9) (1.9)3e the associated Kuhn-Tucker set. Let f : X → ] −∞ , + ∞ ] and g : Y → ] −∞ , + ∞ ] be Legendrefunctions, set f : X → ] −∞ , + ∞ ] : ( x, y ∗ ) f ( x ) + g ∗ ( y ∗ ) , (1.10)let x ∈ int dom f , let y ∗ ∈ int dom g ∗ , and suppose that Z ∩ int dom f = ∅ . The problem is to findthe best Bregman approximation ( x, y ∗ ) = P fZ ( x , y ∗ ) to ( x , y ∗ ) from Z . Notation.
The symbols ⇀ and → denote respectively weak and strong convergence. Theset of weak sequential cluster points of a sequence ( x n ) n ∈ N is denoted by W ( x n ) n ∈ N . The closedball of center x ∈ X and radius ρ ∈ ]0 , + ∞ [ is denoted by B ( x ; ρ ) . Let M : X → X ∗ be a set-valued operator. The domain of M is dom M = (cid:8) x ∈ X (cid:12)(cid:12) M x = ∅ (cid:9) , the range of M is ran M = (cid:8) x ∗ ∈ X ∗ (cid:12)(cid:12) ( ∃ x ∈ X ) x ∗ ∈ M x (cid:9) , and the set of zeros of M is zer M = (cid:8) x ∈ X (cid:12)(cid:12) ∈ M x (cid:9) . Γ ( X ) is the class of all lower semicontinuous convex functions f : X → ] −∞ , + ∞ ] such that dom f = (cid:8) x ∈ X (cid:12)(cid:12) f ( x ) < + ∞ (cid:9) = ∅ . Let f : X → ] −∞ , + ∞ ] . Then f is coercive if lim k x k→ + ∞ f ( x ) = + ∞ and supercoercive if lim k x k→ + ∞ f ( x ) / k x k = + ∞ . The following proposition revisits and complements some results of [2] and [14] on the propertiesof the Kuhn-Tucker set in the more general setting of Problem 1.2.
Proposition 2.1
Consider the setting of Problem 1.2. Then the following hold: (i)
Let P be the set of solutions to (1.7) and let D be the set of solutions to (1.8) . Then the followinghold: (a) Z is a closed convex subset of P × D . (b) Set Q X : X → X : ( x, y ∗ ) x and Q Y ∗ : X → Y ∗ : ( x, y ∗ ) y ∗ . Then P = Q X ( Z ) and D = Q Y ∗ ( Z ) . (c) P = ∅ ⇔ Z = ∅ ⇔ D = ∅ . (ii) For every a = ( a, a ∗ ) ∈ gra A and b = ( b, b ∗ ) ∈ gra B , set s ∗ a , b = ( a ∗ + L ∗ b ∗ , b − La ) , η a , b = h a, a ∗ i + h b, b ∗ i , and H a , b = (cid:8) x ∈ X (cid:12)(cid:12) h x , s ∗ a , b i η a , b (cid:9) . Then the following hold: (a) ( ∀ a ∈ gra A )( ∀ b ∈ gra B ) [ s ∗ a , b = ⇔ H a , b = X ⇒ ( a, b ∗ ) ∈ Z and η a , b = 0 ] . (b) Z = T a ∈ gra A T b ∈ gra B H a , b . (iii) Let ( a n , a ∗ n ) n ∈ N be a sequence in gra A , let ( b n , b ∗ n ) n ∈ N be a sequence in gra B , and let ( x, y ∗ ) ∈ X .Suppose that a n ⇀ x , b ∗ n ⇀ y ∗ , a ∗ n + L ∗ b ∗ n → , and La n − b n → . Then ( x, y ∗ ) ∈ Z . roof . Set M : X → X ∗ : ( x, y ∗ ) Ax × B − y ∗ and S : X → X ∗ : ( x, y ∗ ) ( L ∗ y ∗ , − Lx ) . Since A and B − are maximally monotone, so is M . On the other hand, S is linear, bounded, and positivesince ( ∀ ( x, y ∗ ) ∈ X ) h S ( x, y ∗ ) , ( x, y ∗ ) i = h x, L ∗ y ∗ i + h− Lx, y ∗ i = 0 . (2.1)Thus, it follows from [35, Section 17] that S is maximally monotone with dom S = X . In turn, wederive from [35, Theorem 24.1(a)] that M + S is maximally monotone. (2.2)(i)(a): Let ( x, y ∗ ) ∈ X . Then ∈ M ( x, y ∗ ) + S ( x, y ∗ ) ⇔ ∈ Ax + L ∗ y ∗ and ∈ B − y ∗ − Lx ⇔ − L ∗ y ∗ ∈ Ax and Lx ∈ B − y ∗ ⇔ ( x, y ∗ ) ∈ Z . (2.3)Therefore, we derive from (2.2) and [15, Lemma 1.1(a)] that Z = zer ( M + S ) = ( M + S ) − (0) isclosed and convex.(i)(b): Let x ∈ X . Then x ∈ P ⇔ ∈ Ax + L ∗ BLx ⇔ ( ∃ y ∗ ∈ Y ∗ ) [ − L ∗ y ∗ ∈ Ax and y ∗ ∈ BLx ] ⇔ ( ∃ y ∗ ∈ Y ∗ ) ( x, y ∗ ) ∈ Z . Hence P = ∅ ⇔ Z = ∅ . Likewise, let y ∗ ∈ Y ∗ . Then y ∗ ∈ D ⇔ ∈ − LA − ( − L ∗ y ∗ ) + B − y ∗ ⇔ ( ∃ x ∈ X ) [ x ∈ A − ( − L ∗ y ∗ ) and ∈ − Lx + B − y ∗ ] ⇔ ( ∃ x ∈ X ) [ − L ∗ y ∗ ∈ Ax and Lx ∈ B − y ∗ ] ⇔ ( ∃ x ∈ X ) ( x, y ∗ ) ∈ Z .(i)(c): Clear from (i)(b) (see also [33, Corollary 2.4]).(ii)(a) : Let a ∈ gra A and b ∈ gra B . Then s ∗ a , b = ⇒ [ − L ∗ b ∗ = a ∗ ∈ Aa and La = b ∈ B − b ∗ ] ⇒ ( a, b ∗ ) ∈ Z . In addition, s ∗ a , b = ⇒ η a , b = h a, a ∗ i + h b, b ∗ i = h a, − L ∗ b ∗ i + h La, b ∗ i = −h La, b ∗ i + h La, b ∗ i = 0 . (2.4)Thus s ∗ a , b = ⇒ H a , b = X . Conversely, H a , b = X ⇒ s ∗ a , b = ⇒ η a , b = 0 .(ii)(b): First, suppose that x = ( x, y ∗ ) ∈ T a ∈ gra A T b ∈ gra B H a , b . Then ( ∀ a ∈ gra A )( ∀ b ∈ gra B ) h ( a, b ∗ ) − ( x, y ∗ ) , ( a ∗ , b ) − ( − L ∗ y ∗ , Lx ) i = h ( a − x, b ∗ − y ∗ ) , ( a ∗ + L ∗ y ∗ , b − Lx ) i = h a − x, a ∗ + L ∗ y ∗ i + h b − Lx, b ∗ − y ∗ i = η a , b − h x , s ∗ a , b i > . (2.5)On the other hand, since (cid:8)(cid:0) ( a, b ∗ ) , ( a ∗ , b ) (cid:1) (cid:12)(cid:12) a ∈ gra A, b ∈ gra B (cid:9) = gra M , (2.6)5t follows from (2.2) and (2.5) that (( x, y ∗ ) , ( − L ∗ y ∗ , Lx )) ∈ gra M , i.e., x ∈ Z . Thus \ a ∈ gra A \ b ∈ gra B H a , b ⊂ Z . (2.7)Conversely, let a ∈ gra A , let b ∈ gra B , and let ( x, y ∗ ) ∈ Z . Then ( x, − L ∗ y ∗ ) ∈ gra A and ( Lx, y ∗ ) ∈ gra B . Since A and B are monotone, we obtain h a − x, a ∗ + L ∗ y ∗ i > and h b − Lx, b ∗ − y ∗ i > . (2.8)Adding these two inequalities yields h x − a, a ∗ + L ∗ y ∗ i + h Lx − b, b ∗ − y ∗ i (2.9)and, therefore, h x , s ∗ a , b i = h x, a ∗ + L ∗ b ∗ i + h b − La, y ∗ i = h x, a ∗ + L ∗ y ∗ i + h Lx, b ∗ − y ∗ i + h b − Lx, y ∗ i + h x − a, L ∗ y ∗ i = h x − a, a ∗ + L ∗ y ∗ i + h a, a ∗ i + h La, y ∗ i + h Lx − b, b ∗ − y ∗ i + h b, b ∗ i − h b, y ∗ i + h b − Lx, y ∗ i + h x − a, L ∗ y ∗ i h a, a ∗ i + h b, b ∗ i + h La − b, y ∗ i + h b − Lx, y ∗ i + h x − a, L ∗ y ∗ i = h a, a ∗ i + h b, b ∗ i = η a , b . (2.10)This implies that ( x, y ∗ ) ∈ H a , b . Hence, Z ⊂ H a , b .(iii): Set ( ∀ n ∈ N ) x n = ( a n , b ∗ n ) and x ∗ n = ( a ∗ n + L ∗ b ∗ n , b n − La n ) . Then x n ⇀ ( x, y ∗ ) , x ∗ n → , and ( ∀ n ∈ N ) ( x n , x ∗ n ) ∈ gra ( M + S ) . However, it follows from (2.2) that gra ( M + S ) is sequentiallyclosed in X weak × X ∗ strong [15, Lemma 1.2]. Hence, ∈ ( M + S )( x, y ∗ ) , i.e., by (2.3), ( x, y ∗ ) ∈ Z . Proposition 2.2
Consider the setting of Problem 1.2. Then the following hold: (i) f is a Legendre function. (ii) The solution ( x, y ∗ ) to Problem 1.2 exists and is unique.Proof . (i): Since f and g are Legendre functions, so are f ∗ and g ∗ [6, Corollary 5.5]. Therefore,it follows from [6, Theorem 5.6(iii)] that ∂f and ∂g ∗ are single-valued on dom ∂f = int dom f and dom ∂g ∗ = int dom g ∗ , respectively. On the other hand, we derive from (1.10) that dom f = dom f × dom g ∗ and that ∂ f : X → X ∗ : ( x, y ∗ ) ∂f ( x ) × ∂g ∗ ( y ∗ ) . (2.11)Thus, ∂ f is single-valued ondom ∂ f = dom ∂f × dom ∂g ∗ = int dom f × int dom g ∗ = int ( dom f × dom g ∗ ) = int dom f . (2.12)6ikewise, since f ∗ : X ∗ → ] −∞ , + ∞ ] : ( x ∗ , y ) f ∗ ( x ∗ ) + g ( y ) , (2.13)we deduce that ∂ f ∗ is single-valued on dom ∂ f ∗ = int dom f ∗ . Consequently, [6, Theorems 5.4and 5.6] assert that f is a Legendre function.(ii): It follows from Proposition 2.1(i)(a) that Z is a closed convex subset of X . Hence, since Z ∩ int dom f = ∅ , we derive from (i) and [6, Corollary 7.9] that ( x, y ∗ ) = P fZ ( x , y ∗ ) is uniquelydefined. The approach we present goes back to Haugazeau’s algorithm [23, Th´eor`eme 3-2] (see also [9,Theorem 29.3]) for projecting a point onto the intersection of closed convex sets in a Hilbert spaceusing the projections onto the individual sets. The method was extended in [18] to minimize certainconvex functions over the intersection of closed convex sets in Banach spaces. The adaptation tothe problem of finding the best Bregman approximation from a closed convex set was investigatedin [8].
Definition 2.3 [7, Definition 3.1] and [8, Section 3] Let X be a reflexive real Banach space, let f ∈ Γ ( X ) be a Legendre function, let x ∈ int dom f , let x ∈ int dom f , and let y ∈ int dom f .Then H f ( x , y ) = (cid:8) z ∈ X (cid:12)(cid:12) h z − y , ∇ f ( x ) − ∇ f ( y ) i (cid:9) = (cid:8) z ∈ X (cid:12)(cid:12) D f ( z , y ) + D f ( y , x ) D f ( z , x ) (cid:9) (2.14)is the closed affine half-space onto which y is the Bregman projection of x if x = y . Moreover, if H f ( x , x ) ∩ H f ( x , y ) ∩ int dom f = ∅ , then Q f ( x , x , y ) = P f H f ( x , x ) ∩ H f ( x , y ) x . (2.15) Lemma 2.4 [7, Lemma 3.2]
Let X be a reflexive real Banach space, and let C and C be convexsubsets of X such that C is closed and C ∩ int C = ∅ . Then C ∩ int C = C ∩ C . Proposition 2.5
Let X be a reflexive real Banach space, let f ∈ Γ ( X ) be a Legendre function, let C be a closed convex subset of dom f such that C ∩ int dom f = ∅ , let x ∈ int dom f , and set x = P fC x .At every iteration n ∈ N , find x n +1 / ∈ int dom f such that C ⊂ H f ( x n , x n +1 / ) and set x n +1 = Q f (cid:0) x , x n , x n +1 / (cid:1) . (2.16) Then the following hold: (i) ( ∀ n ∈ N ) C ⊂ H f ( x , x n ) . ( x n ) n ∈ N is a well-defined bounded sequence in int dom f . (iii) Suppose that, for some n ∈ N , x n ∈ C . Then ( ∀ k ∈ N ) x n + k = x . (iv) P n ∈ N D f ( x n +1 , x n ) < + ∞ . (v) ( ∀ n ∈ N ) D f ( x n +1 / , x n ) D f ( x n +1 , x n ) . (vi) P n ∈ N D f ( x n +1 / , x n ) < + ∞ . (vii) [ x n ⇀ x and f ( x n ) → f ( x ) ] ⇔ D f ( x n , x ) → ⇔ W ( x n ) n ∈ N ⊂ C .Proof . Item (i) is found in [8, Proof of Proposition 3.3]. The first equivalence in (vii) follows from[8, Propositions 2.2(ii)]. To establish the remaining assertions, set ( ∀ n ∈ N ) T n = P f H f ( x n , x n +1 / ) . (2.17)Then, for every n ∈ N , Definition 2.3 yields T n x n = x n +1 / , [7, Proposition 3.32(ii)(b)] yieldsFix T n = H f ( x n , x n +1 / ) ∩ int dom f , and we derive from Lemma 2.4 that C ⊂ H f ( x n , x n +1 / ) ∩ dom f = H f ( x n , x n +1 / ) ∩ int dom f = Fix T n . (2.18)On the other hand, ( ∀ n ∈ N ) ∅ = C ∩ int dom f ⊂ H f ( x n , x n +1 / ) ∩ int dom f = Fix T n (2.19)and, therefore, T n ∈ N Fix T n = ∅ . Altogether, [8, Condition 3.2] is satisfied and it follows from [8,Propositions 3.3 and 3.4] and [8, Proof of Proposition 3.4(vii)] that the proof is complete. Definition 2.6
Let X be a reflexive real Banach space such that X 6 = { } and let M : X → X ∗ .Then M is coercive if ( ∃ z ∈ dom M ) lim k x k→ + ∞ inf h x − z, M x ik x k = + ∞ , (2.20)and it is bounded if it maps bounded sets to bounded set. Lemma 2.7
Let X be a reflexive real Banach space such that X 6 = { } and let M : X → X ∗ . Supposethat one of the following holds: (i) dom M is nonempty and bounded. (ii) M is uniformly monotone at some point z ∈ dom M with a supercoercive modulus: thereexists a strictly increasing function φ : [0 , + ∞ [ → [0 , + ∞ ] that vanishes only at such that lim t → + ∞ φ ( t ) /t = + ∞ and ( ∀ ( x, x ∗ ) ∈ gra M )( ∀ z ∗ ∈ M z ) h x − z, x ∗ − z ∗ i > φ ( k x − z k ) . (2.21)8iii) M = ∂ϕ , where ϕ is a supercoercive function in Γ ( X ) .Then M is coercive.Proof . (i): Let x ∈ X and let z ∈ dom M . Then, if k x k is sufficiently large, we have M x = ∅ andtherefore inf h x − z, M x i / k x k = + ∞ .(ii): We have ( ∀ ( x, x ∗ ) ∈ gra M )( ∀ z ∗ ∈ M z ) h x − z, x ∗ − z ∗ i > φ ( k x − z k ) . (2.22)Hence, for every x ∈ dom M such that k x k > k z k , we have ( ∀ x ∗ ∈ M x )( ∀ z ∗ ∈ M z ) h x − z, x ∗ ik x k > φ ( k x − z k ) − k x − z k k z ∗ kk x k = k x − z kk x k (cid:18) φ ( k x − z k ) k x − z k − k z ∗ k (cid:19) . (2.23)Thus, lim k x k→ + ∞ inf h x − z, M x ik x k = + ∞ . (2.24)(iii): In view of (i), we suppose that dom M is unbounded. Let z ∈ dom M . Then we derivefrom (1.3) that, for every x ∈ dom M r { } , ( ∀ x ∗ ∈ M x ) ϕ ( x ) − ϕ ( z ) k x k h x − z, x ∗ ik x k . (2.25)Hence, the supercoercivity of ϕ yields lim k x k→ + ∞ inf h x − z, M x ik x k = + ∞ (2.26)and M is therefore coercive. Lemma 2.8
Let X be a reflexive real Banach space such that X 6 = { } , let M : X → X ∗ , and let M : X → X ∗ be monotone. Suppose that there exists z ∈ dom M ∩ dom M such that lim k x k→ + ∞ inf h x − z, M x ik x k = + ∞ . (2.27) Then M + M is coercive.Proof . Suppose that x ∈ ( dom M ∩ dom M ) r { } , let x ∗ ∈ ( M + M ) x , and let z ∗ ∈ ( M + M ) z .Then there exist x ∗ ∈ M x , x ∗ ∈ M x , z ∗ ∈ M z , and z ∗ ∈ M z such that x ∗ = x ∗ + x ∗ and9 ∗ = z ∗ + z ∗ . In turn, the monotonicity of M yields h x − z, x ∗ ik x k = h x − z, x ∗ − z ∗ ik x k + h x − z, x ∗ − z ∗ ik x k + h x − z, z ∗ ik x k > h x − z, x ∗ ik x k + h x − z, z ∗ ik x k > h x − z, x ∗ i − k x − z k k z ∗ kk x k (2.28)and (2.27) implies that M + M is coercive. Lemma 2.9
Let X be a reflexive real Banach space such that X 6 = { } , let M : X → X ∗ , let M : X → X ∗ be monotone, let ( x ∗ n ) n ∈ N be a bounded sequence in X ∗ , and let ( γ n ) n ∈ N be a bounded sequence in ]0 , + ∞ [ . Suppose that there exists z ∈ dom M ∩ dom M such that lim k x k→ + ∞ inf h x − z, M x ik x k = + ∞ , (2.29) and that ( ∀ n ∈ N ) x n ∈ ( M + γ n M ) − x ∗ n . (2.30) Then ( x n ) n ∈ N is bounded.Proof . Set β = sup n ∈ N k x ∗ n k and σ = sup n ∈ N γ n . It follows from (2.30) that, for every n ∈ N , thereexist a ∗ n ∈ M x n and b ∗ n ∈ M x n such that x ∗ n = a ∗ n + γ n b ∗ n . If ( x n ) n ∈ N is unbounded, there exists astrictly increasing sequence ( k n ) n ∈ N in N such that < k x k n k ↑ + ∞ . Therefore, (2.29) yields lim n → + ∞ h x k n − z, a ∗ k n ik x k n k = + ∞ . (2.31)Now let z ∗ ∈ M z . By monotonicity of M , ( ∀ n ∈ N ) h x k n − z, b ∗ k n − z ∗ i > . Hence, (2.31) impliesthat β (cid:18) k z kk x k k (cid:19) > β (cid:18) k z kk x k n k (cid:19) > β k x k n − z kk x k n k > h x k n − z, x ∗ k n ik x k n k = h x k n − z, a ∗ k n ik x k n k + γ k n h x k n − z, b ∗ k n − z ∗ ik x k n k + γ k n h x k n − z, z ∗ ik x k n k > h x k n − z, a ∗ k n ik x k n k − σ k x k n − z k k z ∗ kk x k n k > h x k n − z, a ∗ k n ik x k n k − σ k z ∗ k (cid:18) k z kk x k k (cid:19) → + ∞ , (2.32)and we reach a contradiction. 10 orollary 2.10 Let X be a reflexive real Banach space such that X 6 = { } and let M : X → X ∗ becoercive. Then M − is bounded.Proof . Take M = M and M = 0 in Lemma 2.9. Proposition 2.11
Let X be a reflexive real Banach space such that X 6 = { } , let h ∈ Γ ( X ) beessentially smooth, and let M : X → X ∗ be such that dom M ∩ int dom h = ∅ . Suppose that one ofthe following holds: (i) dom M ∩ int dom h is bounded. (ii) There exists z ∈ dom M ∩ int dom h such that lim k x k→ + ∞ inf h x − z, M x ik x k = + ∞ . (2.33)(iii) M is uniformly monotone at a point z ∈ dom M ∩ int dom h with a supercoercive modulus. (iv) M is monotone and h is supercoercive. (v) M is monotone and h is uniformly convex at a point z ∈ dom M ∩ int dom h , i.e., there exists anincreasing function φ : [0 , + ∞ [ → [0 , + ∞ ] that vanishes only at such that ( ∀ y ∈ dom h )( ∀ α ∈ ]0 , h ( αy +(1 − α ) z )+ α (1 − α ) φ ( k y − z k ) αh ( y )+(1 − α ) h ( z ) . (2.34) Then ∇ h + M is coercive. If, in addition, M is maximally monotone, then dom ( ∇ h + M ) − = X ∗ .Proof . We first observe that [35, Theorem 18.7] and [6, Theorem 5.6] imply that ∇ h is maximallymonotone and that dom ∇ h = int dom h .(i): Lemma 2.7(i).(ii): It follows from Lemma 2.8 that ∇ h + M is coercive.(iii): Since ∇ h + M is uniformly monotone at z with a supercoercive modulus, the claim followsfrom Lemma 2.7(ii).(iv): Let z ∈ dom M ∩ int dom h = dom M ∩ dom ∂h . Then we derive from (2.26) that lim k x k→ + ∞ h x − z, ∇ h ( x ) ik x k = + ∞ . (2.35)Thus, ∇ h satisfies (2.27) and it follows from Lemma 2.8 that ∇ h + M is coercive.(v): It follows from [39, Definition 2.2 and Remark 2.8] that ∇ h is uniformly monotone at z with a supercoercive modulus. Hence, ∇ h + M is likewise and Lemma 2.7(ii) implies that ∇ h + M is coercive. Alternatively, this is a special case of (iv).11inally, suppose that M is maximally monotone. Then [35, Theorem 24.1(a)] asserts that ∇ h + M is maximally monotone. Consequently, since ∇ h + M is coercive, it follows from [42, Corollary II-B.32.35] that dom ( ∇ h + M ) − = ran ( ∇ h + M ) = X ∗ . Lemma 2.12
Let X and Y be real Banach spaces, let D ⊂ X be a nonempty open set, and let C be anonempty bounded convex subset of D . Suppose that T : D → Y is uniformly continuous on C in thesense that ( ∀ ε ∈ ]0 , + ∞ [)( ∃ δ ∈ ]0 , + ∞ [)( ∀ x ∈ C )( ∀ y ∈ C ) k x − y k δ ⇒ k T x − T y k ε. (2.36) Then T is bounded on C .Proof . In view of (2.36), there exists δ ∈ ]0 , + ∞ [ such that ( ∀ x ∈ C )( ∀ y ∈ C ) k x − y k δ ⇒ k T x − T y k . (2.37)Now fix z ∈ C and ρ ∈ ]0 , + ∞ [ such that C ⊂ (cid:8) x ∈ X (cid:12)(cid:12) k x − z k ρ (cid:9) , and take an integer m > ρ/δ . Let x ∈ C and set ( ∀ n ∈ { , . . . , m } ) x n = x + nm ( z − x ) ∈ C. (2.38)Then, for every n ∈ { , . . . , m − } , k x n +1 − x n k = k z − x k /m ρ/m δ and (2.37) yields k T x n +1 − T x n k . Hence, k T z − T x k P m − n =0 k T x n +1 − T x n k m and therefore k T x k < k T z k + m . We conclude that sup x ∈ C k T x k k T z k + m . Proposition 2.1(i)(a) asserts that Problem 1.2 reduces to finding the Bregman projection of a ref-erence point ( x , y ∗ ) onto the closed convex subset C = Z ∩ dom f of dom f . Our strategy is toemploy Proposition 2.5 for this task. The following condition will be used subsequently (see [7,Examples 4.10, 5.11, and 5.13] for special cases). Condition 3.1 [8, Condition 4.3(ii)] Let X be a reflexive real Banach space and let h ∈ Γ ( X ) be Gˆateaux differentiable on int dom h = ∅ . For every sequence ( x n ) n ∈ N in int dom h and everybounded sequence ( y n ) n ∈ N in int dom h , D h ( x n , y n ) → ⇒ x n − y n → . (3.1)We now derive from Proposition 2.5 our best Bregman approximation algorithm to solve Prob-lem 1.2. Theorem 3.2
Consider the setting of Problem 1.2. Let h ∈ Γ ( X ) and j ∈ Γ ( Y ) be Legendre functionssuch that int dom f ⊂ int dom h , L ( int dom f ) ⊂ int dom j , and there exist ε and δ in ]0 , + ∞ [ such hat ∇ h + εA and ∇ j + δB are coercive. Let σ ∈ [max { ε, δ } , + ∞ [ and iteratefor n = 0 , , . . . ( γ n , µ n ) ∈ [ ε, σ ] × [ δ, σ ] a n = ( ∇ h + γ n A ) − (cid:0) ∇ h ( x n ) − γ n L ∗ y ∗ n (cid:1) a ∗ n = γ − n (cid:0) ∇ h ( x n ) − ∇ h ( a n ) (cid:1) − L ∗ y ∗ n b n = ( ∇ j + µ n B ) − (cid:0) ∇ j ( Lx n ) + µ n y ∗ n (cid:1) b ∗ n = µ − n (cid:0) ∇ j ( Lx n ) − ∇ j ( b n ) (cid:1) + y ∗ n H n = (cid:8) ( x, y ∗ ) ∈ X (cid:12)(cid:12) h x, a ∗ n + L ∗ b ∗ n i + h b n − La n , y ∗ i h a n , a ∗ n i + h b n , b ∗ n i (cid:9) ( x n +1 / , y ∗ n +1 / ) = P fH n ( x n , y ∗ n )( x n +1 , y ∗ n +1 ) = Q f (cid:0) ( x , y ∗ ) , ( x n , y ∗ n ) , ( x n +1 / , y ∗ n +1 / ) (cid:1) . (3.2) Then the following hold: (i)
Let n ∈ N . Then the following are equivalent: (a) ( x n , y ∗ n ) = ( x, y ∗ ) . (b) ( x n , y ∗ n ) ∈ Z . (c) ( x n , y ∗ n ) ∈ H n . (d) x n = a n and y ∗ n = b ∗ n . (e) La n = b n and a ∗ n = − L ∗ b ∗ n . (f) H n = X . (g) ( x n +1 / , y ∗ n +1 / ) = ( x n , y ∗ n ) . (h) ( x n +1 , y ∗ n +1 ) = ( x n , y ∗ n ) . (ii) P n ∈ N D f ( x n +1 , x n ) < + ∞ and P n ∈ N D g ∗ ( y ∗ n +1 , y ∗ n ) < + ∞ . (iii) P n ∈ N D f ( x n +1 / , x n ) < + ∞ and P n ∈ N D g ∗ ( y ∗ n +1 / , y ∗ n ) < + ∞ . (iv) Suppose that f , g ∗ , h , and j satisfy Condition 3.1, and that ∇ h and ∇ j are uniformly continuouson every bounded subset of int dom h and int dom j , respectively. Then x n → x and y ∗ n → y ∗ .Proof . We apply Proposition 2.5 to C = Z ∩ dom f . (3.3)It follows from Proposition 2.1(i)(a) and our assumptions that C is a closed convex subset of dom f and that C ∩ int dom f = ∅ . Moreover, Proposition 2.2(i) asserts that f is a Legendre function.Now let γ ∈ [ ε, + ∞ [ and let µ ∈ [ δ, + ∞ [ . Since h is strictly convex, ∇ h is strictly monotone[40, Theorem 2.4.4(ii)] and ∇ h + γA is likewise. Let ( x ∗ , x ) and ( x ∗ , x ) be two elements ingra ( ∇ h + γA ) − such that x = x . Then ( x , x ∗ ) and ( x , x ∗ ) lie in gra ( ∇ h + γA ) and the strictmonotonicity of ∇ h + γA implies that h x − x , x ∗ − x ∗ i > , (3.4)13hich is impossible. Thus, ( ∇ h + γA ) − is at most single-valued . (3.5)The same argument shows that ( ∇ j + µB ) − is at most single-valued . (3.6)On the other hand, by assumption, there exists ( x, y ∗ ) ∈ Z ∩ int dom f . It follows from (1.9) that x ∈ dom A and Lx ∈ dom B . Furthermore, (2.12) yields x ∈ int dom f . Therefore ( x ∈ dom A ∩ int dom f ⊂ dom A ∩ int dom hLx ∈ dom B ∩ L ( int dom f ) ⊂ dom B ∩ int dom j. (3.7)Thus, dom A ∩ int dom h = ∅ and dom B ∩ int dom j = ∅ . It therefore follows from Lemma 2.8 that ∇ h + γA = ( ∇ h + εA ) + ( γ − ε ) A and ∇ j + µB = ( ∇ j + δB ) + ( µ − δ ) B are coercive . (3.8)Altogether, (3.5), (3.6), (3.8), and Proposition 2.11 assert that the operators ( ( ∇ h + γA ) − : X ∗ → dom A ∩ int dom h ( ∇ j + µB ) − : Y ∗ → dom B ∩ int dom j (3.9)are well defined and single-valued. Now set ( ∀ n ∈ N ) x n = ( x n , y ∗ n ) and x n +1 / = ( x n +1 / , y ∗ n +1 / ) . (3.10)Since (3.2) yields ( ∀ n ∈ N ) ( a n , a ∗ n ) ∈ gra A and ( b n , b ∗ n ) ∈ gra B, (3.11)it follows from (3.3), Proposition 2.1(ii)(b), (3.2), and Definition 2.3 that ( ∀ n ∈ N ) ∅ = C ⊂ Z ⊂ H n = H f ( x n , x n +1 / ) . (3.12)Hence, appealing to Proposition 2.2(i) and (1.5), we see that ( ∀ n ∈ N ) P fH n : int dom f → H n ∩ int dom f (3.13)and that ( ∀ n ∈ N ) x n +1 = Q f (cid:0) x , x n , x n +1 / (cid:1) . (3.14)Thus, we derive from (3.10), (3.12), and Proposition 2.5(ii) that ( x n ) n ∈ N and ( y ∗ n ) n ∈ N are well-defined sequences in int dom f and int dom g ∗ , respectively.(i): We prove the following implications.(i)(a) ⇒ (i)(b): Clear. 14i)(b) ⇒ (i)(a): Proposition 2.5(iii).(i)(b) ⇒ (i)(c): Clear by (3.12).(i)(c) ⇒ (i)(d): In view of (3.2), > h x n , a ∗ n + L ∗ b ∗ n i + h b n − La n , y ∗ n i − h a n , a ∗ n i − h b n , b ∗ n i = h x n − a n , a ∗ n + L ∗ y ∗ n i + h Lx n − b n , b ∗ n − y ∗ n i = γ − n h x n − a n , ∇ h ( x n ) − ∇ h ( a n ) i + µ − n h Lx n − b n , ∇ j ( Lx n ) − ∇ j ( b n ) i . (3.15)Consequently, the strict monotonicity of ∇ h and ∇ j yields x n = a n and Lx n = b n . (3.16)Furthermore, b ∗ n = µ − n (cid:0) ∇ j ( Lx n ) − ∇ j ( b n ) (cid:1) + y ∗ n = µ − n (cid:0) ∇ j ( b n ) − ∇ j ( b n ) (cid:1) + y ∗ n = y ∗ n . (3.17)(i)(d) ⇒ (i)(e): We derive from (3.2) that a ∗ n = γ − n ( ∇ h ( x n ) −∇ h ( a n )) − L ∗ y ∗ n = − L ∗ y ∗ n = − L ∗ b ∗ n .On the other hand, since h La n − b n , ∇ j ( La n ) − ∇ j ( b n ) i = h Lx n − b n , ∇ j ( Lx n ) − ∇ j ( b n ) i = µ n h Lx n − b n , b ∗ n − y ∗ n i = 0 , (3.18)the strict monotonicity of ∇ j yields La n = b n .(i)(e) ⇔ (i)(f): Proposition 2.1(ii)(a).(i)(f) ⇒ (i)(g): Indeed, x n +1 / = P fH n x n = x n .(i)(g) ⇒ (i)(h): We have x n +1 = Q f ( x , x n , x n +1 / ) = P f H f ( x , x n ) ∩ H f ( x n , x n +1 / ) x = P f H f ( x , x n ) x = x n . (3.19)(i)(h) ⇒ (i)(g): By Proposition 2.5(v), D f ( x n +1 / , x n ) D f ( x n +1 , x n ) = 0 . Therefore D f ( x n +1 / , x n ) = 0 and we derive from [6, Lemma 7.3(vi)] that x n +1 / = x n .(i)(g) ⇒ (i)(c): Indeed, x n = x n +1 / = P fH n x n ∈ H n .(i)(d) ⇒ (i)(b): We derive from (3.2) that ∇ h ( x n ) − γ n L ∗ y ∗ n ∈ ∇ h ( a n ) + γ n Aa n = ∇ h ( x n ) + γ n Ax n . (3.20)Hence − L ∗ y ∗ n ∈ Ax n . Likewise, as in (3.18), we first obtain Lx n = b n and then ∇ j ( Lx n ) + µ n y ∗ n ∈ ∇ j ( b n ) + µ n Bb n = ∇ j ( Lx n ) + µ n B ( Lx n ) . (3.21)15hus, y ∗ n ∈ B ( Lx n ) , i.e., Lx n ∈ B − y ∗ n . In view of (1.9), the implication is proved.(ii): Proposition 2.5(iv) yields X n ∈ N D f ( x n +1 , x n ) + X n ∈ N D g ∗ ( y ∗ n +1 , y ∗ n ) = X n ∈ N D f ( x n +1 , x n ) < + ∞ . (3.22)(iii): Proposition 2.5(vi) yields X n ∈ N D f ( x n +1 / , x n ) + X n ∈ N D g ∗ ( y ∗ n +1 / , y ∗ n ) = X n ∈ N D f ( x n +1 / , x n ) < + ∞ . (3.23)(iv): Proposition 2.5(ii) implies that ( x n ) n ∈ N is a bounded sequence in int dom f . In turn, ( x n ) n ∈ N ∈ ( int dom f ) N and ( y ∗ n ) n ∈ N ∈ ( int dom g ∗ ) N are bounded . (3.24)On the other hand, by (3.2), ( ∀ n ∈ N ) ( x n +1 / , y ∗ n +1 / ) = x n +1 / = P fH n x n ∈ H n (3.25)and ( ∀ n ∈ N ) h x n +1 / , a ∗ n + L ∗ b ∗ n i + h b n − La n , y ∗ n +1 / i = h a n , a ∗ n i + h b n , b ∗ n i . (3.26)Therefore, ( ∀ n ∈ N ) k x n − x n +1 / k k a ∗ n + L ∗ b ∗ n k + k b n − La n k k y ∗ n − y ∗ n +1 / k > h x n − x n +1 / , a ∗ n + L ∗ b ∗ n i + h b n − La n , y ∗ n − y ∗ n +1 / i = h x n , a ∗ n + L ∗ b ∗ n i + h b n − La n , y ∗ n i − h a n , a ∗ n i − h b n , b ∗ n i = h x n − a n , a ∗ n + L ∗ y ∗ n i + h Lx n − b n , b ∗ n − y ∗ n i = γ − n h x n − a n , ∇ h ( x n ) − ∇ h ( a n ) i + µ − n h Lx n − b n , ∇ j ( Lx n ) − ∇ j ( b n ) i > σ − (cid:0) D h ( x n , a n ) + D h ( a n , x n ) + D j ( Lx n , b n ) + D j ( b n , Lx n ) (cid:1) > σ − (cid:0) D h ( x n , a n ) + D j ( Lx n , b n ) (cid:1) . (3.27)However, since (iii) yields D f ( x n +1 / , x n ) → and D g ∗ ( y ∗ n +1 / , y ∗ n ) → (3.28)and since f and g ∗ satisfy Condition 3.1, (3.1) yields x n +1 / − x n → and y ∗ n +1 / − y ∗ n → . (3.29)Since ∇ h is uniformly continuous on every bounded subset of int dom h , Lemma 2.12 asserts that ∇ h is bounded on every bounded subset of int dom h and hence, since int dom f ⊂ int dom h and16 ∗ is bounded, it follows from (3.24) that (cid:0) ∇ h ( x n ) − γ n L ∗ y ∗ n (cid:1) n ∈ N is bounded. We therefore deducefrom (3.9), (3.2), and Lemma 2.9 that ( a n ) n ∈ N ∈ ( int dom h ) N is bounded . (3.30)Similarly, since ∇ j is uniformly continuous on every bounded subset of int dom j and L ( int dom f ) ⊂ int dom j , it follows from (3.24) and Lemma 2.12 that (cid:0) ∇ j ( Lx n ) + µ n y ∗ n (cid:1) n ∈ N is bounded and hence(3.9), (3.2), and Lemma 2.9 yield ( b n ) n ∈ N ∈ ( int dom j ) N is bounded . (3.31)Thus, ( ∇ h ( x n )) n ∈ N , ( ∇ h ( a n )) n ∈ N , ( ∇ j ( Lx n )) n ∈ N , and ( ∇ j ( b n )) n ∈ N are bounded and we deducefrom (3.2) that ( a ∗ n ) n ∈ N and ( b ∗ n ) n ∈ N are bounded . (3.32)We therefore derive from (3.27), (3.29), (3.30), and (3.31) that D h ( x n , a n ) → and D j ( Lx n , b n ) → . (3.33)Since h and j satisfy Condition 3.1, we get x n − a n → and Lx n − b n → . (3.34)Therefore, since ∇ h is uniformly continuous on every bounded subset of int dom h and ∇ j is uni-formly continuous on every bounded subset of int dom j , ∇ h ( x n ) − ∇ h ( a n ) → and ∇ j ( Lx n ) − ∇ j ( b n ) → . (3.35)Hence, using (3.2), we get a ∗ n + L ∗ y ∗ n → and b ∗ n − y ∗ n → . (3.36)Now, let x = ( x, y ∗ ) ∈ W ( x n ) n ∈ N , say x k n ⇀ x . Then x k n ⇀ x and y ∗ k n ⇀ y ∗ , and we derive from(3.34) and (3.36) that ( a k n ⇀ xb ∗ k n ⇀ y ∗ and ( La k n − b k n → a ∗ k n + L ∗ b ∗ k n → . (3.37)It therefore follows from (3.11), Proposition 2.1(iii), and (3.24) that x ∈ Z ∩ dom f = C . Hence,we derive from Proposition 2.5(vii) that D f ( x n , x ) + D g ∗ ( y ∗ n , y ∗ ) = D f ( x n , x ) → , (3.38)where x = ( x, y ∗ ) . Hence, D f ( x n , x ) → , D g ∗ ( y ∗ n , y ∗ ) → , and, since f and g ∗ satisfy Condi-tion 3.1, we conclude that x n → x and y ∗ n → y ∗ . Remark 3.3
We provide a couple of settings that satisfy the assumptions of Theorem 3.2.17i) In Problem 1.2, suppose that X and Y are Hilbert spaces, that f = k·k / , and that g = k·k / .Furthermore, in Theorem 3.2, set h = f and j = g , and note that, for any ε ∈ ]0 , + ∞ [ , ∇ h + εA = Id + εA and ∇ j + εB = Id + εB are strongly monotone and hence coercive byLemma 2.7(ii). Then we recover the framework of [3], which has been applied to domaindecomposition problems in [4].(ii) Let (Ω , F , µ ) and (Ω , F , µ ) be measure spaces, let p and q be in ]1 , + ∞ [ , and set p ∗ = p/ ( p − and q ∗ = q/ ( q − . In Problem 1.2, suppose that X = L p (Ω , F , µ ) , Y = L q (Ω , F , µ ) , f = k · k p /p , and g = k · k q /q . Then X ∗ = L p ∗ (Ω , F , µ ) , Y ∗ = L q ∗ (Ω , F , µ ) , and g ∗ = k · k q ∗ /q ∗ . Moreover, it follows from Clarkson’s theorem [17, The-orem II.4.7] that X , X ∗ , Y , and Y ∗ are uniformly convex and uniformly smooth. Hence,we derive from [6, Corollary 5.5 and Example 6.5] that f , g , and g ∗ are Legendre functionswhich are uniformly convex on every bounded set, and which therefore satisfy Condition 3.1by virtue of [7, Example 4.10(i)]. Now set h = f and j = g in Theorem 3.2. We derivefrom [17, Theorem II.2.16(i)] that ∇ h and ∇ j are uniformly continuous on every boundedsubset of X and Y , respectively. In addition, h and j are supercoercive and therefore, for any ε ∈ ]0 , + ∞ [ , Proposition 2.11(iv) asserts that ∇ h + εA and ∇ j + εB are coercive. Finally, itfollows from [17, Proposition II.4.9] that ∇ h : x
7→ | x | p − sign ( x ) and ∇ j : y
7→ | y | q − sign ( y ) . Remark 3.4
The implementation of algorithm (3.2) requires the evaluation of the operator ( ∇ h + A ) − . We provide a simple example in the Euclidean plane X of a maximally monotone operator A for which ( ∇ h + A ) − can be computed explicitly, whereas the classical resolvent (Id + A ) − isdifficult to evaluate. Let β ∈ ]0 , + ∞ [ and let ψ : R → R be a Legendre function with a β Lipschitz-continuous derivative. Set A : R → R : ( ξ , ξ ) (cid:0) βξ − ψ ′ ( ξ ) − ξ , ξ + βξ − ψ ′ ( ξ ) (cid:1) (3.39)and h : R → ] −∞ , + ∞ ] : ( ξ , ξ ) ψ ( ξ ) + ψ ( ξ ) . (3.40)Then it follows from [9, Theorem 18.15] that A is the sum of the gradient of the convex function ( ξ , ξ ) βξ / − ψ ( ξ ) + βξ / − ψ ( ξ ) and of the skew linear operator ( ξ , ξ ) ( − ξ , ξ ) . Thus, A is a maximally monotone operator [9, Corollary 24.4] which is not the subdifferential of a convexfunction. In addition, as in Proposition 2.2(i), h is a Legendre function and (cid:0) ∀ ( ξ , ξ ) ∈ R (cid:1) ( ∇ h + A ) − ( ξ , ξ ) = βξ + ξ β , βξ − ξ β ! . (3.41) Remark 3.5
At every iteration n , algorithm (3.2) requires the computation of x n +1 / = P fH n ( x n , y ∗ n ) and then of x n +1 = Q f (( x , y ∗ ) , ( x n , y ∗ n ) , ( x n +1 / , y ∗ n +1 / )) . Set s ∗ n = ( a ∗ n + L ∗ b ∗ n , b n − La n ) , η n = h a n , a ∗ n i + h b n , b ∗ n i , and x n = ( x n , y ∗ n ) , Then, if x n H n , x n +1 / is the Bregman projec-tion of x n onto the closed affine hyperplane (cid:8) x ∈ X (cid:12)(cid:12) h x , s ∗ n i = η n (cid:9) . Thus, x n +1 / is the solution ofthe problemminimize h p , s ∗ n i = η n f ( p ) − h p , ∇ f ( x n ) i (3.42)18hich, using standard first order conditions, is characterized by (see also [5, Remark 6.13] and [16,Lemma 2.2.1]) ∇ f ( x n +1 / ) = ∇ f ( x n ) − λ s ∗ n h x n +1 / , s ∗ n i = η n λ ∈ ]0 , + ∞ [ . (3.43)In view of [6, Theorem 5.10], the Lagrange multiplier λ is uniquely determined by the equation h∇ f ∗ ( ∇ f ( x n ) − λ s ∗ n ) , s ∗ n i = η n . The problem therefore reduces to finding the solution λ to thisequation in ]0 , + ∞ [ and then setting x n +1 / = ∇ f ∗ ( ∇ f ( x n ) − λ s ∗ n ) . Likewise, it follows from(2.15) that x n +1 is the unique solution to the problemminimize h p − x n , ∇ f ( x ) −∇ f ( x n ) i h p − x n +1 / , ∇ f ( x n ) −∇ f ( x n +1 / ) i f ( p ) − h p , ∇ f ( x ) i . (3.44)Depending on the number of active constraints, this problem boils down to determining up to twoLagrange multipliers in ]0 , + ∞ [ .Next, we consider a specialization of Problem 1.2 to multivariate structured minimization. Problem 3.6
Let m and p be strictly positive integers, let ( X i ) i m and ( Y k ) k p be reflexive realBanach spaces, and let X be the standard vector product space (cid:16) × mi =1 X i (cid:17) × (cid:16) × pk =1 Y ∗ k (cid:17) equippedwith the norm ( x, y ∗ ) = (cid:0) ( x i ) i m , ( y ∗ k ) k p (cid:1) vuut m X i =1 k x i k + p X k =1 k y ∗ k k . (3.45)For every i ∈ { , . . . , m } and every k ∈ { , . . . , p } , let ϕ i ∈ Γ ( X i ) , let ψ k ∈ Γ ( Y k ) , and let L ki : X i → Y k be linear and bounded. Consider the primal problemminimize x ∈X ,..., x m ∈X m m X i =1 ϕ i ( x i ) + p X k =1 ψ k (cid:18) m X i =1 L ki x i (cid:19) , (3.46)the dual problemminimize y ∗ ∈Y ∗ ,..., y ∗ p ∈Y ∗ p m X i =1 ϕ ∗ i (cid:18) − p X k =1 L ∗ ki y ∗ k (cid:19) + p X k =1 ψ ∗ k ( y ∗ k ) , (3.47)and let Z = (cid:26) ( x, y ∗ ) ∈ X (cid:12)(cid:12)(cid:12)(cid:12) ( ∀ i ∈ { , . . . , m } ) − p X k =1 L ∗ ki y ∗ k ∈ ∂ϕ i ( x i ) and ( ∀ k ∈ { , . . . , p } ) m X i =1 L ki x i ∈ ∂ψ ∗ k ( y ∗ k ) (cid:27) (3.48)19e the associated Kuhn-Tucker set. For every i ∈ { , . . . , m } , let f i ∈ Γ ( X i ) be a Legendre functionand let x i, ∈ int dom f i . For every k ∈ { , . . . , p } , let g k ∈ Γ ( Y k ) be a Legendre function and let y ∗ k, ∈ int dom g ∗ k . Set x = ( x i, ) i m , y ∗ = ( y ∗ k, ) k p , and f : X → ] −∞ , + ∞ ] : ( x, y ∗ ) m X i =1 f i ( x i ) + p X k =1 g ∗ k ( y ∗ k ) , (3.49)and suppose that Z ∩ int dom f = ∅ . The objective is to find the best Bregman approximation (cid:0) ( x i ) i m , ( y ∗ k ) k p (cid:1) = P fZ ( x , y ∗ ) to ( x , y ∗ ) from Z .We derive from Theorem 3.2 the following convergence result for a splitting algorithm to solveProblem 3.6. Proposition 3.7
Consider the setting of Problem 3.6. For every i ∈ { , . . . , m } , let h i ∈ Γ ( X i ) be a Legendre function such that int dom f i ⊂ int dom h i and h i + ε i ϕ i is supercoercive for some ε i ∈ ]0 , + ∞ [ . For every k ∈ { , . . . , p } , let j k ∈ Γ ( Y k ) be a Legendre function such that P mi =1 L ki ( int dom f i ) ⊂ int dom j k and j k + δ k ψ k is supercoercive for some δ k ∈ ]0 , + ∞ [ . Set ε = max i m ε i and δ = max k p δ k , let σ ∈ [max { ε, δ } , + ∞ [ , and iteratefor n = 0 , , . . . ( γ n , µ n ) ∈ [ ε, σ ] × [ δ, σ ] for i = 1 , . . . , m a i,n = ( ∇ h i + γ n ∂ϕ i ) − (cid:16) ∇ h i ( x i,n ) − γ n P pk =1 L ∗ ki y ∗ k,n (cid:17) a ∗ i,n = γ − n (cid:0) ∇ h i ( x i,n ) − ∇ h i ( a i,n ) (cid:1) − P pk =1 L ∗ ki y ∗ k,n for k = 1 , . . . , p b k,n = ( ∇ j k + µ n ∂ψ k ) − (cid:0) ∇ j k (cid:0) P mi =1 L ki x i,n (cid:1) + µ n y ∗ k,n (cid:1) b ∗ k,n = µ − n (cid:0) ∇ j k (cid:0) P mi =1 L ki x i,n (cid:1) − ∇ j k ( b k,n ) (cid:1) + y ∗ k,n t k,n = b k,n − P mi =1 L ki a i,n for i = 1 , . . . , m j s ∗ i,n = a ∗ i,n + P pk =1 L ∗ ki b ∗ k,n η n = P mi =1 h a i,n , a ∗ i,n i + P pk =1 h b k,n , b ∗ k,n i H n = n ( x, y ∗ ) ∈ X (cid:12)(cid:12)(cid:12) P mi =1 h x i , s ∗ i,n i + P pk =1 h t k,n , y ∗ k i η n o(cid:0) x n +1 / , y ∗ n +1 / (cid:1) = P fH n ( x n , y ∗ n ) (cid:0) x n +1 , y ∗ n +1 (cid:1) = Q f (cid:0) ( x , y ∗ ) , ( x n , y ∗ n ) , ( x n +1 / , y ∗ n +1 / ) (cid:1) , (3.50) where we use the notation ( ∀ n ∈ N ) x n = ( x i,n ) i m and y ∗ n = ( y ∗ k,n ) k p . Suppose that thefollowing hold: (i) For every i ∈ { , . . . , m } , f i and h i satisfy Condition 3.1 and ∇ h i is uniformly continuous onevery bounded subset of int dom h i . (ii) For every k ∈ { , . . . , p } , g ∗ k and j k satisfy Condition 3.1 and ∇ j k is uniformly continuous onevery bounded subset of int dom j k . hen ( ∀ i ∈ { , . . . , m } ) x i,n → x i and ( ∀ k ∈ { , . . . , p } ) y ∗ k,n → y ∗ k . (3.51) Proof . Denote by X and Y the standard vector product spaces × mi =1 X i and × pk =1 Y k equipped withthe norms x = ( x i ) i m pP mi =1 k x i k and y = ( y k ) k p qP pk =1 k y k k , respectively. Then X ∗ is the vector product space × mi =1 X ∗ i equipped with the norm x ∗ pP mi =1 k x ∗ i k and Y ∗ is thevector product space × pk =1 Y ∗ k equipped with the norm y ∗ qP pk =1 k y ∗ k k . Let us introduce theoperators A : X → X ∗ : x × mi =1 ∂ϕ i ( x i ) B : Y → Y ∗ : y × pk =1 ∂ψ k ( y k ) L : X → Y : x (cid:0) P mi =1 L ki x i (cid:1) k p (3.52)and the functions f : X → ] −∞ , + ∞ ] : x P mi =1 f i ( x i ) h : X → ] −∞ , + ∞ ] : x P mi =1 h i ( x i ) ϕ : X → ] −∞ , + ∞ ] : x P mi =1 ϕ i ( x i ) g : Y → ] −∞ , + ∞ ] : y P pk =1 g k ( y k ) j : Y → ] −∞ , + ∞ ] : y P pk =1 j k ( y k ) . (3.53)Then it follows from [40, Theorem 3.1.11] that A and B are maximally monotone. In addition, theadjoint of L is L ∗ : Y ∗ → X ∗ : y ∗ ( P pk =1 L ∗ ki y ∗ k ) i m , and, as in Proposition 2.2(i), f and g areLegendre functions. Thus, Problem 3.6 is a special case of Problem 1.2. Furthermore, h and j areLegendre functions,int dom f = m × i =1 int dom f i ⊂ m × i =1 int dom h i = int dom h, (3.54)and L (cid:0) int dom f (cid:1) = p × k =1 m X i =1 L ki ( int dom f i ) ⊂ p × k =1 int dom j k = int dom j. (3.55)Next we observe that, for every i ∈ { , . . . , m } , since h i + εϕ i is supercoercive, ( h i + εϕ i ) ∗ is boundedabove on every bounded subset of X ∗ i [6, Theorem 3.3]. As a result, ( h + εϕ ) ∗ : x ∗ P mi =1 ( h i + εϕ i ) ∗ ( x ∗ i ) is bounded above on every bounded subset of X ∗ , and it follows from [6, Theorem 3.3]that h + εϕ is supercoercive. In turn since, as in (3.7), ∅ = dom A ∩ int dom f ⊂ dom ϕ ∩ int dom f ,we derive from [40, Theorem 2.8.3] and Lemma 2.7(iii) that ∇ h + εA = ∇ h + ε∂ϕ = ∂ ( h + εϕ ) (3.56)21s coercive. We show in a similar fashion that ∇ j + δB is coercive. Now set, for every n ∈ N , a n = ( a i,n ) i m , a ∗ n = ( a ∗ i,n ) i m , b n = ( b k,n ) k p , and b ∗ n = ( b ∗ k,n ) k p . Then, for every n ∈ N ,we have ( ∀ i ∈ { , . . . , m } ) a i,n = ( ∇ h i + γ n ∂ϕ i ) − (cid:18) ∇ h i ( x i,n ) − γ n p X k =1 L ∗ ki y ∗ k,n (cid:19) ⇔ ( ∀ i ∈ { , . . . , m } ) ∇ h i ( x i,n ) − γ n p X k =1 L ∗ ki y ∗ k,n ∈ ∇ h i ( a i,n ) + γ n ∂ϕ i ( a i,n ) ⇔ ∇ h ( x n ) − γ n L ∗ y ∗ n ∈ ∇ h ( a n ) + γ n Aa n ⇔ a n = ( ∇ h + γ n A ) − (cid:0) ∇ h ( x n ) − γ n L ∗ y ∗ n (cid:1) . (3.57)Likewise, ( ∀ n ∈ N ) b n = ( ∇ j + µ n B ) − (cid:0) ∇ j ( Lx n ) + µ n y ∗ n (cid:1) . (3.58)Thus, (3.50) is a special case of (3.2). In addition, it follows from our assumptions and (3.53)that f , g ∗ , h , and j satisfy Condition 3.1, and that ∇ h and ∇ j are uniformly continuous on everybounded subset of int dom h and int dom j , respectively. Altogether, the conclusions follow fromTheorem 3.2(iv), with x = ( x i ) i m and y ∗ = ( y ∗ k ) k p . Remark 3.8
In Problem 3.6, suppose that, for every i ∈ { , . . . , m } and every k ∈ { , . . . , p } , ϕ i and ψ k are supercoercive Legendre functions satisfying Condition 3.1, that ∇ ϕ i and ∇ ψ k are uniformly continuous on bounded subset of int dom ϕ i and int dom ψ k , respectively, andthat P mi =1 L ki ( int dom ϕ i ) ⊂ int dom ψ k . Then, in Proposition 3.7, we can choose, for every i ∈ { , . . . , m } and every k ∈ { , . . . , p } , h i = ϕ i and j k = ψ k , and in (3.50), we obtain a i,n = ∇ h ∗ i ∇ h i ( x i,n ) − γ n p X k =1 L ∗ ki y ∗ k,n γ n (3.59)and b k,n = ∇ j ∗ k ∇ j k (cid:18) m X i =1 L ki x i,n (cid:19) + µ n y ∗ k,n µ n . (3.60)For example, suppose that, for every i ∈ { , . . . , m } and every k ∈ { , . . . , p } , X i = R , Y k = R , and ϕ i = h i is the Hellinger-like function, i.e., ϕ i : R → ] −∞ , + ∞ ] : x i ( − q − x i , if x i ∈ [ − ,
1] ;+ ∞ , otherwise . (3.61)22hen (3.59) becomes a i,n = x i,n − γ n (cid:18) p X k =1 L ∗ ki y ∗ k,n (cid:19)q − x i,n vuut (1 + γ n ) (1 − x i,n ) + (cid:18) x i,n − γ n (cid:18) p X k =1 L ∗ ki y ∗ k,n (cid:19)q − x i,n (cid:19) . (3.62)Furthermore, as shown in the next section, in finite-dimensional spaces, we can remove Condi-tion 3.1 and the assumption on the uniform continuity of ( ∇ ϕ i ) i m and ( ∇ ψ k ) k p . In finite-dimensional spaces, the convergence of algorithm (3.2) can be obtained under more gen-eral assumptions. To establish the corresponding results, the following technical facts will beneeded.
Lemma 4.1
Let X be a finite-dimensional real Banach space and let f ∈ Γ ( X ) be a Legendre function.Then the following hold: (i) f and ∇ f are continuous on int dom f [9, Corollaries 8.30(iii), 17.34, and 17.35] . (ii) ∇ f : int dom f → int dom f ∗ is bijective with inverse ∇ f ∗ : int dom f ∗ → int dom f [6, Theo-rem 5.10] . (iii) Let x ∈ int dom f , let y ∈ dom f , and let ( y n ) n ∈ N ∈ ( int dom f ) N . Suppose that y n → y and that ( D f ( x, y n )) n ∈ N is bounded. Then y ∈ int dom f and D f ( y, y n ) → [5, Theorem 3.8(ii)] . (iv) Let x ∈ int dom f , let y ∈ int dom f , let ( x n ) n ∈ N ∈ ( dom f ) N , and let ( y n ) n ∈ N ∈ ( int dom f ) N .Suppose that D f ( x n , y n ) → . Then x = y [5, Theorem 3.9(iii)] . (v) Let y ∈ int dom f . Then D f ( · , y ) is coercive [6, Lemma 7.3(v)] . (vi) Let { x, y } ⊂ int dom f . Then D f ( x, y ) = D f ∗ ( ∇ f ( y ) , ∇ f ( x )) [6, Lemma 7.3(vii)] . Proposition 4.2
In Problem 1.2, suppose that X and Y are finite-dimensional. Let h ∈ Γ ( X ) and j ∈ Γ ( Y ) be Legendre functions such that int dom f ⊂ int dom h , L ( int dom f ) ⊂ int dom j , and thereexist ε and δ in ]0 , + ∞ [ such that ∇ h + εA and ∇ j + δB are coercive. Let σ ∈ [max { ε, δ } , + ∞ [ andexecute algorithm (3.2) . Then ( x n , y ∗ n ) → ( x, y ∗ ) .Proof . Set C = Z ∩ dom f . We first observe that, as in (3.24), ( x n ) n ∈ N ∈ ( int dom f ) N and ( y ∗ n ) n ∈ N ∈ ( int dom g ∗ ) N are bounded . (4.1)23n addition, we deduce from (3.10), (3.14), and Proposition 2.5(i) that x = ( x, y ∗ ) ∈ C ⊂ T n ∈ N H f ( x , x n ) , and hence from (2.14) that ( ∀ n ∈ N ) D f ( x, x n ) + D g ∗ ( y ∗ , y ∗ n ) = D f (cid:0) x , x n (cid:1) D f (cid:0) x , x (cid:1) . (4.2)By virtue of Proposition 2.5(vii) and (4.1), it suffices to show that every cluster point of ( x n , y ∗ n ) n ∈ N belongs to Z . To this end, take x ∈ X , y ∗ ∈ Y , and a strictly increasing sequence ( k n ) n ∈ N in N suchthat x k n → x and y ∗ k n → y ∗ . Then Lx k n → Lx , x ∈ dom f , and y ∗ ∈ dom g ∗ . Since x ∈ int dom f and since (4.2) implies that ( D f ( x, x k n )) n ∈ N is bounded, it follows from Lemma 4.1(iii) that x ∈ int dom f . Analogously, y ∗ ∈ int dom g ∗ . In turn, Lemma 4.1(i) asserts that ∇ f (cid:0) x k n (cid:1) → ∇ f ( x ) and ∇ g ∗ (cid:0) y ∗ k n (cid:1) → ∇ g ∗ ( y ∗ ) . (4.3)Furthermore, since int dom f ⊂ int dom h and L ( int dom f ) ⊂ int dom j , we obtain x ∈ int dom h and Lx ∈ int dom j . Thus, there exists ρ ∈ ]0 , + ∞ [ such that B ( x ; ρ ) ⊂ int dom h and B ( Lx ; ρ ) ⊂ int dom j . We therefore assume without loss of generality that ( x k n ) n ∈ N ∈ B ( x ; ρ ) N and ( Lx k n ) n ∈ N ∈ B ( Lx ; ρ ) N . (4.4)In view of Lemma 4.1(i), h ( B ( x ; ρ )) and ∇ h ( B ( x ; ρ )) are therefore compact, which implies that ( h ( x k n )) n ∈ N and ( ∇ h ( x k n )) n ∈ N are bounded. Hence ( D h ( x, x k n )) n ∈ N is bounded and, moreover, itfollows from (3.2), (4.1), Lemma 2.9, and (3.9) that ( a k n ) n ∈ N is a bounded sequence in int dom h .We show likewise that ( D j ( Lx, Lx k n )) n ∈ N and ( b k n ) n ∈ N are bounded. Next, since the convexity of h yields ( ∀ n ∈ N ) D h ( x, a k n ) = h ( x ) − h ( a k n ) − h x − a k n , ∇ h ( a k n ) i = h ( x ) − h ( x k n ) − h x − x k n , ∇ h ( x k n ) i + (cid:10) x − a k n , ∇ h ( x k n ) − ∇ h ( a k n ) (cid:11) − (cid:0) h ( a k n ) − h ( x k n ) − h a k n − x k n , ∇ h ( x k n ) i (cid:1) D h ( x, x k n ) + (cid:10) x − a k n , ∇ h ( x k n ) − ∇ h ( a k n ) (cid:11) , (4.5)we derive from (3.2) that ( ∀ n ∈ N ) σ − D h ( x, a k n ) γ − k n D h ( x, a k n ) ε − D h ( x, x k n ) + (cid:10) x − a k n , γ − k n (cid:0) ∇ h ( x k n ) − ∇ h ( a k n ) (cid:1)(cid:11) = ε − D h ( x, x k n ) + h x − a k n , a ∗ k n + L ∗ y ∗ k n i = ε − D h ( x, x k n ) + h x − a k n , a ∗ k n + L ∗ y ∗ i + h Lx − La k n , y ∗ k n − y ∗ i . (4.6)Similarly, ( ∀ n ∈ N ) σ − D j ( Lx, b k n ) δ − D j ( Lx, Lx k n ) + h Lx − b k n , b ∗ k n − y ∗ k n i δ − D j ( Lx, Lx k n ) + h Lx − b k n , b ∗ k n − y ∗ i + h Lx − b k n , y ∗ − y ∗ k n i . (4.7)24ince (3.12) entails that x ∈ C ⊂ \ n ∈ N H k n = \ n ∈ N (cid:8) ( x, y ∗ ) ∈ X (cid:12)(cid:12) h x − a k n , a ∗ k n + L ∗ y ∗ i + h Lx − b k n , b ∗ k n − y ∗ i (cid:9) , (4.8)we deduce from (4.6) and (4.7) that ( ∀ n ∈ N ) σ − (cid:0) D h ( x, a k n ) + D j ( Lx, b k n ) (cid:1) ε − D h ( x, x k n ) + δ − D j ( Lx, Lx k n ) + h Lx − La k n , y ∗ k n − y ∗ i + h Lx − b k n , y ∗ − y ∗ k n i = ε − D h ( x, x k n ) + δ − D j ( Lx, Lx k n ) + h b k n − La k n , y ∗ k n − y ∗ i ε − D h ( x, x k n ) + δ − D j ( Lx, Lx k n ) + ( k b k n k + k L k k a k n k )( k y ∗ k n k + k y ∗ k ) . (4.9)Hence, the boundedness of ( a k n ) n ∈ N , ( b k n ) n ∈ N , ( y ∗ k n ) n ∈ N , ( D h ( x, x k n )) n ∈ N , and ( D j ( Lx, Lx k n )) n ∈ N implies that of ( D h ( x, a k n )) n ∈ N and ( D j ( Lx, b k n )) n ∈ N . In turn, by Lemma 4.1(vi), (cid:16) D h ∗ (cid:0) ∇ h ( a k n ) , ∇ h ( x ) (cid:1)(cid:17) n ∈ N and (cid:16) D j ∗ (cid:0) ∇ j ( b k n ) , ∇ j ( Lx ) (cid:1)(cid:17) n ∈ N are bounded (4.10)and, since Lemma 4.1(v) asserts that D h ∗ ( · , ∇ h ( x )) and D j ∗ ( · , ∇ j ( Lx )) are coercive, it follows from(4.10) that ( ∇ h ( a k n )) n ∈ N and ( ∇ j ( b k n )) n ∈ N are bounded. Thus, since ( y ∗ k n ) n ∈ N , ( ∇ h ( x k n )) n ∈ N and ( ∇ j ( Lx k n )) n ∈ N are bounded, we infer from (3.2) that ( a ∗ k n ) n ∈ N and ( b ∗ k n ) n ∈ N are bounded. (4.11)On the other hand, since (3.12) yields x ∈ C ⊂ T n ∈ N H f ( x k n , x k n +1 / ) , (2.14) and (4.2) implythat ( ∀ n ∈ N ) D f ( x, x k n +1 / )+ D g ∗ ( y ∗ , y ∗ k n +1 / ) = D f (cid:0) x , x k n +1 / (cid:1) D f (cid:0) x , x k n (cid:1) D f (cid:0) x , x (cid:1) . (4.12)Thus, Lemma 4.1(vi) yields ( ∀ n ∈ N ) D f ∗ (cid:0) ∇ f ( x k n +1 / ) , ∇ f ( x ) (cid:1) + D g (cid:0) ∇ g ∗ ( y ∗ k n +1 / ) , ∇ g ∗ ( y ∗ ) (cid:1) = D f ( x, x k n +1 / ) + D g ∗ ( y ∗ , y ∗ k n +1 / ) D f (cid:0) x , x (cid:1) (4.13)and, since D f ∗ ( · , ∇ f ( x )) and D g ( · , ∇ g ∗ ( y ∗ )) are coercive by Lemma 4.1(v), it follows that ( ∇ f ( x k n +1 / )) n ∈ N and ( ∇ g ∗ ( y ∗ k n +1 / )) n ∈ N are bounded. (4.14)However, as in (3.28), D f ( x k n +1 / , x k n ) → and D g ∗ ( y ∗ k n +1 / , y ∗ k n ) → , and it therefore followsfrom Lemma 4.1(vi) that D f ∗ (cid:0) ∇ f ( x k n ) , ∇ f ( x k n +1 / ) (cid:1) → and D g (cid:0) ∇ g ∗ ( y ∗ k n ) , ∇ g ∗ ( y ∗ k n +1 / ) (cid:1) → . (4.15)In view of Lemma 4.1(iv), we infer from (4.3), (4.14), and (4.15) that there exists a strictly increas-ing sequence ( p k n ) n ∈ N in N such that ∇ f (cid:0) x p kn +1 / (cid:1) → ∇ f ( x ) and ∇ g ∗ (cid:0) y ∗ p kn +1 / (cid:1) → ∇ g ∗ ( y ∗ ) . (4.16)25ince, by Lemma 4.1(i)–(ii), ( ∇ f ) − = ∇ f ∗ is continuous on int dom f ∗ and ( ∇ g ∗ ) − = ∇ g iscontinuous on int dom g , we obtain x p kn +1 / → x and y ∗ p kn +1 / → y ∗ . Thus, x p kn +1 / − x p kn → and y ∗ p kn +1 / − y ∗ p kn → . (4.17)On the other hand, as in (3.27), ( ∀ n ∈ N ) k x p kn − x p kn +1 / k k a ∗ p kn + L ∗ b ∗ p kn k + k b p kn − La p kn k k y ∗ p kn − y ∗ p kn +1 / k > σ − (cid:0) D h ( x p kn , a p kn ) + D j ( Lx p kn , b p kn ) (cid:1) , (4.18)and hence, since ( a p kn ) n ∈ N and ( b p kn ) n ∈ N are bounded, we deduce from (4.11) and (4.17) that D h (cid:0) x p kn , a p kn (cid:1) → D j (cid:0) Lx p kn , b p kn (cid:1) → x p kn → xLx p kn → Lx ( a p kn ) n ∈ N has a cluster point ( b p kn ) n ∈ N has a cluster point . (4.19)Consequently, by dropping to a subsequence if necessary and invoking Lemma 4.1(iv), we get a p kn → x and b p kn → Lx. (4.20)Hence, using the fact that ∇ h ( x p kn ) → ∇ h ( x ) and ∇ j ( Lx p kn ) → ∇ j ( Lx ) , we derive that ∇ h ( x p kn ) −∇ h ( a p kn ) → and ∇ j ( Lx p kn ) − ∇ j ( b p kn ) → , which, in view of (3.2), yields a ∗ p kn + L ∗ y ∗ p kn → and b ∗ p kn − y ∗ p kn → . (4.21)Thus, since y ∗ p kn → y ∗ , it follows that a ∗ p kn → − L ∗ y ∗ and b ∗ p kn → y ∗ . In summary,gra A ∋ ( a p kn , a ∗ p kn ) → ( x, − L ∗ y ∗ ) and gra B ∋ ( b p kn , b ∗ p kn ) → ( Lx, y ∗ ) . (4.22)Since gra A and gra B are closed [9, Proposition 20.33(iii)], we conclude that ( x, − L ∗ y ∗ ) ∈ gra A and ( Lx, y ∗ ) ∈ gra B , and therefore that ( x, y ∗ ) ∈ Z .Let us note that, even in Euclidean spaces, it may be easier to evaluate ( ∇ h + γ∂ϕ ) − than theusual proximity operator prox γϕ = (Id + γ∂ϕ ) − introduced by Moreau [31], which is based on h = k · k / . We provide illustrations of such instances in the standard Euclidean space R m . Example 4.3
Let γ ∈ ]0 , + ∞ [ , let φ ∈ Γ ( R ) be such that dom φ ∩ ]0 , + ∞ [ = ∅ , and let ϑ be theBoltzmann-Shannon entropy function, i.e., ϑ : ξ ξ ln ξ − ξ, if ξ ∈ ]0 , + ∞ [ ;0 , if ξ = 0;+ ∞ , otherwise . (4.23)26et ϕ : ( ξ i ) i m P mi =1 φ ( ξ i ) and h : ( ξ i ) i m P mi =1 ϑ ( ξ i ) . Note that h is a supercoercive Leg-endre function [5, Sections 5 and 6] and hence Proposition 2.11(iv) asserts that ∇ h + γ∂ϕ is coerciveand dom ( ∇ h + γ∂ϕ ) − = R m . Now let ( ξ i ) i m ∈ R m , set ( η i ) i m = ( ∇ h + γ∂ϕ ) − ( ξ i ) i m , let W be the Lambert function [21, 26], i.e., the inverse of ξ ξe ξ on [0 , + ∞ [ , and let i ∈ { , . . . , m } .Then η i can be computed as follows.(i) Let ω ∈ R and suppose that φ : ξ ξ ln ξ − ωξ, if ξ ∈ ]0 , + ∞ [ ;0 , if ξ = 0;+ ∞ , otherwise . (4.24)Then η i = e ( ξ i + γ ( ω − / ( γ +1) .(ii) Let p ∈ [1 , + ∞ [ and suppose that either φ = | · | p /p or φ : ξ ( ξ p /p, if ξ ∈ [0 , + ∞ [ ;+ ∞ , otherwise . (4.25)Then η i = W ( γ ( p − e ( p − ξ i ) γ ( p − ! p − , if p > e ξ i − γ , if p = 1 . (4.26)(iii) Let p ∈ [1 , + ∞ [ and suppose that φ : ξ ( ξ − p /p, if ξ ∈ ]0 , + ∞ [ ;+ ∞ , otherwise . (4.27)Then η i = W ( γ ( p + 1) e − ( p +1) ξ i ) γ ( p + 1) ! − p +1 . (4.28)(iv) Let p ∈ ]0 , and suppose that φ : ξ ( − ξ p /p, if ξ ∈ [0 , + ∞ [ ;+ ∞ , otherwise . (4.29)Then η i = W ( γ (1 − p ) e ( p − ξ i ) γ (1 − p ) ! p − . (4.30)27 xample 4.4 Let φ ∈ Γ ( R ) be such that dom φ ∩ ]0 , = ∅ and let ϑ be the Fermi-Dirac entropy,i.e., ϑ : ξ ξ ln ξ + (1 − ξ ) ln(1 − ξ ) , if ξ ∈ ]0 ,
1[ ;0 if ξ ∈ { , } ;+ ∞ , otherwise . (4.31)Set ϕ : ( ξ i ) i m P mi =1 φ ( ξ i ) and h : ( ξ i ) i m P mi =1 ϑ ( ξ i ) . Note that h is a Legendre function[5, Sections 5 and 6] and that int dom h = ]0 , m is bounded. Therefore, Proposition 2.11(i)asserts that ∇ h + ∂ϕ is coercive and that dom ( ∇ h + ∂ϕ ) − = R m . Now let ( ξ i ) i m ∈ R m , set ( η i ) i m = ( ∇ h + ∂ϕ ) − ( ξ i ) i m , and let i ∈ { , . . . , m } . Then η i can be computed as follows.(i) Let ω ∈ R and suppose that φ : ξ ξ ln ξ − ωξ, if ξ ∈ ]0 , + ∞ [ ;0 , if ξ = 0;+ ∞ , otherwise . (4.32)Then η i = − e ξ i + ω − / p e ξ i + ω − / e ξ i + ω − .(ii) Suppose that φ : ξ (1 − ξ ) ln(1 − ξ ) + ξ, if ξ ∈ ] −∞ ,
1[ ;1 , if ξ = 1;+ ∞ , otherwise . (4.33)Then η i = 1 + e − ξ i / − p e − ξ i + e − ξ i / . References [1] M. A. Alghamdi, A. Alotaibi, P. L. Combettes, and N. Shahzad, A primal-dual method of partial inversesfor composite inclusions,
Optim. Lett., vol. 8, pp. 2271–2284, 2014.[2] A. Alotaibi, P. L. Combettes, and N. Shahzad, Solving coupled composite monotone inclusions by suc-cessive Fej´er approximations of their Kuhn-Tucker set,
SIAM J. Optim., vol. 24, pp. 2076–2095, 2014.[3] A. Alotaibi, P. L. Combettes, and N. Shahzad, Best approximation from the Kuhn-Tucker set of compositemonotone inclusions,
Numer. Funct. Anal. Optim., to appear.[4] H. Attouch, L. M. Brice˜no-Arias, and P. L. Combettes, A strongly convergent primal-dual method fornonoverlapping domain decomposition,
Numer. Math., published online 2015-07-10.[5] H. H. Bauschke and J. M. Borwein, Legendre functions and the method of random Bregman projections,
J. Convex Anal., vol. 4, pp. 27–67, 1997.
6] H. H. Bauschke, J. M. Borwein, and P. L. Combettes, Essential smoothness, essential strict convexity,and Legendre functions in Banach spaces,
Commun. Contemp. Math., vol. 3, pp. 615–647, 2001.[7] H. H. Bauschke, J. M. Borwein, and P. L. Combettes, Bregman monotone optimization algorithms,
SIAMJ. Control Optim., vol. 42, pp. 596–636, 2003.[8] H. H. Bauschke and P. L. Combettes, Construction of best Bregman approximations in reflexive Banachspaces,
Proc. Amer. Math. Soc., vol. 131, pp. 3757–3766, 2003.[9] H. H. Bauschke and P. L. Combettes,
Convex Analysis and Monotone Operator Theory in Hilbert Spaces.
Springer, New York, 2011.[10] J. M. Borwein, Fifty years of maximal monotonicity,
Optim. Lett., vol. 4, pp. 473–490, 2010.[11] R. I. Bot¸, E. R. Csetnek, and A. Heinrich, A primal-dual splitting algorithm for finding zeros of sums ofmaximal monotone operators,
SIAM J. Optim., vol. 23, pp. 2011–2036, 2013.[12] R. I. Bot¸ and C. Hendrich, A Douglas-Rachford type primal-dual method for solving inclusions withmixtures of composite and parallel-sum type monotone operators,
SIAM J. Optim., vol. 23, pp. 2541–2565, 2013.[13] L. M. Bregman, The relaxation method of finding the common point of convex sets and its application tothe solution of problems in convex programming,
USSR Comput. Math. Math. Phys. , vol 7, pp. 200–217,1967.[14] L. M. Brice˜no-Arias and P. L. Combettes, A monotone+skew splitting model for composite monotoneinclusions in duality,
SIAM J. Optim. , vol. 21, pp. 1230–1250, 2011.[15] F. E. Browder, Multi-valued monotone nonlinear mappings and duality mappings in Banach spaces,
Trans. Amer. Math. Soc., vol. 118, pp. 338–351, 1965.[16] Y. Censor and S. A. Zenios,
Parallel Optimization: Theory, Algorithms and Applications.
Oxford Univer-sity Press, New York, 1997.[17] I. Cioranescu,
Geometry of Banach spaces, Duality Mappings and Nonlinear Problems.
Kluwer, Dordrecht,1990.[18] P. L. Combettes, Strong convergence of block-iterative outer approximation methods for convex opti-mization,
SIAM J. Control Optim. , vol. 38, pp. 538–565, 2000.[19] P. L. Combettes, Systems of structured monotone inclusions: Duality, algorithms, and applications,
SIAM J. Optim., vol. 23, pp. 2420–2447, 2013.[20] P. L. Combettes and B. C. V˜u, Variable metric forward-backward splitting with applications to monotoneinclusions in duality,
Optimization, vol. 63, pp. 1289–1318, 2014.[21] R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jeffrey, and D. E. Knuth, On the Lambert W function, Adv. Comput. Math., vol. 5, pp. 329–359, 1996.[22] J. Eckstein, Nonlinear proximal point algorithms using Bregman functions, with applications to convexprogramming,
Math. Oper. Res. , vol. 18, pp. 202–226, 1993.[23] Y. Haugazeau,
Sur les In´equations Variationnelles et la Minimisation de Fonctionnelles Convexes.
Th`ese,Universit´e de Paris, Paris, France, 1968.
24] R. I. Kaˇcurovski˘ı, On monotone operators and convex functionals,
Uspekhi Mat. Nauk , vol. 15, pp.213–215, 1960.[25] G. Kassay, The proximal points algorithm for reflexive Banach spaces,
Studia Univ. Babes¸-Bolyai Math. ,vol. 30, pp. 9–17, 1985.[26] J. H. Lamberti, Observationes variae in mathesin puram,
Acta Helvetica, Physico-Mathematico-Anatomico-Botanico-Medica, vol. 3, pp. 128–168, 1758.[27] P. L. Lions and B. Mercier, Splitting algorithms for the sum of two nonlinear operators,
SIAM J. Numer.Anal. , vol. 16, pp. 964–979, 1979.[28] B. Mercier,
Topics in Finite Element Solution of Elliptic Problems (Lectures on Mathematics, no. 63). TataInstitute of Fundamental Research, Bombay, 1979.[29] G. J. Minty, On the maximal domain of a “monotone” function,
Michigan Math. J. , vol. 8, pp. 135–137,1961.[30] G. J. Minty, Monotone (nonlinear) operators in Hilbert space,
Duke Math. J. , vol. 29, pp. 341–346,1962.[31] J. J. Moreau, Fonctions convexes duales et points proximaux dans un espace hilbertien,
C. R. Acad. Sci.Paris S´er. A Math., vol. 255, pp. 2897–2899, 1962.[32] J. J. Moreau, Fonctionnelles sous-diff´erentiables,
C. R. Acad. Sci. Paris S´er. A Math., vol. 257, pp. 4117–4119, 1963.[33] T. Pennanen, Dualization of generalized equations of maximal monotone type,
SIAM J. Optim., vol. 10,pp. 809–835, 2000.[34] R. T. Rockafellar, Monotone operators and the proximal point algorithm,
SIAM J. Control Optim., vol.14, pp. 877–898, 1976.[35] S. Simons,
From Hahn-Banach to Monotonicity,
Lecture Notes in Math. 1693, Springer-Verlag, NewYork, 2008.[36] M. Teboulle, Entropic proximal mappings with applications to nonlinear programming,
Math. Oper.Res. , vol. 17, pp. 670–690, 1992.[37] P. Tseng, A modified forward-backward splitting method for maximal monotone mappings,
SIAM J.Control Optim. , vol. 38, pp. 431–446, 2000.[38] B. C. V˜u, A splitting algorithm for dual monotone inclusions involving cocoercive operators,
Adv. Com-put. Math. , vol. 38, pp. 667–681, 2013.[39] C. Z˘alinescu, On uniformly convex functions,
J. Math. Anal. Appl., vol. 95, pp. 344–374, 1983.[40] C. Z˘alinescu,
Convex Analysis in General Vector Spaces.
World Scientific Publishing, River Edge, NJ,2002.[41] E. H. Zarantonello,
Solving functional equations by contractive averaging , Tech. Report 160, Math. Res.Center U.S. Army, Madison, University of Wisconsin, June 1960.[42] E. Zeidler,
Nonlinear Functional Analysis and Its Applications, vols. I–V. Springer-Verlag, New York,1985–1993.vols. I–V. Springer-Verlag, New York,1985–1993.