Affine nonexpansive operators, Attouch-Théra duality and the Douglas-Rachford algorithm
aa r X i v : . [ m a t h . O C ] M a r Affine nonexpansive operators, Attouch-Th ´eraduality and the Douglas-Rachford algorithm
Heinz H. Bauschke ∗ , Brett Lukens † and Walaa M. Moursi ‡ March 30, 2016
In tribute to Michel Th´era on his 70th birthday
Abstract
The Douglas-Rachford splitting algorithm was originally proposed in 1956 to solvea system of linear equations arising from the discretization of a partial differentialequation. In 1979, Lions and Mercier brought forward a very powerful extension ofthis method suitable to solve optimization problems.In this paper, we revisit the original affine setting. We provide a powerful conver-gence result for finding a zero of the sum of two maximally monotone affine relations.As a by product of our analysis, we obtain results concerning the convergence of it-erates of affine nonexpansive mappings as well as Attouch-Th´era duality. Numerousexamples are presented.
Primary 47H05, 47H09, 49M27; Secondary 49M29,49N15, 90C25.
Keywords: affine mapping, Attouch-Th´era duality, Douglas-Rachford algorithm, linear conver-gence, maximally monotone operator, nonexpansive mapping, paramonotone operator, strongconvergence, Toeplitz matrix, tridiagonal matrix. ∗ Mathematics, University of British Columbia, Kelowna, B.C. V1V 1V7, Canada. E-mail: [email protected] . † [email protected]. ‡ Mathematics, University of British Columbia, Kelowna, B.C. V1V 1V7, Canada, and Mansoura Univer-sity, Faculty of Science, Mathematics Department, Mansoura 35516, Egypt. E-mail: [email protected] . Introduction
Throughout this paper X is a real Hilbert space,with inner product h· , ·i and induced norm k·k . A central problem in optimization is tofind x ∈ X such that 0 ∈ ( A + B ) x , (1)where A and B are maximally monotone operators on X ; see, e.g., [7], [14], [15], [17],[18], [34], [35], [33], [39], [40], and the references therein. As Lions and Mercier observedin the their landmark paper [29], one may iteratively solve the sum problem (1) by thecelebrated Douglas-Rachford splitting algorithm (see also [24]). This algorithm proceedsby iterating the operator T = Id − J A + J B R A ; the sequence ( J A T n x ) n ∈ N converges to asolution of (1) (see Section 5 for details). The Douglas-Rachford algorithm was originallyproposed in 1956 by Douglas and Rachford [21]. It can be viewed as a method for solvinga system of linear equations where the underlying coefficient matrix is positive definite.The far-reaching extension to optimization provided by Lions and Mercier [29] is not atall obvious (for the sake of completeness, we sketch this connection in the Appendix).In this paper, we concentrate on the affine setting. In the original setting consideredby Douglas and Rachford, the operators A and B correspond to positive definite matri-ces. We extend this result in various directions. Indeed, we obtain strong convergence inpossibly infinite-dimensional Hilbert space; the operators A and B may be affine maximallymonotone relations , and we also identify the limit . The remainder of this paper is organizedas follows. In Section 2, we provide several results which will be useful in the derivationof the main results. A new characterization of strongly convergent iterations of affinenonexpansive operators (Theorem 3.3) is presented in Section 3. We also discuss whenthe convergence is linear. In Section 4, we obtain new results, which are formulated us-ing the Douglas-Rachford operator, on the relative geometry of the primal and dual (inthe sense of Attouch-Th´era duality) solutions to (1). The main algorithmic result (The-orem 5.1) is derived in Section 5. It provides precise information on the behaviour ofthe Douglas-Rachford algorithm in the affine case. Numerous examples are presented inSection 6 where we also pay attention to the tridiagonal Toeplitz matrices and Kroneckerproducts. In the Appendix, we sketch the connection between the historical Douglas-Rachford algorithm and the powerful extension provided by Lions and Mercier.Finally, the notation we employ is quite standard and follows largely [7]. Let C be anonempty closed convex subset of X . We use N C and P C to denote the normal cone operator and the projector associated with C , respectively. Let Y be a Banach space. We shall use B ( Y ) to denote the set of bounded linear operators on Y . Let L ∈ B ( Y ) . The operator norm of L is k L k = sup k y k≤ k Ly k . Further notation is developed as necessary during the courseof this paper. 2 Auxiliary results
In this section, we collect various results that will be useful in the sequel.Suppose that T : X → X . Then T is nonexpansive if ( ∀ x ∈ X )( ∀ y ∈ X ) k Tx − Ty k ≤ k x − y k ; (2) T is firmly nonexpansive if ( ∀ x ∈ X )( ∀ y ∈ X ) k Tx − Ty k + k ( Id − T ) x − ( Id − T ) y k ≤ k x − y k ; (3) T is asymptotically regular if ( ∀ x ∈ X ) T n x − T n + x →
0. (4)
Fact 2.1.
Let T : X → X. ThenT firmly nonexpansive
Fix T = ∅ (cid:27) ⇒ T asymptotically regular . (5)
Proof . See [16, Corollary 1.1] or [7, Corollary 5.16(ii)]. (cid:4)
Fact 2.2.
Let L : X → X be linear and nonexpansive, and let x ∈ X. ThenL n x → P Fix L x ⇔ L n x − L n + x →
0. (6)
Proof . See [2, Proposition 4], [3, Theorem 1.1], [8, Theorem 2.2] or [7, Proposition 5.27]. (Wemention in passing that in [2, Proposition 4] the author proved the result for general oddnonexpansive mappings in Hilbert spaces and in [3, Theorem 1.1], the authors generalizethe result to Banach spaces.) (cid:4)
Definition 2.3.
Let Y be a real Banach space, let ( y n ) n ∈ N be a sequence in Y and let y ∞ ∈ Y.Then ( y n ) n ∈ N converges to y ∞ , denoted y n → y ∞ , if k y n − y ∞ k → . ( y n ) n ∈ N ; converges µ -linearly to y ∞ if µ ∈ [
0, 1 [ and there exists M ≥ such that ( ∀ n ∈ N ) k y n − y ∞ k ≤ M µ n . (7) ( y n ) n ∈ N converges linearly to y ∞ if there exists µ ∈ [
0, 1 [ and M ≥ such that (7) holds. Example 2.4 ( convergence vs. pointwise convergence of bounded linear operators). LetY be a real Banach space, let ( L n ) n ∈ N be a sequence in B ( Y ) , and let L ∞ ∈ B ( Y ) . Then one says: By [10, Remark 3.7], this is equivalent to ( ∃ M > )( ∃ N ∈ N )( ∀ n ≥ N ) k y n − y ∞ k ≤ M µ n . ( L n ) n ∈ N converges or converges uniformly to L ∞ in B ( Y ) if L n → L ∞ ( in B ( Y )) . (ii) ( L n ) n ∈ N converges pointwise to L ∞ if ( ∀ y ∈ Y ) L n y → L ∞ y ( in Y ) . Remark 2.5.
It is easy to see that the convergence of a sequence of bounded linear operators impliespointwise convergence; however, the converse is not true (see, e.g., [27, Example 4.9-2]).
Lemma 2.6.
Let Y be a real Banach space, let ( L n ) n ∈ N be a sequence in B ( Y ) , let L ∞ ∈ B ( Y ) ,and let µ ∈ ]
0, 1 [ . Then ( ∀ y ∈ Y ) L n y → L ∞ y µ -linearly ( in Y ) ⇔ L n → L ∞ µ -linearly ( in B ( Y )) . (8) Proof . Let y ∈ Y . “ ⇒ ”: Because L n y → L ∞ y µ -linearly, there exists M y ≥ ( ∀ n ∈ N ) k ( L n − L ∞ ) y k ≤ µ n M y ; equivalently, (cid:13)(cid:13)(cid:13) (cid:18) L n − L ∞ µ n (cid:19) y (cid:13)(cid:13)(cid:13) = k ( L n − L ∞ ) y k µ n ≤ M y . (9)It follows from the Uniform Boundedness Principle (see, e.g., [27, 4.7-3]) applied tothe sequence (( L n − L ∞ ) / µ n ) n ∈ N that ( ∃ M ≥ )( ∀ n ∈ N ) k ( L n − L ∞ ) / µ n k ≤ M ;equivalently, k L n − L ∞ k ≤ M µ n , as required. “ ⇐ ”: Since L n → L ∞ µ -linearly, wehave ( ∃ M ≥ )( ∀ n ∈ N ) k L n − L ∞ k ≤ M µ n . Therefore, ( ∀ n ∈ N ) k L n y − L ∞ y k ≤k L n − L ∞ kk y k ≤ M k y k µ n . (cid:4) Lemma 2.7.
Suppose that X is finite-dimensional, let ( L n ) n ∈ N be a sequence of linear nonexpan-sive operators on X and let L ∞ : X → X. Then the following are equivalent: (i) ( ∀ x ∈ X ) L n x → L ∞ x. (ii) L n → L ∞ pointwise ( in X ) , and L ∞ is linear and nonexpansive. (iii) L n → L ∞ ( in B ( X )) .Proof . The implications “(i) ⇒ (ii)” and “(iii) ⇒ (i)” are easy to verify. “(ii) ⇒ (iii)”: Supposethat ( x n ) n ∈ N is a sequence in X such that ( ∀ n ∈ N ) k x n k = k L n − L ∞ k − k L n x n − L ∞ x n k →
0. (10)We can and do assume that x n → x ∞ . Since L ∞ and ( L n ) n ∈ N are linear and nonexpansive,we have k L ∞ k ≤ ( ∀ n ∈ N ) k L n k ≤
1. Using the triangle inequality, we have k L n x n − L ∞ x n k = k ( L n − L ∞ )( x n − x ∞ ) + ( L n − L ∞ ) x ∞ k ≤ k L n − L ∞ kk x n − x ∞ k + k ( L n − L ∞ ) x ∞ k ≤ k x n − x ∞ k + k ( L n − L ∞ ) x ∞ k → + =
0. Now combine with (10). (cid:4)
Corollary 2.8.
Suppose that X is finite-dimensional, let L : X → X be linear, and let L ∞ : X → X be such that L n → L ∞ pointwise. Then L n → L ∞ linearly.Proof . Combine Lemma 2.7 and [5, Theorem 2.12(i)]. (cid:4) Iterating an affine nonexpansive operator
We begin with a simple yet useful result.
Theorem 3.1.
Let L : X → X be linear, let b ∈ X, set T : X → X : x Lx + b, and supposethat Fix T = ∅ . Let x ∈ X. Then the following hold: (i) b ∈ ran ( Id − L ) . (ii) ( ∀ n ∈ N ) T n x = L n x + ∑ n − k = L k b.Proof . (i): Fix T = ∅ ⇔ ( ∃ y ∈ X ) y = Ly + b ⇔ b ∈ ran ( Id − L ) . (ii): We prove thisby induction (see also [11, Theorem 3.2(ii)]). When n = n = n ∈ N , T n x = L n x + n − ∑ k = L k b . (11)Then T n + x = T ( T n x ) = T ( L n x + ∑ n − k = L k b ) = L ( L n x + ∑ n − k = L k b ) + b = L n + x + ∑ nk = L k b . (cid:4) Let S be a nonempty closed convex subset of X and let w ∈ X . We recall the followinguseful translation formula (see, e.g., [7, Proposition 3.17]): ( ∀ x ∈ X ) P w + S x = w + P S ( x − w ) . (12) Lemma 3.2.
Let L : X → X be linear and nonexpansive, let b ∈ X, set T : X → X : x Lx + b,and suppose that Fix T = ∅ . Then there exists a point a ∈ X such that b = a − La and ( ∀ x ∈ X ) Tx = L ( x − a ) + a . (13) Moreover, the following hold: (i) Fix T = a + Fix L. (ii) ( ∀ x ∈ X ) P Fix T x = a + P Fix L ( x − a ) = P ( Fix L ) ⊥ a + P Fix L x. (iii) ( ∀ n ∈ N )( ∀ x ∈ X ) T n x = a + L n ( x − a ) . Proof . The existence of a and (13) follows from Theorem 3.1 and the linearity of L . (i): Let y ∈ X . Then y ∈ Fix T ⇔ y − a ∈ Fix L ⇔ y ∈ a + Fix L . (ii): The first identity followsfrom combining (i) and (12). It follows from, e.g., [7, Corollary 3.22(ii)] that a + P Fix L ( x − a ) = a + P Fix L x − P Fix L a = P ( Fix L ) ⊥ a + P Fix L x . (iii): By telescoping, we have n − ∑ k = L k b = n − ∑ k = L k ( a − La ) = a − L n a . (14)5onsequently, Theorem 3.1(ii) and (14) yield T n x = L n x + a − L n a = a + L n ( x − a ) . (cid:4) The following result extends Fact 2.2 from the linear to the affine case.
Theorem 3.3.
Let L : X → X be linear and nonexpansive, let b ∈ X, set T : X → X : x Lx + b, and suppose that Fix T = ∅ . Then the following are equivalent: (i) L is asymptotically regular. (ii) L n → P Fix L pointwise. (iii) T n → P Fix T pointwise. (iv) T is asymptotically regular.Proof . Let x ∈ X . “(i) ⇔ (ii)”: This is Fact 2.2. “(ii) ⇒ (iii)”: In view of Lemma 3.2(iii)&(ii)we have T n x = L n ( x − a ) + a → P Fix L ( x − a ) + a = P Fix T x . “(iii) ⇒ (iv)”: T n x − T n + x → P Fix T x − P Fix T x =
0. “(iv) ⇒ (i)”: Using Lemma 3.2(iii) we have L n x − L n + x = T n ( x + a ) − T n + ( x + a ) → (cid:4) We now turn to linear convergence.
Lemma 3.4.
Suppose that X is finite-dimensional, and let L : X → X be linear and nonexpansive.Then the following are equivalent: (i)
L is asymptotically regular. (ii) L n → P Fix L pointwise ( in X ) . (iii) L n → P Fix L ( in B ( X )) . (iv) L n → P Fix L linearly pointwise ( in X ) . (v) L n → P Fix L linearly ( in B ( X )) .Proof . “(i) ⇔ (ii)”: This follows from Fact 2.2. “(ii) ⇔ (iii)”: Combine Lemma 2.7 andFact 2.2. “(iii) ⇒ (v)”: Apply Corollary 2.8 with L ∞ replaced by P Fix L . “(v) ⇒ (iii)”: Thisis obvious. “(iv) ⇔ (v)”: Apply Lemma 2.6 to the sequence ( L n ) n ∈ N and use Fact 2.2. (cid:4) Theorem 3.5.
Let L : X → X be linear and nonexpansive, let b ∈ X, set T : X → X : x Lx + b and let µ ∈ ]
0, 1 [ . Then the following are equivalent: (i) T n → P Fix T µ -linearly pointwise ( in X ) . (ii) L n → P Fix L µ -linearly pointwise ( in X ) . (iii) L n → P Fix L µ -linearly ( in B ( X )) .Proof . “(i) ⇔ (ii)”: It follows from Lemma 3.2(iii)&(ii) that T n x − P Fix T x = a + L n ( x − a ) − ( a + P Fix L ( x − a )) = L n ( x − a ) − P Fix L ( x − a ) →
0, by Fact 2.2. “(ii) ⇔ (iii)”: CombineLemma 2.6 and Fact 2.2. (cid:4) orollary 3.6. Suppose that X is finite-dimensional. Let L : X → X be linear, nonexpansive andasymptotically regular, let b ∈ X, set T : X → X : x Lx + b and suppose that Fix T = ∅ .Then T n → P Fix T pointwise linearly.Proof . It follows from Fact 2.2 that L n → P Fix L pointwise. Consequently, by Corollary 2.8, L n → P Fix L linearly. Now apply Theorem 3.5. (cid:4) Recall that a possibly set-valued operator A : X ⇒ X is monotone if for any two points ( x , u ) and ( y , v ) in the graph of A , denoted gra A , we have h x − y , u − v i ≥ A is maxi-mally monotone if there is no proper extension of gra A that preserves the monotonicity of A . The resolvent of A , denoted by J A , is defined by J A = ( Id + A ) − while the reflectedresolvent of A is R A = J A − Id. In the following, we assume that A : X ⇒ X and B : X ⇒ X are maximally monotone.The Attouch-Th´era (see [1]) dual pair to the primal pair ( A , B ) is the pair ( A − , B − > ) . The primal problem associated with ( A , B ) is tofind x ∈ X such that 0 ∈ Ax + Bx , (15)and its Attouch-Th´era dual problem is tofind x ∈ X such that 0 ∈ A − x + B − > x . (16)We shall use Z and K to denote the sets of primal and dual solutions of (15) and (16)respectively, i.e., Z = Z ( A , B ) = ( A + B ) − ( ) and K = K ( A , B ) = ( A − + B − > ) − ( ) . (17)The Douglas-Rachford operator for the ordered pair ( A , B ) (see [29]) is defined by T DR = T DR ( A , B ) = Id − J A + J B R A = ( Id + R B R A ) . (18)We recall that C : X ⇒ X is paramonotone if it is monotone and ( ∀ ( x , u ) ∈ gra C )( ∀ ( y , v ) ∈ gra C ) we have ( x , u ) ∈ gra C ( y , v ) ∈ gra C h x − y , u − v i = ⇒ (cid:8) ( x , v ) , ( y , u ) (cid:9) ⊆ gra C . (19) It is well-known for a maximally monotone operator A : X ⇒ X that J A is firmly nonexpansive and R A is nonexpansive (see, e.g. [7, Corollary 23.10(i) and (ii)]). Let A : X ⇒ X . Then A > = ( − Id ) ◦ A ◦ ( − Id ) and A − > = ( A − ) > = ( A > ) − . For a detailed discussion on paramonotone operators we refer the reader to [26]. xample 4.1. Let f : X → ] − ∞ , + ∞ ] be proper, convex and lower semicontinuous. Then ∂ f isparamonotone by [26, Proposition 2.2] (or by [7, Example 22.3(i)]). Example 4.2.
Suppose that X = R and that A : R → R : ( x , y ) ( y , − x ) . Then one caneasily verify that A and − A are maximally monotone but not paramonotone by [26, Section 3](or [13, Theorem 4.9]).
Fact 4.3.
The following hold: (i) T DR is firmly nonexpansive. (ii) zer ( A + B ) = J A ( Fix T DR ) .If A and B are paramonotone, then we have additionally: (iii) Fix T DR = Z + K. (iv) ( K − K ) ⊥ ( Z − Z ) .Proof . (i): See [29, Lemma 1], [23, Corollary 4.2.1], or [7, Proposition 4.21(ii)]. (ii): See[19, Lemma 2.6(iii)] or [7, Proposition 25.1(ii)]. (iii): See [6, Corollary 5.5(iii)]. (iv): See [6,Corollary 5.5(iv)]. (cid:4) Lemma 4.4.
Suppose that A and B are paramonotone. Let k ∈ K be such that ( ∀ z ∈ Z ) J A ( z + k ) = P Z ( z + k ) . Then k ∈ ( Z − Z ) ⊥ .Proof . By Fact 4.3(iii), Fix T DR = Z + K . Let z and z be in Z . It follows from [6, Theo-rem 4.5] that ( ∀ z ∈ Z ) J A ( z + k ) = z . Therefore, ( ∀ i ∈ {
1, 2 } ) z i + k ∈ Fix T DR and z i = J A ( z i + k ) = P Z ( z i + k ) . (20)Furthermore, the Projection Theorem (see, e.g., [7, Theorem 3.14]) yields h k , z − z i = h z + k − z , z − z i = h z + k − P Z ( z + k ) , P Z ( z + k ) − z i ≥
0. (21)On the other hand, interchanging the roles of z and z yields h k , z − z i ≥
0. Altogether, h k , z − z i = (cid:4) The next result relates the Douglas-Rachford operator to orthogonal properties of pri-mal and dual solutions.
Theorem 4.5.
Suppose that A and B are paramonotone. Then the following are equivalent: (i) J A P Fix T DR = P Z . (ii) J A | Fix T DR = P Z | Fix T DR . (iii) K ⊥ ( Z − Z ) . roof . “(i) ⇒ (ii)”: This is obvious. “(ii) ⇒ (iii)”: Let k ∈ K and let z ∈ Z . Then Fix T DR = Z + K by Fact 4.3(iii); hence, z + k ∈ Fix T DR . Therefore J A ( z + k ) = P Z ( z + k ) . Now applyLemma 4.4. “(iii) ⇒ (i)”: This follows from [6, Theorem 6.7(ii)]. (cid:4) Corollary 4.6.
Let U be a closed affine subspace of X, suppose that A = N U and that B isparamonotone such that Z = ∅ . Then the following hold : (i) Z = U ∩ ( B − ( par U ) ⊥ ) ⊆ U. (ii) ( ∀ z ∈ Z ) K = ( − Bz ) ∩ ( par U ) ⊥ ⊆ ( par U ) ⊥ . (iii) K ⊥ ( Z − Z ) . (iv) J A P Fix T DR = P U P Fix T DR = P Z .Proof . Since A = N C = ∂ι C , it is paramonotone by Example 4.1. (i): Let x ∈ X . Then x ∈ Z ⇔ ∈ Ax + Bx = ( par U ) ⊥ + Bx ⇔ [ x ∈ U and there exists y ∈ X such that y ∈ ( par U ) ⊥ and y ∈ Bx ] ⇔ [ x ∈ U and there exists y ∈ X such that x ∈ B − y and y ∈ ( par U ) ⊥ ] ⇔ x ∈ U ∩ B − (( par U ) ⊥ ) . (ii): Let z ∈ Z . Applying [6, Remark 5.4] to ( A − , B − > ) yields K = ( − Bz ) ∩ ( Az ) = ( − Bz ) ∩ ( par U ) ⊥ . (iii): By (i) Z − Z ⊆ U − U = par U . Now use (ii). (iv): Combine (iii) and Theorem 4.5. (cid:4) Using [6, Proposition 2.10] we havezer A ∩ zer B = ∅ ⇔ ∈ K . (22) Theorem 4.7.
Suppose that A and B are paramonotone and that zer A ∩ zer B = ∅ . Then thefollowing hold: (i) Z = ( zer A ) ∩ ( zer B ) and ∈ K. (ii) J A P Fix T DR = P Z . (iii) K ⊥ ( Z − Z ) .If, in addition, A or B is single-valued, then we also have: (iv) K = { } . (v) Fix T DR = ( zer A ) ∩ ( zer B ) .Proof . (i): Since zer A ∩ zer B = ∅ , it follows from (22) that 0 ∈ K . Now apply [6, Re-mark 5.4] to get Z = A − ( ) ∩ B − ( ) = ( zer A ) ∩ ( zer B ) . (ii): This is [6, Corollary 6.8].(iii): Combine (22) and Fact 4.3(iv). (iv): Let C ∈ { A , B } be single-valued. Using (i) wehave Z ⊆ zer C . Suppose that C = A and let z ∈ Z . We use [6, Remark 5.4] applied to ( A − , B − > ) to learn that K = ( Az ) ∩ ( − Bz ) . Therefore { } ⊆ K ⊆ Az ⊆ A ( zer A ) = { } .A similar argument applies if C = B . (v): Combine Fact 4.3(iii) with (i) & (iv). (cid:4) Let C be nonempty closed convex subset of X . Then J N C = P C by, e.g., [7, Example 23.4]. Suppose that U is a closed affine subspace of X . We use par U to denote the parallel space of U definedby par U = U − U . emark 4.8. The conclusion of Theorem 4.7(i) generalizes the setting of convex feasibility prob-lems. Indeed, suppose that A = N U and B = N V , where U and V are nonempty closed convexsubsets of X such that U ∩ V = ∅ . Then Z = U ∩ V = zer A ∩ zer B. The assumptions that A and B are paramonotone are critical in the conclusion of Theo-rem 4.7(i) as we illustrate now. Example 4.9.
Suppose that X = R , that U = R × { } , that A = N U and that B : R → R : ( x , y ) ( − y , x ) , is the counterclockwise rotator in the plane by π /2 . Then one verifiesthat zer A = U, zer B = { (
0, 0 ) } , Z = zer ( A + B ) = U; however ( zer A ) ∩ ( zer B ) = { (
0, 0 ) } 6 = U = Z. Note that A is paramonotone by Example 4.1 while B is not paramonotoneby Example 4.2.
In view of (22) and Theorem 4.7(ii), when A and B are paramonotone, we have theimplication 0 ∈ K ⇒ J A P Fix T DR = P Z . However the converse implication is not true, aswe show in the next example. Example 4.10.
Suppose that a ∈ X r { } , that A = Id − a and that B = Id . Then Z = { a } , ( A − , B − > ) = ( Id + a , Id ) , hence K = {− a } , Z − Z = { } and therefore K ⊥ ( Z − Z ) whichimplies that J A P Fix T DR = P Z by Theorem 4.5, but K. If neither A nor B is single-valued, then the conclusion of Theorem 4.7(iv)&(v) may failas we now illustrate. Example 4.11.
Suppose that X = R , that U = R × { } , that V = ball ((
0, 1 ) ; 1 ) , thatA = N U and that B = N V . By [6, Example 2.7] Z = U ∩ V = { (
0, 0 ) } and K = N U − V ( ) = R + · (
0, 1 ) = { (
0, 0 ) } . Therefore Fix T DR = R + · (
0, 1 ) = { (
0, 0 ) } = U ∩ V = zer A ∩ zer B. Recall that the Passty’s parallel sum (see e.g., [32] or [7, Section 24.4]) is defined by A (cid:3) B = ( A − + B − ) − . (23)In view of (17) and (23), one readily verifies that K = ( A (cid:3) B > )( ) . (24) Lemma 4.12.
Suppose that B : X ⇒ X is linear . Then the following hold: (i) B > = B and B − > = B − . Let u ∈ X and let r >
0. We use ball ( u ; r ) to denote the closed ball in X centred at u with radius r . Wealso use R + to denote the set of nonnegative real numbers [ + ∞ [ . A : X ⇒ X is a linear relation if gra A is a linear subspace of X × X . ( A − , B − > ) = ( A − , B − ) . (iii) K = ( A (cid:3) B )( ) .Proof . This is straightforward from the definitions. (cid:4) Let f : X → ] − ∞ , + ∞ ] be proper, convex and lower semicontinuous. In the followingwe make use of the well-known identity (see, e.g., [7, Corollary 16.24]): ( ∂ f ) − = ∂ f ∗ . (25) Corollary 4.13 ( subdifferential operators). Let f : X → ] − ∞ , + ∞ ] and g : X → ] − ∞ , + ∞ ] be proper, convex and lower semicontinuous. Suppose that A = ∂ f and that B = ∂ g. Then thefollowing hold : (i) Z = ( ∂ f ∗ (cid:3) ∂ g ∗ )( ) . (ii) K = ( ∂ f (cid:3) ∂ g ∨ )( ) . (iii) Suppose that Argmin f ∩ Argmin g = ∅ . Then Z = ∂ f ∗ ( ) ∩ ∂ g ∗ ( ) . (iv) Suppose that ∈ sri ( dom f − dom g ) . Then Z = ∂ ( f ∗ (cid:3) g ∗ )( ) . (v) Suppose that ∈ sri ( dom f ∗ + dom g ∗ ) . Then K = ∂ ( f (cid:3) g ∨ )( ) .Proof . Note that A and B are paramonotone by Example 4.1. (i): Using (25) and(23) we have Z = ( A + B ) − ( ) = ((( ∂ f ) − ) − + (( ∂ g ) − ) − ) − ( ) = (( ∂ f ∗ ) − +( ∂ g ∗ ) − ) − ( ) = ( ∂ f ∗ (cid:3) ∂ g ∗ )( ) . (ii): Observe that ( ∂ g ) − > = (( ∂ g ) > ) − =( ∂ g ∨ ) − . Therefore using (23) we have K = (( ∂ f ) − + (( ∂ g ) > ) − ) − ( ) = (( ∂ f ) − +( ∂ g ∨ ) − ) − ( ) = ( ∂ f (cid:3) ∂ g ∨ )( ) . (iii): Using Theorem 4.7(i), Fermat’s rule (see, e.g., [7,Theorem 16.2]) and (25) we have Z = ( zer A ) ∩ ( zer B ) = Argmin f ∩ Argmin g =( ∂ f ) − ( ) ∩ ( ∂ g ) − ( ) = ∂ f ∗ ( ) ∩ ∂ g ∗ ( ) . (iv): Combine (i) and [7, Proposition 24.27]applied to the functions f ∗ and g ∗ . (v): Combine (ii) and [7, Proposition 24.27] applied tothe functions f and g ∨ . (cid:4) Let f : X → ] − ∞ , + ∞ ] be proper, convex and lower semicontinuous. We use f ∗ to denote the convexconjugate (a.k.a. Fenchel conjugate) of f , defined by f ∗ : X → ] − ∞ , + ∞ ] : x sup u ∈ X ( h x , u i − f ( x )) . Let f : X → ] − ∞ , + ∞ ] . Then f ∨ : X → ] − ∞ , + ∞ ] : x f ( − x ) . Let f : X → ] − ∞ , + ∞ ] be proper. The set of minimizers of f , { x ∈ X | f ( x ) = inf f ( X ) } , is denoted byArgmin f . Let S be nonempty subset of X . The strong relative interior of S , denoted by sri S , is the interior withrespect to the closed affine hull of S . Let f : X → ] − ∞ , + ∞ ] and g : X → ] − ∞ , + ∞ ] be proper, convex and lower semicontinuous.The infimal convolution of f and g , denoted by f (cid:3) g , is the convex function f (cid:3) g : X → R : x inf x ∈ X ( f ( y ) + g ( x − y )) . The Douglas-Rachford algorithm in the affine case
In this section we assume that A : X ⇒ X and B : X ⇒ X are maximally monotone and affine,and that Z = { x ∈ X | ∈ Ax + Bx } 6 = ∅ . (26)Since the resolvents J A and J B are affine (see [9, Theorem 2.1(xix)]), so is T DR . Theorem 5.1.
Let x ∈ X. Then the following hold: (i) T n DR x → P Fix T DR x. (ii) Suppose that A and B are paramonotone such that K ⊥ ( Z − Z ) (as is the case when Aand B are paramonotone and ( zer A ) ∩ ( zer B ) = ∅ ). Then J A T n DR x → P Z x. (iii) Suppose that X is finite-dimensional. Then T n DR x → P Fix T DR x linearly and J A T n DR x → J A P Fix T DR x linearly.Proof . (i): Note that in view of Fact 4.3(ii) and (26) we have Fix T DR = ∅ . MoreoverFact 4.3(i) and Fact 2.1 imply that T DR is asymptotically regular. It follows from The-orem 3.3 that (i) holds. (ii): Use (i) and Theorem 4.5. (iii): The linear convergence of ( T n DR x ) n ∈ N follows from Corollary 3.6. The linear convergence of ( J A T n DR x ) n ∈ N is a di-rect consequence of the linear convergence of ( T n DR x ) n ∈ N and the fact that J A is (firmly)nonexpansive. (cid:4) Remark 5.2.
Theorem 5.1 generalizes the convergence results for the original Douglas-Rachfordalgorithm [21] from particular symmetric matrices/affine operators on a finite-dimensional spaceto general affine relations defined on possibly infinite dimensional spaces, while keeping strongand linear convergence of the iterates of the governing sequence ( T n DR x ) n ∈ N and identifyingthe limit to be P Fix T DR x. Paramonotonicity coupled with common zeros yields convergence of theshadow sequence ( J A T n DR x ) n ∈ N to P Z x. Suppose that U and V are nonempty closed convex subsets of X . Then T U , V = T DR ( N U , N V ) = Id − P U + P V ( P U − Id ) . (27) Proposition 5.3.
Suppose that U and V are closed linear subspaces of X. Let w ∈ X. Thenw + U and w + V are closed affine subspaces of X, ( w + U ) ∩ ( w + V ) = ∅ and ( ∀ n ∈ N ) T nw + U , w + V = T nU , V ( · − w ) + w . (28) A : X ⇒ X is an affine relation if gra A is affine subspace of X × X , i.e., a translation of a linear subspaceof X × X . For further information of affine relations we refer the reader to [12]. See Theorem 4.7(iii). roof . Let x ∈ X . We proceed by induction. The case n = n =
1, i.e., T w + U , w + V = T U , V ( · − w ) + w . (29)Indeed, T w + U , w + V x = ( Id − P w + U + P w + V ( P w + U − Id )) x = x − w − P U ( x − w ) + w + P V ( P w + U x − x − w ) = x − w − P U ( x − w ) + w + P V ( w + P U ( x − w ) − x − w ) = ( x − w ) − P U ( x − w ) + P V ( P U ( x − w ) − ( x − w )) + w = ( Id − P U + P V R U )( x − w ) + w = T V , U ( x − w ) + w . We now assume that (28) holds for some n ∈ N . Applying (29) with x replaced by T nw + V , w + U x yields T n + w + U , w + V x = T w + U , w + V ( T nw + U , w + V x ) = T U , V ( T nw + U , w + V x − w ) + w = T U , V ( T nU , V ( x − w ) + w − w ) + w = T n + U , V ( x − w ) + w ; (30)hence (28) holds for all n ∈ N . (cid:4) Example 5.4 ( Douglas-Rachford in the affine feasibility case ). (see also [4, Corollary 4.5])Suppose that U and V are closed linear subspaces of X. Let w ∈ X and let x ∈ X. Supposethat A = N w + U and that B = N w + V . Then T w + U , w + V x = Lx + b, where L = T U , V andb = w − T U , V w. Moreover, T nw + V , w + U x → P Fix T w + V , w + U x (31) and J A T nw + V , w + U x = P w + U T nw + V , w + U x → P Z x = P ( w + V ) ∩ ( w + U ) x . (32) Finally, if U + V is closed (as is the case when X is finite-dimensional) then the convergence islinear with rate c F ( U , V ) < , where c F ( U , V ) is the cosine of the Friedrich’s angle between Uand V.Proof . Using (28) with n = T U , V we have T w + U , w + V = T U , V ( · − w ) + w = T U , V + w − T U , V w . (33)Hence L = T U , V and b = w − T U , V w , as claimed. To obtain (31) and (32), use Theo-rem 5.1(i) and Theorem 5.1(ii), respectively. The claim about the linear rate follows bycombining [4, Corollary 4.4] and Theorem 3.5 with T replaced by T w + U , w + V , L replacedby T U , V and b replaced by w − T U , V w . (cid:4) Remark 5.5.
When X is infinite-dimensional, it is possible to construct an example (see [4, Sec-tion 6]) of two linear subspaces U and V where c F ( U , V ) = , and the rate of convergence of T DR is not linear. Suppose that U and V are closed linear subspaces of X . The cosine of the Friedrichs angle is c F ( par U , par V ) = sup u ∈ par U ∩ W ⊥ ∩ ball ( ) v ∈ par V ∩ W ⊥ ∩ ball ( ) |h u , v i| <
1, where W = par U ∩ par V . Example 5.6.
Suppose that X = R , thatA = (cid:18) −
11 0 (cid:19) and that B = N { }× R . (34) Then T DR : R → R : ( x , y ) ( x − y ) · ( − ) , Fix T DR = R · ( − ) , Z = { } × R ,K = { } , hence K ⊥ ( Z − Z ) , and ( ∀ ( x , y ) ∈ R )( ∀ n ≥ ) T n DR ( x , y ) = T DR ( x , y ) = ( x − y , y − x ) ∈ Fix
T, however ( ∀ ( x , y ) ∈ ( R r { } ) × R ) ( ∀ n ≥ )( y − x ) = J A T n DR ( x , y ) = P Z ( x , y ) = ( y ) . (35) Note that A is not paramonotone by Example 4.2.Proof . We have J A = ( Id + A ) − = (cid:18) −
11 1 (cid:19) − = (cid:18) − (cid:19) , (36)and R A = J A − Id = (cid:18) − (cid:19) . (37)Moreover, by [7, Example 23.4], J B = P R ×{ } = (cid:18) (cid:19) . (38)Consequently T DR = Id − J A + J B R A = (cid:18) (cid:19) − (cid:18) − (cid:19) + (cid:18) (cid:19) (cid:18) − (cid:19) = (cid:18) − − (cid:19) ,(39)i.e., T DR : R → R : ( x , y ) x − y ( − ) . (40)Now let ( x , y ) ∈ R . Then ( x , y ) ∈ Fix T DR ⇔ ( x , y ) = ( x − y , − x − y ) ⇔ x = x − y and y = − x − y ⇔ x + y =
0, hence Fix T DR = R · ( − ) as claimed. It follows from [19,Lemma 2.6(iii)] that Z = J A ( Fix T DR ) = R · J A ( − ) = R · (
0, 2 ) = { } × R , as claimed.Now let ( x , y ) ∈ R . By (40) we have T DR ( x , y ) = x − y ( − ) ∈ Fix T DR , hence ( ∀ n ≥ ) T n DR ( x , y ) = T ( x , y ) = x − y ( − ) . Therefore, ( ∀ n ≥ ) J A T n DR ( x , y ) = J A T DR ( x , y ) = J A (cid:16) x − y ( − ) (cid:17) = ( y − x ) = ( y ) = P Z ( x , y ) whenever x = (cid:4) The next example illustrates that the assumption K ⊥ ( Z − Z ) is critical for the conclu-sion in Theorem 5.1(ii). 14 xample 5.7 ( when K ( Z − Z ) ). Let u ∈ X r { } . Suppose that A : X → X : x u andB : X → X : x
7→ − u. Then A and B are paramonotone, A + B ≡ and therefore Z = X.Moreover, by [6, Remark 5.4] ( ∀ z ∈ Z = X ) K = ( Az ) ∩ ( − Bz ) = { u } 6⊥ ( Z − Z ) = X. Notethat
Fix T = Z + K = X + { u } = X and J A : X → X : x x − u. Consequently ( ∀ x ∈ X )( ∀ n ∈ N ) J A T n DR x = J A P Fix T x = J A x = x − u = x = P Z x . (41) Proposition 5.8 ( parallel splitting). Let m ∈ {
2, 3, . . . } , and let B i : X ⇒ X be max-imally monotone and affine, i ∈ {
1, 2, . . . , m } , such that zer ( ∑ mi = B i ) = ∅ . Set ∆ = { ( x , . . . , x ) ∈ X m | x ∈ X } , set A = N ∆ , set B = × mi = B i , set T = T DR ( A , B ) , let j : X → X m : x ( x , x , . . . , x ) , and let e : X m → X : ( x , x , . . . , x m ) m ( ∑ mi = x i ) . Let x ∈ X m .Then ∆ ⊥ = { ( u , . . . , u m ) ∈ X m | ∑ mi = u i = } , Z = Z ( A , B ) = j ( zer (cid:16) ∑ mi = B i (cid:17) ) ⊆ ∆ and K = K ( A , B ) = ( − B ( Z )) ∩ ∆ ⊥ ⊆ ∆ ⊥ . (42) Moreover, the following hold: (i) T n x → P Fix T x . (ii) Suppose that X is finite-dimensional. Then T n x → P Fix T x linearly and J A T n x = P ∆ T n x → P ∆ P Fix T x linearly. (iii) Suppose that B i : X ⇒ X, i ∈ {
1, 2, . . . , m } , are paramonotone. Then B is paramonotoneand J A T n x = P ∆ T n x → P Z x . Consequently, e ( J A T n x ) = e ( P ∆ T n x ) → e ( P Z x ) ∈ Z.Proof . The claim about ∆ ⊥ and first identity in (42) follows from [7, Proposi-tion 25.5(i)&(vi)], whereas the second identity in (42) follows from Corollary 4.6(iii) ap-plied to ( A , B ) . (i): Apply Theorem 5.1(i) to ( A , B ) . (ii): Apply Theorem 5.1(iii) to ( A , B ) .(iii): Let ( x , u ) , ( y , v ) be in gra B . On the one hand h x − y , u − v i = ⇔ ∑ mi = h x i − y i , u i − v i i = ( x i , u i ) , ( y i , v i ) are in gra B i , i ∈ {
1, . . . , m } . On the other hand, since ( ∀ i ∈ {
1, . . . , m } ) B i are monotone we learn that ( ∀ i ∈ {
1, . . . , m } ) h x i − y i , u i − v i i ≥ ( ∀ i ∈ {
1, . . . , m } ) h x i − y i , u i − v i i =
0. Now use that paramonotonicity of B i to deduce that ( x i , v i ) , ( y i , u i ) are in gra B i , i ∈ {
1, . . . , m } ; equivalently, ( x , v ) , ( y , u ) ingra B . Finally, apply Corollary 4.6(iv). (cid:4) In this section we present examples of monotone operators that are partly motivated byapplications in partial differential equations; see, e.g., [25] and [37]. Let M ∈ R n × n . Thenwe have the following equivalences: M is monotone ⇔ M + M T the eigenvalues of M + M T R + . (43b) Lemma 6.1.
Let M = (cid:18) α βγ δ (cid:19) ∈ R × . (44) Then M is monotone if and only if α ≥ , δ ≥ and αδ ≥ ( β + γ ) .Proof . Indeed, the principal minors of M + M T are 2 α , 2 δ and 4 αδ − ( β + γ ) ; by, e.g., [30,(7.6.12) on page 566]. (cid:4) Note that if M = M T , then M is monotone if and only if the eigenvalues of M lie in R + . If M = M T , then some information about the location of the (possibly complex)eigenvalues of M is available: Lemma 6.2.
Let M ∈ R n × n be monotone, and let { λ k } nk = denote the set of eigenvalues of M.Then Re ( λ k ) ≥ for every k ∈ {
1, . . . , n } .Proof . Write λ = α + i β , where α and β belong to R and i = √− λ is an eigenvalue of M with (nonzero) eigenvector w = u + i v , where u and v are in R n .Then ( M − λ Id ) w = ⇔ (( M − α Id ) − i β Id )( u + i v ) = ⇔ ( M − α Id ) u + β v = ( M − α Id ) v − β u =
0. Hence h u , ( M − α Id ) u i + β h u , v i =
0, (45a) h v , ( M − α Id ) v i − β h v , u i =
0. (45b)Adding (45) yields h u , ( M − α Id ) u i + h v , ( M − α Id ) v i =
0; equivalently, h u , Mu i + h v , Mv i − α k u k − α k v k =
0. Solving for α yieldsRe ( λ ) = α = h u , Mu i + h v , Mv ik u k + k v k ≥
0, (46)as claimed. (cid:4)
The converse of Lemma 6.2 is not true in general, as we demonstrate in the followingexample.
Example 6.3.
Let ξ ∈ R r [ −
2, 2 ] , and setM = (cid:18) ξ (cid:19) . (47) Then M has as its only eigenvalue (with multiplicity ), M is not monotone by Lemma 6.1, andM is not symmetric. Let C be the set of complex numbers and let z ∈ C . We use Re ( z ) to refer to the real part of the complexnumber z . roposition 6.4. Consider the tridiagonal Toeplitz matrixM = β γ α . . . . . .. . . . . . γ α β ∈ R n × n . (48) Then M is monotone if and only if β ≥ | α + γ | cos ( π / ( n + )) .Proof . Note that ( M + M T ) = β ( α + γ ) ( α + γ ) . . . . . .. . . . . . ( α + γ ) ( α + γ ) β . (49)By (43a), M is monotone ⇔ ( M + M T ) is positive semidefinite. If α + γ = ( M + M T ) = β Id and therefore ( M + M T ) is positive semidefinite ⇔ β ≥ = | α + γ | . Nowsuppose that α + γ =
0. It follows from [30, Example 7.2.5] that the eigenvalues of ( M + M T ) are λ k = β + ( α + γ ) cos (cid:0) k π n + (cid:1) , (50)where k ∈ {
1, . . . , n } . Consequently, { λ k } nk = ⊆ R + ⇔ β ≥ | ( α + γ ) cos ( π / ( n + )) | .Therefore, the characterization of monotonicity of M follows from (43b). (cid:4) Proposition 6.5.
Let M = β γ α . . . . . .. . . . . . γ α β ∈ R n × n . (51) Then exactly one of the following holds: (i) αγ = and det ( M ) = β n . Consequently M is invertible ⇔ β = , in which case [ M − ] i , j = ( − α ) max { i − j ,0 } ( − γ ) max { j − i ,0 } β min { j − i , i − j }− . (52)(ii) αγ = . Set r = α ( − β + p β − αγ ) , s = α ( − β − p β − αγ ) and Λ = (cid:8) β + γ p α / γ cos ( k π / ( n + )) (cid:12)(cid:12) k ∈ {
1, 2, . . . , n } (cid:9) . Then rs = . Moreover, M isinvertible ⇔ Λ , in which caser = s ⇒ [ M − ] i , j = − γ j − ( r min { i , j } − s min { i , j } )( r n + s max { i , j } − r max { i , j } s n + ) α j ( r − s )( r n + − s n + ) , (53a) In the special case, when β =
0, this is equivalent to saying that M is invertible ⇔ n is even. = s ⇒ [ M − ] i , j = − γ j − min { i , j } ( n + − max { i , j } ) r i + j − α j ( n + ) . (53b) Alternatively, define the recurrence relationsu = u = u k = − γ ( α u k − + β u k − ) , k ≥
2; (54a) v n + = v n = v k = − α ( β v k + + γ v k + ) , k ≤ n −
1. (54b)
Then [ M − ] i , j = − u min { i , j } v max { i , j } v (cid:16) γα (cid:17) j − . (55) Proof . (i): αγ = ⇔ α = γ =
0, in which case M is a (lower or upper) triangularmatrix. Hence det ( M ) = β n , and the characterization follows. The formula in (52) iseasily verified. (ii): Note that 0 ∈ { r , s } ⇔ β ∈ {± p β − αγ } ⇔ β = β − αγ ⇔ αγ =
0. Hence rs =
0. Moreover, it follows from [30, Example 7.2.5] that Λ is the setof eigenvalues of M ; therefore, M is invertible ⇔ Λ . The formulae (53) follow from[38, Remark 2 on page 110]. The recurrence formulae defined in (54) and [38, Theorem 2]yield (55). (cid:4) Remark 6.6.
Concerning Proposition 6.5, it follows from [36, Section 2 on page 44] that we alsohave the alternative formulaer = s ⇒ [ M − ] i , j = − γ s − i − r − i s − − r − s − n + j − − r − n + j − s − ( n + ) − r − ( n + ) , j ≥ i ; − α s j − r j s − r s n − i + − r n − i + s n + − r n + , j ≤ i , (56a) r = s ⇒ [ M − ] i , j = − i γ (cid:18) − jn + (cid:19) r j − i + , j ≥ i ; − j α (cid:18) − in + (cid:19) r j − i − , j ≤ i . (56b) Using the binomial expansion, (56a) , and a somewhat tedious calculation which we omit here, onecan show that [ M − ] i , j is equal to − ⌈ min { i , j } / ⌉− ∑ m = (cid:0) min { i , j } m + (cid:1) ( − β ) min { i , j }− ( m + ) ( β − αγ ) m ! ⌈ ( n + − max { i , j } ) / ⌉− ∑ m = (cid:0) n − max { i , j } + m + (cid:1) ( − β ) n − max { i , j }− m ( β − αγ ) m ! ( α ) min { j − i } ( γ ) min { i − j } ⌈ ( n + ) / ⌉− ∑ m = (cid:0) n + m + (cid:1) ( − β ) n − m ( β − αγ ) m ! (57) provided that r = s. xample 6.7. Let β ≥ , setM = β − − . . . . . .. . . . . . − − β ∈ R n × n , (58) set r = ( β + p β − ) and set s = ( β − p β − ) . Then M is monotone and invertible.Moreover, r = s ⇒ [ M − ] i , j = ( r min { i , j } − s min { i , j } )( r n + s max { i , j } − r max { i , j } s n + )( r − s )( r n + − s n + ) , (59a) r = s ⇒ [ M − ] i , j = min { i , j } ( n + − max { i , j } ) n + Alternatively, define the recurrence relationsu = u = u k = β u k − − u k − , k ≥
2, (60a) v n + = v n = v k = β v k + − v k + , k ≤ n −
1. (60b)
Then [ M − ] i , j = − u min { i , j } v max { i , j } v . (61) Proof . The monotonicity of M follows from Proposition 6.4 by noting that β ≥ > ( π / ( n + )) . The same argument implies that0 Λ = n β − (cid:0) k π n + (cid:1) (cid:12)(cid:12)(cid:12) k ∈ {
1, 2, . . . , n } o . (62)Hence M is invertible by Proposition 6.5(ii). Note that β = ⇔ β − = ⇔ r = s = (cid:4) Let M = [ α i , j ] ni , j = ∈ R n × n and M = [ β i , j ] ni , j = ∈ R n × n . Recall that the Kroneckerproduct of M and M (see, e.g., [28, page 407] or [30, Exercise 7.6.10]) is defined by theblock matrix M ⊗ M = [ α i , j M ] ∈ R n × n . (63) Lemma 6.8.
Let M and M be symmetric matrices in R n × n . Then M ⊗ M ∈ R n × n issymmetric.Proof . Using [30, Exercise 7.8.11(a)] or [28, Proposition 1(e) on page 408] we have ( M ⊗ M ) T = M T ⊗ M T = M ⊗ M . (cid:4) The following fact is very useful in the conclusion of the upcoming results.19 act 6.9.
Let M and M be in R n × n , with eigenvalues (cid:8) λ k (cid:12)(cid:12) k ∈ {
1, . . . , n } (cid:9) and (cid:8) µ k (cid:12)(cid:12) k ∈ {
1, . . . , n } (cid:9) . Then the eigenvalues of M ⊗ M are (cid:8) λ j µ k (cid:12)(cid:12) j , k ∈ {
1, . . . , n } (cid:9) .Proof . See [28, Corollary 1 on page 412] or [30, Exercise 7.8.11(b)]. (cid:4) Corollary 6.10.
Let M and M in R n × n be monotone such that M or M is symmetric. ThenM ⊗ M is monotone.Proof . According to (43), it is suffices to show that all the eigenvalues of M ⊗ M +( M ⊗ M ) T are nonnegative. Suppose first that M is symmetric. Then using [28,Proposition 1(e)&(c)] we have M ⊗ M + ( M ⊗ M ) T = M ⊗ M + M ⊗ M T = M ⊗ ( M + M T ) . Since M is monotone, it follows from (43) that all the eigenvaluesof M + M T are nonnegative. Now apply Fact 6.9 to M and M + M T to learn that all theeigenvalues of M ⊗ M + ( M ⊗ M ) T are nonnegative, hence M ⊗ M is monotone by(43b). A similar argument applies if M is monotone. (cid:4) Note that the assumption that at least one matrix is symmetric is critical in Corol-lary 6.10, as we show in the next example.
Example 6.11.
Suppose that M = (cid:18) −
11 0 (cid:19) . (64)
Then M is monotone, with eigenvalues {± i } , but not symmetric . However,M ⊗ M = − − . (65) is a symmetric matrix with eigenvalues {± } by Fact 6.9. Therefore M ⊗ M is not monotone by (43) . Proposition 6.12.
Let M ∈ R n × n be symmetric. Then Id ⊗ M is monotone ⇔ M ⊗ Id is mono-tone ⇔ M is monotone, in which case we haveJ Id n ⊗ M = Id n ⊗ J M and J M ⊗ Id n = J M ⊗ Id n . (66) Proof . In view of Fact 6.9 the sets of eigenvalues of Id ⊗ M , M ⊗ Id, and M coincide. Itfollows from Lemma 6.8 that Id ⊗ M and M ⊗ Id are symmetric. Now apply (43b) anduse the monotonicity of M . To prove (66), we use [28, Proposition 1(c) on page 408] tolearn that Id n + Id n ⊗ M = Id n ⊗ Id n + Id n ⊗ M = Id n ⊗ ( Id n + M ) . Therefore, by [28,Corollary 1(b) on page 408] we have J Id n ⊗ M = ( Id n + Id n ⊗ M ) − = ( Id n ⊗ ( Id n + M )) − = Id n ⊗ ( Id n + M ) − Id n ⊗ J M . (67)The other identity in (66) is proved similarly. (cid:4) Corollary 6.13.
Let β ∈ R . SetM [ β ] = β − − . . . . . .. . . . . . − − β ∈ R n × n , (68) and let M → and M ↑ be block matrices in R n × n defined by M → = M [ β ] n n n . . . . . .. . . . . . n n n M [ β ] and M ↑ = β Id n − Id n n − Id n . . . . . .. . . . . . − Id n n − Id n β Id n . (69) Then M → = Id n ⊗ M [ β ] and M ↑ = M [ β ] ⊗ Id n . Moreover, M → is monotone ⇔ M ↑ is monotone ⇔ M [ β ] is monotone ⇔ β ≥ ( π / ( n + )) , in which caseJ M → = Id n ⊗ M − [ β + ] and J M ↑ = M − [ β + ] ⊗ Id n . (70) Proof . It is straightforward to verify that M → = Id n ⊗ M [ β ] and M ↑ = M [ β ] ⊗ Id n . Itfollows from Proposition 6.4 that M [ β ] is monotone ⇔ β ≥ (cid:0) π n + (cid:1) . Now combinewith Proposition 6.12. To prove (70) note that Id n + M [ β ] = M [ β + ] , and therefore J M [ β ] = M − [ β + ] . The conclusion follows by applying (66). (cid:4) The above matrices play a key role in the original design of the Douglas-Rachford al-gorithm — see the Appendix for details.
Proposition 6.14.
Let n ∈ {
2, 3, . . . } , let M ∈ R n × n and consider the block matrix M = M − Id n n − Id n . . . . . .. . . . . . − Id n n − Id n M . (71) Let x = ( x , x , . . . , x n ) ∈ R n , where x i ∈ R n , i ∈ {
1, 2, . . . n } . Then h x , M x i = h x , ( M − Id ) x i + n − ∑ k = h x k , ( M − ) x k i + h x n , ( M − Id ) x n i + n − ∑ i = k x i − x i + k . (72)21 oreover the following hold: (i) Suppose that n = . Then M − Id is monotone ⇔ M is monotone. (ii) M − is monotone ⇒ M is monotone. (iii) M is monotone ⇒ M − ( − n ) Id is monotone ⇒ M is monotone.Proof . We have h x , M x i = h ( x , x , . . . , x n ) , ( Mx − x , − x + Mx − x , . . . , − x n − + Mx n ) i = h x , Mx i − h x , x i − h x , x i + h x , Mx i − h x , x i− h x , x i + h x , Mx i − . . . − h x , x i + h x n , Mx n i − h x n − , x n i = h x , Mx i − h x , x i + h x , Mx i − h x , x i− . . . − h x , x i + h x n , Mx n i − h x n − , x n i = h x , Mx i + k x − x k − k x k − k x k + h x , Mx i + k x − x k − k x k − k x k + . . . + k x n − − x n k − k x n − k − k x n k + h x n , Mx n i = h x , Mx i − k x k + h x , Mx i − k x k + . . . + h x n , Mx n i − k x n k + . . . + k x − x k + k x − x k + . . . + k x n − − x n k = h x , ( M − Id ) x i + (cid:18) n − ∑ k = h x k , ( M − ) x k i (cid:19) + h x n , ( M − Id ) x n i + n − ∑ i = k x i − x i + k .(i): “ ⇒ ”: Apply (72) with n =
2. “ ⇐ ”: Let y ∈ R . Applying (72) to the point x =( y , y ) ∈ R , we get 0 ≤ h x , M x i = h y , ( Id − M ) y i . (ii): This is clear from (72). (iii): Let y ∈ R n . Applying (72) to the point x = ( y , y , . . . , y ) ∈ R n yields 0 ≤ h x , M x i = h y , ( M − Id ) y i + ( n − ) h y , ( M − ) y i = h y , ( nM − ( n − ) Id ) y i . Therefore, M − ( − n ) Id ismonotone. (cid:4)
The converse of Proposition 6.14(ii) is not true in general, as we illustrate now.
Example 6.15.
Set M = (cid:18) −
11 1 (cid:19) , (73) and let M be as defined in Proposition 6.14. Then one verifies easily that M is monotone whileM − is not. We now show that the converse of the implications in Proposition 6.14(iii) are not truein general. 22 xample 6.16.
Set n = , set M = Id ∈ R × , and let M be as defined in Proposition 6.14..Then M is monotone but M − ( − ) Id = − Id is not monotone, and M = − − − − . (74) Note the M is symmetric and has eigenvalues {− / , / } , hence M is not monotone by (43) . Acknowledgments
HHB was partially supported by the Natural Sciences and Engineering Research Councilof Canada and by the Canada Research Chair Program.
References [1] H. Attouch and M. Th´era, A general duality principle for the sum of two operators,
Journalof Convex Analysis
Comptes rendus de l’Acad´emie des Sciences
Houston Journal of Mathematics
Journal of Approximation Theory
185 (2014), 63–79.[5] H.H. Bauschke, J.Y. Bello Cruz, T.T.A. Nghia, H.M. Phan and X. Wang, Optimal ratesof convergence of matrices with applications, to appear in
Numerical Algorithms . DOI10.1007/s11075-015-0085-4 [6] H.H. Bauschke, R.I. Bot¸, W.L. Hare and W.M. Moursi, Attouch–Th´era duality revisited:paramonotonicity and operator splitting,
Journal of Approximation Theory
164 (2012), 1065–1084.[7] H.H. Bauschke and P.L. Combettes,
Convex Analysis and Monotone Operator Theory in HilbertSpaces , Springer, 2011.[8] H.H. Bauschke, F. Deutsch, H. Hundal and S.-H. Park, Accelerating the convergence of themethod of alternating projections,
Transactions of the AMS
355 (2003), 3433–3461.[9] H.H. Bauschke, S.M. Moffat, and X. Wang, Firmly nonexpansive mappings and maximallymonotone operators: correspondence and duality,
Set-Valued and Variational Analysis
10] H.H. Bauschke, D.R. Luke, H.M. Phan and X. Wang, Restricted normal cones and themethod of alternating projections: applications,
Set-Valued and Variational Analysis
21 (2013),475–501.[11] H.H. Bauschke and W.M. Moursi, On the Douglas-Rachford algorithm for two (not neces-sarily intersecting) affine subspaces, to appear in
SIAM Journal on Optimization .[12] H.H. Bauschke, X. Wang and L. Yao, On Borwein-Wiersma decompositions of monotonelinear relations,
SIAM Journal on Optimization
20 (2010), 2636–2652.[13] H.H. Bauschke, X. Wang and L. Yao, Rectangularity and paramonotonicity of maximallymonotone operators,
Optimization
63 (2014), 487–504.[14] J.M. Borwein, Fifty years of maximal monotonicity,
Optimization Letters
Operateurs Maximaux Monotones et Semi-Groupes de Contractions dans les Espaces deHilbert , North-Holland/Elsevier, 1973.[16] R.E. Bruck and S. Reich, Nonexpansive projections and resolvents of accretive operators inBanach spaces,
Houston Journal of Mathematics
Set-Valued Mappings and Enlargements of Monotone Operators ,Springer-Verlag, 2008.[18] P.L. Combettes, The convex feasibility problem in image recovery,
Advances in Imaging andElectron Physics
25 (1995), 155–270.[19] P.L. Combettes, Solving monotone inclusions via compositions of nonexpansive averagedoperators,
Optimization
53 (2004), 475–504.[20] J. Douglas and H.H. Rachford, The numerical solution of parabolic and elliptic differentialequations,
Journal of SIAM
Transactions of the AMS
82 (1956), 421–439.[22] J. Douglas, On the numerical integration of ∂ u ∂ x + ∂ u ∂ y = ∂ u ∂ t by implicit methods, Journal ofSIAM
Splitting Methods for Monotone Operators with Applications to Parallel Optimization ,Ph.D. thesis, MIT, 1989.[24] J. Eckstein and D.P. Bertsekas, On the Douglas–Rachford splitting method and the proximalpoint algorithm for maximal monotone operators,
Mathematical Programming
55 (1992), 293–318.[25] R. Glowinski,
Variational Methods for the Numerical Solution of Nonlinear Elliptic Problems ,SIAM, 2015.[26] A.N. Iusem, On some properties of paramonotone operators,
Journal of Convex Analysis
Introductory Functional Analysis with Applications , Wiley, 1989.[28] P. Lancaster and M. Tismenetsky,
The Theory of Matrices with Applications , Academic Press,1985.[29] P.L. Lions and B. Mercier, Splitting algorithms for the sum of two nonlinear operators.
SIAMJournal on Numerical Analysis
30] C.D. Meyer,
Matrix Analysis and Applied Linear Algebra , SIAM, 2000.[31] W.E. Milne,
Numerical Solutions of Differential Equations , New York, 1953.[32] G.B. Passty, The parallel sum of nonlinear monotone operators,
Nonlinear Analysis
10 (1986),215–227.[33] R.T. Rockafellar and R.J-B. Wets,
Variational Analysis , Springer-Verlag, corrected 3rd print-ing, 2009.[34] S. Simons,
Minimax and Monotonicity , Springer-Verlag, 1998.[35] S. Simons,
From Hahn-Banach to Monotonicity , Springer-Verlag, 2008.[36] T. Torii, Inversion of tridiagonal matrices and the stability of tridiagonal systems of linearequations,
Information Processing in Japan
Handbook of numericalanalysis
Linear Algebra and its Applications
Nonlinear Functional Analysis and Its Applications II/A: Linear Monotone Operators ,Springer-Verlag, 1990.[40] E. Zeidler,
Nonlinear Functional Analysis and Its Applications II/B: Nonlinear Monotone Opera-tors , Springer-Verlag, 1990.
Appendix
In this section we briefly show the connection between the original Douglas-Rachfordalgorithm introduced in [21] (see also [22], [31] and [20] for variations of this method) tosolve certain types of heat equations and the general algorithm introduced by Lions andMercier in [29] (see also [19]).Suppose that Ω is a bounded square region in R . Consider the Dirichlet problem forthe Poisson equation: Given f and g , find u : Ω → R such that ∆ u = f on Ω and u = g on bdry Ω , (75)where ∆ = ∇ = ∂ ∂ x + ∂ ∂ y is the Laplace operator and and bdry Ω denotes the boundaryof Ω . Discretizing u followed by converting it into a “long vector” y (see [30, Exam-ple 7.6.2 & Problem 7.6.9]) we obtain the system of linear equations L → y + L ↑ y = − b . (76)Here L → and L ↑ denote the horizontal (respectively vertical) positive definite discretiza-tion of the negative Laplacian over a square mesh with n points at equally spaced inter-vals (see, [30, Problem 7.6.10]). We have L → = Id ⊗ M and L ↑ = M ⊗ Id, (77)25here M = − − − − ∈ R n × n . (78)To see the connection to monotone operators, set A = L → and B : L ↑ + b : y L ↑ y + b .Then A and B are affine and strictly monotone. The problem then reduces tofind y ∈ R n such that Ay + By =
0, (79)and the algorithm proposed by Douglas and Rachford in [21] becomes y n + / + Ay n + By n + / − y n =
0, (80a) y n + − y n + / − Ay n + Ay n + =
0. (80b)Consequently,(80a) ⇔ ( Id + B )( y n + / ) = ( Id − A ) y n ⇔ y n + / = J B ( Id − A ) y n , (81a)(80b) ⇔ ( Id + A ) y n + = Ay n + y n + / ⇔ y n + = J A ( Ay n + y n + / ) . (81b)Substituting (81a) into (81b) to eliminate y n + / yields y n + = J A (cid:0) Ay n + J B ( Id − A ) y n (cid:1) . (82)To proceed further, we must show that ( Id − A ) J A = R A (83a) AJ A = Id − J A . (83b)Indeed, note that Id − A = − ( Id + A ) , therefore multiplying by J A = ( Id + A ) − fromthe right yields ( Id − A ) J A = ( − ( Id + A )) J A = J A − Id = R A . Hence J A − AJ A = J A − ( Id − J A ) ; equivalently, AJ A = Id − J A . Now consider the change of variable ( ∀ n ∈ N ) x n = ( Id + A ) y n , (84)which is equivalent to y n = J A x n . Substituting (82) into (84), and using (83), yield x n + = ( Id + A ) y n + = ( Id + A ) J A ( Ay n + J B ( Id − A ) y n ) = Ay n + J B ( Id − A ) y n = AJ A x n + J B ( Id − A ) J A x n = x n − J A x n + J B R A x n = ( Id − J A + J B R A ) x n , (85)which is the Douglas-Rachford update formula (18).26e point out that J A = J L → , and using [7, Proposition 23.15(ii)] we have J B = J L ↑ + b = J L ↑ − J L ↑ b . To calculate J A and J B apply Corollary 6.13 to get J A = Id n ⊗ J M and J B = J M ⊗ Id n − ( J M ⊗ Id n )( b ) . (86)For instance, when n =
3, the above calculations yield J M =
821 17 12117 37 17121 17 821 , (87)Id ⊗ J M =
821 17 121
17 37 17
121 17 821
821 17 121
17 37 17
121 17 821
821 17 121
17 37 17
121 17 821 , (88)and J M ⊗ Id =
00 0
00 0
00 0