[PDF] Affine nonexpansive operators, Attouch-Théra duality and the Douglas-Rachford algorithm

Abstract

The Douglas-Rachford splitting algorithm was originally proposed in 1956 to solve a system of linear equations arising from the discretization of a partial differential equation. In 1979, Lions and Mercier brought forward a very powerful extension of this method suitable to solve optimization problems. In this paper, we revisit the original affine setting. We provide a powerful convergence result for finding a zero of the sum of two maximally monotone affine relations. As a by product of our analysis, we obtain results concerning the convergence of iterates of affine nonexpansive mappings as well as Attouch-Théra duality. Numerous examples are presented.

Full PDF

aa r X i v : . [ m a t h . O C ] M a r Afﬁne nonexpansive operators, Attouch-Th ´eraduality and the Douglas-Rachford algorithm

Heinz H. Bauschke ∗ , Brett Lukens † and Walaa M. Moursi ‡ March 30, 2016

In tribute to Michel Th´era on his 70th birthday

Abstract

The Douglas-Rachford splitting algorithm was originally proposed in 1956 to solvea system of linear equations arising from the discretization of a partial differentialequation. In 1979, Lions and Mercier brought forward a very powerful extension ofthis method suitable to solve optimization problems.In this paper, we revisit the original afﬁne setting. We provide a powerful conver-gence result for ﬁnding a zero of the sum of two maximally monotone afﬁne relations.As a by product of our analysis, we obtain results concerning the convergence of it-erates of afﬁne nonexpansive mappings as well as Attouch-Th´era duality. Numerousexamples are presented.

Primary 47H05, 47H09, 49M27; Secondary 49M29,49N15, 90C25.

Keywords: afﬁne mapping, Attouch-Th´era duality, Douglas-Rachford algorithm, linear conver-gence, maximally monotone operator, nonexpansive mapping, paramonotone operator, strongconvergence, Toeplitz matrix, tridiagonal matrix. ∗ Mathematics, University of British Columbia, Kelowna, B.C. V1V 1V7, Canada. E-mail: [email protected] . † [email protected]. ‡ Mathematics, University of British Columbia, Kelowna, B.C. V1V 1V7, Canada, and Mansoura Univer-sity, Faculty of Science, Mathematics Department, Mansoura 35516, Egypt. E-mail: [email protected] . Introduction

Throughout this paper X is a real Hilbert space,with inner product h· , ·i and induced norm k·k . A central problem in optimization is toﬁnd x ∈ X such that 0 ∈ ( A + B ) x , (1)where A and B are maximally monotone operators on X ; see, e.g., [7], [14], [15], [17],[18], [34], [35], [33], [39], [40], and the references therein. As Lions and Mercier observedin the their landmark paper [29], one may iteratively solve the sum problem (1) by thecelebrated Douglas-Rachford splitting algorithm (see also [24]). This algorithm proceedsby iterating the operator T = Id − J A + J B R A ; the sequence ( J A T n x ) n ∈ N converges to asolution of (1) (see Section 5 for details). The Douglas-Rachford algorithm was originallyproposed in 1956 by Douglas and Rachford [21]. It can be viewed as a method for solvinga system of linear equations where the underlying coefﬁcient matrix is positive deﬁnite.The far-reaching extension to optimization provided by Lions and Mercier [29] is not atall obvious (for the sake of completeness, we sketch this connection in the Appendix).In this paper, we concentrate on the afﬁne setting. In the original setting consideredby Douglas and Rachford, the operators A and B correspond to positive deﬁnite matri-ces. We extend this result in various directions. Indeed, we obtain strong convergence inpossibly inﬁnite-dimensional Hilbert space; the operators A and B may be afﬁne maximallymonotone relations , and we also identify the limit . The remainder of this paper is organizedas follows. In Section 2, we provide several results which will be useful in the derivationof the main results. A new characterization of strongly convergent iterations of afﬁnenonexpansive operators (Theorem 3.3) is presented in Section 3. We also discuss whenthe convergence is linear. In Section 4, we obtain new results, which are formulated us-ing the Douglas-Rachford operator, on the relative geometry of the primal and dual (inthe sense of Attouch-Th´era duality) solutions to (1). The main algorithmic result (The-orem 5.1) is derived in Section 5. It provides precise information on the behaviour ofthe Douglas-Rachford algorithm in the afﬁne case. Numerous examples are presented inSection 6 where we also pay attention to the tridiagonal Toeplitz matrices and Kroneckerproducts. In the Appendix, we sketch the connection between the historical Douglas-Rachford algorithm and the powerful extension provided by Lions and Mercier.Finally, the notation we employ is quite standard and follows largely [7]. Let C be anonempty closed convex subset of X . We use N C and P C to denote the normal cone operator and the projector associated with C , respectively. Let Y be a Banach space. We shall use B ( Y ) to denote the set of bounded linear operators on Y . Let L ∈ B ( Y ) . The operator norm of L is k L k = sup k y k≤ k Ly k . Further notation is developed as necessary during the courseof this paper. 2 Auxiliary results

In this section, we collect various results that will be useful in the sequel.Suppose that T : X → X . Then T is nonexpansive if ( ∀ x ∈ X )( ∀ y ∈ X ) k Tx − Ty k ≤ k x − y k ; (2) T is ﬁrmly nonexpansive if ( ∀ x ∈ X )( ∀ y ∈ X ) k Tx − Ty k + k ( Id − T ) x − ( Id − T ) y k ≤ k x − y k ; (3) T is asymptotically regular if ( ∀ x ∈ X ) T n x − T n + x →

0. (4)

Fact 2.1.

Let T : X → X. ThenT ﬁrmly nonexpansive

Fix T = ∅ (cid:27) ⇒ T asymptotically regular . (5)

Proof . See [16, Corollary 1.1] or [7, Corollary 5.16(ii)]. (cid:4)

Fact 2.2.

Let L : X → X be linear and nonexpansive, and let x ∈ X. ThenL n x → P Fix L x ⇔ L n x − L n + x →

0. (6)

Proof . See [2, Proposition 4], [3, Theorem 1.1], [8, Theorem 2.2] or [7, Proposition 5.27]. (Wemention in passing that in [2, Proposition 4] the author proved the result for general oddnonexpansive mappings in Hilbert spaces and in [3, Theorem 1.1], the authors generalizethe result to Banach spaces.) (cid:4)

Deﬁnition 2.3.

Let Y be a real Banach space, let ( y n ) n ∈ N be a sequence in Y and let y ∞ ∈ Y.Then ( y n ) n ∈ N converges to y ∞ , denoted y n → y ∞ , if k y n − y ∞ k → . ( y n ) n ∈ N ; converges µ -linearly to y ∞ if µ ∈ [

0, 1 [ and there exists M ≥ such that ( ∀ n ∈ N ) k y n − y ∞ k ≤ M µ n . (7) ( y n ) n ∈ N converges linearly to y ∞ if there exists µ ∈ [

0, 1 [ and M ≥ such that (7) holds. Example 2.4 ( convergence vs. pointwise convergence of bounded linear operators). LetY be a real Banach space, let ( L n ) n ∈ N be a sequence in B ( Y ) , and let L ∞ ∈ B ( Y ) . Then one says: By [10, Remark 3.7], this is equivalent to ( ∃ M > )( ∃ N ∈ N )( ∀ n ≥ N ) k y n − y ∞ k ≤ M µ n . ( L n ) n ∈ N converges or converges uniformly to L ∞ in B ( Y ) if L n → L ∞ ( in B ( Y )) . (ii) ( L n ) n ∈ N converges pointwise to L ∞ if ( ∀ y ∈ Y ) L n y → L ∞ y ( in Y ) . Remark 2.5.

It is easy to see that the convergence of a sequence of bounded linear operators impliespointwise convergence; however, the converse is not true (see, e.g., [27, Example 4.9-2]).

Lemma 2.6.

Let Y be a real Banach space, let ( L n ) n ∈ N be a sequence in B ( Y ) , let L ∞ ∈ B ( Y ) ,and let µ ∈ ]

0, 1 [ . Then ( ∀ y ∈ Y ) L n y → L ∞ y µ -linearly ( in Y ) ⇔ L n → L ∞ µ -linearly ( in B ( Y )) . (8) Proof . Let y ∈ Y . “ ⇒ ”: Because L n y → L ∞ y µ -linearly, there exists M y ≥ ( ∀ n ∈ N ) k ( L n − L ∞ ) y k ≤ µ n M y ; equivalently, (cid:13)(cid:13)(cid:13) (cid:18) L n − L ∞ µ n (cid:19) y (cid:13)(cid:13)(cid:13) = k ( L n − L ∞ ) y k µ n ≤ M y . (9)It follows from the Uniform Boundedness Principle (see, e.g., [27, 4.7-3]) applied tothe sequence (( L n − L ∞ ) / µ n ) n ∈ N that ( ∃ M ≥ )( ∀ n ∈ N ) k ( L n − L ∞ ) / µ n k ≤ M ;equivalently, k L n − L ∞ k ≤ M µ n , as required. “ ⇐ ”: Since L n → L ∞ µ -linearly, wehave ( ∃ M ≥ )( ∀ n ∈ N ) k L n − L ∞ k ≤ M µ n . Therefore, ( ∀ n ∈ N ) k L n y − L ∞ y k ≤k L n − L ∞ kk y k ≤ M k y k µ n . (cid:4) Lemma 2.7.

Suppose that X is ﬁnite-dimensional, let ( L n ) n ∈ N be a sequence of linear nonexpan-sive operators on X and let L ∞ : X → X. Then the following are equivalent: (i) ( ∀ x ∈ X ) L n x → L ∞ x. (ii) L n → L ∞ pointwise ( in X ) , and L ∞ is linear and nonexpansive. (iii) L n → L ∞ ( in B ( X )) .Proof . The implications “(i) ⇒ (ii)” and “(iii) ⇒ (i)” are easy to verify. “(ii) ⇒ (iii)”: Supposethat ( x n ) n ∈ N is a sequence in X such that ( ∀ n ∈ N ) k x n k = k L n − L ∞ k − k L n x n − L ∞ x n k →

0. (10)We can and do assume that x n → x ∞ . Since L ∞ and ( L n ) n ∈ N are linear and nonexpansive,we have k L ∞ k ≤ ( ∀ n ∈ N ) k L n k ≤

1. Using the triangle inequality, we have k L n x n − L ∞ x n k = k ( L n − L ∞ )( x n − x ∞ ) + ( L n − L ∞ ) x ∞ k ≤ k L n − L ∞ kk x n − x ∞ k + k ( L n − L ∞ ) x ∞ k ≤ k x n − x ∞ k + k ( L n − L ∞ ) x ∞ k → + =

0. Now combine with (10). (cid:4)

Corollary 2.8.

Suppose that X is ﬁnite-dimensional, let L : X → X be linear, and let L ∞ : X → X be such that L n → L ∞ pointwise. Then L n → L ∞ linearly.Proof . Combine Lemma 2.7 and [5, Theorem 2.12(i)]. (cid:4) Iterating an afﬁne nonexpansive operator

We begin with a simple yet useful result.

Theorem 3.1.

Let L : X → X be linear, let b ∈ X, set T : X → X : x Lx + b, and supposethat Fix T = ∅ . Let x ∈ X. Then the following hold: (i) b ∈ ran ( Id − L ) . (ii) ( ∀ n ∈ N ) T n x = L n x + ∑ n − k = L k b.Proof . (i): Fix T = ∅ ⇔ ( ∃ y ∈ X ) y = Ly + b ⇔ b ∈ ran ( Id − L ) . (ii): We prove thisby induction (see also [11, Theorem 3.2(ii)]). When n = n = n ∈ N , T n x = L n x + n − ∑ k = L k b . (11)Then T n + x = T ( T n x ) = T ( L n x + ∑ n − k = L k b ) = L ( L n x + ∑ n − k = L k b ) + b = L n + x + ∑ nk = L k b . (cid:4) Let S be a nonempty closed convex subset of X and let w ∈ X . We recall the followinguseful translation formula (see, e.g., [7, Proposition 3.17]): ( ∀ x ∈ X ) P w + S x = w + P S ( x − w ) . (12) Lemma 3.2.

Let L : X → X be linear and nonexpansive, let b ∈ X, set T : X → X : x Lx + b,and suppose that Fix T = ∅ . Then there exists a point a ∈ X such that b = a − La and ( ∀ x ∈ X ) Tx = L ( x − a ) + a . (13) Moreover, the following hold: (i) Fix T = a + Fix L. (ii) ( ∀ x ∈ X ) P Fix T x = a + P Fix L ( x − a ) = P ( Fix L ) ⊥ a + P Fix L x. (iii) ( ∀ n ∈ N )( ∀ x ∈ X ) T n x = a + L n ( x − a ) . Proof . The existence of a and (13) follows from Theorem 3.1 and the linearity of L . (i): Let y ∈ X . Then y ∈ Fix T ⇔ y − a ∈ Fix L ⇔ y ∈ a + Fix L . (ii): The ﬁrst identity followsfrom combining (i) and (12). It follows from, e.g., [7, Corollary 3.22(ii)] that a + P Fix L ( x − a ) = a + P Fix L x − P Fix L a = P ( Fix L ) ⊥ a + P Fix L x . (iii): By telescoping, we have n − ∑ k = L k b = n − ∑ k = L k ( a − La ) = a − L n a . (14)5onsequently, Theorem 3.1(ii) and (14) yield T n x = L n x + a − L n a = a + L n ( x − a ) . (cid:4) The following result extends Fact 2.2 from the linear to the afﬁne case.

Theorem 3.3.

Let L : X → X be linear and nonexpansive, let b ∈ X, set T : X → X : x Lx + b, and suppose that Fix T = ∅ . Then the following are equivalent: (i) L is asymptotically regular. (ii) L n → P Fix L pointwise. (iii) T n → P Fix T pointwise. (iv) T is asymptotically regular.Proof . Let x ∈ X . “(i) ⇔ (ii)”: This is Fact 2.2. “(ii) ⇒ (iii)”: In view of Lemma 3.2(iii)&(ii)we have T n x = L n ( x − a ) + a → P Fix L ( x − a ) + a = P Fix T x . “(iii) ⇒ (iv)”: T n x − T n + x → P Fix T x − P Fix T x =

0. “(iv) ⇒ (i)”: Using Lemma 3.2(iii) we have L n x − L n + x = T n ( x + a ) − T n + ( x + a ) → (cid:4) We now turn to linear convergence.

Lemma 3.4.

Suppose that X is ﬁnite-dimensional, and let L : X → X be linear and nonexpansive.Then the following are equivalent: (i)

L is asymptotically regular. (ii) L n → P Fix L pointwise ( in X ) . (iii) L n → P Fix L ( in B ( X )) . (iv) L n → P Fix L linearly pointwise ( in X ) . (v) L n → P Fix L linearly ( in B ( X )) .Proof . “(i) ⇔ (ii)”: This follows from Fact 2.2. “(ii) ⇔ (iii)”: Combine Lemma 2.7 andFact 2.2. “(iii) ⇒ (v)”: Apply Corollary 2.8 with L ∞ replaced by P Fix L . “(v) ⇒ (iii)”: Thisis obvious. “(iv) ⇔ (v)”: Apply Lemma 2.6 to the sequence ( L n ) n ∈ N and use Fact 2.2. (cid:4) Theorem 3.5.

Let L : X → X be linear and nonexpansive, let b ∈ X, set T : X → X : x Lx + b and let µ ∈ ]

0, 1 [ . Then the following are equivalent: (i) T n → P Fix T µ -linearly pointwise ( in X ) . (ii) L n → P Fix L µ -linearly pointwise ( in X ) . (iii) L n → P Fix L µ -linearly ( in B ( X )) .Proof . “(i) ⇔ (ii)”: It follows from Lemma 3.2(iii)&(ii) that T n x − P Fix T x = a + L n ( x − a ) − ( a + P Fix L ( x − a )) = L n ( x − a ) − P Fix L ( x − a ) →

0, by Fact 2.2. “(ii) ⇔ (iii)”: CombineLemma 2.6 and Fact 2.2. (cid:4) orollary 3.6. Suppose that X is ﬁnite-dimensional. Let L : X → X be linear, nonexpansive andasymptotically regular, let b ∈ X, set T : X → X : x Lx + b and suppose that Fix T = ∅ .Then T n → P Fix T pointwise linearly.Proof . It follows from Fact 2.2 that L n → P Fix L pointwise. Consequently, by Corollary 2.8, L n → P Fix L linearly. Now apply Theorem 3.5. (cid:4) Recall that a possibly set-valued operator A : X ⇒ X is monotone if for any two points ( x , u ) and ( y , v ) in the graph of A , denoted gra A , we have h x − y , u − v i ≥ A is maxi-mally monotone if there is no proper extension of gra A that preserves the monotonicity of A . The resolvent of A , denoted by J A , is deﬁned by J A = ( Id + A ) − while the reﬂectedresolvent of A is R A = J A − Id. In the following, we assume that A : X ⇒ X and B : X ⇒ X are maximally monotone.The Attouch-Th´era (see [1]) dual pair to the primal pair ( A , B ) is the pair ( A − , B − > ) . The primal problem associated with ( A , B ) is toﬁnd x ∈ X such that 0 ∈ Ax + Bx , (15)and its Attouch-Th´era dual problem is toﬁnd x ∈ X such that 0 ∈ A − x + B − > x . (16)We shall use Z and K to denote the sets of primal and dual solutions of (15) and (16)respectively, i.e., Z = Z ( A , B ) = ( A + B ) − ( ) and K = K ( A , B ) = ( A − + B − > ) − ( ) . (17)The Douglas-Rachford operator for the ordered pair ( A , B ) (see [29]) is deﬁned by T DR = T DR ( A , B ) = Id − J A + J B R A = ( Id + R B R A ) . (18)We recall that C : X ⇒ X is paramonotone if it is monotone and ( ∀ ( x , u ) ∈ gra C )( ∀ ( y , v ) ∈ gra C ) we have ( x , u ) ∈ gra C ( y , v ) ∈ gra C h x − y , u − v i =  ⇒ (cid:8) ( x , v ) , ( y , u ) (cid:9) ⊆ gra C . (19) It is well-known for a maximally monotone operator A : X ⇒ X that J A is ﬁrmly nonexpansive and R A is nonexpansive (see, e.g. [7, Corollary 23.10(i) and (ii)]). Let A : X ⇒ X . Then A > = ( − Id ) ◦ A ◦ ( − Id ) and A − > = ( A − ) > = ( A > ) − . For a detailed discussion on paramonotone operators we refer the reader to [26]. xample 4.1. Let f : X → ] − ∞ , + ∞ ] be proper, convex and lower semicontinuous. Then ∂ f isparamonotone by [26, Proposition 2.2] (or by [7, Example 22.3(i)]). Example 4.2.

Suppose that X = R and that A : R → R : ( x , y ) ( y , − x ) . Then one caneasily verify that A and − A are maximally monotone but not paramonotone by [26, Section 3](or [13, Theorem 4.9]).

Fact 4.3.

The following hold: (i) T DR is ﬁrmly nonexpansive. (ii) zer ( A + B ) = J A ( Fix T DR ) .If A and B are paramonotone, then we have additionally: (iii) Fix T DR = Z + K. (iv) ( K − K ) ⊥ ( Z − Z ) .Proof . (i): See [29, Lemma 1], [23, Corollary 4.2.1], or [7, Proposition 4.21(ii)]. (ii): See[19, Lemma 2.6(iii)] or [7, Proposition 25.1(ii)]. (iii): See [6, Corollary 5.5(iii)]. (iv): See [6,Corollary 5.5(iv)]. (cid:4) Lemma 4.4.

Suppose that A and B are paramonotone. Let k ∈ K be such that ( ∀ z ∈ Z ) J A ( z + k ) = P Z ( z + k ) . Then k ∈ ( Z − Z ) ⊥ .Proof . By Fact 4.3(iii), Fix T DR = Z + K . Let z and z be in Z . It follows from [6, Theo-rem 4.5] that ( ∀ z ∈ Z ) J A ( z + k ) = z . Therefore, ( ∀ i ∈ {

1, 2 } ) z i + k ∈ Fix T DR and z i = J A ( z i + k ) = P Z ( z i + k ) . (20)Furthermore, the Projection Theorem (see, e.g., [7, Theorem 3.14]) yields h k , z − z i = h z + k − z , z − z i = h z + k − P Z ( z + k ) , P Z ( z + k ) − z i ≥

0. (21)On the other hand, interchanging the roles of z and z yields h k , z − z i ≥

0. Altogether, h k , z − z i = (cid:4) The next result relates the Douglas-Rachford operator to orthogonal properties of pri-mal and dual solutions.

Theorem 4.5.

Suppose that A and B are paramonotone. Then the following are equivalent: (i) J A P Fix T DR = P Z . (ii) J A | Fix T DR = P Z | Fix T DR . (iii) K ⊥ ( Z − Z ) . roof . “(i) ⇒ (ii)”: This is obvious. “(ii) ⇒ (iii)”: Let k ∈ K and let z ∈ Z . Then Fix T DR = Z + K by Fact 4.3(iii); hence, z + k ∈ Fix T DR . Therefore J A ( z + k ) = P Z ( z + k ) . Now applyLemma 4.4. “(iii) ⇒ (i)”: This follows from [6, Theorem 6.7(ii)]. (cid:4) Corollary 4.6.

Let U be a closed afﬁne subspace of X, suppose that A = N U and that B isparamonotone such that Z = ∅ . Then the following hold : (i) Z = U ∩ ( B − ( par U ) ⊥ ) ⊆ U. (ii) ( ∀ z ∈ Z ) K = ( − Bz ) ∩ ( par U ) ⊥ ⊆ ( par U ) ⊥ . (iii) K ⊥ ( Z − Z ) . (iv) J A P Fix T DR = P U P Fix T DR = P Z .Proof . Since A = N C = ∂ι C , it is paramonotone by Example 4.1. (i): Let x ∈ X . Then x ∈ Z ⇔ ∈ Ax + Bx = ( par U ) ⊥ + Bx ⇔ [ x ∈ U and there exists y ∈ X such that y ∈ ( par U ) ⊥ and y ∈ Bx ] ⇔ [ x ∈ U and there exists y ∈ X such that x ∈ B − y and y ∈ ( par U ) ⊥ ] ⇔ x ∈ U ∩ B − (( par U ) ⊥ ) . (ii): Let z ∈ Z . Applying [6, Remark 5.4] to ( A − , B − > ) yields K = ( − Bz ) ∩ ( Az ) = ( − Bz ) ∩ ( par U ) ⊥ . (iii): By (i) Z − Z ⊆ U − U = par U . Now use (ii). (iv): Combine (iii) and Theorem 4.5. (cid:4) Using [6, Proposition 2.10] we havezer A ∩ zer B = ∅ ⇔ ∈ K . (22) Theorem 4.7.

Suppose that A and B are paramonotone and that zer A ∩ zer B = ∅ . Then thefollowing hold: (i) Z = ( zer A ) ∩ ( zer B ) and ∈ K. (ii) J A P Fix T DR = P Z . (iii) K ⊥ ( Z − Z ) .If, in addition, A or B is single-valued, then we also have: (iv) K = { } . (v) Fix T DR = ( zer A ) ∩ ( zer B ) .Proof . (i): Since zer A ∩ zer B = ∅ , it follows from (22) that 0 ∈ K . Now apply [6, Re-mark 5.4] to get Z = A − ( ) ∩ B − ( ) = ( zer A ) ∩ ( zer B ) . (ii): This is [6, Corollary 6.8].(iii): Combine (22) and Fact 4.3(iv). (iv): Let C ∈ { A , B } be single-valued. Using (i) wehave Z ⊆ zer C . Suppose that C = A and let z ∈ Z . We use [6, Remark 5.4] applied to ( A − , B − > ) to learn that K = ( Az ) ∩ ( − Bz ) . Therefore { } ⊆ K ⊆ Az ⊆ A ( zer A ) = { } .A similar argument applies if C = B . (v): Combine Fact 4.3(iii) with (i) & (iv). (cid:4) Let C be nonempty closed convex subset of X . Then J N C = P C by, e.g., [7, Example 23.4]. Suppose that U is a closed afﬁne subspace of X . We use par U to denote the parallel space of U deﬁnedby par U = U − U . emark 4.8. The conclusion of Theorem 4.7(i) generalizes the setting of convex feasibility prob-lems. Indeed, suppose that A = N U and B = N V , where U and V are nonempty closed convexsubsets of X such that U ∩ V = ∅ . Then Z = U ∩ V = zer A ∩ zer B. The assumptions that A and B are paramonotone are critical in the conclusion of Theo-rem 4.7(i) as we illustrate now. Example 4.9.

Suppose that X = R , that U = R × { } , that A = N U and that B : R → R : ( x , y ) ( − y , x ) , is the counterclockwise rotator in the plane by π /2 . Then one veriﬁesthat zer A = U, zer B = { (

0, 0 ) } , Z = zer ( A + B ) = U; however ( zer A ) ∩ ( zer B ) = { (

0, 0 ) } 6 = U = Z. Note that A is paramonotone by Example 4.1 while B is not paramonotoneby Example 4.2.

In view of (22) and Theorem 4.7(ii), when A and B are paramonotone, we have theimplication 0 ∈ K ⇒ J A P Fix T DR = P Z . However the converse implication is not true, aswe show in the next example. Example 4.10.

Suppose that a ∈ X r { } , that A = Id − a and that B = Id . Then Z = { a } , ( A − , B − > ) = ( Id + a , Id ) , hence K = {− a } , Z − Z = { } and therefore K ⊥ ( Z − Z ) whichimplies that J A P Fix T DR = P Z by Theorem 4.5, but K. If neither A nor B is single-valued, then the conclusion of Theorem 4.7(iv)&(v) may failas we now illustrate. Example 4.11.

Suppose that X = R , that U = R × { } , that V = ball ((

0, 1 ) ; 1 ) , thatA = N U and that B = N V . By [6, Example 2.7] Z = U ∩ V = { (

0, 0 ) } and K = N U − V ( ) = R + · (

0, 1 ) = { (

0, 0 ) } . Therefore Fix T DR = R + · (

0, 1 ) = { (

0, 0 ) } = U ∩ V = zer A ∩ zer B. Recall that the Passty’s parallel sum (see e.g., [32] or [7, Section 24.4]) is deﬁned by A (cid:3) B = ( A − + B − ) − . (23)In view of (17) and (23), one readily veriﬁes that K = ( A (cid:3) B > )( ) . (24) Lemma 4.12.

Suppose that B : X ⇒ X is linear . Then the following hold: (i) B > = B and B − > = B − . Let u ∈ X and let r >

0. We use ball ( u ; r ) to denote the closed ball in X centred at u with radius r . Wealso use R + to denote the set of nonnegative real numbers [ + ∞ [ . A : X ⇒ X is a linear relation if gra A is a linear subspace of X × X . ( A − , B − > ) = ( A − , B − ) . (iii) K = ( A (cid:3) B )( ) .Proof . This is straightforward from the deﬁnitions. (cid:4) Let f : X → ] − ∞ , + ∞ ] be proper, convex and lower semicontinuous. In the followingwe make use of the well-known identity (see, e.g., [7, Corollary 16.24]): ( ∂ f ) − = ∂ f ∗ . (25) Corollary 4.13 ( subdifferential operators). Let f : X → ] − ∞ , + ∞ ] and g : X → ] − ∞ , + ∞ ] be proper, convex and lower semicontinuous. Suppose that A = ∂ f and that B = ∂ g. Then thefollowing hold : (i) Z = ( ∂ f ∗ (cid:3) ∂ g ∗ )( ) . (ii) K = ( ∂ f (cid:3) ∂ g ∨ )( ) . (iii) Suppose that Argmin f ∩ Argmin g = ∅ . Then Z = ∂ f ∗ ( ) ∩ ∂ g ∗ ( ) . (iv) Suppose that ∈ sri ( dom f − dom g ) . Then Z = ∂ ( f ∗ (cid:3) g ∗ )( ) . (v) Suppose that ∈ sri ( dom f ∗ + dom g ∗ ) . Then K = ∂ ( f (cid:3) g ∨ )( ) .Proof . Note that A and B are paramonotone by Example 4.1. (i): Using (25) and(23) we have Z = ( A + B ) − ( ) = ((( ∂ f ) − ) − + (( ∂ g ) − ) − ) − ( ) = (( ∂ f ∗ ) − +( ∂ g ∗ ) − ) − ( ) = ( ∂ f ∗ (cid:3) ∂ g ∗ )( ) . (ii): Observe that ( ∂ g ) − > = (( ∂ g ) > ) − =( ∂ g ∨ ) − . Therefore using (23) we have K = (( ∂ f ) − + (( ∂ g ) > ) − ) − ( ) = (( ∂ f ) − +( ∂ g ∨ ) − ) − ( ) = ( ∂ f (cid:3) ∂ g ∨ )( ) . (iii): Using Theorem 4.7(i), Fermat’s rule (see, e.g., [7,Theorem 16.2]) and (25) we have Z = ( zer A ) ∩ ( zer B ) = Argmin f ∩ Argmin g =( ∂ f ) − ( ) ∩ ( ∂ g ) − ( ) = ∂ f ∗ ( ) ∩ ∂ g ∗ ( ) . (iv): Combine (i) and [7, Proposition 24.27]applied to the functions f ∗ and g ∗ . (v): Combine (ii) and [7, Proposition 24.27] applied tothe functions f and g ∨ . (cid:4) Let f : X → ] − ∞ , + ∞ ] be proper, convex and lower semicontinuous. We use f ∗ to denote the convexconjugate (a.k.a. Fenchel conjugate) of f , deﬁned by f ∗ : X → ] − ∞ , + ∞ ] : x sup u ∈ X ( h x , u i − f ( x )) . Let f : X → ] − ∞ , + ∞ ] . Then f ∨ : X → ] − ∞ , + ∞ ] : x f ( − x ) . Let f : X → ] − ∞ , + ∞ ] be proper. The set of minimizers of f , { x ∈ X | f ( x ) = inf f ( X ) } , is denoted byArgmin f . Let S be nonempty subset of X . The strong relative interior of S , denoted by sri S , is the interior withrespect to the closed afﬁne hull of S . Let f : X → ] − ∞ , + ∞ ] and g : X → ] − ∞ , + ∞ ] be proper, convex and lower semicontinuous.The inﬁmal convolution of f and g , denoted by f (cid:3) g , is the convex function f (cid:3) g : X → R : x inf x ∈ X ( f ( y ) + g ( x − y )) . The Douglas-Rachford algorithm in the afﬁne case

In this section we assume that A : X ⇒ X and B : X ⇒ X are maximally monotone and afﬁne,and that Z = { x ∈ X | ∈ Ax + Bx } 6 = ∅ . (26)Since the resolvents J A and J B are afﬁne (see [9, Theorem 2.1(xix)]), so is T DR . Theorem 5.1.

Let x ∈ X. Then the following hold: (i) T n DR x → P Fix T DR x. (ii) Suppose that A and B are paramonotone such that K ⊥ ( Z − Z ) (as is the case when Aand B are paramonotone and ( zer A ) ∩ ( zer B ) = ∅ ). Then J A T n DR x → P Z x. (iii) Suppose that X is ﬁnite-dimensional. Then T n DR x → P Fix T DR x linearly and J A T n DR x → J A P Fix T DR x linearly.Proof . (i): Note that in view of Fact 4.3(ii) and (26) we have Fix T DR = ∅ . MoreoverFact 4.3(i) and Fact 2.1 imply that T DR is asymptotically regular. It follows from The-orem 3.3 that (i) holds. (ii): Use (i) and Theorem 4.5. (iii): The linear convergence of ( T n DR x ) n ∈ N follows from Corollary 3.6. The linear convergence of ( J A T n DR x ) n ∈ N is a di-rect consequence of the linear convergence of ( T n DR x ) n ∈ N and the fact that J A is (ﬁrmly)nonexpansive. (cid:4) Remark 5.2.

Theorem 5.1 generalizes the convergence results for the original Douglas-Rachfordalgorithm [21] from particular symmetric matrices/afﬁne operators on a ﬁnite-dimensional spaceto general afﬁne relations deﬁned on possibly inﬁnite dimensional spaces, while keeping strongand linear convergence of the iterates of the governing sequence ( T n DR x ) n ∈ N and identifyingthe limit to be P Fix T DR x. Paramonotonicity coupled with common zeros yields convergence of theshadow sequence ( J A T n DR x ) n ∈ N to P Z x. Suppose that U and V are nonempty closed convex subsets of X . Then T U , V = T DR ( N U , N V ) = Id − P U + P V ( P U − Id ) . (27) Proposition 5.3.

Suppose that U and V are closed linear subspaces of X. Let w ∈ X. Thenw + U and w + V are closed afﬁne subspaces of X, ( w + U ) ∩ ( w + V ) = ∅ and ( ∀ n ∈ N ) T nw + U , w + V = T nU , V ( · − w ) + w . (28) A : X ⇒ X is an afﬁne relation if gra A is afﬁne subspace of X × X , i.e., a translation of a linear subspaceof X × X . For further information of afﬁne relations we refer the reader to [12]. See Theorem 4.7(iii). roof . Let x ∈ X . We proceed by induction. The case n = n =

1, i.e., T w + U , w + V = T U , V ( · − w ) + w . (29)Indeed, T w + U , w + V x = ( Id − P w + U + P w + V ( P w + U − Id )) x = x − w − P U ( x − w ) + w + P V ( P w + U x − x − w ) = x − w − P U ( x − w ) + w + P V ( w + P U ( x − w ) − x − w ) = ( x − w ) − P U ( x − w ) + P V ( P U ( x − w ) − ( x − w )) + w = ( Id − P U + P V R U )( x − w ) + w = T V , U ( x − w ) + w . We now assume that (28) holds for some n ∈ N . Applying (29) with x replaced by T nw + V , w + U x yields T n + w + U , w + V x = T w + U , w + V ( T nw + U , w + V x ) = T U , V ( T nw + U , w + V x − w ) + w = T U , V ( T nU , V ( x − w ) + w − w ) + w = T n + U , V ( x − w ) + w ; (30)hence (28) holds for all n ∈ N . (cid:4) Example 5.4 ( Douglas-Rachford in the afﬁne feasibility case ). (see also [4, Corollary 4.5])Suppose that U and V are closed linear subspaces of X. Let w ∈ X and let x ∈ X. Supposethat A = N w + U and that B = N w + V . Then T w + U , w + V x = Lx + b, where L = T U , V andb = w − T U , V w. Moreover, T nw + V , w + U x → P Fix T w + V , w + U x (31) and J A T nw + V , w + U x = P w + U T nw + V , w + U x → P Z x = P ( w + V ) ∩ ( w + U ) x . (32) Finally, if U + V is closed (as is the case when X is ﬁnite-dimensional) then the convergence islinear with rate c F ( U , V ) < , where c F ( U , V ) is the cosine of the Friedrich’s angle between Uand V.Proof . Using (28) with n = T U , V we have T w + U , w + V = T U , V ( · − w ) + w = T U , V + w − T U , V w . (33)Hence L = T U , V and b = w − T U , V w , as claimed. To obtain (31) and (32), use Theo-rem 5.1(i) and Theorem 5.1(ii), respectively. The claim about the linear rate follows bycombining [4, Corollary 4.4] and Theorem 3.5 with T replaced by T w + U , w + V , L replacedby T U , V and b replaced by w − T U , V w . (cid:4) Remark 5.5.

When X is inﬁnite-dimensional, it is possible to construct an example (see [4, Sec-tion 6]) of two linear subspaces U and V where c F ( U , V ) = , and the rate of convergence of T DR is not linear. Suppose that U and V are closed linear subspaces of X . The cosine of the Friedrichs angle is c F ( par U , par V ) = sup u ∈ par U ∩ W ⊥ ∩ ball ( ) v ∈ par V ∩ W ⊥ ∩ ball ( ) |h u , v i| <

1, where W = par U ∩ par V . Example 5.6.

Suppose that X = R , thatA = (cid:18) −

11 0 (cid:19) and that B = N { }× R . (34) Then T DR : R → R : ( x , y ) ( x − y ) · ( − ) , Fix T DR = R · ( − ) , Z = { } × R ,K = { } , hence K ⊥ ( Z − Z ) , and ( ∀ ( x , y ) ∈ R )( ∀ n ≥ ) T n DR ( x , y ) = T DR ( x , y ) = ( x − y , y − x ) ∈ Fix

T, however ( ∀ ( x , y ) ∈ ( R r { } ) × R ) ( ∀ n ≥ )( y − x ) = J A T n DR ( x , y ) = P Z ( x , y ) = ( y ) . (35) Note that A is not paramonotone by Example 4.2.Proof . We have J A = ( Id + A ) − = (cid:18) −

11 1 (cid:19) − = (cid:18) − (cid:19) , (36)and R A = J A − Id = (cid:18) − (cid:19) . (37)Moreover, by [7, Example 23.4], J B = P R ×{ } = (cid:18) (cid:19) . (38)Consequently T DR = Id − J A + J B R A = (cid:18) (cid:19) − (cid:18) − (cid:19) + (cid:18) (cid:19) (cid:18) − (cid:19) = (cid:18) − − (cid:19) ,(39)i.e., T DR : R → R : ( x , y ) x − y ( − ) . (40)Now let ( x , y ) ∈ R . Then ( x , y ) ∈ Fix T DR ⇔ ( x , y ) = ( x − y , − x − y ) ⇔ x = x − y and y = − x − y ⇔ x + y =

0, hence Fix T DR = R · ( − ) as claimed. It follows from [19,Lemma 2.6(iii)] that Z = J A ( Fix T DR ) = R · J A ( − ) = R · (

0, 2 ) = { } × R , as claimed.Now let ( x , y ) ∈ R . By (40) we have T DR ( x , y ) = x − y ( − ) ∈ Fix T DR , hence ( ∀ n ≥ ) T n DR ( x , y ) = T ( x , y ) = x − y ( − ) . Therefore, ( ∀ n ≥ ) J A T n DR ( x , y ) = J A T DR ( x , y ) = J A (cid:16) x − y ( − ) (cid:17) = ( y − x ) = ( y ) = P Z ( x , y ) whenever x = (cid:4) The next example illustrates that the assumption K ⊥ ( Z − Z ) is critical for the conclu-sion in Theorem 5.1(ii). 14 xample 5.7 ( when K ( Z − Z ) ). Let u ∈ X r { } . Suppose that A : X → X : x u andB : X → X : x

7→ − u. Then A and B are paramonotone, A + B ≡ and therefore Z = X.Moreover, by [6, Remark 5.4] ( ∀ z ∈ Z = X ) K = ( Az ) ∩ ( − Bz ) = { u } 6⊥ ( Z − Z ) = X. Notethat

Fix T = Z + K = X + { u } = X and J A : X → X : x x − u. Consequently ( ∀ x ∈ X )( ∀ n ∈ N ) J A T n DR x = J A P Fix T x = J A x = x − u = x = P Z x . (41) Proposition 5.8 ( parallel splitting). Let m ∈ {

2, 3, . . . } , and let B i : X ⇒ X be max-imally monotone and afﬁne, i ∈ {

1, 2, . . . , m } , such that zer ( ∑ mi = B i ) = ∅ . Set ∆ = { ( x , . . . , x ) ∈ X m | x ∈ X } , set A = N ∆ , set B = × mi = B i , set T = T DR ( A , B ) , let j : X → X m : x ( x , x , . . . , x ) , and let e : X m → X : ( x , x , . . . , x m ) m ( ∑ mi = x i ) . Let x ∈ X m .Then ∆ ⊥ = { ( u , . . . , u m ) ∈ X m | ∑ mi = u i = } , Z = Z ( A , B ) = j ( zer (cid:16) ∑ mi = B i (cid:17) ) ⊆ ∆ and K = K ( A , B ) = ( − B ( Z )) ∩ ∆ ⊥ ⊆ ∆ ⊥ . (42) Moreover, the following hold: (i) T n x → P Fix T x . (ii) Suppose that X is ﬁnite-dimensional. Then T n x → P Fix T x linearly and J A T n x = P ∆ T n x → P ∆ P Fix T x linearly. (iii) Suppose that B i : X ⇒ X, i ∈ {

1, 2, . . . , m } , are paramonotone. Then B is paramonotoneand J A T n x = P ∆ T n x → P Z x . Consequently, e ( J A T n x ) = e ( P ∆ T n x ) → e ( P Z x ) ∈ Z.Proof . The claim about ∆ ⊥ and ﬁrst identity in (42) follows from [7, Proposi-tion 25.5(i)&(vi)], whereas the second identity in (42) follows from Corollary 4.6(iii) ap-plied to ( A , B ) . (i): Apply Theorem 5.1(i) to ( A , B ) . (ii): Apply Theorem 5.1(iii) to ( A , B ) .(iii): Let ( x , u ) , ( y , v ) be in gra B . On the one hand h x − y , u − v i = ⇔ ∑ mi = h x i − y i , u i − v i i = ( x i , u i ) , ( y i , v i ) are in gra B i , i ∈ {

1, . . . , m } . On the other hand, since ( ∀ i ∈ {

1, . . . , m } ) B i are monotone we learn that ( ∀ i ∈ {

1, . . . , m } ) h x i − y i , u i − v i i ≥ ( ∀ i ∈ {

1, . . . , m } ) h x i − y i , u i − v i i =

0. Now use that paramonotonicity of B i to deduce that ( x i , v i ) , ( y i , u i ) are in gra B i , i ∈ {

1, . . . , m } ; equivalently, ( x , v ) , ( y , u ) ingra B . Finally, apply Corollary 4.6(iv). (cid:4) In this section we present examples of monotone operators that are partly motivated byapplications in partial differential equations; see, e.g., [25] and [37]. Let M ∈ R n × n . Thenwe have the following equivalences: M is monotone ⇔ M + M T the eigenvalues of M + M T R + . (43b) Lemma 6.1.

Let M = (cid:18) α βγ δ (cid:19) ∈ R × . (44) Then M is monotone if and only if α ≥ , δ ≥ and αδ ≥ ( β + γ ) .Proof . Indeed, the principal minors of M + M T are 2 α , 2 δ and 4 αδ − ( β + γ ) ; by, e.g., [30,(7.6.12) on page 566]. (cid:4) Note that if M = M T , then M is monotone if and only if the eigenvalues of M lie in R + . If M = M T , then some information about the location of the (possibly complex)eigenvalues of M is available: Lemma 6.2.

Let M ∈ R n × n be monotone, and let { λ k } nk = denote the set of eigenvalues of M.Then Re ( λ k ) ≥ for every k ∈ {

1, . . . , n } .Proof . Write λ = α + i β , where α and β belong to R and i = √− λ is an eigenvalue of M with (nonzero) eigenvector w = u + i v , where u and v are in R n .Then ( M − λ Id ) w = ⇔ (( M − α Id ) − i β Id )( u + i v ) = ⇔ ( M − α Id ) u + β v = ( M − α Id ) v − β u =

0. Hence h u , ( M − α Id ) u i + β h u , v i =

0, (45a) h v , ( M − α Id ) v i − β h v , u i =

0. (45b)Adding (45) yields h u , ( M − α Id ) u i + h v , ( M − α Id ) v i =

0; equivalently, h u , Mu i + h v , Mv i − α k u k − α k v k =

0. Solving for α yieldsRe ( λ ) = α = h u , Mu i + h v , Mv ik u k + k v k ≥

0, (46)as claimed. (cid:4)

The converse of Lemma 6.2 is not true in general, as we demonstrate in the followingexample.

Example 6.3.

Let ξ ∈ R r [ −

2, 2 ] , and setM = (cid:18) ξ (cid:19) . (47) Then M has as its only eigenvalue (with multiplicity ), M is not monotone by Lemma 6.1, andM is not symmetric. Let C be the set of complex numbers and let z ∈ C . We use Re ( z ) to refer to the real part of the complexnumber z . roposition 6.4. Consider the tridiagonal Toeplitz matrixM =  β γ α . . . . . .. . . . . . γ α β  ∈ R n × n . (48) Then M is monotone if and only if β ≥ | α + γ | cos ( π / ( n + )) .Proof . Note that ( M + M T ) =  β ( α + γ ) ( α + γ ) . . . . . .. . . . . . ( α + γ ) ( α + γ ) β  . (49)By (43a), M is monotone ⇔ ( M + M T ) is positive semideﬁnite. If α + γ = ( M + M T ) = β Id and therefore ( M + M T ) is positive semideﬁnite ⇔ β ≥ = | α + γ | . Nowsuppose that α + γ =

0. It follows from [30, Example 7.2.5] that the eigenvalues of ( M + M T ) are λ k = β + ( α + γ ) cos (cid:0) k π n + (cid:1) , (50)where k ∈ {

1, . . . , n } . Consequently, { λ k } nk = ⊆ R + ⇔ β ≥ | ( α + γ ) cos ( π / ( n + )) | .Therefore, the characterization of monotonicity of M follows from (43b). (cid:4) Proposition 6.5.

Let M =  β γ α . . . . . .. . . . . . γ α β  ∈ R n × n . (51) Then exactly one of the following holds: (i) αγ = and det ( M ) = β n . Consequently M is invertible ⇔ β = , in which case [ M − ] i , j = ( − α ) max { i − j ,0 } ( − γ ) max { j − i ,0 } β min { j − i , i − j }− . (52)(ii) αγ = . Set r = α ( − β + p β − αγ ) , s = α ( − β − p β − αγ ) and Λ = (cid:8) β + γ p α / γ cos ( k π / ( n + )) (cid:12)(cid:12) k ∈ {

1, 2, . . . , n } (cid:9) . Then rs = . Moreover, M isinvertible ⇔ Λ , in which caser = s ⇒ [ M − ] i , j = − γ j − ( r min { i , j } − s min { i , j } )( r n + s max { i , j } − r max { i , j } s n + ) α j ( r − s )( r n + − s n + ) , (53a) In the special case, when β =

0, this is equivalent to saying that M is invertible ⇔ n is even. = s ⇒ [ M − ] i , j = − γ j − min { i , j } ( n + − max { i , j } ) r i + j − α j ( n + ) . (53b) Alternatively, deﬁne the recurrence relationsu = u = u k = − γ ( α u k − + β u k − ) , k ≥

2; (54a) v n + = v n = v k = − α ( β v k + + γ v k + ) , k ≤ n −

1. (54b)

Then [ M − ] i , j = − u min { i , j } v max { i , j } v (cid:16) γα (cid:17) j − . (55) Proof . (i): αγ = ⇔ α = γ =

0, in which case M is a (lower or upper) triangularmatrix. Hence det ( M ) = β n , and the characterization follows. The formula in (52) iseasily veriﬁed. (ii): Note that 0 ∈ { r , s } ⇔ β ∈ {± p β − αγ } ⇔ β = β − αγ ⇔ αγ =

0. Hence rs =

0. Moreover, it follows from [30, Example 7.2.5] that Λ is the setof eigenvalues of M ; therefore, M is invertible ⇔ Λ . The formulae (53) follow from[38, Remark 2 on page 110]. The recurrence formulae deﬁned in (54) and [38, Theorem 2]yield (55). (cid:4) Remark 6.6.

Concerning Proposition 6.5, it follows from [36, Section 2 on page 44] that we alsohave the alternative formulaer = s ⇒ [ M − ] i , j =  − γ s − i − r − i s − − r − s − n + j − − r − n + j − s − ( n + ) − r − ( n + ) , j ≥ i ; − α s j − r j s − r s n − i + − r n − i + s n + − r n + , j ≤ i , (56a) r = s ⇒ [ M − ] i , j =  − i γ (cid:18) − jn + (cid:19) r j − i + , j ≥ i ; − j α (cid:18) − in + (cid:19) r j − i − , j ≤ i . (56b) Using the binomial expansion, (56a) , and a somewhat tedious calculation which we omit here, onecan show that [ M − ] i , j is equal to − ⌈ min { i , j } / ⌉− ∑ m = (cid:0) min { i , j } m + (cid:1) ( − β ) min { i , j }− ( m + ) ( β − αγ ) m ! ⌈ ( n + − max { i , j } ) / ⌉− ∑ m = (cid:0) n − max { i , j } + m + (cid:1) ( − β ) n − max { i , j }− m ( β − αγ ) m ! ( α ) min { j − i } ( γ ) min { i − j } ⌈ ( n + ) / ⌉− ∑ m = (cid:0) n + m + (cid:1) ( − β ) n − m ( β − αγ ) m ! (57) provided that r = s. xample 6.7. Let β ≥ , setM =  β − − . . . . . .. . . . . . − − β  ∈ R n × n , (58) set r = ( β + p β − ) and set s = ( β − p β − ) . Then M is monotone and invertible.Moreover, r = s ⇒ [ M − ] i , j = ( r min { i , j } − s min { i , j } )( r n + s max { i , j } − r max { i , j } s n + )( r − s )( r n + − s n + ) , (59a) r = s ⇒ [ M − ] i , j = min { i , j } ( n + − max { i , j } ) n + Alternatively, deﬁne the recurrence relationsu = u = u k = β u k − − u k − , k ≥

2, (60a) v n + = v n = v k = β v k + − v k + , k ≤ n −

1. (60b)

Then [ M − ] i , j = − u min { i , j } v max { i , j } v . (61) Proof . The monotonicity of M follows from Proposition 6.4 by noting that β ≥ > ( π / ( n + )) . The same argument implies that0 Λ = n β − (cid:0) k π n + (cid:1) (cid:12)(cid:12)(cid:12) k ∈ {

1, 2, . . . , n } o . (62)Hence M is invertible by Proposition 6.5(ii). Note that β = ⇔ β − = ⇔ r = s = (cid:4) Let M = [ α i , j ] ni , j = ∈ R n × n and M = [ β i , j ] ni , j = ∈ R n × n . Recall that the Kroneckerproduct of M and M (see, e.g., [28, page 407] or [30, Exercise 7.6.10]) is deﬁned by theblock matrix M ⊗ M = [ α i , j M ] ∈ R n × n . (63) Lemma 6.8.

Let M and M be symmetric matrices in R n × n . Then M ⊗ M ∈ R n × n issymmetric.Proof . Using [30, Exercise 7.8.11(a)] or [28, Proposition 1(e) on page 408] we have ( M ⊗ M ) T = M T ⊗ M T = M ⊗ M . (cid:4) The following fact is very useful in the conclusion of the upcoming results.19 act 6.9.

Let M and M be in R n × n , with eigenvalues (cid:8) λ k (cid:12)(cid:12) k ∈ {

1, . . . , n } (cid:9) and (cid:8) µ k (cid:12)(cid:12) k ∈ {

1, . . . , n } (cid:9) . Then the eigenvalues of M ⊗ M are (cid:8) λ j µ k (cid:12)(cid:12) j , k ∈ {

1, . . . , n } (cid:9) .Proof . See [28, Corollary 1 on page 412] or [30, Exercise 7.8.11(b)]. (cid:4) Corollary 6.10.

Let M and M in R n × n be monotone such that M or M is symmetric. ThenM ⊗ M is monotone.Proof . According to (43), it is sufﬁces to show that all the eigenvalues of M ⊗ M +( M ⊗ M ) T are nonnegative. Suppose ﬁrst that M is symmetric. Then using [28,Proposition 1(e)&(c)] we have M ⊗ M + ( M ⊗ M ) T = M ⊗ M + M ⊗ M T = M ⊗ ( M + M T ) . Since M is monotone, it follows from (43) that all the eigenvaluesof M + M T are nonnegative. Now apply Fact 6.9 to M and M + M T to learn that all theeigenvalues of M ⊗ M + ( M ⊗ M ) T are nonnegative, hence M ⊗ M is monotone by(43b). A similar argument applies if M is monotone. (cid:4) Note that the assumption that at least one matrix is symmetric is critical in Corol-lary 6.10, as we show in the next example.

Example 6.11.

Suppose that M = (cid:18) −

11 0 (cid:19) . (64)

Then M is monotone, with eigenvalues {± i } , but not symmetric . However,M ⊗ M =  − −  . (65) is a symmetric matrix with eigenvalues {± } by Fact 6.9. Therefore M ⊗ M is not monotone by (43) . Proposition 6.12.

Let M ∈ R n × n be symmetric. Then Id ⊗ M is monotone ⇔ M ⊗ Id is mono-tone ⇔ M is monotone, in which case we haveJ Id n ⊗ M = Id n ⊗ J M and J M ⊗ Id n = J M ⊗ Id n . (66) Proof . In view of Fact 6.9 the sets of eigenvalues of Id ⊗ M , M ⊗ Id, and M coincide. Itfollows from Lemma 6.8 that Id ⊗ M and M ⊗ Id are symmetric. Now apply (43b) anduse the monotonicity of M . To prove (66), we use [28, Proposition 1(c) on page 408] tolearn that Id n + Id n ⊗ M = Id n ⊗ Id n + Id n ⊗ M = Id n ⊗ ( Id n + M ) . Therefore, by [28,Corollary 1(b) on page 408] we have J Id n ⊗ M = ( Id n + Id n ⊗ M ) − = ( Id n ⊗ ( Id n + M )) − = Id n ⊗ ( Id n + M ) − Id n ⊗ J M . (67)The other identity in (66) is proved similarly. (cid:4) Corollary 6.13.

Let β ∈ R . SetM [ β ] =  β − − . . . . . .. . . . . . − − β  ∈ R n × n , (68) and let M → and M ↑ be block matrices in R n × n deﬁned by M → =  M [ β ] n n n . . . . . .. . . . . . n n n M [ β ]  and M ↑ =  β Id n − Id n n − Id n . . . . . .. . . . . . − Id n n − Id n β Id n  . (69) Then M → = Id n ⊗ M [ β ] and M ↑ = M [ β ] ⊗ Id n . Moreover, M → is monotone ⇔ M ↑ is monotone ⇔ M [ β ] is monotone ⇔ β ≥ ( π / ( n + )) , in which caseJ M → = Id n ⊗ M − [ β + ] and J M ↑ = M − [ β + ] ⊗ Id n . (70) Proof . It is straightforward to verify that M → = Id n ⊗ M [ β ] and M ↑ = M [ β ] ⊗ Id n . Itfollows from Proposition 6.4 that M [ β ] is monotone ⇔ β ≥ (cid:0) π n + (cid:1) . Now combinewith Proposition 6.12. To prove (70) note that Id n + M [ β ] = M [ β + ] , and therefore J M [ β ] = M − [ β + ] . The conclusion follows by applying (66). (cid:4) The above matrices play a key role in the original design of the Douglas-Rachford al-gorithm — see the Appendix for details.

Proposition 6.14.

Let n ∈ {

2, 3, . . . } , let M ∈ R n × n and consider the block matrix M =  M − Id n n − Id n . . . . . .. . . . . . − Id n n − Id n M  . (71) Let x = ( x , x , . . . , x n ) ∈ R n , where x i ∈ R n , i ∈ {

1, 2, . . . n } . Then h x , M x i = h x , ( M − Id ) x i + n − ∑ k = h x k , ( M − ) x k i + h x n , ( M − Id ) x n i + n − ∑ i = k x i − x i + k . (72)21 oreover the following hold: (i) Suppose that n = . Then M − Id is monotone ⇔ M is monotone. (ii) M − is monotone ⇒ M is monotone. (iii) M is monotone ⇒ M − ( − n ) Id is monotone ⇒ M is monotone.Proof . We have h x , M x i = h ( x , x , . . . , x n ) , ( Mx − x , − x + Mx − x , . . . , − x n − + Mx n ) i = h x , Mx i − h x , x i − h x , x i + h x , Mx i − h x , x i− h x , x i + h x , Mx i − . . . − h x , x i + h x n , Mx n i − h x n − , x n i = h x , Mx i − h x , x i + h x , Mx i − h x , x i− . . . − h x , x i + h x n , Mx n i − h x n − , x n i = h x , Mx i + k x − x k − k x k − k x k + h x , Mx i + k x − x k − k x k − k x k + . . . + k x n − − x n k − k x n − k − k x n k + h x n , Mx n i = h x , Mx i − k x k + h x , Mx i − k x k + . . . + h x n , Mx n i − k x n k + . . . + k x − x k + k x − x k + . . . + k x n − − x n k = h x , ( M − Id ) x i + (cid:18) n − ∑ k = h x k , ( M − ) x k i (cid:19) + h x n , ( M − Id ) x n i + n − ∑ i = k x i − x i + k .(i): “ ⇒ ”: Apply (72) with n =

2. “ ⇐ ”: Let y ∈ R . Applying (72) to the point x =( y , y ) ∈ R , we get 0 ≤ h x , M x i = h y , ( Id − M ) y i . (ii): This is clear from (72). (iii): Let y ∈ R n . Applying (72) to the point x = ( y , y , . . . , y ) ∈ R n yields 0 ≤ h x , M x i = h y , ( M − Id ) y i + ( n − ) h y , ( M − ) y i = h y , ( nM − ( n − ) Id ) y i . Therefore, M − ( − n ) Id ismonotone. (cid:4)

The converse of Proposition 6.14(ii) is not true in general, as we illustrate now.

Example 6.15.

Set M = (cid:18) −

11 1 (cid:19) , (73) and let M be as deﬁned in Proposition 6.14. Then one veriﬁes easily that M is monotone whileM − is not. We now show that the converse of the implications in Proposition 6.14(iii) are not truein general. 22 xample 6.16.

Set n = , set M = Id ∈ R × , and let M be as deﬁned in Proposition 6.14..Then M is monotone but M − ( − ) Id = − Id is not monotone, and M =  − − − −  . (74) Note the M is symmetric and has eigenvalues {− / , / } , hence M is not monotone by (43) . Acknowledgments

HHB was partially supported by the Natural Sciences and Engineering Research Councilof Canada and by the Canada Research Chair Program.

References [1] H. Attouch and M. Th´era, A general duality principle for the sum of two operators,

Journalof Convex Analysis

Comptes rendus de l’Acad´emie des Sciences

Houston Journal of Mathematics

Journal of Approximation Theory

185 (2014), 63–79.[5] H.H. Bauschke, J.Y. Bello Cruz, T.T.A. Nghia, H.M. Phan and X. Wang, Optimal ratesof convergence of matrices with applications, to appear in

Numerical Algorithms . DOI10.1007/s11075-015-0085-4 [6] H.H. Bauschke, R.I. Bot¸, W.L. Hare and W.M. Moursi, Attouch–Th´era duality revisited:paramonotonicity and operator splitting,

Journal of Approximation Theory

164 (2012), 1065–1084.[7] H.H. Bauschke and P.L. Combettes,

Convex Analysis and Monotone Operator Theory in HilbertSpaces , Springer, 2011.[8] H.H. Bauschke, F. Deutsch, H. Hundal and S.-H. Park, Accelerating the convergence of themethod of alternating projections,

Transactions of the AMS

355 (2003), 3433–3461.[9] H.H. Bauschke, S.M. Moffat, and X. Wang, Firmly nonexpansive mappings and maximallymonotone operators: correspondence and duality,

Set-Valued and Variational Analysis

10] H.H. Bauschke, D.R. Luke, H.M. Phan and X. Wang, Restricted normal cones and themethod of alternating projections: applications,

Set-Valued and Variational Analysis

21 (2013),475–501.[11] H.H. Bauschke and W.M. Moursi, On the Douglas-Rachford algorithm for two (not neces-sarily intersecting) afﬁne subspaces, to appear in

SIAM Journal on Optimization .[12] H.H. Bauschke, X. Wang and L. Yao, On Borwein-Wiersma decompositions of monotonelinear relations,

SIAM Journal on Optimization

20 (2010), 2636–2652.[13] H.H. Bauschke, X. Wang and L. Yao, Rectangularity and paramonotonicity of maximallymonotone operators,

Optimization

63 (2014), 487–504.[14] J.M. Borwein, Fifty years of maximal monotonicity,

Optimization Letters

Operateurs Maximaux Monotones et Semi-Groupes de Contractions dans les Espaces deHilbert , North-Holland/Elsevier, 1973.[16] R.E. Bruck and S. Reich, Nonexpansive projections and resolvents of accretive operators inBanach spaces,

Houston Journal of Mathematics

Set-Valued Mappings and Enlargements of Monotone Operators ,Springer-Verlag, 2008.[18] P.L. Combettes, The convex feasibility problem in image recovery,

Advances in Imaging andElectron Physics

25 (1995), 155–270.[19] P.L. Combettes, Solving monotone inclusions via compositions of nonexpansive averagedoperators,

Optimization

53 (2004), 475–504.[20] J. Douglas and H.H. Rachford, The numerical solution of parabolic and elliptic differentialequations,

Journal of SIAM

Transactions of the AMS

82 (1956), 421–439.[22] J. Douglas, On the numerical integration of ∂ u ∂ x + ∂ u ∂ y = ∂ u ∂ t by implicit methods, Journal ofSIAM

Splitting Methods for Monotone Operators with Applications to Parallel Optimization ,Ph.D. thesis, MIT, 1989.[24] J. Eckstein and D.P. Bertsekas, On the Douglas–Rachford splitting method and the proximalpoint algorithm for maximal monotone operators,

Mathematical Programming

55 (1992), 293–318.[25] R. Glowinski,

Variational Methods for the Numerical Solution of Nonlinear Elliptic Problems ,SIAM, 2015.[26] A.N. Iusem, On some properties of paramonotone operators,

Journal of Convex Analysis

Introductory Functional Analysis with Applications , Wiley, 1989.[28] P. Lancaster and M. Tismenetsky,

The Theory of Matrices with Applications , Academic Press,1985.[29] P.L. Lions and B. Mercier, Splitting algorithms for the sum of two nonlinear operators.

SIAMJournal on Numerical Analysis

30] C.D. Meyer,

Matrix Analysis and Applied Linear Algebra , SIAM, 2000.[31] W.E. Milne,

Numerical Solutions of Differential Equations , New York, 1953.[32] G.B. Passty, The parallel sum of nonlinear monotone operators,

Nonlinear Analysis

10 (1986),215–227.[33] R.T. Rockafellar and R.J-B. Wets,

Variational Analysis , Springer-Verlag, corrected 3rd print-ing, 2009.[34] S. Simons,

Minimax and Monotonicity , Springer-Verlag, 1998.[35] S. Simons,

From Hahn-Banach to Monotonicity , Springer-Verlag, 2008.[36] T. Torii, Inversion of tridiagonal matrices and the stability of tridiagonal systems of linearequations,

Information Processing in Japan

Handbook of numericalanalysis

Linear Algebra and its Applications

Nonlinear Functional Analysis and Its Applications II/A: Linear Monotone Operators ,Springer-Verlag, 1990.[40] E. Zeidler,

Nonlinear Functional Analysis and Its Applications II/B: Nonlinear Monotone Opera-tors , Springer-Verlag, 1990.

Appendix

In this section we brieﬂy show the connection between the original Douglas-Rachfordalgorithm introduced in [21] (see also [22], [31] and [20] for variations of this method) tosolve certain types of heat equations and the general algorithm introduced by Lions andMercier in [29] (see also [19]).Suppose that Ω is a bounded square region in R . Consider the Dirichlet problem forthe Poisson equation: Given f and g , ﬁnd u : Ω → R such that ∆ u = f on Ω and u = g on bdry Ω , (75)where ∆ = ∇ = ∂ ∂ x + ∂ ∂ y is the Laplace operator and and bdry Ω denotes the boundaryof Ω . Discretizing u followed by converting it into a “long vector” y (see [30, Exam-ple 7.6.2 & Problem 7.6.9]) we obtain the system of linear equations L → y + L ↑ y = − b . (76)Here L → and L ↑ denote the horizontal (respectively vertical) positive deﬁnite discretiza-tion of the negative Laplacian over a square mesh with n points at equally spaced inter-vals (see, [30, Problem 7.6.10]). We have L → = Id ⊗ M and L ↑ = M ⊗ Id, (77)25here M =  − − − −  ∈ R n × n . (78)To see the connection to monotone operators, set A = L → and B : L ↑ + b : y L ↑ y + b .Then A and B are afﬁne and strictly monotone. The problem then reduces toﬁnd y ∈ R n such that Ay + By =

0, (79)and the algorithm proposed by Douglas and Rachford in [21] becomes y n + / + Ay n + By n + / − y n =

0, (80a) y n + − y n + / − Ay n + Ay n + =

0. (80b)Consequently,(80a) ⇔ ( Id + B )( y n + / ) = ( Id − A ) y n ⇔ y n + / = J B ( Id − A ) y n , (81a)(80b) ⇔ ( Id + A ) y n + = Ay n + y n + / ⇔ y n + = J A ( Ay n + y n + / ) . (81b)Substituting (81a) into (81b) to eliminate y n + / yields y n + = J A (cid:0) Ay n + J B ( Id − A ) y n (cid:1) . (82)To proceed further, we must show that ( Id − A ) J A = R A (83a) AJ A = Id − J A . (83b)Indeed, note that Id − A = − ( Id + A ) , therefore multiplying by J A = ( Id + A ) − fromthe right yields ( Id − A ) J A = ( − ( Id + A )) J A = J A − Id = R A . Hence J A − AJ A = J A − ( Id − J A ) ; equivalently, AJ A = Id − J A . Now consider the change of variable ( ∀ n ∈ N ) x n = ( Id + A ) y n , (84)which is equivalent to y n = J A x n . Substituting (82) into (84), and using (83), yield x n + = ( Id + A ) y n + = ( Id + A ) J A ( Ay n + J B ( Id − A ) y n ) = Ay n + J B ( Id − A ) y n = AJ A x n + J B ( Id − A ) J A x n = x n − J A x n + J B R A x n = ( Id − J A + J B R A ) x n , (85)which is the Douglas-Rachford update formula (18).26e point out that J A = J L → , and using [7, Proposition 23.15(ii)] we have J B = J L ↑ + b = J L ↑ − J L ↑ b . To calculate J A and J B apply Corollary 6.13 to get J A = Id n ⊗ J M and J B = J M ⊗ Id n − ( J M ⊗ Id n )( b ) . (86)For instance, when n =

3, the above calculations yield J M = 

821 17 12117 37 17121 17 821  , (87)Id ⊗ J M = 

821 17 121

17 37 17

121 17 821

821 17 121

17 37 17

121 17 821

821 17 121

17 37 17

121 17 821  , (88)and J M ⊗ Id = 

00 0

00 0 