Symbolic computation with monotone operators
SSymbolic Computation with Monotone Operators
Florian Lauster · D. Russell Luke · Matthew K. Tam
July 19, 2018
Dedicated to the memory of Jonathan Michael Borwein
Abstract
We consider a class of monotone operators which are appropriate forsymbolic representation and manipulation within a computer algebra system. Var-ious structural properties of the class ( e.g., closure under taking inverses, resol-vents) are investigated as well as the role played by maximal monotonicity withinthe class. In particular, we show that there is a natural correspondence betweenour class of monotone operators and the subdifferentials of convex functions be-longing to a class of convex functions deemed suitable for symbolic computationof Fenchel conjugates which were previously studied by Bauschke & von Mohren-schildt and by Borwein & Hamilton. A number of illustrative examples utilizingthe introduced class of operators are provided including computation of proximityoperators, recovery of a convex penalty function associated with the hard thresh-olding operator, and computation of superexpectations, superdistributions andsuperquantiles with specialization to risk measures.
Keywords monotone operator, symbolic computation, experimental mathematics
Mathematics Subject Classification (2010)
The
Fenchel conjugate and the subdifferential of a function are two objects of fun-damental importance in convex analysis. For this reason, software libraries orpackages which have the ability to compute and manipulate such objects easilyare a valuable edition to the convex analyst’s toolkit. In the spirit of experimental
F. Lauster · D. R. Luke · M. K. Tam ( (cid:66) )Institut f¨ur Num. und Angew. Mathematik, Universit¨at G¨ottingen, 37083 G¨ottingen, Germany.E-mail: [email protected]. LausterE-mail: fl[email protected]. LukeE-mail: [email protected] a r X i v : . [ m a t h . O C ] M a r F. Lauster et al. mathematics [5,2,3,4], such software also enables researchers to use the machineryof convex analysis to test ideas and look for patterns. There are also potential ped-agogical uses if one believes, as we do, that nonsmooth analysis could and shouldbe a part of the traditional “calculus” cannon taught to high school and beginningBachelor’s level students.There are at least two possible paradigms for computation which can be fol-lowed for the development of such a software library, namely, computations can bedone numerically or symbolically . Roughly speaking, the former involves numericalevaluation of the object under consideration on a grid of points in the ambientspace whilst the latter involves the manipulation objects through symbolic expres-sions with a Computer Algebra System (CAS) . We focus on the second approach,the “symbolic paradigm”. For further details regarding numerical convex analysis,we refer the reader to [15,16,1].It is not too difficult to imagine that there are convex functions which, if notimpossible, are too complex to represent and manipulate symbolically. Neverthe-less, by restricting oneself to a suitable class of convex functions, a great deal canstill be accomplished. Such a framework for symbolic convex analysis was pro-posed by Bauschke & von Mohrenschildt [7,8] for functions on the real line andan extension which could handle a many-dimensional setting was later proposedby Borwein & Hamilton [14,10]. By “suitable class” we mean a class of functionswhich are representable by an appropriate data-structure, are closed under oper-ations such as Fenchel conjugation and are sufficiently generic so as to capturemany important examples.As the subdifferential of a convex function is a monotone operator, it is naturalto ask what class of monotone operators is suitable for symbolic computation aswell as their relationship to the “suitable class” of convex functions studied in[7,8,14,10]. To the best of our knowledge, little has been done in this directionwith the aforementioned works not focusing on the structure of the underlyingmonotone operators directly. In this work, we propose and study such a class ofmonotone operators which are suitable for implementation within a CAS. Amongour main results, we prove that there is a natural correspondence between ourclass of monotone operators and the subdifferentials of the functions studied byBauschke & von Mohrenschildt (Theorem 3.1). We show that a consequence ofthis is that the class is closed under addition, scalar multiplication, taking inversesand taking resolvents (Proposition 3.3). We demonstrate the application of thisclass of operators on several illustrative examples in Section 4.
Our notation and terminology are standard and can be found, for instance, in[11] and [22]. Since this work concerns computer implementations, we restrict ourattention to R n equipped with standard dot-product, denoted (cid:104)· , ·(cid:105) . For quick ref-erence, we list some well-known facts from convex analysis that will be important later.The (effective) domain of a function f : R n → [ −∞ , + ∞ ] is the set dom f := { x ∈ R n : | f ( x ) | < + ∞} . We will be interested in proper (not everywhere infiniteand nowhere equal to −∞ ), lower semi-continuous (lsc), convex functions . The sub-differential of a convex function is the set-valued mapping ∂f : R n ⇒ R n given ymbolic Computation with Monotone Operators 3 by ∂f ( x ) := (cid:40) { φ ∈ R n : (cid:104) φ, x − x (cid:105) ≤ f ( x ) − f ( x ) } x ∈ dom f, ∅ x (cid:54)∈ dom f. The
Fenchel conjugate of f is the function f ∗ : R n → [ −∞ , + ∞ ] defined by f ∗ ( y ) := sup x ∈ R n {(cid:104) y, x (cid:105) − f ( x ) } . The subdifferentials of a function and its Fenchel conjugate are inversely related.
Fact 2.1 ([11, Prop. 4.4.5])
Let f : R n → ( −∞ , + ∞ ] be a function with x ∈ dom f .If v ∈ ∂f ( x ) then x ∈ ∂f ∗ ( v ) . Conversely, if f is a convex function which is lsc at ¯ x and v ∈ ∂f ∗ ( x ) , then v ∈ ∂f ( x ) . Let T : R n ⇒ R n be a set-valued map. The domain of T is the set dom T := { x ∈ R n : T x (cid:54) = ∅} , and the graph of T is the set gph T := { ( x, y ) ∈ R n × R n : y ∈ T x } .Recall that T is monotone if( x, x + ) ∈ gph T ( y, y + ) ∈ gph T (cid:41) = ⇒ (cid:104) x − y, x + − y + (cid:105) ≥ . (1)If the inequality in (1) is strict whenever x (cid:54) = y , then T is said to be strictlymonotone . If T is monotone and there exists no monotone operator whose graphproperly contains the graph of T , then T is said to be maximal monotone .Next we recall a notion stronger than monotonicity. An operator T : R n ⇒ R n is said to be cyclically monotone if, for every n ≥
2, it holds that( x , x +1 ) ∈ gph T ...( x n , x + n ) ∈ gph Tx n +1 = x = ⇒ n (cid:88) i =1 (cid:104) x i − x i +1 , x + i (cid:105) ≥ . Analogously, if T is cyclically monotone and there exists no cyclically monotoneoperator whose graph properly contains the graph of T , then T is said to be maximal cyclically monotone .The family of maximal cyclically monotone operators can be characterized asthe subdifferentials of proper, lsc, convex functions. Fact 2.2 (Rockafellar [21])
Let T : R n ⇒ R n . Then T is maximal cyclically mono-tone if and only if there exists a proper, lsc, convex function f : R n → ( −∞ , + ∞ ] suchthat T = ∂f . Regarding Fact 2.2, we remark that cyclic monotonicity is stronger than mere monotonicity. However, on the real-line ( i.e., dom T = R ), the two notions coincide[6, Th. 22.18].In general, the sum of two maximal monotone operators need not be maximalunless an appropriate constraint qualification is satisfied. In the following Fact, wegive one such example. F. Lauster et al.
Fact 2.3 (Maximal monotonicity of sums [6, Th. 24.3])
Let T , T : R n ⇒ R n be maximal monotone such that cone(dom T − dom T ) = cl span(dom T − dom T ) . (2) Then T + T is maximal monotone. In particular, (2) holds whenever ∈ int(dom T − dom T ) . Let S be a subset of R n . Recall that S is said to be an m -dimensional simplex if there exist a set of m + 1 affinely independent points whose convex hull equals S . We say that S is locally simplicial , in the sense of [20], if for each x ∈ S thereexists a finite collection of simplicies S , . . . , S m contained in S such that, for someneighborhood U of x , U ∩ ( S ∪ · · · ∪ S m ) = U ∩ S. Examples of locally simplicial sets are line segments, polyhedral convex sets, andrelatively open convex sets.
Fact 2.4 (Continuity on the effective domain [20, Th. 10.2])
Let f be a convexfunction on R n , and let S be any locally simplicial subset of dom f . Then f is uppersemi-continuous relative to S . In particular, if f is lsc, then f is continuous relative to S . As a consequence of Fact 2.4 and the fact that every convex subset of R is an interval (and hence simplicial), it follows that every convex lsc function on R isnecessarily continuous on its domain. Before turning our attention to monotone operators, we recall first the class ofconvex functions originally studied in [7,8,14,10]. The
Maple package
SymbolicConvex Analysis Toolbox (SCAT) [10] implements precisely this class of functions.Since the definition of the class is recursive in the dimension of the underlyingdomain, it is essential to understand the one-dimensional case first.Recall that a function f : R n → [ −∞ , + ∞ ] is strictly convex if f ( λx + (1 − λ ) y ) < λf ( x ) + (1 − λ ) f ( y ) , for all x, y ∈ dom f with x (cid:54) = y and for all λ ∈ (0 , Definition 3.1 ( F -functions [7, 8]) For a set of finitely many points A = { a i } mi =1 satisfying a = −∞ < a < · · · < a m − < a m = + ∞ , (3)we say a function f : R → ( −∞ , + ∞ ] belongs to F ( A ) if: (a) f is closed and convex;(b) f is continuous on its effective domain; and(c) the restriction of f to the interval ( a i , a i +1 ) is either (i) affine, (ii) strictlyconvex and differentiable, or (iii) identically equal to + ∞ .The class of functions F is the union of F ( A ) over all finite sets of points A satisfying (3) ymbolic Computation with Monotone Operators 5 Recalling that a function is closed ( i.e., its epigraph is a closed set) if and onlyif it is lsc, we observe that Condition (a) is equivalent to requiring that functionsin F be either proper, lsc and convex, or identically equal to + ∞ . Moreover, as aconvex function is continuous on the relative interior of its domain [20, Th. 10.1],the only place where Condition (b) can play a role is at boundary points of theeffective domain. Remark 3.1
The presentation of Definition 3.1 given here differs slightly from theversion given in [7,8] in that we introduce the set F through the union of the sets F ( A ) rather than directly.One of the most important properties of F -functions is that the class is closedunder Fenchel conjugation. This is ensures that a data-structure designed to rep-resent functions belonging to F is also able to represent their conjugates. Thisclosure property was noted in [8] without proof. We shall return to this topic laterwhere our soon to be introduced class of monotone operator to furnish a conve-nient proof. Another important property of the subdifferentials of F -functions isthat they may be expressed explicitly in terms of their gradient, when this exists. Proposition 3.1 (Computing F -subdifferentials [10, § Suppose f ∈ F ( A ) for A = { a i } mi =1 , and let f | i (resp. f (cid:48) | i ) denote the restriction of f (resp. f (cid:48) ) to theinterval ( a i , a i +1 ) . Then ∂f can be piecewise defined according to the following threecases.(a) If x (cid:54)∈ dom f then ∂f ( x ) = ∅ ;(b) If x ∈ int(dom f ) then ∂f ( x ) = (cid:20) lim y ↑ x f (cid:48) ( y ) , lim y ↓ x f (cid:48) ( y ) (cid:21) ; (c) If x ∈ dom f \ int(dom f ) then x = a i for some i ∈ { , , . . . , m − } . In this case ∂f ( a i ) = ( −∞ , + ∞ ) if f | i − = ∞ = f | i , ( −∞ , lim y ↓ a i f | (cid:48) i ( y )] if f | i − = ∞ (cid:54) = f | i , [lim y ↑ a i f | (cid:48) i − ( y ) , + ∞ ) if f | i − (cid:54) = ∞ = f | i . As we have already seen, the subdifferential of proper, lsc, convex functionis always a maximal (cyclically) monotone operator (Fact 2.2). Thus, in light ofthe above proposition, we collect some of the finer monotonicity properties of thesubdifferentials of F -functions. The following lemma will simplify the proof ofProposition 3.2. Lemma 3.1
Let f ∈ F ( A ) be a proper function where A = { a i } mi =0 . Then the restric-tion of ∂f to the interval ( a i , a i +1 ) is either (i) single-valued and constant, (ii) single-valued, continuous and strictly monotone, or (iii) identically equal to the empty-set.Proof Consider the restriction of f to the open interval ( a i , a i +1 ). We distinguishthree cases based on Definition 3.1(c). (i) If f is affine on ( a i , a i +1 ) then f (cid:48) is single- valued and constant. (ii) If f is differentiable on ( a i , a i +1 ) then f (cid:48) is continuous on( a i , a i +1 ) [20, Thm. 25.5.1], and if f is strictly convex on ( a i , a i +1 ) then f (cid:48) is strictlymonotone on ( a i , a i +1 ), by [11, Exer. 2.1.14] and [22, Thm. 12.17]. (iii) Otherwise,by virtue belonging to F , f must be identically equal to + ∞ on ( a i , a i +1 ) and, bydefinition, its subdifferential is identically equal to the empty-set on ( a i , a i +1 ). (cid:117)(cid:116) F. Lauster et al.
Proposition 3.2 (Structure of F -subdifferentials) Let f ∈ F ( A ) be a properfunction where A = { a i } mi =0 . The following assertions hold.(a) The restriction of ∂f to the interval ( a i , a i +1 ) is either (i) single-valued and con-stant, (ii) single-valued, continuous and strictly monotone, or (iii) identically equalto the empty-set.(b) For any x ∈ dom f \ int(dom f ) , x = a i for some i ∈ { , m − } and ∂f ( a i ) = ( −∞ , + ∞ ) if f | i − = ∞ = f | i , max ∂f ( a i ) = lim x ↓ a i f | (cid:48) i ( x ) if f | i − = ∞ (cid:54) = f | i , min ∂f ( a i ) = lim x ↑ a i f | (cid:48) i − ( x ) if f | i − (cid:54) = ∞ = f | i ; where, by convention, min ∅ = + ∞ and max ∅ = −∞ .Proof (a): Follows by applying Lemma 3.1 to each interval ( a i , a i +1 ). (b): This isa direct consequence of Proposition 3.1(c). (cid:117)(cid:116) Motivated by Propositions 3.1 and 3.2, we define the following class of mono-tone operators. We shall show that, in particular, it contains all subdifferentials of F -functions. Definition 3.2 ( T -operators) For a set of finitely many points B = { b i } li =0 sat-isfying b = −∞ < b < · · · < b l − < b l = + ∞ , (4)we say a set-valued operator T : R ⇒ R belongs to T ( B ) if there exists a maximalmonotone extension (cid:101) T of T such that the restriction (cid:101) T to each interval ( b i , b i +1 )is either(i) single-valued and constant;(ii) single-valued, continuous and strictly monotone; or(iii) identically equal to the empty-set.The class of operators T is the union of T ( B ) over all finite sets of points B satisfying (4).The proposition which soon follows establishes that the class of T -operators iswell-suited for symbolic manipulation. Moreover, as a monotone operator can haveonly countably many discontinuities in its domain, the restriction to monotoneoperators possessing at most finitely many discontinuities is still quite general. Remark 3.2 ( T -operators at points of discontinuity) Let T ∈ T with maximal mono-tone extension (cid:101) T ∈ T ( B ) where B = { b i } li =1 . From the definition of T -operators,the only possible points of discontinuity of T are the points in B . At a point b i ∈ int(dom (cid:101) T ) for some i ∈ { , . . . , l − } , the restriction of (cid:101) T to either of the openintervals ( b i − , b i ) and ( b i , b i +1 ) is continuous and hence, by monotonicity, both of the limits lim x ↑ b i (cid:101) T ( x ) and lim x ↓ b i (cid:101) T ( x ) are finite. We therefore have that T ( b i ) ⊆ (cid:101) T ( b i ) = (cid:20) lim x ↑ b i T ( x ) , lim x ↓ b i T ( x ) (cid:21) , where the equality holds due to outer semi-continuity of (cid:101) T [12, § ymbolic Computation with Monotone Operators 7 The following theorem shows that all of the most important closure propertieshold for the class of T -operators. Proposition 3.3 (Properties of T -operators) The following assertions hold.(a) If T ∈ T and λ ≥ then λT ∈ T .(b) If T , T ∈ T then T + T ∈ T .(c) If T ∈ T then T − ∈ T .(d) If T ∈ T and λ > then ( I + λT ) − ∈ T .(e) If T , T ∈ T then ( T − + T − ) − ∈ T .Proof (a): Let (cid:101) T ∈ T be a maximal monotone extension of T and λ ≥
0. Then λ (cid:101) T ∈ T and, moreover, λ (cid:101) T is a maximal monotone extension of λT .(b): Define T := T + T and let (cid:101) T and (cid:101) T denote maximal monotone extensionsrespectively of T and T contained in T . Setting (cid:101) T := (cid:101) T + (cid:101) T , we therefore havethat dom (cid:101) T = dom (cid:101) T ∩ dom (cid:101) T ⊇ dom T ∩ dom T = dom T. (5)We now distinguish three cases based on dom (cid:101) T .(i) Suppose dom (cid:101) T = ∅ . Then, using (5), it follows that dom T = ∅ . As the emptyrelation is trivially contained in T , we have that T ∈ T .(ii) Suppose dom (cid:101) T (cid:54) = ∅ but int(dom (cid:101) T ) = ∅ . Since dom (cid:101) T is the intersection oftwo convex sets, dom (cid:101) T and dom (cid:101) T , it follows that dom (cid:101) T is a singleton, say,dom (cid:101) T = { x } . In this case, the operator x (cid:55)→ (cid:40) ( −∞ , + ∞ ) if x = x , ∅ otherwisedefines a maximal monotone extension of (cid:101) T , and hence also defines a maximalmonotone extension of T , which is contained in T .(iii) Suppose int(dom (cid:101) T ) (cid:54) = ∅ . Then 0 ∈ int(dom (cid:101) T − dom (cid:101) T ) and hence, byFact 2.3, the extension (cid:101) T is maximal monotone. Let { b i } li =1 denote the unionof the sets of breakpoints for (cid:101) T and (cid:101) T , provided by Definition 3.2. To seethat (cid:101) T ∈ T , observe that the restriction of (cid:101) T to each open interval ( b i , b i +1 )is either single-valued and continuous, single-valued and strictly monotone,or identically equal to the empty-set.(c): Let (cid:101) T ∈ T ( B ) be a maximal monotone extension of T . Since (cid:101) T − is amaximal monotone extension of T − , it suffices to show that (cid:101) T − ∈ T . To thisend, observe thatdom (cid:101) T − = range (cid:101) T = (cid:32) l (cid:91) i =0 (cid:101) T (( b i , b i +1 )) (cid:33) ∪ (cid:32) l − (cid:91) i =1 (cid:101) T ( b i ) (cid:33) , (6) where we denote (cid:101) T (( b i , b i +1 )) := { y ∈ (cid:101) T ( x ) : x ∈ ( b i , b i +1 ) } . Both (cid:101) T and (cid:101) T − aremaximal monotone and closed convex-valued [22, Exerc.12.8]. To show that (cid:101) T − ∈T , it suffices to show that on each piece of its domain specified by (6) which is nota singleton, that (cid:101) T − is single-valued and either constant, or strictly monotoneand hence continuous by maximal monotonicity. To see this, we distinguish thefollowing cases, using the fact that (cid:101) T ∈ T . F. Lauster et al. (i) Consider a piece in (6) of the form (cid:101) T (( b i , b i +1 )). There are thus two possibil-ities:(I) (cid:101) T is single-valued and constant with value c on ( b i , b i +1 ): In this case (cid:101) T − ( c ) is a closed interval containing ( b i , b i +1 ).(II) (cid:101) T is single-valued, continuous and strictly monotone on ( b i , b i +1 ): In thiscase, (cid:101) T − is single-valued, continuous and strictly monotone on U = (cid:101) T (( b i , b i +1 )) and, moreover, U is an interval [23, Th. 5.11.14].(ii) Next consider a piece in (6) of the form (cid:101) T ( b i ) where b i ∈ dom (cid:101) T . Then b i ∈ (cid:101) T − ( y ) for all y ∈ (cid:101) T ( b i ) where (cid:101) T ( b i ) is a closed convex set due to the maximalmonotonicity of (cid:101) T . If int (cid:101) T ( b i ) = ∅ then (cid:101) T ( b i ) is a singleton and there isnothing further to prove. Suppose, then, that the open interval U := int (cid:101) T ( b i )is non-empty. There are two possibilities.(I) (cid:101) T − is single-valued on U : Then (cid:101) T − ( y ) = { b i } for all y ∈ U .(II) (cid:101) T − is multi-valued on U :m Then there exist points y ∈ U and x (cid:54) = b i such that x ∈ (cid:101) T − ( y ) and y > y ≥ inf (cid:101) T ( b i ). For convenience, weassume that x < b i ; an analogous argument applies when x > b i .Since (cid:101) T − is closed and convex-valued, [ x , b i ] ⊆ (cid:101) T − ( y ) and hence y ∈ (cid:101) T ( x ) for all x ∈ [ x , b i ]. Since (cid:101) T ∈ T , it must be single-valued on( b i − , b i ), hence (cid:101) T x = { y } for all x ∈ ( b i − , b i ). Since (cid:101) T is maximalmonotone, it must hold that y = inf T ( b i ) ≤ y which is a contradic-tion.We conclude, therefore, that (cid:101) T − ( y ) = { b i } for all y ∈ (cid:101) T ( b i ).Cases (i) and (ii) together imply that (cid:101) T − ∈ T which completes the proof of (c).(d): By (a) it follows that λT ∈ T . Noting that the identity operator I is amaximal monotone operator contained in T with dom I = R , (b) implies that I + λT ∈ T . The result now follows from (c).(e): This follows immediately from parts (b) and (c). (cid:117)(cid:116) Example 3.1 (Examples of T -operators) Proposition 3.3(d) ensures that the resolvent of any T -operator belongs to T . In particular, T contains all proximity mappings of F -functions. In other words, if f ∈ F and λ > λf := arg min y ∈ R (cid:26) f ( y ) + 12 λ (cid:107) · − y (cid:107) (cid:27) = ( I + λ∂f ) − ∈ T . In particular, by considering the indicator function contained in F , we see that T contains all projection operators onto closed, convex subsets of R . Remark 3.3 (Maximal monotone extensions of T ∈ T ) When defining the class of T -operators in Definition 3.2, one might have instead required an operator T ∈ T to be maximal monotone itself rather than its extension in T . This approach has asignificant shortcoming in that the empty relation is no longer in T . Consequently,Proposition 3.3(b) no longer holds as can be seen by considering the sum twomaximal monotone operators whose domains do not intersect. The following theorem summarizes the connection between F -functions and T -operators. Theorem 3.1 If f ∈ F is proper then ∂f is maximal monotone and belongs to T .Conversely, if T ∈ T is maximal monotone then there exists a proper, lsc, convexfunction f such that T = ∂f and, moreover, any such function belongs to F . ymbolic Computation with Monotone Operators 9 Proof If f ∈ F is a proper function then the fact that ∂f is maximal monotoneand belongs to T was already proven in Proposition 3.2.Conversely, let T ∈ T be a maximal monotone operator. By Fact 2.2, thereexists at least one proper, lsc, convex function with subdifferential equal to T . Let f denote any such function (which already satisfies Definition 3.1(a)). By Fact 2.4, f is continuous on dom f , that is, Definition 3.1(b) is satisfied. Finally, to showthat f satisfies Definition 3.1(c), first recall that a convex function is differentiableat point in its domain if and only if its subdifferential is a singleton at the samepoint [11, Th. 2.2.1]. It follows that f can be non-differentiable only if T is multi-valued which happens at most at finitely many points. Consider the restriction ofthe function f to an open interval on which it is differentiable. Then, as T ∈ T , wehave that f (cid:48) is either constant or strictly monotone on this interval. If f (cid:48) constantthen f is affine. Otherwise f (cid:48) is strictly monotone and hence f is strictly convex[22, Th. 2.13]. (cid:117)(cid:116) Note that Theorem 3.1 provides a pathway to symbolically computing a maximalmonotone extension of a monotone operators T ∈ T . First find function f ∈ F such that ∂f = T . A maximal extension of T is then given by ∂f .We now return to the question closure of F -function under the operation ofFenchel conjugation. We offer the following proof, which utilizes our class of mono-tone operators. Proposition 3.4 ( F is closed under Fenchel conjugation) If f ∈ F is properthen f ∗ ∈ F .Proof By Theorem 3.1, ∂f ∈ T . Combining Fact 2.1 with Proposition 3.3(c) showsthat ∂f ∗ = ( ∂f ) − ∈ T . Using Theorem 3.1 a second time yields f ∗ ∈ F . (cid:117)(cid:116) Assumption (c) of Definition 3.1 is crucial for obtaining closedness of the fam-ily F under Fenchel conjugation. Specifically, the strict convexity assumption onnon-constant pieces of the domain cannot be removed and replaced with meredifferentiability. In fact, the following counter-example shows that this is the caseeven for infinitely differentiable convex functions. Example 3.2 (Necessity of finitely affine pieces)
Consider the convex function (seeFigure 1) constructed from (unnormalized) C ∞ mollifying functions as definedfollows: f ( x ) := (cid:90) x (cid:90) y h ( z ) dz dy (7)with h ( x ) := ψ (cid:0) n +1 x − (cid:1) for x ∈ [2 − (2 n +1) , − n ] ( n ∈ N )0 for x ∈ (2 − (2 n +2) , − (2 n +1) ) ( n ∈ N )0 for x ∈ ( −∞ , ∪ (1 , ∞ ) , where ψ denotes the mollifying function given by ψ ( x ) := (cid:40) exp (cid:16) − − (2 x − (cid:17) x ∈ [0 , . . . . . . . . . f (cid:48) on [0 , . . . . . . . . h = f (cid:48)(cid:48) on [0 , Fig. 1: Construction of the convex function f in (7).The function h is nonnegative and infinitely differentiable on R \ { } so that (cid:82) y h ( z ) dz is continuous nondecreasing. It follows that f is convex on R and in-finitely differentiable on R \ { } and hence satisfies properties (a)-(b) of Defini-tion 3.1. The function f , however does not satisfy (c) of Definition 3.1 as it isaffine on every interval (2 − (2 n +2) , − (2 n +1) ) for n ∈ N fixed with slope a n givenby a n = (cid:90) − n +1) h ( z ) dz = ∞ (cid:88) j = n +1 (cid:90) R ψ (cid:16) j +1 x − (cid:17) dx = ∞ (cid:88) j = n +1 − (2 j +1) (cid:90) R ψ ( y ) dy. Now, since f is affine on infinitely many intervals in [0 , / ∂f , is constant and singleton on infinitely many intervals in [0 , /
2] with value onthese intervals given by { a n } N . It follows that ∂f ∗ = ( ∂f ) − is multi-valued ateach point in { a n } N , of which there are infinitely many, and therefore f ∗ cannotbe in F . (cid:117)(cid:116) In this section we detail a number of computational examples and applicationswhich utilize the class of monotone operators introduced above. We perform oursymbolic computations in
Maple making use of the data-structures provided by the
Symbolic Convex Analysis Toolkit (SCAT) developed by Borwein & Hamilton [10].We shall also make use of an additional function, shown in Figure 2, for computingthe inverse of a monotone operator. The source code for the examples which followas well as the SCAT library are available online at:http://vaopt.math.uni-goettingen.de/software.phpAlthough we consider the one-dimensional setting, it is worth noting that, asrecognized by [7], separable convex functions on R n can still be handled. Recall that a convex function f : R n → ( −∞ , + ∞ ] is separable if there exist convexfunctions f j : R → ( −∞ , + ∞ ] such that f ( x ) = (cid:80) nj =1 f j ( x j ) . For such a function, ∂f ( x ) = ∂f ( x ) × · · · × ∂f n ( x n ) , f ∗ ( y ) = n (cid:88) j =1 f ∗ j ( y j ) . ymbolic Computation with Monotone Operators 11 In this way, we may also consider monotone operators T : R n ⇒ R n such that T ( x ) = T ( x ) × · · · × T n ( x n )where each T j : R ⇒ R is a monotone operator. We give examples in which thiskind of structure arising in Sections 4.1 and 4.2. with(SCAT): Fig. 2: Inversion of a maximal cyclic monotone operator using Fenchel conjugation.4.1 Explicit Formula for Proximity OperatorsRecall that the proximity operator of a proper, lsc, convex function f : R n → ( −∞ , + ∞ ] with parameter λ > λf := arg min y ∈ R (cid:26) f ( y ) + 12 λ (cid:107) · − y (cid:107) (cid:27) . Proximity operators are the building blocks of many iterative algorithms inoptimization and thus it is important, in practice, that they can be efficientlycomputed. One such possibility is to find an explicit formula for the proximityoperator.We compute explicit forms for some classical proximity operators using theframework of T -operators. Our approach exploits the formulaprox λf = ( I + λ∂f ) − . (8) Given a function f ∈ F , we symbolically compute its subdifferential using theSCAT package. Using (8) and Proposition 3.3, we deduce that prox λf ∈ T .Two example computations of proximity operators are in given in Figures 3and 4. Furthermore, it is worth nothing that, thanks to our theory in Section 3, thisis computation actually proves the resulting formula for the proximity function. with(SCAT): f := − x x < x = 0 x x > sdf := SubDiff(f): prox λf = { λ + y } y < − λ { } y = − λ { } ( − λ < y ) and ( y < λ ) { } y = λ {− λ + y } λ < y Fig. 3: Computation of the proximity function of f = (cid:107) · (cid:107) . with(SCAT): f := ∞ x < a x = a a < x ) and ( x < b )0 x = b ∞ b < x sdf := SubDiff(f): P [ a,b ] = { a } x < a { a } x = a { y } ( a < y ) and ( y < b ) { b } x = b { b } b < x Fig. 4: Computation of the proximity function of f = ι [ a,b ] ( i.e., projector onto[ a, b ]).4.2 Recovery of Penalty Functions Given a monotone operator T : R n ⇒ R n , we consider the problem of findinga so-called penalty function f : R n → ( −∞ , + ∞ ], that is, a function f whosesubdifferential can be identified with T . Precisely, find a function f such thatgph T ⊆ gph(prox λf ) . ymbolic Computation with Monotone Operators 13 We shall focus on the case in which λ = 1 as it covers all the technicalities of thegeneral case and thus we define prox f := prox f . The same problem was previouslystudied in [9], without symbolic computational tools.The following proposition and its proof shall form the basis of our approach. Proposition 4.1 (Recovery of penalty functions)
Let T : R n ⇒ R n be a maximalcyclically monotone operator. There exists a proper, lsc function f with f + (cid:107)·(cid:107) convexsuch that T = prox f . Furthermore, if T ∈ F then there exists an f such that f + (cid:107)·(cid:107) belongs to F .Proof By Fact 2.2, there exists a proper, lsc, convex function h : R n → ( −∞ , + ∞ ]such that T = ∂h , and in particular, if T ∈ T then Theorem 3.1 ensures that wemay choose h ∈ F . Using Fermat’s rule [22, Th. 10.1], we deduce that T ( x ) = ∂h ( x ) = (cid:0) ∂h ∗ (cid:1) − ( x ) = { y ∈ R n : x ∈ ∂h ∗ ( y ) } = arg min y ∈ R n { h ∗ ( y ) − (cid:104) x, y (cid:105)} = arg min y ∈ R n (cid:26)(cid:18) h ∗ ( y ) − (cid:107) y (cid:107) (cid:19) + 12 (cid:107) x − y (cid:107) (cid:27) . We therefore have T = prox f where f := h ∗ − (cid:107) · (cid:107) . The fact that f is proper andlsc with f + (cid:107) · (cid:107) convex follows since h ∗ is proper, lsc and convex. In particular,if h ∈ F then f + (cid:107) · (cid:107) = h ∗ ∈ F . (cid:117)(cid:116) We are now ready to state our strategy for reconstruction of the penalty func-tion associated with the monotone operator T ∈ T .(i) Find a maximal cyclically monotone extension of (cid:101) T ∈ T of T .(ii) Find a function h ∈ F such that ∂h = (cid:101) T .(iii) Compute the Fenchel conjugate h ∗ of h .(iv) The penalty function f can now be given as f := h ∗ − (cid:107) · (cid:107) .We note that the existence of a function h in Step (ii) is possible due to Theorem 3.1and can be obtained via integrating any selection of T [17, Prop. 1.6.1]. Step (iii)can be performed for F -functions within the SCAT package and Step (iv) is clearlystraightforward. Thus the only potentially difficult computation arises in Step (i);but this can sometimes be dealt with satisfactorily as we shall now demonstrate. Example 4.1 (Hidden convexity of the hard thresholding operator)
The hard threshold-ing operator H α : R → R for parameter α > H α ( x ) := (cid:40) x if | x | > α, H α is usually viewed as a selection of the (set-valued) proximity operator of the (cid:96) -functional; a non-convex object. More precisely,prox α (cid:107)·(cid:107) ( x ) = arg min y ∈ R (cid:110) α (cid:107) y (cid:107) + | x − y | (cid:111) = x if | x | > α, { , x } if | x | = α, and hence that gph H α ⊆ gph prox α (cid:107)·(cid:107) . Whilst both H α and prox α (cid:107)·(cid:107) are mono-tone operators, neither are maximal. Nevertheless, on account of having full do-main, their unique maximal monotone extension can be given T ( x ) := x if | x | > α, [0 , x ] if | x | = α, T belongs to T .We are now in a position to recover the penalty function f associated with T . with(SCAT): H := { x } x < − α { , x } x = − α { } ( − α < x ) and ( x < α ) { , x } x = α { x } x < α Conjh := Conj(Integ(H),y):f = factor(simplify(Conjh-1/2*y^2)); f = y < − α ( α − y )( α + y ) y = − α ( α + y ) ( − α < y ) and ( y < − α − y y = 0 − ( α − y ) (0 < y ) and ( y < α ) ( α − y )( α + y ) y = α α < y Fig. 5: Recovery of a convex penalty associated with H α .A closer look at the result from Figure 5 shows that the penalty function f may be expressed more concisely in the form f ( y ) = (cid:40) | y | > α, − ( | y | − α ) | y | ≤ α. (9) It is also worth noting, that the entire procedure can be reserved to give a proofthat the hard thresholding operator is monotone! More precisely, one should sym-bolically compute the proximity function of f in (9) using the method in Sec-tion 4.1. If the result is an extension of the original operator, H α , then it is nec-essarily monotone (as the subdifferential of a proper, lsc, convex function). ymbolic Computation with Monotone Operators 15 T is well-suited for direct symbolic calculation of superex-pectations, superdistributions and superquantiles as developed in [18,13,19]. The superexpectation function, E X ( x ), of the random variable X at level x is defined as E X ( x ) := E [max { x, X } ] = (cid:90) ∞−∞ max { x, x (cid:48) } dF X ( x (cid:48) )= (cid:90) max { x, Q X ( p ) } dp. (10)where F X : R → [0 ,
1] is the cumulative distribution function of the random vari-able X and Q X : (0 , → ( −∞ , + ∞ ) is the quantile function ; these are definedrespectively as F X ( x ) := prob { X ≤ x } (11) Q X ( p ) := min { x | F X ( x ) ≥ p } ( p ∈ (0 , . (12)The function F X is nondecreasing and right-continuous on ( −∞ , + ∞ ) withlim x →−∞ F X ( x ) = 0 , lim x → + ∞ F X ( x ) = 1 . The maximal monotone extension of the distribution function F X , denoted (cid:101) F is called the superdistribution and is also generated by taking the subdifferential ofthe superexpectation function [19, Th. 1], which is a finite convex function on( −∞ , ∞ ) with gph (cid:101) F X = gph ∂ E X , F X ( x ) = E (cid:48) + X ( x ) , (13)where E (cid:48) + X ( x ) := lim y ↓ x E X ( y ) − E X ( x ) y − x . The superquantile function, denoted (cid:101) Q X is the maximal monotone extension ofthe quantile function and satisfies the inverse relationship [19, Th. 2] (cid:16) gph (cid:101) F X (cid:17) − = gph ∂ E ∗ X = gph (cid:101) Q X , and Q X ( p ) = E ∗ (cid:48) − X ( p ) . These objects are therefore amenable to symbolic convex analysis, via subdifferen-tials of the superexpectation function E X or its Fenchel conjugate. This providesa symbolic route to working with coherent risk measures such as conditional valued-at-risk .To demonstrate this approach, we symbolically derive an example which ap-pears in [19]. Example 4.2 (Exponential distributions)
Let X be exponentially distributed with pa-rameter λ >
0. That is, X has cumulative distribution function F X = 1 − exp( − λx ).Figure 6 shows the symbolic computation of the superexpectations function, thesuperdistribution function, and the superquantile function of X . Note that, to compute the superexpection of F X , we have made use of the fact thatlim x →∞ [ E X ( x ) − x ] = 0 , which was proven as part of [19, Th. 1]. In this way, it is not necessary to computethe potentially tricky ‘max’ in the definition (10) directly. with(SCAT):F := 1-exp(-lambda*x): Q := − ln(1 − p ) λ superQ := factor(1/(1-p)*int(subs(p=t,Q),t=p..1)); superQ := − ln(1 − p ) − λ Assume(lambda>0):F := convert(piecewise(x >= 0,F), SD,x); E := { } x < { } x = 0 { − e − λx } < x E := λ x < λ x = 0 λx + e − λx λ < x conjE := conjE(E,p,x); conjE := ∞ p < − λ p = 0 − ( − p )(ln(1 − p ) − λ (0 < p ) and ( p < p = 1 ∞ < p Fig. 6: Computation of super-functions for the exponential distribution.
Acknowledgements
DRL was supported in part by Deutsche Forschungsgemeinschaft Col-laborative Research Center SFB755. MKT was supported by Deutsche ForschungsgemeinschaftRTG2088.
References
1. B. Gardiner, J.K., Lucet, Y.: Conjugate of convex piecewise linear-quadratic bivariatefunctions. Comput. Optim. Appl. , 249–272 (2014). DOI 10.1007/s10589-013-9622-z2. Bailey, D.H., Borwein, J.M.: Mathematics by Experiment: Plausible Reasoning in the 21stcentury. A K Peters Ltd, Natick, MA (2003)3. Bailey, D.H., Borwein, J.M.: Experimental Mathematics: examples, methods and implica-tions. Notices Amer. Math. Soc. (5), 502–514 (2005)ymbolic Computation with Monotone Operators 174. Bailey, D.H., Borwein, J.M., Calkin, N.J., Girgensohn, R., Luke, D.R., Moll, V.H.: Exper-imental Mathematics in Action. A K Peters Ltd, Natick, MA (2007)5. Bailey, D.H., Borwein, J.M., Girgensohn, R.: Experimentation in Mathematics: Compu-tational Paths to Discovery. A K Peters Ltd, Natick, MA (2003)6. Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory inHilbert Spaces. Springer Science & Business Media (2011)7. Bauschke, H.H., von Mohrenschildt, M.: Fenchel conjugates and subdifferentials in Maple.Tech. rep. (1997)8. Bauschke, H.H., von Mohrenschildt, M.: Symbolic computation of Fenchel conjugates.ACM Commun. Comput. Algebra (1), 18–28 (2006)9. Bayram, I.: Penalty functions derived from monotone mappings. IEEE Signal Process.Lett. (3), 264–268 (2015)10. Borwein, J.M., Hamilton, C.H.: Symbolic Fenchel conjugation. Math. Program. (1-2),17–35 (2009)11. Borwein, J.M., Vanderwerff, J.D.: Convex Functions: Constructions, Characterizationsand Counterexamples, vol. 32. Cambridge University Press Cambridge (2010)12. Burachik, R., Iusem, A.N.: Set-valued mappings and enlargements of monotone operators,vol. 8. Springer Science & Business Media (2007)13. Dentcheva, D., Martinez, G.: Two-stage stochastic optimization problems with stochasticordering constraints on the recourse. Eur. J. Oper. Res. , 1–8 (2012)14. Hamilton, C.H.: Symbolic convex analysis. Master’s thesis, Simon Fraser University (2005)15. Lucet, Y.: Faster than the fast Legendre transform, the linear-time Legendre transform.Numer. Algorithms (2), 171–185 (1997)16. Lucet, Y.: What shape is your conjugate? a survey of computational convex analysis andits applications. SIAM Rev. (3), 505–542 (2010)17. Niculescu, C., Persson, L.E.: Convex functions and their applications: a contemporaryapproach. Springer Science & Business Media (2006)18. Ogryczak, W., Ruszczy´nski, A.: Dual stochastic dominance and related mean-risk models.SIAM J. Optim. , 60–78 (2002)19. Rockafellar, R., Royset, J.: Random variables, monotone relations, and convex analysis.Math. Program. (2014). DOI 10.1007/s10107-014-0801-120. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton, NJ (1970)21. Rockafellar, R.T.: On the maximal monotonicity of subdifferential mappings. Pacific J.Math.33