A Fresh Geometrical Look at the General S-Procedure
aa r X i v : . [ m a t h . O C ] F e b A Fresh Geometrical Lookat the General S-Procedure
Michel De Lara ∗ and Jean-Baptiste Hiriart-Urruty † February 12, 2021
Abstract
We revisit the S-procedure for general functions with “geometri-cal glasses”. We thus delineate a necessary condition, and almost asufficient condition, to have the S-procedure valid. Everything is ex-pressed in terms of convexity of augmented sets (convex hulls, conicalhulls) of images built from the data functions.
Keywords.
S-lemma, Convexity of image sets, Separation of convexsets, Theorem of alternatives.
Introduction
The so-called
S-procedure takes roots in Automatic Control Theory; anexcellent survey-paper on its origin and developments is [4]. In the field ofOptimization, the subject has also been studied thoroughly, beginning withthe quadratic data and further with general functions. As a result, papersconcerning the S-procedure abound; fortunately, there are from time to timesurvey-papers which allow to take stock of what has been done and whatneeds to be done; two examples are [1] and [6]. With that in mind, for theconvenience of the reader who is not necessarily “immersed” in the subject,we recall in Section 1 some of the main known results. ∗ CERMICS, Ecole des Ponts, Marne-la-Vall´ee, France. E-mail:[email protected] † Institut de Math´ematiques, Universit´e Paul Sabatier, Toulouse, France. E-mail:[email protected] R + ) q , or taking the conical hull), an assumption weaker than themere convexity of the image itself.The S-procedure is intimately linked with the validity of a duality resultin a certain mathematical optimization problem (see a recent overview ofthat in [9]); this was already the main motivation in Fradkov ’s paper ([3]).But this aspect is not broached here.Our approach is essentially geometrical ; the validity of the necessary/sufficientconditions that we develop are expressed in terms of convexity of sets. Asexpected in such contexts, the main used mathematical tool is the separa-tion of convex sets by hyperplanes (in finite-dimensional vector spaces). Ourmain results (Theorem 2, Theorem 3) have similarities with some in
Frad-kov ’s old paper [3]; they could have been there, as much as the method asindications around some remarks led to them. To a certain extent, our noteis a revisit and an extension of Section 1 in [3].
1. The S-procedure for quadratic functions
We recall here some basic results on the S-procedure when only quadraticfunctions are involved.Let Q , Q , ..., Q p be 1 + p real n × n symmetric matrices, let c , c , ..., c p ∈ R n , let d , d , ..., d p ∈ R , and let q i ( · ) be the associated quadratic functions x ∈ R n q i ( x ) = 12 h Q i x, x i + h c i , x i + d i , where h· , ·i denotes the scalar product on R n . When c i = 0 and d i = 0, onespeaks of quadratic form q i instead of quadratic function. When Q i = 0,one speaks of linear ( or affine) function , and of linear form when, moreover, d i = 0 . What is called
S-procedure in Automatic Control Theory is the relation-ship between( I ) ( q i ( x ) > i = 1 , , ..., p ) ⇒ ( q ( x ) > C ) ( There exist α > , ..., α p > q ( x ) − X pi =1 α i q i ( x ) > x ∈ R n . C ) ⇒ ( I )] is trivial. The issue is therefore the con-verse implication; we say that the S-procedure is valid ( or favorable, or loss-less) when this converse [( I ) ⇒ ( C )] holds true, that is to say the equiv-alence between the two statements ( I ) and ( C ). The equivalence may beused in its negative form, i.e. [( not I ) ⇔ ( not C )], whose essential content is[( not C ) ⇒ ( not I )] . Let us recall some important cases when the S-procedure is known to bevalid:- When p = 1, provided that there exists x such that q ( x ) > . - When all the involved functions q i are linear forms. In that case, this isjust the Minkowski-Farkas lemma (in its homogeneous form). Indeed, tohave h a , x i − X pi =1 α i h a i , x i > x ∈ R n amounts to having a = X pi =1 α i a i . - When all the functions q i involved are linear functions. In that case, thisis again the Minkowski-Farkas lemma (non-homogeneous form). Indeed,( h a i , x i − b i > i = 1 , , ..., p ) ⇒ ( h a , x i − b > ( There exist α > , ..., α p > a = X pi =1 α i a i and b − X pi =1 α i b i . . The S-procedure for general functions Let f , f , ..., f p : R n → R be 1 + p (general) functions. For such acollection of functions, we mimic the S-procedure presented for quadraticfunctions. The objective is to have the equivalence between the two nextassertions: ( I ) ( f i ( x ) > i = 1 , , ..., p ) ⇒ ( f ( x ) > C ) ( There exist α > , ..., α p > f ( x ) − X pi =1 α i f i ( x ) > x ∈ R n . Sometimes, the expected result is written in the following “alternative theo-rem” form, with( not I ) (cid:26) The system of inequations ( f i ( x ) > i = 1 , , ..., p )and ( f ( x ) <
0) has a solution x ∈ R n .The valid S-procedure then reads: exactly one of the two statements( not I ) and ( C ) is true. For real-valued functions ϕ , ϕ , ..., ϕ k defined on R n , we use the standardnotation Im( ϕ , ϕ , ..., ϕ k ) for the image set { ( ϕ ( x ) , ϕ ( x ) , ..., ϕ k ( x ) : x ∈ R n } .The main result in this subsection is as follows: Theorem 1 . Suppose that:- There exists x such that f i ( x ) > for all i = 1 , , ..., p and- The epi-image Im ( f , − f , − f , ..., − f p )+( R + ) p +1 is convex (We there-fore say that the mapping ( f , − f , − f , ..., − f p ) is epi-convex). Then the S-procedure is valid, that is to say: ( I ) and ( C ) are equivalent. The first assumption:
There exists x such that f i ( x ) > for all i =1 , , ..., p is common in Optimization; it is a Slater -type assumption. Werefer to it hereafter as ( S ). A general remark.
Suppose that Im ( g , g , g , ..., g p ) is convex. ThenIm ( g , − g , − g , ..., − g p ) is also convex (as the image of the previous set From
J.-B. Hiriart-Urruty , A remark on the general S-procedure.
Unpublishedtechnical note (2020). u , u , u , ..., u p ) ( u , − u , − u , ..., − u p ). Hence,Im ( g , − g , − g , ..., − g p ) + ( R + ) p +1 , sum of two convex sets, is convex. Example 1 . Suppose that g , g , g , ..., g p are all convex functions. ThenIm ( g , g , g , ..., g p ) is not necessarily convex but Im ( g , g , g , ..., g p )+( R + ) p +1 is convex, as this is easily seen from the basic definition of convexity of the g i ’s. As a result, it comes from the main theorem above that the S-procedureis valid whenever f is convex and the f , f , ..., f p are concave; we thereforerecover a classical result in convex minimization (with convex inequalities). Example 2 (from [7, Example 3 . Epi-convex but not convex images.
Let q and q be defined on R as follows: q ( x, y ) = 2 x − y , q ( x, y ) = x + y. Then, Im( q , q ) = { ( u, v ) ∈ R : u > − v } is not convex. However, theepi-image F = Im( q , − q ) + ( R + ) = R is convex. Example 3.
Indeed a lot of effort has been made by authors to detect(rather strong) assumptions ensuring that an image set like Im ( g , g , g , ..., g p )is convex, especially with quadratic g i ’s (see [2 , , , q , q ) + ( R + ) is convex for any pair ofquadratic functions ( q , q ) ([2, assertion (b) in Theorem 4 . ( I ) and ( C )In this subsection, we intend to provide a geometrical exact characteriza-tion of the statement ( I ) and a “close to exact” geometrical characterizationof the statement ( C ). For that purpose, we posit: − (cid:0) R ∗ + × ( R + ) p (cid:1) = K (a polyhedral convex cone in R p +1 );Im ( f , − f , − f , ..., − f p ) = F (an image set in R p +1 , from the data).Given a set S , we denote by coS its convex hull, and by coneS its convexconical hull, that is to say nP ki =1 λ i u i : k positive integer, λ i > u i ∈ S for all i o .To link the two definitions, we clearly have that coneS = R ∗ + coS = co (cid:0) R ∗ + S (cid:1) . Theorem 2.
We have the following: ( I ) holds true ⇔ F ∩ K = ∅ ⇔ R ∗ + F ∩ K = ∅ (1)( C ) holds true ⇒ co F ∩ K = ∅ ⇔ cone F ∩ K = ∅ (2)( cone F ∩ K = ∅ and ( S )) ⇔ ( co F ∩ K = ∅ and ( S )) ⇒ ( C ) holds true .(3)5n short:- A geometrical equivalent form of ( I ) is F ∩ K = ∅ or R ∗ + F ∩ K = ∅ .- Provided the (slight) Slater -type assumption ( S ) is satisfied on the f i ’s, a geometrical equivalent form of ( C ) is either co F ∩ K = ∅ or cone F ∩ K = ∅ . Proof of Theorem 2 - For the first equivalence in (1), maybe it is easier to consider ( not I ).To have ( not I ) means that there exists x ∈ R n such that: f i ( x ) > i = 1 , ..., p , and f ( x ) >
0. This exactly expresses that
F ∩ K 6 = ∅ .The second equivalence in (1) is clear from the relation R ∗ + K = K . - To prove the first implication in (2), we use the notation h· , ·i for theusual inner product in R p +1 = R × R p ; thus h α, z i = α z + α z + ... + α p z p whenever α = ( α , α , ..., α p ) ∈ R × R p and z = ( z , z , ..., z p ) ∈ R × R p .By definition of the statement ( C ) itself, there exists α = ( α , α , ..., α p ) ∈− K (that is to say α > α i > i = 1 , ..., p ) such that h α, z i > z = ( z , z , ..., z p ) ∈ F .Clearly, this is equivalent to h α, z i > z = ( z , z , ..., z p ) ∈ co F . (4)We prove by contradiction that co F ∩ K is empty. Therefore, supposethere exists some β = ( β , β , ..., β p ) lying in co F ∩ K . Then, according tothe inequality (4) just above, we get that h α, β i = α β + X pi =1 α i β i > . (5)But, by definition of K , we have β < β i i = 1 , ..., p .Thus, recalling the signs of the α i ’s, one gets at h α, β i <
0, which contra-dicts (5).As for the equivalence in the second part of (2), it is clear from thefollowing observations: coneS = R ∗ + coS and R ∗ + K = K . - We now are going to prove that ( co F ∩ K = ∅ and ( S )) ⇒ ( C ).As expected in such a context, the proof is based on a separation theoremon convex sets. Because the two convex sets co F and K in R p +1 do not inter-sect, one can separate them properly: there exists α ∗ = ( α ∗ , α ∗ , α ∗ , ..., α ∗ p ) = 0in R × R p = R p +1 such thatsup b ∈K h α ∗ , b i inf z ∈F h α ∗ , z i = inf z ∈ co F h α ∗ , z i , (6)inf b ∈K h α ∗ , b i < sup z ∈F h α ∗ , z i . (7)6he second property (7) is useless here, due the nonemptiness of theinterior of K .Due to the specific structure of K , we deduce from (6) that α ∗ i > i = 0 , , , ..., p and, further, sup b ∈K h α ∗ , b i = 0. Now, what is in the right-hand side of (6) is just inf x ∈ R n h α ∗ f ( x ) − X pi =1 α ∗ i f i ( x ) i . We therefore haveproved that α ∗ f ( x ) − X pi =1 α ∗ i f i ( x ) > x ∈ R n . (8)We claim that α ∗ >
0. If not, we would have − X pi =1 α ∗ i f i ( x ) > , which comes into contradiction with our Slater -type assumption ( S ): f i ( x ) > i = 1 , , ..., p , and α ∗ i > i = 1 , , ..., p (and one of them is > α ∗ > (cid:3) Now, we are at the point for providing a rather general geometrical con-dition ensuring the validity of the S-procedure.
Theorem 3.
Assume the
Slater -type condition ( S ), and suppose thereexists a set Z ⊂ ( R + ) p +1 containing such that R + ( F + Z ) is convex. Thenthe S-procedure is valid, that is to say: ( I ) implies ( C ) (hence ( I ) and ( C )are equivalent). The set Z plays the role of a “convexifier” of the extended image-set R + F . Let us see how the made assumption covers the three following knowncases:- (The most stringent one). When the image set F itself is convex; theassumed condition is satisfied with Z = { } . - The epi-convex case (see § . Z = ( R + ) p +1 to fulfill the proposedassumption. Here, instead of considering F solely, one takes its so-called“upper set” F + ( R + ) p +1 .- The “conical convex” case, i.e. when R + F is convex; again the consid-ered assumption is verified with Z = { } . Proof of Theorem 3. irst step. We start from the assumption ( I ) in its equivalent form F ∩ K = ∅ (see (1) in Theorem 2). We make it a bit more general by observ-ing that ( F + Z ) ∩ K = ∅ for every set Z contained in ( R + ) p +1 . This is easyto check, as Z is contained in a cone placed “oppositely” to K . We even gofurther by observing that R ∗ + ( F + Z ) ∩ K = ∅ , since K is a cone. Finally,because 0 / ∈ K , we summarize the result of this first step in: R + ( F + Z ) ∩ K = ∅ . (9) Second step . Since 0 ∈ Z , we have
F ⊂ F + Z ; hence F ⊂ R + ( F + Z ).By the assumed convexity of R + ( F + Z ), we get at co F ⊂ R + ( F + Z ) . (10) Final step.
We infer from (9) and (10) that co F ∩ K = ∅ . It remains toapply the result (3) in Theorem 2 to get at the desired conclusion ( C ). (cid:3) Conclusion
We have expressed all the ingredients of the general S-procedure in purelygeometrical forms, as this was initiated in the seminal paper by
Fradkov ([3, pages 248 − References K. Derinkuyu and
M. C. Pinar , On the S-procedure and somevariants.
Math. Methods Oper. Res. 64, n ◦ − F. Flores-Bazan and
F. Opazo,
Characterizing the convexity ofjoint-range for a pair of inhomogeneous quadratic functions and strong dual-ity . Minimax Theory and its Applications, Vol. 1, n ◦ − A. L. Fradkov , Duality theorems for certain nonconvex extremalproblems.
Siberian Math. Journal 14 (1973), 247 − . S. V. Gusev and
A. L. Likhtarnikov , Kalman-Popov-Yakubovichlemma and the S-procedure: a historical survey.
Automation and RemoteControl, Vol. 67, n ◦
11 (2006), 1768 − . J.-B. Hiriart-Urruty and
M. Torki , Permanently going back andforth between the “quadratic world” and the “convexity world” in optimiza-tion.
J. of Applied Math. and Optimization 45 (2002), 169 − . I. Polik and
T. Terlaky , A survey of the S-lemma . SIAM Review,Vol. 49, n ◦ − . B. T. Polyack , Convexity of quadratic transformations and its usein control and optimization.
J. of Optimization Theory and Applications,Vol. 99, n ◦ − . M. Ramana and
A.J. Goldman , Quadratic maps with convex im-ages . Rutcor Research Report 36 −
94 (October 1994).9.
M. Teboulle , Nonconvex quadratic optimization: a guided detour.
Talk in Montpellier (September 2009), and