How to Convexify the Intersection of a Second Order Cone and a Nonconvex Quadratic
MMathematical Programming manuscript No. (will be inserted by the editor)
How to Convexify the Intersectionof a Second Order Coneand a Nonconvex Quadratic
Samuel Burer · Fatma Kılın¸c-Karzan
Submitted on June 10, 2014; Revised on June 6, 2015 and May 19, 2016
Abstract
A recent series of papers has examined the extension of disjunctive-programming techniques to mixed-integer second-order-cone programming. For exam-ple, it has been shown—by several authors using different techniques—that the convexhull of the intersection of an ellipsoid, E , and a split disjunction, ( l − x j )( x j − u ) ≤ l < u , equals the intersection of E with an additional second-order-cone repre-sentable (SOCr) set. In this paper, we study more general intersections of the form K ∩ Q and
K ∩ Q ∩ H , where K is a SOCr cone, Q is a nonconvex cone defined by asingle homogeneous quadratic, and H is an affine hyperplane. Under several easy-to-verify conditions, we derive simple, computable convex relaxations K∩S and
K∩S ∩ H ,where S is a SOCr cone. Under further conditions, we prove that these two sets cap-ture precisely the corresponding conic/convex hulls. Our approach unifies and extendsprevious results, and we illustrate its applicability and generality with many examples. Keywords: convex hull, disjunctive programming, mixed-integer linear programming,mixed-integer nonlinear programming, mixed-integer quadratic programming, noncon-vex quadratic programming, second-order-cone programming, trust-region subproblem.
Mathematics Subject Classification:
In this paper, we study nonconvex intersections of the form
K ∩ Q and
K ∩ Q ∩ H ,where the cone K is second-order-cone representable (SOCr), Q is a nonconvex conedefined by a single homogeneous quadratic, and H is an affine hyperplane. Our goal isto develop tight convex relaxations of these sets and to characterize the conic/convexhulls whenever possible. We are motivated by recent research on Mixed Integer Conic S. BurerDepartment of Management Sciences, University of Iowa, Iowa City, IA, 52242-1994, USA.E-mail: [email protected]. Kılın¸c KarzanTepper School of Business, Carnegie Mellon University, Pittsburgh, PA, 15213, USAE-mail: [email protected] a r X i v : . [ m a t h . O C ] M a y Programs (MICPs), though our results here enjoy wider applicability to nonconvexquadratic programs.Prior to the study of MICPs in recent years, cutting plane theory has been funda-mental in the development of efficient and powerful solvers for Mixed Integer LinearPrograms (MILPs). In this theory, one considers a convex relaxation of the problem,e.g., its continuous relaxation, and then enforces integrality restrictions to eliminateregions containing no integer feasible points—so-called lattice-free sets . The comple-ment of a valid two-term linear disjunction , say x j ≤ l ∨ x j ≥ u , is a simple formof a lattice-free set. The additional inequalities required to describe the convex hullof such a disjunction are known as disjunctive cuts . Such a disjunctive point of viewwas introduced by Balas [6] in the context of MILPs, and it has since been studiedextensively in mixed integer linear and nonlinear optimization [7,8,17,18,20,22,33,48,49], complementarity [29,31,43,51] and other nonconvex optimization problems [11,17]. In the case of MILPs, several well-known classes of cuts such as Chv´atal-Gomory , lift-and-project , mixed-integer rounding (MIR) , split , and intersection cuts are knownto be special types of disjunctive cuts. Stubbs and Mehrotra [50] and Ceria and Soares[20] extended cutting plane theory from MILP to convex mixed integer problems. Theseworks were followed by several papers [15,24,25,33,53] that investigated linear-outer-approximation based approaches, as well as others that extended specific classes ofinequalities, such as Chv´atal-Gomory cuts [19] for MICPs and MIR cuts [5] for SOC-based MICPs.Recently there has been growing interest in developing closed-form expressions forconvex inequalities that fully describe the convex hull of a disjunctive set involving anSOC. In this vein, G¨unl¨uk and Linderoth [27] studied a simple set involving an SOCin R and a single binary variable and showed that the resulting convex hull is charac-terized by adding a single SOCr constraint. For general SOCs in R n , this line of workwas furthered by Dadush et al. [23], who derived cuts for ellipsoids based on paralleltwo-term disjunctions, that is, split disjunctions . Modaresi et al. [40] extended this bystudying intersection cuts for SOC and all of its cross-sections (i.e., all conic sections),based on split disjunctions as well as a number of other lattice-free sets such as ellip-soids and paraboloids. A theoretical and computational comparison of intersection cutsfrom [40] with extended formulations and conic MIR inequalities from [5] is given in[39]. Taking a different approach, Andersen and Jensen [2] derived an SOC constraintdescribing the convex hull of a split disjunction applied to an SOC. Belotti et al. [12]studied families of quadratic surfaces having fixed intersections with two given hyper-planes, and in [13], they identified a procedure for constructing two-term disjunctivecuts when the sets defined by the disjunctions are bounded and disjoint. Kılın¸c-Karzan[34] introduced and examined minimal valid linear inequalities for general conic setswith a disjunctive structure, and under a mild technical assumption, established thatthey are sufficient to describe the resulting closed convex hulls. For general two-termdisjunctions on regular (closed, convex, pointed with nonempty interior) cones, Kılın¸c-Karzan and Yıldız [36] studied the structure of tight minimal valid linear inequalities.In the particular case of SOCs, based on conic duality, a class of convex valid inequal-ities that is sufficient to describe the convex hull were derived in [36] along with theconditions for SOCr representability of these inequalities as well as for the sufficiencyof a single inequality from this class. This work was recently extended in Yıldız andCornu´ejols [55] to all cross-sections of SOC that can be covered by the same assump-tions of [36]. Bienstock and Michalka [14] studied the characterization and separationof valid linear inequalities that convexify the epigraph of a convex, differentiable func- tion whose domain is restricted to the complement of a convex set defined by linear orconvex quadratic inequalities. Although all of these authors take different approaches,their results are comparable, for example, in the case of analyzing split disjunctionsof the SOC or its cross-sections. We remark also that these methods convexify in thespace of the original variables, i.e., they do not involve lifting. For additional convexifi-cation approaches for nonconvex quadratic programming, which convexify in the liftedspace of products x i x j of variables, we refer the reader to [4,9,16,17,52], for example.In this paper, our main contributions can be summarized as follows (see Section 3and Theorem 1 in particular). First, we derive a simple, computable convex relaxation K ∩ S of K ∩ Q , where S is an additional SOCr cone. This also provides the convexrelaxation K ∩ S ∩ H ⊇ K ∩ Q ∩ H . The derivation relies on several easy-to-verify condi-tions (see Section 3.2). Second, we identify stronger conditions guaranteeing moreoverthat K ∩ S = cl . conic . hull( K ∩ Q ) and
K ∩ S ∩ H = cl . conv . hull( K ∩ Q ∩ H ), where cl indicates the closure, conic.hull indicates the conic hull, and conv.hull indicates theconvex hull. Our approach unifies and significantly extends previous results. In partic-ular, in contrast to the existing literature on cuts based on lattice-free sets, here weallow a general Q without making an assumption that R n \ Q is convex. We illustratethe applicability and generality of our approach with many examples and explicitlycontrast our work with the existing literature.Our approach can be seen as a variation of the following basic, yet general, ideaof conic aggregation to generate valid inequalities. Suppose that f = f ( x ) is convex,while f = f ( x ) is nonconvex, and suppose we are interested in the closed convex hullof the set Q := { x : f ≤ , f ≤ } . For any 0 ≤ t ≤
1, the inequality f t := (1 − t ) f + tf ≤ Q , but f t is generally nonconvex. Hence, it is natural to seek valuesof t such that the function f t is convex for all x . One might even conjecture that someparticular convex f s with 0 ≤ s ≤ . conv . hull( Q ) = { x : f ≤ , f s ≤ } .However, it is known that this approach cannot generally achieve the convex hull evenwhen f , f are quadratic functions; see [40]. Such aggregation techniques to obtainconvex under-estimators have also been explored in the global-optimization literature,albeit without explicit results on the resulting convex hull descriptions (see [1] forexample).In this paper, we follow a similar approach in spirit, but instead of determining0 ≤ t ≤ f t for all x , we only require “almost” convexity,that is, the function f t is required to be convex on { x : f ≤ } . This weakenedrequirement is crucial. In particular, it allows us to obtain convex hulls for many caseswhere { x : f ≤ } is SOCr and f is a nonconvex quadratic, and we recover all of theknown results regarding two-term disjunctions cited above (see Section 5). We notethat using quite different techniques and under completely different assumptions, asimilar idea of aggregation for quadratic functions has been explored in [13,40] as well.Specifically, our weakened requirement is in contrast to the developments in [40], whichexplicitly requires the function f t to be convex everywhere. Also, our general Q allowsus to study general nonconvex quadratics f as opposed to the specific ones arising fromtwo-term disjunctions studied in [13]. As a practical and technical matter, instead ofworking directly with convex functions in this paper, we work in the equivalent realmof convex sets, in particular SOCr cones. Section 2 discusses in detail the features ofSOCr cones required for our analysis.Compared to the previous literature on MICPs, our work here is broader in thatwe study a general nonconvex cone Q defined by a single homogeneous quadratic func-tion. As a result, we assume neither the underlying matrix defining the homogeneous quadratic Q to be of rank at most 2 nor R n \Q to be convex. This is in contrast to a keyunderlying assumption used in the literature. Specifically, the majority of the earlierliterature on MICPs focus on specific lattice-free sets, e.g., all of the works [2,5,13,23,36,55] focus on either split or two-term disjunctions on SOCs or its cross-sections. Inthe case of two-term disjunctions, the matrix defining the homogeneous quadratic for Q is of rank at most 2; and moreover, the complement of any two-term disjunctionis a convex set. Even though, nonconvex quadratics Q with rank higher than 2 areconsidered in [40], unlike our general, Q this is done under the assumption that thecomplement of the nonconvex quadratic defines a convex set. Our general Q allowsfor a unified framework and works under weaker assumptions. In Sections 3.3 and 5and the Online Supplement, we illustrate and highlight these features of our approachand contrast it with the existing literature through a series of examples. Bienstock andMichalka [14] also consider more general Q under the assumption that R n \Q is convex,but their approach is quite different than ours. Whereas [14] relies on polynomial timeprocedures for separating and tilting valid linear inequalities, we directly give the con-vex hull description. In contrast, our study of the general, nonconvex quadratic cone Q allows its complement R n \ Q to be nonconvex as well.We remark that our convexification tools for general nonconvex quadratics havepotential applications beyond MICPs, for example in the nonconvex quadratic pro-gramming domain. We also can, for example, characterize: the convex hull of the dele-tion of an arbitrary ball from another ball; and the convex hull of the deletion of anarbitrary ellipsoid from another ellipsoid sharing the same center. In addition, we canuse our results to solve the classical trust region subproblem [21] using SOC optimiza-tion, complementing previous approaches relying on nonlinear [26,42] or semidefiniteprogramming [47]. Section 6 discusses these examples.Another useful feature of our approach is that we clearly distinguish the conditionsguaranteeing validity of our relaxation from those ensuring sufficiency. In [2,13,23,40], validity and sufficiency are intertwined making it difficult to construct convexrelaxations when their conditions are only partly satisfied. Furthermore, our derivationof the convex relaxation is efficiently computable and relies on conditions that are easilyverifiable. Finally, our conditions regarding the cross-sections (that is, intersection withthe affine hyperplane H ) are applicable for general cones other than SOCs.We would like to stress that the inequality describing the SOCr set S is efficientlycomputable. In other words, given the sets K ∩ Q and
K ∩ Q ∩ H , one can verify inpolynomial time the required conditions and then calculate in polynomial time theinequality for S to form the relaxations K ∩ S and
K ∩ S ∩ H . The core operationsinclude calculating eigenvalues/eigenvectors for several symmetric and non-symmetricmatrices and solving a two-constraint semidefinite program. The computation can alsobe streamlined in cases when any special structure of K and Q is known ahead of time.The paper is structured as follows. Section 2 discusses the details of SOCr cones,and Section 3 states our conditions and main theorem. In Section 3.2, we providea detailed discussion and pseudocode for verifying our conditions and computing theresulting SOC based relaxation S . Section 3.3 then provides a low-dimensional examplewith figures and comparisons with existing literature. We provide more examples withcorresponding figures and comparisons in the Online Supplement accompanying thisarticle. In Section 4, we prove the main theorem, and then in Sections 5 and 6, wediscuss and prove many interesting general examples covered by our theory. Section7 concludes the paper with a few final remarks. Our notation is mostly standard. Wewill define any particular notation upon its first use. Our analysis in this paper is based on the concept of SOCr (second-order-cone repre-sentable) cones. In this section, we define and introduce the basic properties of suchsets.A cone F + ⊆ R n is said to be second-order-cone representable (or SOCr ) if thereexists a matrix 0 (cid:54) = B ∈ R n × ( n − and a vector b ∈ R n such that the nonzero columnsof B are linearly independent, b (cid:54)∈ Range( B ), and F + = { x : (cid:107) B T x (cid:107) ≤ b T x } , (1)where (cid:107) · (cid:107) denotes the usual Euclidean norm. The negative of F + is also SOCr: F − := −F + = { x : (cid:107) B T x (cid:107) ≤ − b T x } . (2)Defining A := BB T − bb T , the union F + ∪ F − corresponds to the homogeneousquadratic inequality x T Ax ≤ F := F + ∪ F − = { x : (cid:107) B T x (cid:107) ≤ ( b T x ) } = { x : x T Ax ≤ } . (3)We also define int( F + ) := { x : (cid:107) B T x (cid:107) < b T x } bd( F + ) := { x : (cid:107) B T x (cid:107) = b T x } apex( F + ) := { x : B T x = 0 , b T x = 0 } . We next study properties of F , F + , F − such as their representations and unique-ness thereof. On a related note, Mahajan and Munson [38] have also studied sets as-sociated with nonconvex quadratics with a single negative eigenvalue but from a morecomputational point of view. The following proposition establishes some importantfeatures of SOCr cones: Proposition 1
Let F + be SOCr as in (1), and define A := BB T − bb T . Then apex( F + ) = null( A ) , A has at least one positive eigenvalue, and A has exactly onenegative eigenvalue. As a consequence, int( F + ) (cid:54) = ∅ .Proof For any x , we have the equation Ax = ( BB T − bb T ) x = B ( B T x ) − b ( b T x ) . (4)So x ∈ apex( F + ) implies x ∈ null( A ). The converse also holds by (4) because, bydefinition, the nonzero columns of B are independent and b (cid:54)∈ Range( B ). Hence,apex( F + ) = null( A ).The equation A = BB T − bb T , with 0 (cid:54) = BB T (cid:23) bb T (cid:23) b (cid:54)∈ Range( B ), implies that A has at least one positive eigenvalue and at most one negativeeigenvalue. Because b (cid:54)∈ Range( B ), we can write b = x + y such that x ∈ Range( B ),0 (cid:54) = y ∈ null( B T ), and x T y = 0. Then y T Ay = y T ( BB T − bb T ) y = 0 − ( b T y ) = −(cid:107) y (cid:107) < , showing that A has exactly one negative eigenvalue, and so int( F + ) contains either y or − y . (cid:117)(cid:116) We define analogous sets int( F − ), bd( F − ), and apex( F − ) for F − . In addition:int( F ) := { x : x T Ax < } = int( F + ) ∪ int( F − )bd( F ) := { x : x T Ax = 0 } = bd( F + ) ∪ bd( F − ) . Similarly, we have apex( F − ) = null( A ) = apex( F + ), and if A has exactly one negativeeigenvalue, then int( F − ) (cid:54) = ∅ and int( F ) (cid:54) = ∅ .When considered as a pair of sets {F + , F − } , it is possible that another choice( ¯ B, ¯ b ) in place of ( B, b ) leads to the same pair and hence to the same F . For example,( ¯ B, ¯ b ) = ( − B, − b ) simply switches the roles of F + and F − , but F does not change.However, we prove next that F is essentially invariant up to positive scaling. As acorollary, any alternative ( ¯ B, ¯ b ) yields A = ρ ( ¯ B ¯ B T − ¯ b ¯ b T ) for some ρ >
0, i.e., A isessentially invariant with respect to its ( B, b ) representation.
Proposition 2
Let A, ¯ A be two n × n symmetric matrices such that { x ∈ R n : x T Ax ≤ } = { x ∈ R n : x T ¯ Ax ≤ } . Suppose that A satisfies λ min ( A ) < < λ max ( A ) . Thenthere exists ρ > such that ¯ A = ρA .Proof Since λ min ( A ) <
0, there exists ¯ x ∈ R n such that ¯ x T A ¯ x <
0. Because x T Ax ≤ ⇔ x T ¯ Ax ≤
0, there exists no x such that x T Ax ≤ x T ( − ¯ A ) x <
0. Then,by the S-lemma (see Theorem 2.2 in [45], for example), there exists λ ≥ − ¯ A + λ A (cid:23)
0. Switching the roles of A and ¯ A , a similar argument implies theexistence of λ ≥ − A + λ ¯ A (cid:23)
0. Note λ >
0; otherwise, A would benegative semidefinite, contradicting λ max ( A ) >
0. Likewise, λ >
0. Hence, A (cid:23) λ ¯ A (cid:23) λ λ A ⇐⇒ (1 − λ λ ) A (cid:23) . Since λ min ( A ) < < λ max ( A ), we conclude λ λ = 1, which in turn implies A = λ ¯ A ,as claimed. (cid:117)(cid:116) Corollary 1
Let {F + , F − } be SOCr sets as in (1) and (2), and define A := BB T − bb T . Let ( ¯ B, ¯ b ) be another choice in place of ( B, b ) leading to the same pair {F + , F − } .Then A = ρ ( ¯ B ¯ B T − ¯ b ¯ b T ) for some ρ > . We can reverse the discussion thus far to start from a symmetric matrix A withat least one positive eigenvalue and a single negative eigenvalue and define associatedSOCr cones F + and F − . Indeed, given such an A , let Q Diag( λ ) Q T be a spectraldecomposition of A such that λ <
0. Let q j be the j -th column of Q , and define B := (cid:16) λ / q · · · λ / n q n (cid:17) ∈ R n × ( n − , b := ( − λ ) / q ∈ R n . (5)Note that the nonzero columns of B are linearly independent and b (cid:54)∈ Range( B ). Then A = BB T − bb T , and F = F + ∪ F − can be defined as in (1)–(3). An importantobservation is that, as a collection of sets, {F + , F − } is independent of the choice ofspectral decomposition. Proposition 3
Let A be a given symmetric matrix with at least one positive eigenvalueand a single negative eigenvalue, and let A = Q Diag( λ ) Q T be a spectral decompositionsuch that λ < . Define the SOCr sets {F + , F − } according to (1) and (2), where ( B, b ) is given by (5). Similarly, let { ¯ F + , ¯ F − } be defined by an alternative spectraldecomposition A = ¯ Q Diag(¯ λ ) ¯ Q T . Then { ¯ F + , ¯ F − } = {F + , F − } . Proof
Let ( ¯ B, ¯ b ) be given by the alternative spectral decomposition. Because A has asingle negative eigenvalue, ¯ b = b or ¯ b = − b . In addition, we claim (cid:107) ¯ B T x (cid:107) = (cid:107) B T x (cid:107) for all x . This holds because ¯ B ¯ B T = BB T is the positive semidefinite part of A . Thisproves the result. (cid:117)(cid:116) To resolve the ambiguity inherent in Proposition 3, one could choose a specific ¯ x ∈ int( F ), which exists by Proposition 1, and enforce the convention that, for any spectraldecomposition, F + is chosen to contain ¯ x . This simply amounts to flipping the sign of b so that b T ¯ x > In Section 3.1, we state our main theorem (Theorem 1) and the conditions upon whichit is based. The proof of Theorem 1 is delayed until Section 4. In Section 3.2, we discusscomputational details related to our conditions and Theorem 1.3.1 The resultTo begin, let A be a symmetric matrix satisfying the following: Condition 1 A has at least one positive eigenvalue and exactly one negative eigen-value. As described in Section 2, we may define SOCr cones F = F +0 ∪ F − based on A .We also introduce a symmetric matrix A and define the cone F := { x : x T A x ≤ } in analogy with F . However, we do not assume that A has exactly one negativeeigenvalue, so F does not necessarily decompose into two SOCr cones.We investigate the set F +0 ∩F , which has been expressed as K ∩Q in the Introduc-tion. In particular, we would like to develop strong convex relaxations of F +0 ∩ F and,whenever possible, characterize its closed conic hull. We focus on the full-dimensionalcase, and so we assume: Condition 2
There exists ¯ x ∈ int( F +0 ∩ F ) . Note that int( F +0 ∩ F ) = int( F +0 ) ∩ int( F ), and so Condition 2 is equivalent to¯ x T A ¯ x < x T A ¯ x < . (6)In particular, this implies A has at least one negative eigenvalue.The first part of Theorem 1 below establishes that cl . conic . hull( F +0 ∩ F ) is con-tained within the convex intersection of F +0 with a second set of the same type, i.e.,one that is SOCr. In addition to Conditions 1 and 2, we require the following condition,which handles the singularity of A carefully via several cases: Condition 3
Either (i) A is nonsingular, (ii) A is singular and A is positive def-inite on null( A ) , or (iii) A is singular and A is negative definite on null( A ) . Conditions 1–3 will ensure (see Proposition 4 in Section 4.1) the existence of amaximal s ∈ [0 ,
1] such that A t := (1 − t ) A + tA has a single negative eigenvalue for all t ∈ [0 , s ], A t is invertible for all t ∈ (0 , s ),and A s is singular—that is, null( A s ) is non-trivial. (Actually, A s may be nonsingularwhen s equals 1, but this is a small detail.) Indeed, we define s formally as follows. Let T := { t ∈ R : A t is singular } . Then s := (cid:26) min( T ∩ (0 , s given by (7), we can then define, for all A t with t ∈ [0 , s ], SOCr sets F t = F + t ∪F − t as described in Section 2. Furthermore, for ¯ x of Condition 2, noting that¯ x T A t ¯ x = (1 − t ) ¯ xA ¯ x + t ¯ x T A ¯ x < x ∈ F + t for all such t . Then Theorem 1 asserts that cl . conic . hull( F +0 ∩ F ) iscontained in F +0 ∩F + s . We remark that while F +0 ∩F ⊆ F +0 ∩F s (no “+” superscript on F s ) follows trivially from the definition of F s , strengthening the inclusion to F +0 ∩F ⊆F +0 ∩ F + s (with the “+” superscript) is nontrivial.The second part of Theorem 1 provides an additional condition under which F +0 ∩F + s actually equals the closed conic hull. The required condition is: Condition 4
When s < , apex( F + s ) ∩ int( F ) (cid:54) = ∅ . While Condition 4 may appear quite strong, we will actually show (see Lemma 3 inSection 4) that Conditions 1–3 and the definition of s already ensure apex( F + s ) ⊆ F .So Condition 4 is a type of regularity condition guaranteeing that the set apex( F + s ) =null( A s ) is not restricted to the boundary of F .We also include in Theorem 1 a specialization for the case when F +0 ∩ F is inter-sected with an affine hyperplane H , which has been expressed as K ∩ Q ∩ H in theIntroduction. For this, let h ∈ R n be given, and define the hyperplanes H := { x : h T x = 1 } , (8) H := { x : h T x = 0 } . (9)We introduce an additional condition related to H : Condition 5
When s < , apex( F + s ) ∩ int( F ) ∩ H (cid:54) = ∅ or F +0 ∩ F + s ∩ H ⊆ F . We now state the main theorem of the paper. See Section 4 for its proof.
Theorem 1
Suppose Conditions 1–3 are satisfied, and let s be defined by (7). Then cl . conic . hull( F +0 ∩ F ) ⊆ F +0 ∩ F + s , and equality holds under Condition 4. Moreover,Conditions 1–5 imply F +0 ∩ F + s ∩ H = cl . conv . hull( F +0 ∩ F ∩ H ) . F +0 ∩ F + s of the nonconvex cone F +0 ∩ F . For the purposes of computation, we assume that F +0 ∩ F is described as F +0 ∩ F = { x ∈ R n : (cid:107) B T x (cid:107) ≤ b T x, x T A x ≤ } , where B is nonzero, 0 (cid:54) = b (cid:54)∈ Range( B ), and A = B B T − b b T in accordance with(5). In particular, F +0 is given in its direct SOC form. Our goal is to calculate F + s interms of its SOC form (cid:107) B Ts x (cid:107) ≤ b Ts x , to which we will refer as the SOC cut.Before one can apply Theorem 1 to generate the cut, Conditions 1–3 must beverified. By construction, Condition 1 is satisfied, and verifying Condition 3(i) is easy.Conditions 3(ii) and 3(iii) are also easy to verify by computing the eigenvalues of Z T A Z , where Z is a matrix whose columns span null( A ). Due to (6) and the factthat F and F are cones, verifying Condition 2 is equivalent to checking the feasibilityof the following quadratic equations in the original variables x ∈ R n and the auxiliary“squared slack” variables s, t ∈ R : x T A x + s = − , x T A x + t = − . Let us define the underlying symmetric ( n + 2) × ( n + 2) matrices for these quadraticsas ˆ A and ˆ A . Since there are only two quadratic equations with symmetric matrices,by [10, Corollary 13.2], checking Condition 2 is equivalent to checking the feasibility ofthe following linear semidefinite system, which can be done easily in practice: Y (cid:23) , trace( ˆ A Y ) = − , trace( ˆ A Y ) = − . (10)See also [44] for a similar result.This equivalence of Condition 2 and the feasibility of system (10) relies on thefact that every extreme point of (10) is a rank-1 matrix, and such extreme points canbe calculated in polynomial time [44]. Extreme points can also be generated reliably(albeit heuristically) in practice to calculate an interior point ¯ x ∈ int( F +0 ∩ F ). Onecan simply minimize over (10) the objective trace(( I + R ) Y ), where I is the identitymatrix and R is a random matrix, small enough so that I + R remains positive definite.The objective trace(( I + R ) Y ) is bounded over (10), and hence an optimal solutionoccurs at an extreme point. The random nature of the objective also makes it highlylikely that the optimal solution is unique, in which case the optimal Y ∗ must be rank-1. Then ¯ x can easily be extracted from the rank-1 factorization of Y ∗ . Note that incertain specific cases ¯ x might be known ahead of time or could be computed right awayby some other means.Once Conditions 1–3 have been verified, we are then ready to calculate s accordingto its definition (7). If Condition 3(iii) holds, we simply set s = 0. For Conditions 3(i)and 3(ii), we need to calculate T , the set of scalars t such that A t := (1 − t ) A + tA issingular. Let us first consider Condition 3(i), which is the simpler case. The followingcalculation with t (cid:54) = 0 shows that the elements of T are in bijective correspondencewith the real eigenvalues of A − A : A t is singular ⇐⇒ ∃ x (cid:54) = 0 s . t . A t x = 0 ⇐⇒ ∃ x (cid:54) = 0 s . t . A − A x = − (cid:0) − tt (cid:1) x ⇐⇒ − (cid:0) − tt (cid:1) is an eigenvalue of A − A . So to calculate T , we calculate the real eigenvalues E of A − A , and then calculate T = { (1 − e ) − : e ∈ E} , where by convention 0 − = ∞ . In particular, |T | is finite.When Condition 3(ii) holds, we calculate T in a slightly different manner. Wewill show in Section 4 (see Lemma 1 in particular) that, even though A is singular, A (cid:15) is nonsingular for all (cid:15) > A (cid:15) could be calculated bysystematically testing values of (cid:15) near 0, for example. Then we can apply the procedureof the previous paragraph to calculate the set T of all ¯ t such that (1 − ¯ t ) A (cid:15) + ¯ tA issingular. Then one can check that T is calculated by the following affine transformation: T = { (1 − (cid:15) )¯ t + (cid:15) : ¯ t ∈ T } .Once T is computed, we can easily calculate s = min( T ∩ (0 , A s := (1 − s ) A + sA and calculate ( B s , b s ) according to (5).Then our cut is (cid:107) B Ts x (cid:107) ≤ b Ts x with only one final provision. We must check the signof b Ts ¯ x , where ¯ x ∈ int( F +0 ∩ F ) has been calculated previously. If b Ts ¯ x ≥
0, then thecut is as stated; if b Ts ¯ x <
0, then the cut is as stated but b s is first replaced by − b s .We summarize the preceding discussion by the pseudocode in Algorithm 1. Whilethis algorithm is quite general, it is also important to point out that it can be stream-lined if one already knows the structure of (cid:107) B T x (cid:107) ≤ b T x and x T A x ≤
0. For example,one may already know that A is invertible, in which case it would be unnecessary tocalculate the spectral decomposition of A in Algorithm 1. In addition, for many ofthe specific cases that we consider in Sections 5 and 6, we can explicitly point out thecorresponding value of s without even relying on the computation of the set T . Becauseof space considerations, we do not include these closed-form expressions for s and thecorresponding computations.Finally, we mention briefly the computability of Conditions 4 and 5, which arenot necessary for the validity of the cut but can establish its sufficiency. Given s < Z Ts A Z s , where Z s has columns spanningnull( A s ). We know Z Ts A Z s (cid:22) F + s ) ⊆ F (see Lemma 3 in Section4), and then Condition 4 holds as long as Z Ts A Z s (cid:54) = 0. On the other hand, it seemschallenging to verify Condition 5 in general. However, in Sections 5 and 6, we will showthat it can be verified in many examples of interest.3.3 An ellipsoid and a nonconvex quadraticIn R , consider the intersection of the unit ball defined by y + y + y ≤ − y − y + y ≤ y + y . By homogenizingvia x = (cid:0) yx (cid:1) with x = 1, we can represent the intersection as F +0 ∩ F ∩ H with A := − , A := − − − − − − , H := { x : x = 1 } . Conditions 1 and 3(i) are straightforward to verify, and Condition 2 is satisfied with¯ x = ( ; 0; 0; 1), for example. We can also calculate s = from (7). Then A s = −
20 0 0 −
10 0 6 0 − − − , F s = (cid:110) x : 3 x ≤ x x + x x + 2 x (cid:111) . Algorithm 1
Calculate Cut (see also Section 3.2)
Input:
Inequalities (cid:107) B T x (cid:107) ≤ b T x and x T A x ≤ Output:
Valid cut (cid:107) B Ts x (cid:107) ≤ b Ts x . Calculate A = B B T − b b T and a spectral decomposition Q Diag( λ ) Q T . Let Z be the submatrix of Q of zero eigenvectors (possibly empty). Minimize trace(( I + R ) Y over (10). If infeasible, then STOP. Otherwise, extract¯ x ∈ int( F +0 ∩ F ) from Y ∗ . if Z is empty then Calculate the set E of real eigenvalues of A − A . Set T = { (1 − e ) − : e ∈ E} . Set s = min( T ∩ (0 , else if Z T A Z (cid:31) then Determine (cid:15) > A (cid:15) = (1 − (cid:15) ) A + (cid:15)A is invertible. Calculate the set E of real eigenvalues of A − (cid:15) A . Set T = { (1 − ¯ e ) − : ¯ e ∈ E} . Set T = { (1 − (cid:15) )¯ t + (cid:15) : ¯ t ∈ T } . Set s = min( T ∩ (0 , else if Z T A Z ≺ then Set s = 0. else STOP. end if
Calculate A s = B s B Ts − b s b Ts and a spectral decomposition Q s Diag( λ s ) Q Ts . Let( B s , b s ) be given by (5). If b Ts ¯ x <
0, replace b s by − b s .The negative eigenvalue of A s is λ s := − with corresponding eigenvector q s :=(2; 1; 0; 5), and so, in accordance with the Section 2, we have that F + s equals all x ∈ F s satisfying b Ts x ≥
0, where b s := ( − λ s ) / q s = (cid:112) / . In other words, F + s := (cid:26) x : 3 x ≤ x x + x x + 2 x x + x + 5 x ≥ (cid:27) . Note that ¯ x ∈ F + s . In addition, apex( F + s ) = null( A s ) = span { d } , where d = (1; −
2; 0; 0).Clearly, d ∈ H and d T A d <
0, which verifies Conditions 4 and 5 simultaneously. Set-ting x = 1 and returning to the original variables y , we see (cid:26) y : y + y + y ≤ y ≤ y + y + 2 (cid:27) = cl . conv . hull (cid:26) y : y + y + y ≤ − y − y + y ≤ y + y (cid:27) , where the now redundant constraint 2 y + y ≥ − F + s ∩ H , and the closed convex hull.Of the earlier, related approaches, this example can be handled by [40] only. Inparticular, [2,13,23,35,36,55] cannot handle this example because they deal with only F +0 ∩ F ∩ H (b) F + s ∩ H (c) F +0 ∩ F + s ∩ H Fig. 1
An ellipsoid and a nonconvex quadratic split or two-term disjunctions but cannot cover general nonconvex quadratics. Theapproach of [14] is based on eliminating a convex region from a convex epigraphicalset, but this example removes a nonconvex region (specifically, R n \ F ). So [14] cannothandle this example either.In actuality, the results of [40] do not handle this example explicitly since theauthors only state results for: the removal of a paraboloid or an ellipsoid from aparaboloid; or the removal of an ellipsoid (or an ellipsoidal cylinder) from another ellip-soid with a common center. However, in this particular example, the function obtainedfrom the aggregation technique described in [40] is convex on all of R . Therefore, theirglobal convexity requirement on the aggregated function is satisfied for this example. In this section, we build the proof of Theorem 1, and we provide important insightsalong the way. The key results are Propositions 5–7, which state F +0 ∩ F ⊆ F +0 ∩ F + s ⊆ conic . hull( F +0 ∩ F ) F +0 ∩ F ∩ H ⊆ F +0 ∩ F + s ∩ H ⊆ conv . hull( F +0 ∩ F ∩ H ) , where s is given by (7). In each line here, the first containment depends only onConditions 1–3, which proves the first part of Theorem 1. On the other hand, thesecond containments require Condition 4 and Conditions 4–5, respectively. Then thesecond part of Theorem 1 follows by simply taking the closed conic hull and the closedconvex hull, respectively, and noting that F +0 ∩ F + s and F +0 ∩ F + s ∩ H are alreadyclosed and convex.4.1 The interval [0 , s ]Our next result, Lemma 1, is quite technical but critically important. For example, itestablishes that the line of matrices { A t } contains at least one invertible matrix notequal to A . As discussed in Section 3, this proves that the set T used in the definition (7) of s is finite and easily computable. The lemma also provides additional insightinto the definition of s . Specifically, the lemma clarifies the role of Condition 3 in (7). Lemma 1
For (cid:15) > small, consider A (cid:15) and A − (cid:15) . Relative to Condition 3: – if (i) holds, then A (cid:15) and A − (cid:15) are each invertible with one negative eigenvalue; – if (ii) holds, then only A (cid:15) is invertible with one negative eigenvalue; – if (iii) holds, then only A − (cid:15) is invertible with one negative eigenvalue. Since the proof of Lemma 1 is involved, we delay it until the end of this subsection.If Condition 3(i) or 3(ii) holds, then Lemma 1 shows that the interval (0 , (cid:15) ) containsinvertible A t , each with exactly one negative eigenvalue, and (7) takes s to be the largest (cid:15) with this property. By continuity, A s is singular (when s <
1) but still retains exactlyone negative eigenvalue, a necessary condition for defining F + s in Theorem 1. On theother hand, if Condition 3(iii) holds, then A is singular and no (cid:15) > s = 0 is still the natural “right-hand limit” of invertible A − (cid:15) , eachwith exactly one negative eigenvalue. This will be all that is required for Theorem 1.With Lemma 1 in hand, we can prove the following key result, which sets up theremainder of this section. The proof of Lemma 1 follows afterwards. Proposition 4
Suppose Conditions 1–3 hold. For all t ∈ [0 , s ] , A t has exactly onenegative eigenvalue. In addition, A t is nonsingular for all t ∈ (0 , s ) , and if s < , then A s is singular.Proof Condition 2 implies (6), and so ¯ x T A t ¯ x = (1 − t ) ¯ x T A ¯ x + t ¯ x T A ¯ x < t . So each A t has at least one negative eigenvalue. Also, the definition of s ensures thatall A t for t ∈ (0 , s ) are nonsingular and that A s is singular when s < A t with t ∈ [0 , s ] has two negative eigenvalues. Then by Con-dition 1 and the facts that the entries of A t are affine functions of t and the eigen-values depend continuously on the matrix entries [28, Section 2.4.9], there exists some0 ≤ r < t ≤ s with at least one zero eigenvalue, i.e., with A r singular. From the defi-nition of s , we deduce that r = 0 and A (cid:15) has two negative eigenvalues for (cid:15) > s >
0. However, we then encounter a contradictionwith Lemma 1, which states that A (cid:15) has exactly one negative eigenvalue. (cid:117)(cid:116) Proof (of Lemma 1)
The lemma holds under Condition 3(i) since A is invertible withexactly one negative eigenvalue and the eigenvalues are continuous in (cid:15) .Suppose Condition 3(ii) holds. Let V be the subspace spanned by the zero andpositive eigenvectors of A , and consider θ := inf { x T A x : x T ( A − A ) x = 1 , x ∈ V } . Clearly θ ≥
0, and we claim θ >
0. If θ = 0, then there exists { x k } ⊆ V with( x k ) T A x k → x k ) T ( A − A ) x k = 1 for all k . If { x k } is bounded, then passing toa subsequence if necessary, we have x k → ˆ x such that ˆ x T A ˆ x = 0 and ˆ x T ( A − A )ˆ x =1, which implies ˆ x T A ˆ x = −
1, a contradiction of Condition 3(ii). On the other hand,if { x k } is unbounded, then the sequence d k := x k / (cid:107) x k (cid:107) is bounded, and passingto a subsequence if necessary, we see that d k → ˆ d with (cid:107) ˆ d (cid:107) = 1, ˆ d T A ˆ d = 0 andˆ d T ( A − A ) ˆ d = 0. This implies ˆ d T A ˆ d = 0, violating Condition 3(ii). So θ > < (cid:15) ≤ θ/
2, and take any nonzero x ∈ V . Note that x T A (cid:15) x = (1 − (cid:15) ) x T A x + (cid:15)x T A x = x T A x − (cid:15)x T ( A − A ) x. (11) We wish to show x T A (cid:15) x >
0, and so we consider three subcases. First, if x T ( A − A ) x = 0, then it must hold that x T A x >
0. If not, then x T A x = 0 also, violatingCondition 3(ii). So x T A (cid:15) x = x T A x >
0. Second, if x T ( A − A ) x <
0, then because x ∈ V we have x T A (cid:15) x >
0. Third, if x T ( A − A ) x >
0, then we may assume withoutloss of generality by scaling that x T ( A − A ) x = 1 in which case x T A (cid:15) x ≥ θ − (cid:15) > A (cid:15) is positive definite on a subspace of dimension n − A (cid:15) has at least n − A (cid:15) has at least one negative eigenvalue because ¯ x T A (cid:15) ¯ x < A (cid:15) is invertible with exactly one negative eigenvalue, as claimed.By repeating a very similar argument for vectors x ∈ W , the subspace spannedby the negative and zero eigenvectors of A (note that W is at least two-dimensionalbecause Condition 3(ii) holds), and once again using the relation (11), we can showthat A − (cid:15) has at least two negative eigenvalues, as claimed.Finally, suppose Condition 3(iii) holds and define¯ A (cid:15) := (cid:0) (cid:15) (cid:1) A − (cid:15) = (cid:0) (cid:15) (cid:1) ((1 + (cid:15) ) A − (cid:15)A ) = (cid:0) (cid:15) (cid:15) (cid:1) A + (cid:0) (cid:15) (cid:15) (cid:1) ( − A )¯ A − (cid:15) := (cid:0) − (cid:15) (cid:1) A (cid:15) = (cid:0) − (cid:15) (cid:1) ((1 − (cid:15) ) A + (cid:15)A ) = (cid:0) − (cid:15) − (cid:15) (cid:1) A + (cid:0) − (cid:15) − (cid:15) (cid:1) ( − A ) . Then ¯ A (cid:15) and ¯ A − (cid:15) are on the line generated by A and − A such that − A is positivedefinite on the null space of A . Applying the previous case for Condition 3(ii), we seethat only ¯ A (cid:15) is invertible with a single negative eigenvalue. This proves the result. (cid:117)(cid:116) F +0 ∩ F ⊆ F +0 ∩ F + s For each t ∈ [0 , s ], Proposition 4 allows us to define analogs F t = F + t ∪F − t as describedin Section 2 based on any spectral decomposition A t = Q t Diag( λ t ) Q Tt .It is an important technical point, however, that in this paper we require λ t and Q t to be defined continuously in t . While it is well known that the vector of eigenvalues λ t can be defined continuously, it is also known that—if the eigenvalues are ordered, say,such that [ λ t ] ≤ · · · ≤ [ λ t ] n for all t —then the corresponding eigenvectors, i.e., theordered columns of Q t , cannot be defined continuously in general. On the other hand,if one drops the requirement that the eigenvalues in λ t stay ordered, then the followingresult of Rellich [46] (see also [32]) guarantees that λ t and Q t can be constructedcontinuously—in fact, analytically—in t : Theorem 2 (Rellich [46])
Because A t is analytic in the single parameter t , thereexist spectral decompositions A t = Q t Diag( λ t ) Q Tt such that λ t and Q t are analytic in t . So we define F + t and F − t using continuous spectral decompositions provided byTheorem 2: F + t := { x : (cid:107) B Tt x (cid:107) ≤ b Tt x }F − t := { x : (cid:107) B Tt x (cid:107) ≤ − b Tt x } , where B t and b t such that A t = B t B Tt − b t b Tt are derived from the spectral decompo-sition as described in Section 2. Recall from Proposition 3 that, for each t , a differentspectral decomposition could flip the roles of F + t and F − t , but we now observe that Theorem 2 and Condition 2 together guarantee that each F + t contains ¯ x from Condi-tion 2. In this sense, every F + t has the same “orientation.” Our observation is enabledby a lemma that will be independently helpful in subsequent analysis. Lemma 2
Suppose Conditions 1–3 hold. Given t ∈ [0 , s ] , suppose some x ∈ F + t sat-isfies b Tt x = 0 . Then t = 0 or t = s .Proof Since x T A t x ≤ b Tt x = 0, we have 0 = ( b Tt x ) ≥ (cid:107) B Tt x (cid:107) which implies A t x = ( B t B Tt − b t b Tt ) x = B t ( B Tt x ) − b t ( b Tt x ) = 0 . So A t is singular. By Proposition4, this implies t = 0 or t = s . (cid:117)(cid:116) Observation 1
Suppose Conditions 1–3 hold. Let ¯ x ∈ int( F +0 ∩ F ) . Then for all t ∈ [0 , s ] , ¯ x ∈ F + t .Proof Condition 2 implies b T ¯ x >
0. Let t ∈ (0 , s ] be fixed. Since ¯ x T A t ¯ x < x ∈ F + t or ¯ x ∈ F − t . Suppose for contradiction that ¯ x ∈ F − t , i.e., b Tt ¯ x <
0. Thenthe continuity of b t by Theorem 2 implies the existence of r ∈ (0 , t ) such that b Tr ¯ x = 0.Because ¯ x T A r ¯ x < x ∈ F + r . By Lemma 2, this implies r = 0 or r = s , acontradiction. (cid:117)(cid:116) In particular, Observation 1 implies that our discussion in Section 3 on choosing ¯ x ∈ F + t to facilitate the statement of Theorem 1 is indeed consistent with the discussion here.The primary result of this subsection, F +0 ∩ F + s is a valid convex relaxation of F +0 ∩ F , is given below. Proposition 5
Suppose Conditions 1–3 hold. Then F +0 ∩ F ⊆ F +0 ∩ F + s .Proof If s = 0, the result is trivial. So assume s >
0. In particular, Condition 3(i) or3(ii) holds. Let x ∈ F +0 ∩ F , that is, x T A x ≤ b T x ≥
0, and x T A x ≤
0. We wouldlike to show x ∈ F +0 ∩ F + s . So we need x T A s x ≤ b Ts x ≥
0. The first inequalityholds because x T A s x = (1 − s ) x T A x + s x T A x ≤
0. Now suppose for contradictionthat b Ts x <
0. In particular, x (cid:54) = 0. Then by the continuity of b t via Theorem 2, thereexists 0 ≤ r < s such that b Tr x = 0. Since x T A r x ≤ x ∈ F + r , and Lemma 2implies r = 0. So Condition 3(ii) holds. However, x ∈ F also, contradicting that A ispositive definite on null( A ). (cid:117)(cid:116) F +0 ∩ F + s ⊆ conic . hull( F +0 ∩ F )Proposition 5 in the preceding subsection establishes that F +0 ∩ F + s is a valid convexrelaxation of F +0 ∩ F under Conditions 1–3. We now show that, in essence, the reverseinclusion holds under Condition 4 (see Proposition 6). Indeed, when s = 1, we clearlyhave F +0 ∩ F +1 ⊆ F +0 ∩ F ⊆ conic . hull( F +0 ∩ F ). So the true case of interest is s < s < s = 1 simultaneously.)As mentioned in Section 3, Condition 4 is a type of regularity condition in light ofLemma 3 next. The proof of Proposition 6 also relies on Lemma 3. Lemma 3
Suppose Conditions 1–3 hold. Then apex( F + s ) ⊆ F . Proof
By Proposition 1, the claimed result is equivalent to null( A s ) ⊆ F . Let d ∈ null( A s ). If s = 1, then d T A d = 0, i.e., d ∈ bd( F ) ⊆ F , as desired. If s = 0, thenCondition 3(iii) holds, that is, A is singular and A is negative definite on null( A ).Then d ∈ null( A ) implies d T A d ≤
0, as desired.So assume s ∈ (0 , d (cid:54)∈ int( F ), that is, d T A d ≥
0, then the equation 0 =(1 − s ) d T A d + s d T A d implies d T A d ≤
0, as desired.We have thus reduced to the case s ∈ (0 ,
1) and d ∈ int( F ), and we proceedto derive a contradiction. Without loss of generality, assume that d ∈ int( F +0 ) and − d ∈ int( F − ). We know − d ∈ null( A s ) = apex( F + s ) ⊆ F + s . In total, we have − d ∈F + s ∩ int( F − ). We claim that, in fact, F + t ∩ int( F − ) (cid:54) = ∅ as t → s .Note that F + t is a full-dimensional set because ¯ x T A t ¯ x < F + t isdefined by the intersection of a homogeneous quadratic x T A t x ≤ b Tt x ≥ A t , b t ) → ( A s , b s ) as t → s . Then the boundary of F + t convergesto the boundary of F + s as t → s . Since F + t is a full-dimensional, convex set (in factSOC), F + t then converges as a set to F + s as t → s . So there exists a sequence y t ∈ F + t converging to − d . In particular, F + t ∩ int( F − ) (cid:54) = ∅ for t → s .We can now achieve the desired contradiction. For t < s , let x ∈ F + t ∩ int( F − ). Then x T A x ≤ , b T x < x T A t x ≤ , b Tt x ≥
0. It follows that x T A r x ≤ , b Tr x = 0for some 0 < r ≤ t < s . Hence, Lemma 2 implies r = 0 or r = s , a contradiction. (cid:117)(cid:116) Proposition 6
Suppose Conditions 1–4 hold. Then F +0 ∩ F + s ⊆ conic . hull( F +0 ∩ F ) .Proof First, suppose s = 1. Then the result follows because F +0 ∩ F +1 ⊆ F +0 ∩ F ⊆ conic . hull( F +0 ∩ F ). So assume s ∈ [0 , x ∈ F +0 ∩ F + s , that is, x T A x ≤ b T x ≥ x T A s x ≤ b Ts x ≥
0. If x T A x ≤
0, we are done. So assume x T A x > d ∈ null( A s ) such that d T A d <
0. In addition, d isnecessarily perpendicular to the negative eigenvector b s . For all (cid:15) ∈ R , consider theaffine line of points given by x ε := x + (cid:15) d . We have x Tε A s x ε = ( x + (cid:15) d ) T A s ( x + (cid:15) d ) = x T A s x ≤ b Ts x ε = b Ts ( x + (cid:15) d ) = b Ts x ≥ (cid:27) = ⇒ x (cid:15) ∈ F + s . Note that x Tε A x ε = x T A x +2 (cid:15) d T A x + (cid:15) d T A d . Then x Tε A x ε defines a quadraticfunction of ε and its roots are given by ε ± = − d T A x ± √ ( d T A x ) − ( x T A x )( d T A d ) d T A d . Since x T A x > d T A d <
0, the discriminant is greater than | d T A x | . Hence,one of the roots will be positive and the other one will be negative. Then there exist l := ε − < < ε + =: u such that x Tl A x Tl = x Tu A x u = 0, i.e., x l , x u ∈ F . Then s < x Tl A s x l ≤ x Tl A x l ≤
0, and hence x l ∈ F . Similarly, x Tu A x u ≤ x u ∈ F . We will prove in the next paragraph that both x l and x u are in F +0 , which will establish the result because then x l , x u ∈ F +0 ∩ F and x is a convexcombination of x l and x u .Suppose that at least one of the two points x l or x u is not a member of F +0 . Withoutloss of generality, say x l (cid:54)∈ F +0 . Then x l ∈ F − with − b T x l >
0. Similar to Proposition5, we can prove F − ∩F ⊆ F − ∩F − s , and so x l ∈ F − ∩F − s . Then x l ∈ F + s ∩F − s , whichimplies b Ts x l = 0 and B Ts x l = 0, which in turn implies A s x l = 0, i.e., x l ∈ null( A s ).Then x + l d = x l ∈ null( A s ) implies x ∈ null( A s ) also. Then x ∈ F by Lemma 3, butthis contradicts the earlier assumption that x T A x > (cid:117)(cid:116) H and H are defined according to(8) and (9), where h ∈ R n . Also define H + := { x : h T x ≥ } . Our first task is to prove the analog of Propositions 5–6 under intersection with H + . Specifically, we wish to show that the inclusions F +0 ∩ F ∩ H + ⊆ F +0 ∩ F + s ∩ H + ⊆ conic . hull( F +0 ∩ F ∩ H + ) (12)hold under Conditions 1–5. As Condition 5 consists of two parts, we break the proofinto two corresponding parts (Lemma 4 and Corollary 2). Note that Condition 5 onlyapplies when s <
1, although results are stated covering both s < s = 1simultaneously. Lemma 4
Suppose Conditions 1–4 and the first part of Condition 5 hold. Then (12)holds.Proof
Proposition 5 implies that F +0 ∩ F ∩ H + ⊆ F +0 ∩ F + s ∩ H + . Moreover, we canrepeat the proof of Proposition 6, intersecting with H + along the way. However, werequire one key modification in the proof of Proposition 6.Let x ∈ F +0 ∩ F + s ∩ H + with x T A x >
0. Then, mimicking the proof of Proposition6 for s ∈ [0 ,
1) and d ∈ apex( F + s ) ∩ int( F ) from Condition 4, x ∈ { x (cid:15) := x + (cid:15) d : (cid:15) ∈ R } ⊆ F + s . Moreover, x is a strict convex combination of points x l , x u ∈ F +0 ∩ F where x l , x u are as defined in the proof of Proposition 6. Hence, the entire closed intervalfrom x l to x u is contained in F +0 ∩ F + s .Under the first part of Condition 5, if there exists d ∈ apex( F + s ) ∩ int( F ) ∩ H ,then h T d = 0 and this particular d can be used to show that x l , x u identified in theproof of Proposition 6 also satisfy h T x l = h T ( x + l d ) = h T x ≥ x ∈ H + )and h T x u = h T ( x + u d ) = h T x ≥
0, i.e., x l , x u ∈ F +0 ∩ F ∩ H + . Then this implies x ∈ F +0 ∩ F ∩ H + , as desired. (cid:117)(cid:116) Regarding the second part of Condition 5, we prove Corollary 2 using the followingmore general lemma involving cones that are not necessarily SOCr:
Lemma 5
Let G , G , and G s be cones such that G , G s are convex, G ∩G ⊆ G ∩G s ⊆ conic . hull( G ∩ G ) and G ∩ G s ∩ H ⊆ G . Then G ∩ G ∩ H + ⊆ G ∩ G s ∩ H + ⊆ conic . hull( G ∩ G ∩ H + ) . Proof
For notational convenience, define G := G ∩ G and G s := G ∩ G s . We clearlyhave G ∩ H + ⊆ G s ∩ H + ⊆ conic . hull( G ) ∩ H + . We will show G s ∩ H + ⊆ conic . hull( G ∩ H + ). Consider x ∈ G s ∩ H + . Either h T x = 0 or h T x > h T x = 0, then x ∈ G s ∩ H ⊆ G by the premise of the lemma. Thus x ∈G s ∩ H + ∩ G ⊆ conic . hull( G ∩ H + ), as desired.When h T x >
0, because G s ⊆ conic . hull( G ), we know that x can be expressedas a finite sum x = (cid:80) k λ k x k , where each x k ∈ G ⊆ G s and λ i >
0. Define I := { k : h T x k ≥ } and J := { k : h T x k < } . If J = ∅ , then we are done as we have shown x ∈ conic . hull( G ∩ H + ). If not, then for all j ∈ J , let y j be a strict conic combinationof x and x j such that y j ∈ H . In particular, there exists α j ≥ β j > y j = α j x + β j x j . Note also that y j ∈ G s because G s is convex and x, x j ∈ G s . Then y j ∈ G s ∩ H ⊆ G . As a result, for all j ∈ J , we have y j ∈ G ∩ H + . Rewriting x as x = (cid:88) i ∈ I λ i x i + (cid:88) j ∈ J λ j β j (cid:16) y j − α j x (cid:17) ⇐⇒ (cid:88) j ∈ J λ j α j β j x = (cid:88) i ∈ I λ i x i + (cid:88) j ∈ J λ j β j y j , we conclude that x is a conic combination of points in G ∩ H + , as desired. (cid:117)(cid:116) Corollary 2
Suppose Conditions 1–4 and the second part of Condition 5 hold. Then(12) holds.Proof
Apply Lemma 5 with G := F +0 , G := F , and G s := F + s . Propositions 5–6 andthe second part of Condition 5 ensure that the hypotheses of Lemma 5 are met. Thenthe result follows. (cid:117)(cid:116) Even though our goal in this subsection is Proposition 7, which involves inter-section with the hyperplane H , we remark that Lemmas 4–5 can help us investigateintersections with homogeneous halfspaces H + for SOCr cones (Lemma 4) or more gen-eral cones (Lemma 5). Further, by iteratively applying Lemmas 4–5, we can considerintersections with multiple halfspaces, say, H +1 , . . . , H + m .Given Lemma 4 and Corollary 2, we are now ready to prove our main result for thissubsection, Proposition 7, which establishes the second part of Theorem 1. It requiresthe following simple lemmas which are applicable to general sets and cones: Lemma 6
Let S be any set, and let rec . cone( S ) be its recession cone. Then conv . hull( S )+conic . hull(rec . cone( S )) = conv . hull( S ) .Proof The containment ⊇ is clear. Now let x + y be in the left-hand side such that x = (cid:88) k λ k x k , x k ∈ S, λ k > , (cid:88) k λ k = 1 , and y = (cid:88) j ρ j y j , y j ∈ rec . cone( S ) , ρ j > . Without loss of generality, we may assume the number of x k ’s equals the number of y j ’s by splitting some λ k x k or some ρ j y j as necessary. Then x + y = (cid:88) k ( λ k x k + ρ k y k ) = (cid:88) k λ k ( x k + λ − k ρ k y k ) ∈ conv . hull( S ) . (cid:117)(cid:116) Lemma 7
Let G and G s be cones (not necessarily convex) such that G ∩ H + ⊆G s ∩ H + ⊆ conic . hull( G ∩ H + ) . Then G ∩ H ⊆ G s ∩ H ⊆ conv . hull( G ∩ H ) .Proof We have G ∩ H ⊆ G s ∩ H ⊆ conic . hull( G ∩ H + ) ∩ H . We claim furtherthatconic . hull( G ∩ H + ) ∩ H ⊆ conic . hull( G ∩ H ) + conv . hull( G ∩ H ) . (13)Then applying Lemma 6 with S := G ∩ H and rec . cone( S ) = G ∩ H , we see thatconic . hull( G ∩ H + ) ∩ H ⊆ conv . hull( G ∩ H ), which proves the lemma. To prove the claim (13), let x ∈ conic . hull( G ∩ H + ) ∩ H . Then h T x = 1 and x = (cid:88) k λ k x k , x k ∈ G ∩ H + , λ k > , which may further be separated as x = (cid:88) k : h T x k > λ k x k (cid:124) (cid:123)(cid:122) (cid:125) := y + (cid:88) k : h T x k =0 λ k x k (cid:124) (cid:123)(cid:122) (cid:125) := r = y + r. Note that r ∈ conic . hull( G ∩ H ), and so it sufficies to show y ∈ conv . hull( G ∩ H ).Rewrite y as y = (cid:88) k : h T x k > λ k x k = (cid:88) k : h T x k > ( λ k · h T x k ) (cid:124) (cid:123)(cid:122) (cid:125) :=˜ λ k ( x k /h T x k ) (cid:124) (cid:123)(cid:122) (cid:125) :=˜ x k =: (cid:88) k : h T x k > ˜ λ k ˜ x k . By construction, each ˜ x k ∈ G ∩ H . Moreover, each ˜ λ k is positive and (cid:88) k : h T x k > ˜ λ k = (cid:88) k : h T x k > λ k · h T x k = h T y = h T ( x − r ) = 1 − , since x ∈ H . So y ∈ conv . hull( G ∩ H ). (cid:117)(cid:116) Proposition 7
Suppose Conditions 1–5 hold. Then F +0 ∩ F ∩ H ⊆ F +0 ∩ F + s ∩ H ⊆ conv . hull( F +0 ∩ F ∩ H ) . Proof
Define G := F +0 ∩ F and G s := F +0 ∩ F + s . Lemma 4 and Corollary 2 imply G ∩ H + ⊂ G s ∩ H + ⊆ conic . hull( G ) ∩ H + . Then Lemma 7 implies the result. (cid:117)(cid:116) As with Lemma 5, we have stated Lemma 7 in terms of general cones, extendingbeyond just SOCr cones. In particular, in future research, these results may allow thederivation of conic and convex hulls for the intersects with more general cones.
In this section (specifically Sections 5.1–5.4), we consider the intersection of the canon-ical second-order cone K := { x : (cid:107) ˜ x (cid:107) ≤ x n } , where ˜ x = ( x ; . . . ; x n − ) , and a two-term linear disjunction defined by c T x ≥ d ∨ c T x ≥ d . Without loss ofgenerality, we take d , d ∈ { , ± } with d ≥ d , and we work with the followingcondition: Condition 6
The disjunctive sets K := K ∩{ x : c T x ≥ d } and K := K ∩{ x : c T x ≥ d } are non-intersecting except possibly on their boundaries, e.g., K ∩ K ⊆ (cid:26) x ∈ K : c T x = d c T x = d (cid:27) . This condition ensures that, on K , the disjunction c T x ≥ d ∨ c T x ≥ d is equivalent tothe quadratic inequality ( c T x − d )( c T x − d ) ≤
0. Condition 6 is satisfied, for example,when the disjunction is a proper split, i.e., c (cid:107) c with c T c < K ∪ K (cid:54) = K , and d = d . (In this case of a split disjunction, if d (cid:54) = d , then it can be shown that theclosed conic hull of K ∪ K is just K .)Because d , d ∈ { , ± } with d ≥ d , we can break our analysis into the followingthree cases with a total of six subcases:(a) d = d = 0, covering subcase ( d , d ) = (0 , d = d nonzero, covering subcases ( d , d ) ∈ { ( − , − , (1 , } ;(c) d > d , covering subcases ( d , d ) ∈ { (0 , − , (1 , − , (1 , } .Case (a) is the homogeneous case, in which we take A = J := Diag(1 , . . . , , −
1) and A = c c T + c c T to match our set of interest K ∩ F . Note that K = F +0 in this case.For the non-homogeneous cases (b) and (c), we can homogenize via y = (cid:0) xx n +1 (cid:1) with h T y = x n +1 = 1. Defining A := (cid:18) J
00 0 (cid:19) , A := (cid:18) c c T + c c T − d c − d c − d c T − d c T d d (cid:19) , we then wish to examine F +0 ∩ F ∩ H .In fact, by the results in [36, Section 5.2], case (c) implies that cl . conic . hull( F +0 ∩F ) cannot in general be captured by two conic inequalities, making it unlikely thatour desired equality cl . conv . hull( F +0 ∩ F ∩ H ) = F +0 ∩ F + s ∩ H will hold in general.So we will focus on cases (a) and (b). Nevertheless, we include some comments on case(c) in Section 5.4.Later on, in Section 5.3, we will also revisit Condition 6 to show that it is unnec-essary in some sense. Precisely, even when Condition 6 does not hold, we can derivea related convex valid inequality, which, together with F +0 , gives the complete convexhull description. This inequality precisely matches the one already described in [36],but it does not have an SOC form.In contrast to Sections 5.1–5.4, Section 5.5 examines two-term disjunctions on conicsections of K , i.e., intersections of K with a hyperplane.5.1 The case (a) of d = d = 0As discussed above, we have A := J and A := c c T + c c T . If either c i ∈ K , thenthe corresponding side of the disjunction K i simply equals K , so the conic hull is K .In addition, if either c i ∈ int( −K ), then K i = { } , so the conic hull equals the other K j . Hence, we assume both c i (cid:54)∈ K ∪ int( −K ), i.e., (cid:107) ˜ c i (cid:107) ≥ | c i,n | , where c i = (cid:0) ˜ c i c i,n (cid:1) .Since the example in Section 4 of the Online Supplement violates Condition 4 with (cid:107) ˜ c (cid:107) = | c ,n | , we further assume that both (cid:107) ˜ c i (cid:107) > | c i,n | .Conditions 1 and 3(i) are easily verified. In particular, s >
0. Condition 2 describesthe full-dimensional case of interest. It remains to verify Condition 4. (Note that Con-dition 4 is only relevant when s < s <
1, and given nonzero z ∈ null( A s ), we will show z T A z = 2( c T z )( c T z ) < , verifying Condition 4. We already know from Lemma 3 that z T A z ≤
0. So it remainsto show that both c T z and c T z are nonzero.Since z ∈ null( A s ), we know (cid:0) − ss (cid:1) A z = − A z , i.e., (cid:0) − ss (cid:1) (cid:32) ˜ z − z n (cid:33) = − c ( c T z ) − c ( c T z ) . (14)Note that c T z = (cid:0) ˜ c − c ,n (cid:1) T (cid:0) ˜ z − z n (cid:1) , so multiplying both sides of equation (14) with (cid:0) ˜ c − c ,n (cid:1) T and rearranging terms, we obtain (cid:104) − ss + ˜ c T ˜ c − c ,n c ,n (cid:105) ( c T z ) = (cid:16) c ,n − (cid:107) ˜ c (cid:107) (cid:17) ( c T z ) . Similarly, using (cid:0) ˜ c − c ,n (cid:1) T , we obtain: (cid:104) − ss + ˜ c T ˜ c − c ,n c ,n (cid:105) ( c T z ) = (cid:16) c ,n − (cid:107) ˜ c (cid:107) (cid:17) ( c T z ) . The inequalities (cid:107) ˜ c (cid:107) > | c ,n | and (cid:107) ˜ c (cid:107) > | c ,n | thus imply c T z (cid:54) = 0 ⇔ c T z (cid:54) = 0.Moreover, c T z and c T z cannot both be 0; otherwise, z would be 0 by (14).Note that [35,36] give an infinite family of valid inequalities in this setup but do notprove the sufficiency of a single inequality from this family. In this case, the sufficiencyproof for a single inequality from this family is given recently in [55]. None of the otherpapers [2,23,40] are relevant here because they consider only split disjunctions, notgeneral two-term disjunctions. Because of the boundedness assumption used in [13],[13] is not applicable here either. Similar to the example in Section 1 of the OnlineSupplement, as long as the disjunction can be viewed as removing a convex set, wecan try to apply [14] to this case by considering the SOC as the epigraph of the norm (cid:107) ˜ x (cid:107) . However, the authors’ special conditions for polynomial-time separability such asdifferentiability or growth rate are not satisfied; see Theorem IV therein.5.2 The case (b) of nonzero d = d In [36], it was shown that c − c ∈ ±K implies one of the sets K i defining the disjunctionis contained in the other K j , and thus the desired closed convex hull trivially equals K j . So we assume c − c (cid:54)∈ ±K , i.e., (cid:107) ˜ c − ˜ c (cid:107) > ( c ,n − c ,n ) , where c i = (cid:0) ˜ c i c i,n (cid:1) .Defining σ = d = d , we have A := (cid:18) J
00 0 (cid:19) , A := (cid:18) c c T + c c T − σ ( c + c ) − σ ( c + c ) T (cid:19) . Conditions 1 and 3(ii) are easily verified, and Condition 2 describes the full-dimensionalcase of interest. It remains to verify Conditions 4 and 5. So assume s <
1, and note s > z + ∈ R n +1 , write z + = (cid:0) zz n +1 (cid:1) and z = (cid:0) ˜ zz n (cid:1) ∈ R n . Suppose z + (cid:54) = 0. Then z + ∈ null( A s ) ⇐⇒ (cid:0) − ss (cid:1) A z + = − A z + ⇐⇒ (cid:0) − ss (cid:1) A z + = − (cid:0) c − σ (cid:1)(cid:0) c − σ (cid:1) T z + − (cid:0) c − σ (cid:1)(cid:0) c − σ (cid:1) T z + =: α (cid:0) c − σ (cid:1) + β (cid:0) c − σ (cid:1) . Since the last component of A z + is zero, we must have β = − α . We claim α (cid:54) = 0.Assume for contradiction that α = 0. Then z = 0, but z n +1 (cid:54) = 0 as z + is nonzero. Onthe other hand, because z + ∈ null( A s ), Lemma 3 implies 0 ≥ ( z + ) T A z + = 2 z n +1 , acontradiction. So indeed α (cid:54) = 0.Because z + ∈ null( A s ) and s ∈ (0 , z + ) T A s z + = (1 − s )( z + ) T A z + + s ( z + ) T A z + , implies Condition 4 holds if and only if ( z + ) T A z + >
0. From the previous paragraph,we have (cid:0) − ss (cid:1) A z + = α (cid:0) c − c (cid:1) with α (cid:54) = 0. Then (cid:0) − ss (cid:1) ( z + ) T A z + = α (˜ c − ˜ c ) − α ( c ,n − c ,n ) z n +1 T α (˜ c − ˜ c ) α ( c ,n − c ,n )0 = α (cid:16) (cid:107) ˜ c − ˜ c (cid:107) − ( c ,n − c ,n ) (cid:17) > , as desired.However, it seems difficult to verify Condition 5 generally. For example, consider itssecond part F +0 ∩ F + s ∩ H ⊆ F . In the current context, we have F +0 ∩ H = K × { } ,and it is unclear if its intersection with F + s would be contained in F . Letting (cid:0) ˆ h (cid:1) ∈ F + s with ˆ h ∈ K , we would have to check the following:0 ≥ (cid:32) ˆ h (cid:33) T A s (cid:32) ˆ h (cid:33) = (1 − s ) ˆ h T J ˆ h + 2 s ( c T ˆ h )( c T ˆ h ) = ⇒ (cid:32) ˆ h (cid:33) ∈ F . If ˆ h were in the interior of K , then ˆ h T J ˆ h < c T ˆ h )( c T ˆ h ) >
0, so that (cid:0) ˆ h (cid:1) ∈ F would not be achieved. So it seems Condition 5 will hold under additionalconditions only.One such set of conditions ensuring Condition 5 is as follows: there exists β , β ≥ β c + c ∈ −K and β c + c ∈ K . These hold, for example, for splitdisjunctions, i.e., when c is a negative multiple of c . To prove Condition 5, takeˆ h ∈ K . Then c T ˆ h ≥ c T ˆ h = − β c T ˆ h + ( β c + c ) T ˆ h ≤ , and similarly c T ˆ h ≤ c T ˆ h ≥
0. Then overall ˆ h ∈ K implies ( c T ˆ h )( c T ˆ h ) ≤
0. Inthe context of the previous paragraph, this ensures F +0 ∩ F + s ∩ H ⊆ F +0 ∩ H ⊆ F ,thus verifying Condition 5.Note that [35,36] cover this case. In the case of split disjunctions with d = d = 1,these results are also presented in [2,40]. Whenever the boundedness assumption of[13] is satisfied, one can use their result as well, but the papers [23,55] are not relevanthere. Similar to the previous subsection, [14] is limited in its application to this case.5.3 Revisiting Condition 6For the cases d = d of Sections 5.1 and 5.2, we know that F +0 ∩ F + s is a valid convexrelaxation of F +0 ∩ F under Conditions 1–3 and 6. The same holds for the cross-sections: F +0 ∩ F + s ∩ H is a relaxation of F +0 ∩ F ∩ H . Because Condition 3(i) is verified in the case of d = d = 0 and Condition 3(ii) is verified in the case of nonzero d = d , we have s >
0. However, when Condition 6 is violated, it may be possiblethat F + s is invalid for points simultaneously satisfying both sides of the disjunction,i.e., points x with c T x ≥ d and c T x ≥ d . This is because such points can violate thequadratic ( c T x − d )( c T x − d ) ≤ F + s is derived. In such cases, the set F + s should be relaxed somehow.Recall that, by definition, F + s = { x : x T A s x ≤ , b Ts x ≥ } . Let us examine theinequality x T A s x ≤
0, which can be rewritten as0 ≥ (1 − s ) x T Jx + 2 s ( c T x − d )( c T x − d ) ⇐⇒ ≥ − s ) x T Jx + s (cid:16) [( c T x − d ) + ( c T x − d )] − [( c T x − d ) − ( c T x − d )] (cid:17) ⇐⇒ s [( c − c ) T x − ( d − d )] − − s ) x T Jx ≥ s [( c + c ) T x − ( d + d )] . Note that the left hand-side of the third inequality is nonnegative for any x ∈ K since x T Jx ≤
0. Therefore, x ∈ K implies x T A s x ≤ (cid:113)(cid:2) ( c − c ) T x − ( d − d ) (cid:3) − (cid:0) − ss (cid:1) x T Jx ≥ | ( c + c ) T x − ( d + d ) | . (15)An immediate relaxation of (15) is (cid:113)(cid:2) ( c − c ) T x − ( d − d ) (cid:3) − (cid:0) − ss (cid:1) x T Jx ≥ ( d + d ) − ( c + c ) T x (16)since | ( c + c ) T x − ( d + d ) | ≥ ( d + d ) − ( c + c ) T x . Note also that (16) is clearlyvalid for any x satisfying c T x ≥ d and c T x ≥ d since the two sides of the inequalityhave different signs in this case. In total, the set G + s := { x : (16) holds , b Ts x ≥ } is a valid relaxation when Condition 6 does not hold. Although not obvious, it followsfrom [36] that (16) is a convex inequality. In that paper, (16) was encountered from adifferent viewpoint, and its convexity was established directly, even though it does notadmit an SOC representation. So in fact G + s is convex.Now let us assume that Condition 4 holds as well so that F + s captures the conichull of the intersection of F +0 and ( c T x − d )( c T x − d ) ≤
0. We claim that F +0 ∩ G + s captures the conic hull when Condition 6 does not hold. (A similar claim will also holdwhen Condition 5 holds for the further intersection with H .) So let ˆ x ∈ F +0 ∩ G + s begiven. If (15) happens to hold also, then ˆ x T A s ˆ x ≤ ⇒ ˆ x ∈ F + s . Then ˆ x is alreadyin the closed convex hull given by ( c T x − d )( c T x − d ) ≤ c + c ) T ˆ x > d + d . So either c T ˆ x > d or c T ˆ x > d . Whichever the case, ˆ x satisfies the disjunction. Therefore ˆ x isin the closed convex hull, which gives the desired conclusion.We remark that, despite their different forms, (16) and the inequality defining F + s both originate from x T A s x ≤ . hull( F +0 ∩F ) \ ( F +0 ∩F ), e.g., the points added due to the convexification process. Moreover, (16)can be interpreted as adding all of the recessive directions { d ∈ K : c T d ≥ , c T d ≥ } of the disjunction to the set F +0 ∩ F + s . Finally, the analysis in [36] shows in additionthat the linear inequality b Ts x ≥ G + s .Note that [35,36] cover this case. Because the resulting convex hull is not conicrepresentable [13] is not applicable in this case. The papers [23,55] are not relevanthere and none of the other papers [2,40] cover this case because they focus on splitdisjunctions only. As in the previous two subsections, [14] is limited in its application. d > d As mentioned above, the results of [36] ensure that cl . conic . hull( F +0 ∩ F ) requiresmore than two conic inequalities, making it highly likely that the closed convex hullof F +0 ∩ F ∩ H requires more than two also. In other words, our theory would notapply in this case in general. So we ask: which conditions are violated in this case?Let us first consider when d d = 0, which covers two subcases. Then A := (cid:18) J
00 0 (cid:19) , A := (cid:18) c c T + c c T − d c − d c − d c T − d c T (cid:19) , and it is clear that Condition 3 is not satisfied.Now consider the remaining subcase when ( d , d ) = (1 , − A := (cid:18) J
00 0 (cid:19) , A := (cid:18) c c T + c c T c − c c T − c T − (cid:19) . Condition 1 holds, and Condition 2 is the full-dimensional case of interest. Condition3(iii) holds as well, so s = 0. Then Condition 4 requires v T A v <
0, where v =(0; . . . ; 0; 1), which is true. On the other hand, Condition 5 might fail. In fact, theexample in Section 5 of the Online Supplement provides just such an instance. Thisbeing said, the same stronger condition discussed in Section 5.2 can be seen to implyCondition 5, that is, when there exists β , β ≥ β c + c ∈ −K and β c + c ∈ K . This covers the case of split disjunctions, for example.Of course, even when all conditions do not hold, just Conditions 1-3, which holdwhen d d = −
1, are enough to ensure the valid relaxations F +0 ∩F + s and F +0 ∩F + s ∩ H .However, these relaxations may not be sufficient to describe the conic and convex hulls.If necessary, another way to generate valid conic inequalities when d > d is asfollows. Instead of the original disjunction, consider the weakened disjunction c T x ≥ d ∨ c T x ≥ d , where d replaces d in the first term. Clearly any point satisfying theoriginal disjunction will also satisfy the new disjunction. Therefore any valid inequalityfor the new disjunction will also be valid for the original one. In Sections 5.1 and 5.2, wehave discussed the conditions under which Conditions 1-5 are satisfied when d = d .Even if the new disjunction violates Condition 6, as long as the original disjunctionsatisfies Condition 6, the resulting inequalities from this approach will be valid.Regarding the existing literature, the conclusions at the end of Section 5.3 alsoapply here.5.5 Conic sectionsLet ρ T x ≥ d ∨ ρ T x ≥ d be a disjunction on a cross-section K∩ H of the second-ordercone, where H = { x : h T x = 1 } . We work with an analogous of Condition 6: Condition 7
The disjunctive sets K := K ∩ H ∩ { x : ρ T x ≥ d } and K := K ∩ H ∩ { x : ρ T x ≥ d } are non-intersecting except possibly on their boundaries, e.g., K ∩ K ⊆ (cid:26) x ∈ K ∩ H : ρ T x = d ρ T x = d (cid:27) . We would like to characterize the convex hull of the disjunction, which is the same asthe convex hull of the disjunction ( ρ − d h ) T x ≥ ∨ ( ρ − d h ) T x ≥ K ∩ H .Defining c := ρ − d h , c := ρ − d h , A := J , and A := c c T + c c T , our goal is tocharacterize cl . conv . hull( K ∩ F ∩ H ). This is quite similar to the analysis in Section5.1 except that here we also must verify Condition 5.Conditions 1 and 3(i) are easily verified, and Condition 2 describes the full-dimensionalcase of interest. Following the development in Section 5.1, we can verify Condition 4when (cid:107) ˜ ρ − d ˜ h (cid:107) > | ρ ,n − d h n | and (cid:107) ˜ ρ − d ˜ h (cid:107) > | ρ ,n − d h n | , and otherwise theconvex hull is easy to determine. For Condition 5, we consider the cases of ellipsoids,paraboloids, and hyperboloids separately.Ellipsoids are characterized by h ∈ int( K ), and so K∩ H = { } . Thus K∩F + s ∩ H = { } ⊆ F easily verifying Condition 5. On the other hand, paraboloids are characterizedby 0 (cid:54) = h ∈ bd( K ), and in this case, K ∩ H = cone { ˆ h } , where ˆ h := − Jh = (cid:0) − ˜ hh n (cid:1) . Thus,to verify Condition 5, it suffices to show ˆ h ∈ F + s ⇒ ˆ h ∈ F . Indeed ˆ h ∈ F + s implies0 ≥ ˆ h T A s ˆ h = (1 − s ) ˆ h T J ˆ h + s ˆ h T A ˆ h = s ˆ h T A ˆ h because h ∈ bd( K ) ensures ˆ h T J ˆ h = 0. So ˆ h ∈ F .It remains only to verify Condition 5 for hyperboloids, which are characterizedby h / ∈ ±K , i.e., h = (cid:0) ˜ hh n (cid:1) satisfies (cid:107) ˜ h (cid:107) > | h n | . However, it seems difficult to verifyCondition 5 generally. Still, we note that ˆ h ∈ H impliesˆ h T A ˆ h = 2( c T ˆ h )( c T ˆ h ) = 2( ρ T ˆ h − d h T ˆ h )( ρ T ˆ h − d h T ˆ h ) = 2( ρ T ˆ h )( ρ T ˆ h ) . Then Condition 5 would hold, for example, when ρ and ρ satisfy the following, whichis identical to conditions discussed in Sections 5.2 and 5.4: there exists β , β ≥ β ρ + ρ ∈ −K and β ρ + ρ ∈ K . This covers the case of split disjunctions, forexample.We remark that our analysis in this subsection covers all of the various cases of splitdisjunctions found in [40] and more. In particular, we handle ellipsoids and paraboloidsfor all possible general two-term disjunctions (including the non-disjoint ones). On theother hand, the cases we can cover for hyperboloids is a subset of those recently givenin [55]. Note that [23] covers only split disjunctions on ellipsoids. [13] covers two-termdisjunctions on ellipsoids and certain specific two-term disjunctions on paraboloidsand hyperboloids satisfying their disjointness and boundedness assumptions. None ofthe papers [2,35,36] are relevant here. Finally, when the disjunction correspond to thedeletion of a convex set, the paper [14] applies to the cases for ellipsoids and paraboloidsbecause those sets can be viewed as epigraphs of strictly convex quadratics. In this section, we examine the case of (nearly) general quadratics intersected withconic sections of the SOC. For simplicity of presentation, we will employ affine trans-formations of the sets F +0 ∩F ∩ H of interest. It is clear that our theory is not affectedby affine transformations. (cid:26) y ∈ R n : y T y ≤ y T Qy + 2 g T y + f ≤ (cid:27) , where λ min [ Q ] <
0. Note that if λ min [ Q ] ≥
0, then the set is already convex. Allowingan affine transformation, this set models the intersection of any ellipsoid with a generalquadratic inequality. We can model this set in our framework by homogenizing x = (cid:0) yx n +1 (cid:1) and taking A := (cid:18) I T − (cid:19) , A := (cid:18) Q gg T f (cid:19) , H := { x : x n +1 = 1 } . We would like to compute cl . conv . hull( F +0 ∩ F ∩ H ).Conditions 1 and 3(i) are clear, and Condition 2 describes the full-dimensional caseof interest. When s <
1, Condition 5 is satisfied because, in this case, F +0 ∩ H = { } making the containment F +0 ∩ F + s ∩ H ⊆ F trivial. In Sections 6.1.1 and 6.1.2 below,we break the analysis of verifying Condition 4 into two subcases that we are able tohandle: (i) when λ min [ Q ] has multiplicity k ≥
2; and (ii) when λ min [ Q ] ≤ f and g = 0.Subcase (i) covers, for example, the situation of deleting the interior of an arbitraryball from the unit ball. Indeed, consider (cid:26) x ∈ R n : x T x ≤ x − c ) T ( x − c ) ≥ r (cid:27) , where c ∈ R n and r > Q, g, f ) = ( − I, c, r − c T c ). On the other hand, subcase (ii) can handle, forexample, the deletion of the interior of an arbitrary ellipsoid from the unit ball—as longas that ellipsoid shares the origin as its center. In other words, the portion to delete isdefined by x T Ex < r , for some E (cid:31) r >
0, and we take (
Q, g, f ) = ( − E, , r ).Note that λ min [ Q ] ≤ − f ⇔ λ max [ E ] ≥ r , which occurs if and only if the deletedellipsoid contains a point on the boundary of the unit ball. This is the most interestingcase because, if the deleted ellipsoid were either completely inside or outside the unitball, then the convex hull would simply be the unit ball itself. The subcase (ii) wasalso studied in Corollary 9 of [40] and in [14]. Moreover, none of the other papers [2,13,23,35,36,55] can handle this case. λ min [ Q ] has multiplicity k ≥ B t := (1 − t ) I + tQ to be the top-left n × n corner of A t . Since λ min [ B ] < k ≥
2, there exists r ∈ (0 ,
1) such that: (i) B r (cid:23)
0; (ii) λ min [ B r ] = 0with multiplicity k ; (ii) B t (cid:31) t < r . We claim that s = r as a consequence ofthe interlacing of eigenvalues with respect to A t and B t . Indeed, let λ tn +1 := λ min [ A t ]and λ tn denote the two smallest eigenvalues of A t , and let ρ tn and ρ tn − denote theanalogous eigenvalues of B t . It is well known that λ tn +1 ≤ ρ tn ≤ λ tn ≤ ρ tn − . When t < r , we have λ tn +1 < < ρ tn ≤ λ tn , and when t = r , we have λ rn +1 < ≤ λ rn ≤
0, which proves s = r . Since dim(null( B s )) = k ≥ { g } ⊥ ) = n −
1, there exists 0 (cid:54) = z ∈ null( B s ) such that g T z = 0. We can show that (cid:0) z (cid:1) ∈ null( A s ): A s (cid:32) z (cid:33) = (cid:18) B s s gs g T (1 − s )( −
1) + sf (cid:19) (cid:32) z (cid:33) = (cid:32) B s zs g T z (cid:33) = (cid:32) (cid:33) . Moreover, (cid:0) z (cid:1) T A (cid:0) z (cid:1) = z T B z = z T Qz < z ∈ null( B s ) if and only if z is aeigenvector of B = Q corresponding to λ min [ Q ]. This verifies Condition 4. λ min [ Q ] ≤ − f and g = 0The argument is similar to the preceding subcase in Section 6.1.1. Note that A t = (cid:18) (1 − t ) I + tQ
00 (1 − t )( −
1) + tf (cid:19) =: (cid:18) B t β t (cid:19) is block diagonal, so that the singularity of A t is determined by the singularity of B t and β t . B t is first singular when t = 1 / (1 − λ min [ Q ]), while β t is first singular when t = 1 / (1 + f ) (assuming f >
0; if not, then β t is never singular). Then11 − λ min [ Q ] ≤
11 + f ⇐⇒ λ min [ Q ] ≤ − f, which holds by assumption. So B t is singular before β t , leading to s = 1 / (1 − λ min [ Q ]).Let 0 (cid:54) = z ∈ null( B s ). Then, we have Qz = − − ss z , and thus, (cid:0) z (cid:1) ∈ null( A s ) with (cid:0) z (cid:1) T A (cid:0) z (cid:1) = z T B z = z T Qz <
0. Condition 4 is hence verified.6.2 The trust-region subproblemWe show in this subsection that our methodology can be used to solve the trust-regionsubproblem (TRS) min ˜ y ∈ R n − (cid:110) ˜ y T ˜ Q ˜ y + 2 ˜ g T ˜ y : ˜ y T ˜ y ≤ (cid:111) , (17)where λ min [ ˜ Q ] <
0. Without loss of generality, we assume that ˜ Q is diagonal with˜ Q ( n − n − = λ min [ ˜ Q ] after applying an orthogonal transformation that does notchange the feasible set.Our intention is not necessarily to argue that the TRS should be solved numericallywith our approach, although this is an interesting question left as future work. Ourgoal is to illustrate that the well-known problem (17) can be handled by our machinery.We also believe that the corresponding SOCP formulation for the TRS as opposed toits usual SDP formulation is independently interesting. Our transformations to followrequire simply two eigenvalue decompositions and the resulting SOCP can be solvedby interior point solvers very efficiently. We note that none of the previous papers,in particular, [2,13,23,35,36,40,55] have given a transformation of the TRS into anSOC optimization problem before. We recently became aware that an SOC basedreformulation of TRS was also given in Jeyakumar and Li [30]; our approach parallelstheir developments from a different, convexification based, perspective. We first argue that (17) is equivalent to a trust-region subproblemmin y ∈ R n (cid:110) y T Qy + 2 g T y : y T y ≤ (cid:111) (18)in the n -dimensional variable y := (cid:0) ˜ yy n (cid:1) . Indeed, define Q := (cid:18) ˜ Q T λ min [ ˜ Q ] (cid:19) , g := (cid:32) ˜ g (cid:33) , and note that λ min [ Q ] has multiplicity at least 2. The following proposition shows that(18) is equivalent to (17). Proposition 8
There exists an optimal solution of (18) with y n = 0 . In particular,the optimal values of (17) and (18) are equal.Proof Let ¯ y be an optimal solution of (18). Then (¯ y n − ; ¯ y n ) is an optimal solution ofthe two-dimensional trust-region subproblemmin y n − ,y n (cid:110) | λ min [ ˜ Q ] | ( − y n − − y n ) + 2˜ g n − y n − : y n − + y n ≤ (cid:15) (cid:111) . where (cid:15) := 1 − (¯ y + · · · ¯ y n − ). Since we are minimizing a concave function over theellipsoid, at least one optimal solution will be on the boundary of this set. In particular,whenever ˜ g n − >
0, the solution (cid:0) y n − y n (cid:1) = (cid:0) −√ (cid:15) (cid:1) is optimal, and when ˜ g n − ≤
0, thesolution (cid:0) y n − y n (cid:1) = (cid:0) √ (cid:15) (cid:1) is optimal. Thus, this problem has at least one optimal solutionwith y n = 0. Hence, ¯ y n can be taken as 0. (cid:117)(cid:116) With the proposition in hand, we now focus on the solution of (18).A typical approach to solve (18) is to introduce an auxiliary variable x n +2 (wherewe reserve the variable x n +1 for later homogenization) and to recast the problem asmin (cid:26) x n +2 : y T y ≤ y T Qy + 2 g T y ≤ x n +2 (cid:27) . If one can compute the closed convex hull of this feasible set, then (18) is solvable bysimply minimizing x n +2 over the convex hull. We can represent this approach in ourframework by taking x = ( y ; x n +1 ; x n +2 ), homogenizing via x n +1 = 1, and defining A := I T − T , A := Q g g T − T − , H := { x ∈ R n +2 : x n +1 = 1 } . Clearly, Conditions 1 and 2 are satisfied. However, no part of Condition 3 is satisfied.So we require a different approach.Since x = 0 is feasible for (18), its optimal value is nonpositive. (In fact, it isnegative since Q has a negative eigenvector, so that x = 0 is not a local minimizer).Hence, (18) is equivalent to v := min (cid:26) x n +2 : y T y ≤ y T Qy + 2 g T y ≤ − x n +2 (cid:27) , (19) which can be solved in stages: first, minimize x n +2 over the feasible set of (19) (let l be the minimal value); second, separately maximize x n +2 over the same (let u be themaximal value); and finally take v = min {− l , − u } . If one can compute the closedconvex hull of (19), then l and u can be computed easily.To represent the feasible set of (19) in our framework, we define x = ( y ; x n +1 ; x n +2 )and take A := I T − T , A := Q g g T T , H := { x ∈ R n +2 : x n +1 = 1 } . Clearly, Conditions 1 and 2 are satisfied, and Condition 3(ii) is now satisfied. ForConditions 4 and 5, we note that A t has a block structure such that s equals thesmallest positive t such that B t := (1 − t ) (cid:18) I − (cid:19) + t (cid:18) Q gg T (cid:19) is singular. Using an argument similar to Section 6.1.1 and exploiting the fact that λ min [ Q ] has multiplicity at least 2, we can compute s such that there exists 0 (cid:54) = z ∈ null( B s ) ⊆ R n +1 with z T B z < z n +1 = 0. By appending an extra 0 entry, this z can be easily extended to z ∈ R n +2 with z T A z < z ∈ H . This simultaneouslyverifies Conditions 4 and 5.6.3 ParaboloidsConsider the set (cid:40) y = (cid:32) ˜ yy n (cid:33) ∈ R n : ˜ y T ˜ y ≤ y n ˜ y T ˜ Q ˜ y + 2 g T y + f ≤ (cid:41) , where λ := λ min [ ˜ Q ] < g n ≤ − λ . After an affine transformation, this models theintersection of a paraboloid with any quadratic inequality that is strictly linear in y n ,i.e., no quadratic terms involve y n . Note that if λ min [ Q ] ≥
0, then the set is alreadyconvex. The reason for the upper bound on 2 g n will become evident shortly.Writing g := (cid:0) ˜ gg n (cid:1) , we can model this situation with x = (cid:0) yx n +1 (cid:1) and A := I T − T − , A := ˜ Q g T g n ˜ g T g n f , H := { x : x n +1 = 1 } , and we would like to compute cl . conv . hull( F +0 ∩ F ∩ H ). Conditions 1 and 3(i) areclear, and Condition 2 describes the full-dimensional case of interest. So it remains toverify Conditions 4 and 5.Define B t := (cid:18) (1 − t ) I + t ˜ Q
00 0 (cid:19) to be the top-left n × n corner of A t , and define r := 1 / (1 − λ ). Due to its structure, B t is positive semidefinite for all t ≤ r . Moreover, B t has exactly one zero eigenvalue for t < r , and B r has at least two zero eigenvalues. Those two zero eigenvalues ensurethat A r is singular by the interlacing of eigenvalues of A t and B t (similar to Section6.1.1). So s ≤ r .We claim that in fact s = r . Let t < r ; and consider the following system fornull( A t ): (1 − t ) I + t ˜ Q t ˜ g T − t )( − ) + t g n t ˜ g T (1 − t )( − ) + t g n tf ˜ zz n z n +1 = . Note that 2 g n ≤ − λ and 0 ≤ t < r imply2 (cid:2) (1 − t )( − ) + t g n (cid:3) = t (1 + 2 g n ) − ≤ t (1 − λ ) − < r (1 − λ ) − , (20)which implies z n +1 = 0. This in turn implies ˜ z = 0 because (1 − t ) I + t ˜ Q (cid:31) t < r .Finally, z n = 0 again due to (20). So we conclude that t < r implies null( A t ) = { } .Hence, s = r . We next write A s = (cid:18) B s g s g s sf (cid:19) . Since dim(null( B s )) ≥ { g s } ⊥ ) = n −
1, there exists 0 (cid:54) = z ∈ null( B s )such that g Ts z = 0. From the structure of B s , we have z = (cid:0) ˜ zz n (cid:1) , where ˜ z is a negativeeigenvector of ˜ Q . We claim that (cid:0) z (cid:1) ∈ null( A s ). Indeed: A s (cid:32) z (cid:33) = (cid:18) B s g s g Ts sf (cid:19) (cid:32) z (cid:33) = (cid:32) B s zg Ts z (cid:33) = (cid:32) (cid:33) . Moreover, (cid:0) z (cid:1) T A (cid:0) z (cid:1) = z T B z = ˜ z T ˜ Q ˜ z <
0. This verifies Conditions 4 and 5.We remark that Corollary 8 in [40] studies the closed convex hull of the set (cid:40) y = (cid:32) ˜ yy n (cid:33) ∈ R n : (cid:107) ˜ A (˜ y − ˜ c ) (cid:107) ≤ y n , (cid:107) ˜ D (˜ y − ˜ d ) (cid:107) ≥ − γ y n + q (cid:41) , where ˜ A ∈ R ( n − × ( n − is an invertible matrix, ˜ c, ˜ d ∈ R n − and γ ≥
0. This situationis covered by our theory here. The paper [14] also applies to this case, but none of theother papers [2,13,23,35,36,55] are relevant here.
This paper provides basic convexity results regarding the intersection of a second-order-cone representable set and a nonconvex quadratic. Although several results haveappeared in the prior literature, we unify and extend these by introducing a simple,computable technique for aggregating (with nonnegative weights) the inequalities defin-ing the two intersected sets. The underlying conditions of our theory can be checkedeasily in many cases of interest.Beyond the examples detailed in this paper, our technique can be used in otherways. Consider for example, a general quadratically constrained quadratic program,whose objective has been linearized without loss of generality. If the constraints include an ellipsoid constraint, then our techniques can be used to generate valid SOC inequal-ities for the convex hull of the feasible region by pairing each nonconvex quadratic con-straint with the ellipsoid constraint one by one. The theoretical and practical strengthof this technique is of interest for future research, and the techniques in [3,37] couldprovide a good point of comparison.In addition, it would be interesting to investigate whether our techniques could beextended to produce valid inequalities or explicit convex hull descriptions for intersec-tions involving multiple second-order cones or multiple nonconvex quadratics. Afterour initial June 2014 submission of this paper, a similar aggregation idea has beenrecently explored in [41] in November 2014 by using the results from [54]. We note thatas opposed to our emphasis on the computability of SOCr relaxations, these recent re-sults rely on numerical algorithms to compute such relaxations and further topologicalconditions for verifying their sufficiency. Acknowledgments
The authors wish to thank the Associate Editor and anonymous referees for theirconstructive feedback which improved the presentation of the material in this paper.The research of the second author is supported in part by NSF grant CMMI 1454548.
References
1. C. Adjiman, S. Dallwig, C. Floudas, and A. Neumaier. A global optimization method, α -BB, for general twice-differentiable constrained NLPs - I. Theoretical advances. Computers& Chemical Engineering , 22(9):1137–1158, 1998.2. K. Andersen and A. N. Jensen. Intersection cuts for mixed integer conic quadratic sets.In
Proceedings of IPCO 2013 , volume 7801 of
Lecture Notes in Computer Science , pages37–48, Valparaiso, Chile, March 2013.3. I. P. Androulakis, C. D. Maranas, and C. A. Floudas. α BB: a global optimization methodfor general constrained nonconvex problems.
Journal of Global Optimization , 7(4):337–363, 1995. State of the art in global optimization: computational methods and applications(Princeton, NJ, 1995).4. K. M. Anstreicher and S. Burer. Computable representations for convex hulls of low-dimensional quadratic forms.
Mathematical Programming , 124(1-2):33–43, 2010.5. A. Atamt¨urk and V. Narayanan. Conic mixed-integer rounding cuts.
Mathematical Pro-gramming , 122(1):1–20, 2010.6. E. Balas. Intersection cuts - a new type of cutting planes for integer programming.
Oper-ations Research , 19:19–39, 1971.7. E. Balas. Disjunctive programming.
Annals of Discrete Mathematics , 5:3–51, 1979.8. E. Balas, S. Ceria, and G. Cornu´ejols. A lift-and-project cutting plane algorithm for mixed0-1 programs.
Mathematical Programming , 58:295–324, 1993.9. X. Bao, N. V. Sahinidis, and M. Tawarmalani. Semidefinite relaxations for quadraticallyconstrained quadratic programming: A review and comparisons.
Mathematical Program-ming , 129(1):129–157, 2011.10. A. Barvinok.
A course in convexity , volume 54. American Mathematical Soc., 2002.11. P. Belotti. Disjunctive cuts for nonconvex MINLP. In J. Lee and S. Leyffer, editors,
MixedInteger Nonlinear Programming , volume 154 of
The IMA Volumes in Mathematics andits Applications , pages 117–144. Springer, New York, NY, 2012.12. P. Belotti, J. G´oez, I. P´olik, T. Ralphs, and T. Terlaky. On families of quadratic sur-faces having fixed intersections with two hyperplanes.
Discrete Applied Mathematics ,161(16):2778–2793, 2013.213. P. Belotti, J. C. Goez, I. Polik, T. K. Ralphs, and T. Terlaky. A conic representation ofthe convex hull of disjunctive sets and conic cuts for integer second order cone optimiza-tion. In M. Al-Baali, L. Grandinetti, and A. Purnama, editors,
Numerical Analysis andOptimization , volume 134 of
Springer Proceedings in Mathematics and Statistics , pages1–35. Springer, 2014.14. D. Bienstock and A. Michalka. Cutting-planes for optimization of convex functions overnonconvex sets.
SIAM Journal on Optimization , 24(2):643–677, 2014.15. P. Bonami. Lift-and-project cuts for mixed integer convex programs. In O. Gunluk andG. J. Woeginger, editors,
Proceedings of the 15th IPCO Conference , volume 6655 of
LectureNotes in Computer Science , pages 52–64, New York, NY, 2011. Springer.16. S. Burer and K. M. Anstreicher. Second-order-cone constraints for extended trust-regionsubproblems.
SIAM Journal on Optimization , 23(1):432–451, 2013.17. S. Burer and A. Saxena. The MILP road to MIQCP. In
Mixed Integer Nonlinear Pro-gramming , pages 373–405. Springer, 2012.18. F. Cadoux. Computing deep facet-defining disjunctive cuts for mixed-integer program-ming.
Mathematical Programming , 122(2):197–223, 2010.19. M. C¸ ezik and G. Iyengar. Cuts for mixed 0-1 conic programming.
Mathematical Program-ming , 104(1):179–202, 2005.20. S. Ceria and J. Soares. Convex programming for disjunctive convex optimization.
Math-ematical Programming , 86(3):595–614, 1999.21. A. R. Conn, N. I. M. Gould, and P. L. Toint.
Trust-Region Methods . MPS/SIAM Serieson Optimization. SIAM, Philadelphia, PA, 2000.22. G. Cornu´ejols and C. Lemar´echal. A convex-analysis perspective on disjunctive cuts.
Mathematical Programming , 106(3):567–586, 2006.23. D. Dadush, S. S. Dey, and J. P. Vielma. The split closure of a strictly convex body.
Operations Research Letters , 39:121–126, 2011.24. S. Drewes.
Mixed Integer Second Order Cone Programming . PhD thesis, TechnischeUniversit¨at Darmstadt, 2009.25. S. Drewes and S. Pokutta. Cutting-planes for weakly-coupled 0/1 second order coneprograms.
Electronic Notes in Discrete Mathematics , 36:735–742, 2010.26. N. I. M. Gould, S. Lucidi, M. Roma, and P. L. Toint. Solving the trust-region subproblemusing the Lanczos method.
SIAM Journal on Optimization , 9(2):504–525, 1999.27. O. G¨unl¨uk and J. Linderoth. Perspective reformulations of mixed integer nonlinear pro-grams with indicator variables.
Mathematical Programming , 124(1-2):183–205, 2010.28. R. A. Horn and C. R. Johnson.
Matrix analysis . Cambridge university press, 2013.29. J. Hu, J. E. Mitchell, J.-S. Pang, K. P. Bennett, and G. Kunapuli. On the global solution oflinear programs with linear complementarity constraints.
SIAM J. Optim. , 19(1):445–471,2008.30. V. Jeyakumar and G. Y. Li. Trust-region problems with linear inequality constraints: ExactSDP relaxation, global optimality and robust optimization.
Mathematical Programming ,147(1):171–206, 2013.31. J. J. J´udice, H. Sherali, I. M. Ribeiro, and A. M. Faustino. A complementarity-basedpartitioning and disjunctive cut algorithm for mathematical programming problems withequilibrium constraints.
Journal of Global Optimization , 136:89–114, 2006.32. T. Kato.
Perturbation theory for linear operators . Springer-Verlag, Berlin-New York,second edition, 1976. Grundlehren der Mathematischen Wissenschaften, Band 132.33. M. Kılın¸c, J. Linderoth, and J. Luedtke. Effective separation of disjunctive cutsfor convex mixed integer nonlinear programs. Technical report, 2010. .34. F. Kılın¸c-Karzan. On minimal inequalities for mixed integer conic programs.
Mathematicsof Operations Research , 41(2):477–510, 2016.35. F. Kılın¸c-Karzan and S. Yıldız. Two-term disjunctions on the second-order cone. In J. Leeand J. Vygen, editors,
IPCO , volume 8494 of
Lecture Notes in Computer Science , pages345–356. Springer, 2014.36. F. Kılın¸c-Karzan and S. Yıldız. Two-term disjunctions on the second-order cone.
Mathe-matical Programming , 154(1):463–491, 2015.37. S. Kim and M. Kojima. Second order cone programming relaxation of nonconvex quadraticoptimization problems.
Optimization Methods and Software , 15(3-4):201–224, 2001.38. A. Mahajan and T. Munson. Exploiting second-order cone structure for global opti-mization. Technical report, October 2010. ANL/MCS-P1801-1010, Argonne NationalLaboratory, .339. S. Modaresi, M. R. Kılın¸c, and J. P. Vielma. Split cuts and extended formulations formixed integer conic quadratic programming.
Operations Research Letters , 43(1):10–15,2015.40. S. Modaresi, M. R. Kılın¸c, and J. P. Vielma. Intersection cuts for nonlinear integer pro-gramming: Convexification techniques for structured sets.
Mathematical Programming ,155(1):575 – 611, 2016.41. S. Modaresi and J. Vielma. Convex hull of two quadratic or a conic quadraticand a quadratic inequality. Technical report, November 2014. .42. J. J. Mor´e and D. C. Sorensen. Computing a trust region step.
SIAM Journal on Scientificand Statistical Computing , 4(3):553–572, 1983.43. T. T. Nguyen, M. Tawarmalani, and J.-P. P. Richard. Convexification techniques for linearcomplementarity constraints. In O. G¨unl¨uk and G. J. Woeginger, editors,
IPCO , volume6655 of
Lecture Notes in Computer Science , pages 336–348. Springer, 2011.44. G. Pataki. On the rank of extreme matrices in semidefinite programs and the multiplicityof optimal eigenvalues.
Mathematics of Operations Research , 23(2):339–358, 1998.45. I. P´olik and T. Terlaky. A survey of the S-lemma.
SIAM Rev. , 49(3):371–418 (electronic),2007.46. F. Rellich.
Perturbation theory of eigenvalue problems . Assisted by J. Berkowitz. With apreface by Jacob T. Schwartz. Gordon and Breach Science Publishers, New York-London-Paris, 1969.47. F. Rendl and H. Wolkowicz. A semidefinite framework for trust region subproblems withapplications to large scale minimization.
Mathematical Programming , 77(2):273–299, 1997.48. A. Saxena, P. Bonami, and J. Lee. Disjunctive cuts for non-convex mixed integer quadrat-ically constrained programs. In A. Lodi, A. Panconesi, and G. Rinaldi, editors,
IPCO ,volume 5035 of
Lecture Notes in Computer Science , pages 17–33. Springer, 2008.49. H. Sherali and C. Shetty. Optimization with disjunctive constraints.
Lectures on Econ.Math. Systems , 181, 1980.50. R. A. Stubbs and S. Mehrotra. A branch-and-cut method for 0-1 mixed convex program-ming.
Mathematical Programming , 86(3):515–532, 1999.51. M. Tawarmalani, J. Richard, and K. Chung. Strong valid inequalities for orthogonaldisjunctions and bilinear covering sets.
Mathematical Programming , 124(1-2):481–512,2010.52. M. Tawarmalani, J.-P. P. Richard, and C. Xiong. Explicit convex and concave envelopesthrough polyhedral subdivisions.
Mathematical Programming , 138(1-2):531–577, 2013.53. J. P. Vielma, S. Ahmed, and G. L. Nemhauser. A lifted linear programming branch-and-bound algorithm for mixed-integer conic quadratic programs.
INFORMS Journal onComputing , 20(3):438–450, 2008.54. U. Yıldıran. Convex hull of two quadratic constraints is an LMI set.
IMA J. Math. ControlInf. , 26:417–450, 2009.55. S. Yıldız and G. Cornu´ejols. Disjunctive cuts for cross-sections of the second-order cone.
Operations Research Letters , 43(4):432–437, 2015.
Online Supplement: Low-Dimensional Examples
Samuel Burer
Department of Management SciencesUniversity of Iowa,Iowa City, IA, 52242-1994, USA.( [email protected] ) Fatma Kılın¸c-Karzan
Tepper School of BusinessCarnegie Mellon University,Pittsburgh, PA, 15213, USA.( [email protected] ) In this Online Supplement, we illustrate Theorem 1 of the main article with severallow-dimensional examples and discuss which of the earlier approaches [2,13,14,23,35,36,40,55] cannot replicate these examples. Section 5 of the main article is devoted tothe important case for which the dimension n is arbitrary, F +0 is the second-order cone,and F represents a two-term linear disjunction c T x ≥ d ∨ c T x ≥ d . Section 6 ofthe main article investigates cases in which F is given by a (nearly) general quadraticinequality. In R , consider the intersection of the canonical second-order cone, defined by (cid:107) ( y ; y ) (cid:107) ≤ y , and a specific linear disjunction, defined by y ≤ − ∨ y ≥
1, whichis a proper split. By homogenizing via x = (cid:0) yx (cid:1) with x = 1 and noting that thedisjunction is equivalent to y ≥ ⇔ y ≥ x , we can represent the intersection as F +0 ∩ F ∩ H with A := Diag(1 , , − , , A := Diag( − , , , , H := { x : x = 1 } . Note that A t = Diag(1 − t, − t, − t, t ). Conditions 1 and 3(ii) are easily verified,and Condition 2 holds with ¯ x := (2; 0; 3; 1), for example.In this case, s = , A s = Diag(0 , , − , F s = { x : x + x ≤ x } , and F + s = { x : (cid:107) ( x ; x ) (cid:107) ≤ x } , which contains ¯ x . Note that apex( F + s ) = null( A s ) =span { d } , where d := (1; 0; 0; 0). It is easy to check that d ∈ H with d T A d <
0, andso Conditions 4 and 5 are simultaneously verified.So, in the original variable y , the explicit convex hull is given by (cid:26) y : (cid:107) ( y ; y ) (cid:107) ≤ y (cid:107) ( y ; 1) (cid:107) ≤ y (cid:27) = cl . conv . hull (cid:26) y : (cid:107) ( y ; y ) (cid:107) ≤ y y ≤ − ∨ y ≥ (cid:27) . Figure 1 depicts the original intersection, F + s ∩ H , and the closed convex hull.Papers [2,35,36,40] can handle this example, and in fact they can handle all splitdisjunctions on SOCs. On the other hand, [13] cannot handle this example because oftheir boundedness assumption on the sides of the disjunction. Because this exampleconcerns a disjunction on SOC itself—not a disjunction on a cross-section of SOC—thepapers [23,55] are not relevant here. In order to apply the results from [14], we needto consider the SOC (cid:107) ( y ; y ) (cid:107) ≤ y as the epigraph of the convex norm (cid:107) ( y ; y ) (cid:107) .However, this viewpoint does not satisfy the special conditions for polynomial-timeseparability, such as differentiability or growth rate, in that paper; see Theorem IVtherein. (a) F +0 ∩ F ∩ H (b) F + s ∩ H (c) F +0 ∩ F + s ∩ H Fig. 1
A proper split of the second-order cone In R , consider the intersection of the paraboloid defined by y + y ≤ y and the “two-sided” second-order cone disjunction defined by y + y ≤ y . One side has y ≥ y ≤
0. By homogenizing via x = (cid:0) yx (cid:1) with x = 1, we canrepresent the intersection as F +0 ∩ F ∩ H with A := − − , A := − , H := { x : x = 1 } . Conditions 1 and 3(i) are straightforward to verify, and Condition 2 is satisfied with¯ x = (0; √ ; √ ; 1), for example. We can also calculate s = from (7). Then A s = − − , F s = (cid:110) x : x + x ≤ x x (cid:111) . The negative eigenvalue of A s is λ s := (1 − √ / q s := (0; 0; √ −
1; 1), and so, in accordance with the Section 2, we have that F + s equals all x ∈ F s satisfying b Ts x ≥
0, where b s := ( − λ s ) / q s = (cid:112) √ − √ − . Scaling b s by a positive constant, we thus have F + s := (cid:26) x : x + x ≤ x x ( √ − x + x ≥ (cid:27) . Note that ¯ x ∈ F + s . In addition, apex( F + s ) = null( A s ) = span { d } , where d = (0; 1; 0; 0).Clearly, d ∈ H and d T A d <
0, which verifies Conditions 4 and 5 simultaneously.Setting x = 1 and returning to the original variable y , we see (cid:26) y : y + y ≤ y y + y ≤ y (cid:27) = cl . conv . hull (cid:26) y : y + y ≤ y y + y ≤ y (cid:27) , where the now redundant constraint ( √ − y + 1 ≥ F + s ∩ H , and the closed convex hull. (a) F +0 ∩ F ∩ H (b) F + s ∩ H (c) F +0 ∩ F + s ∩ H Fig. 2
A paraboloid and a second-order-cone disjunction
Of the earlier, related approaches, this example can be handled by [40] only. Inparticular, [2,13,23,35,36,55] cannot handle this example because they deal with onlysplit or two-term disjunctions but cannot cover general nonconvex quadratics. Theapproach of [14] is based on eliminating a convex region from a convex epigraphicalset, but this example removes a nonconvex region (specifically, R n \ F ). So [14] cannothandle this example either.In actuality, the results of [40] do not handle this example explicitly since theauthors only state results for: the removal of a paraboloid or an ellipsoid from aparaboloid; or the removal of an ellipsoid (or an ellipsoidal cylinder) from another ellip-soid with a common center. However, in this particular example, the function obtainedfrom the aggregation technique described in [40] is convex on all of R . Therefore, theirglobal convexity requirement on the aggregated function is satisfied for this example. In R , consider the intersection of the canonical second-order cone defined by | y | ≤ y and the set defined by the quadratic y ( y − ≤
0. By homogenizing via x = (cid:0) yx (cid:1) with x = 1, we can represent the set as F +0 ∩ F ∩ H with A := − , A := −
11 0 0 − , H := { x : x = 1 } . While Conditions 1 and 2 hold, Condition 3 does not hold because A is singular and A is zero on the null space span { (0; 0; 1) } of A . Figure 3 depicts F +0 ∩ F ∩ H and F +0 ∩ F .In this example, even though Condition 3 is violated, we still have the trivial convexrelaxation given by cl . conv . hull( F +0 ∩ F ∩ H ) ⊆ F +0 ∩ H . Of course, this trivialconvex relaxation is not sufficient. (a) F +0 ∩ F ∩ H (b) F +0 ∩ F Fig. 3
An example violating Condition 3
The papers [2,13,23,35,36,55] also cannot handle this example because they dealwith only split or two-term disjunctions that are not general enough to cover generalnonconvex quadratics. Moreover, R \ F defines a nonconvex region, so neither of theapproaches from [14,40] related to excluding convex sets is applicable in this case. In R , consider the intersection of the second-order cone defined by | x | ≤ x and thetwo-term linear disjunction defined by x ≤ ∨ x ≤ x . Note that, in the second-order cone, x ≤ x implies x = x . So one side of the disjunction is contained in theboundary of the second-order cone. We also note that—in the second-order cone—thedisjunction is equivalent to the quadratic x ( x − x ) ≤
0. Thus, to compute the closedconic hull of the intersection of cone and the disjunction, we define A := (cid:18) − (cid:19) , A := (cid:18) − (cid:19) , and we wish to calculate cl . conic . hull( F +0 ∩ F ).Conditions 1, 2, and 3(i) are easily verified, and the eigenvalues of A − A are − s = by (7), and so A s = 12 (cid:18) − − (cid:19) . Also, null( A s ) is spanned by d = (1; 1), and yet d T A d = 0, which violates Condition4. Note that A s = − (cid:0) − (cid:1)(cid:0) − (cid:1) T , and so F + s = { x : x ≥ x } . Figure 4 de-picts F +0 ∩ F , F + s , and F +0 ∩ F + s . Since Conditions 1–3 are satisfied, we know thatcl . conic . hull( F +0 ∩ F ) ⊆ F +0 ∩ F + s , and it is evident from the figures that—in thisparticular example—equality holds. This simply indicates that the results of Theorem1 may still hold even when Condition 4 is violated.The approach [2] can only handle split disjunctions on SOCs and thus is not appli-cable here. This is also the case for that portion of the approach from [40] associated (a) F +0 ∩ F (b) F + s (c) F +0 ∩ F + s Fig. 4
An example violating Condition 4 with split disjunctions. Moreover, [35,36] cannot handle this two-term disjunction be-cause of their strict feasibility assumption on both sides of the sets defined by thedisjunction. Also, [13] cannot handle this example because of their boundedness as-sumption on both of the sets defined by the disjunction. In addition, R \ F defines anonconvex region, therefore neither of the approaches from [14,40] related to excludingconvex sets is applicable in this case. Since this example concerns a disjunction on SOCitself but not on the cross-section of an SOC, [23,55] are not relevant here. In R , consider the intersection of the second-order cone defined by | y | ≤ y and thetwo-term linear disjunction defined by y ≥ ∨ y ≤
1. Note that, in the second-ordercone, the disjunction is equivalent to the quadratic ( y − − y ) ≤
0. Thus, tocompute the closed conic hull of the intersection of cone and the disjunction, we define x = (cid:0) yx (cid:1) and A := − , A := 12 − − − , H := { x : x = 1 } and we wish to calculate cl . conic . hull( F +0 ∩ F ∩ H ).Conditions 1, 2, and 3(iii) are easily verified, and so s = 0 with null( A s ) spannedby d = (0; 0; 1). Then Condition 4 is clearly satisfied. However, d (cid:54) = 0, and so thefirst option for Condition 5 is not satisfied. The second option is the containment F +0 ∩ F + s ∩ H ⊆ F , which simplifies to F +0 ∩ H ⊆ F in this case. This is also not true because the point x = (1; 2; 0) ∈ F +0 ∩ H but x (cid:54)∈ F .Figure 5 depicts this example. Note that the inequality y ≥ − F +0 ∩ F ∩ H . In addition, F +0 ∩ F + s = cl . conic . hull( F +0 ∩ F ) becauseConditions 1–4 are satisfied. However, the projection F +0 ∩ F + s ∩ H is not the desiredconvex hull since, for example, it violates y ≥ − (a) F +0 ∩ F ∩ H (b) F +0 ∩ F (c) F + s = F +0 ∩ F + s (d) F +0 ∩ F + s ∩ H Fig. 5
An example violating Condition 5. Note that ss