Sharp and fast bounds for the Celis-Dennis-Tapia problem
aa r X i v : . [ m a t h . O C ] F e b Sharp and fast bounds for the Celis-Dennis-Tapia problem
L.Consolini , M.Locatelli Dipartimento di Ingegneria e Architettura, Universit`a di ParmaParco Area delle Scienze, 181/A, Parma, Italy
Abstract
In the Celis-Dennis-Tapia (CDT) problem a quadratic function is minimized over aregion defined by two strictly convex quadratic constraints. In this paper we re-derivea necessary and optimality condition for the exactness of the dual Lagrangian bound(equivalent to the Shor relaxation bound in this case). Starting from such condition wepropose to strengthen the dual Lagrangian bound by adding one or two linear cuts to theLagrangian relaxation. Such cuts are obtained from supporting hyperplanes of one of thetwo constraints. Thus, they are redundant for the original problem but they are not forthe Lagrangian relaxation. The computational experiments show that the new bounds areeffective and require limited computing times. In particular, one of the proposed boundsis able to solve all but one of the 212 hard instances of the CDT problem presented in [7].
Keywords
CDT problem, Dual Lagrangian Bound, Linear Cuts.
The Celis-Dennis-Tapia problem (CDT problem in what follows) is defined as follows: p ⋆ = min x ⊤ Qx + q ⊤ xx ⊤ x ≤ x ⊤ Ax + a ⊤ x ≤ a , (1)where Q , A ∈ R n × n , q , a ∈ R n , a ∈ R , while A is assumed to be positive definite. We willdenote by E = { x ∈ R n : x ⊤ Ax + a ⊤ x ≤ a } , the ellipsoid defined by the second constraint, and by ∂ E its border. The CDT problem wasoriginally proposed in [8] and has attracted a lot of attention in the last two decades. Forsome special cases a convex reformulation is available. For instance, in [17] it is shown that asemidefinite reformulation is possible when no linear terms are present, i.e., when q = a = .However, up to now no tractable convex reformulation of general CDT problems has beenproposed in the literature. In spite of that, recently three different works [5, 9, 13] inde-pendently proved that the CDT problem is solvable in polynomial time. More precisely, in[9, 13] polynomial solvability is proved by identifying all KKT points through the solution ofa bivariate polynomial system with polynomials of degree at most 2 n . The two unknowns arethe Lagrange multipliers of the two quadratic constraints. Instead, in [5] an approach basedon the solution of a sequence of feasibility problems for systems of quadratic inequalities is1roposed. The systems are solved by a polynomial-time algorithm based on Barvinok’s con-struction [3]. Though polynomial, all these approaches are quite computationally demandingand the degree of the polynomial is quite large. Conditions guaranteeing that the classicalShor SDP relaxation or, equivalently in this case, the dual Lagrangian bound is exact arediscussed in [1, 4]. In particular, in [1] a necessary and sufficient condition is presented. It isshown that lack of exactness is related to the existence of KKT points with the same Lagrangemultipliers but two distinct primal solutions, both active at one of the two constraints butone violating and the other one fulfilling the other constraint. In [6] necessary and sufficientconditions for local and global optimality are discussed based on copositivity. A trajectoryfollowing method to solve the CDT problem has been discussed in [17], while different branch-and-bound solvers are tested in [10].Recently, different papers proposed valid bounds for the CDT problem. In [7] the Shorrelaxation bound is strengthened by adding all RLT constraints obtained by supporting hy-perplanes of the two ellipsoids. By fixing the supporting hyperplane for one ellipsoid, theRLT constraints obtained with all the supporting hyperplanes of the other can be condensedinto a single SOC-RLT constraint. Varying the supporting hyperplane of the first ellipsoidgives rise to an infinite number of SOC-RLT constraints which, however, can be separated inpolynomial time. The addition of these constraints does not allow to close the duality gap butit is computationally shown that many instances which are not solved via the SDP bound, aresolved with the addition of these SOC-RLT cuts. The authors generate 1000 random test in-stances for each n = 5 , ,
20, following a procedure described in [11] to generate trust-regionproblems with one local and nonglobal minimizer. The proposed bound based on SOC-RLTcuts allows to solve most instances except for 212 (38 for n = 5, 70 for n = 70, and 104 for n = 20). Such unsolved instances are considered as hard ones in subsequent works. In [16]lifted-RLT cuts are introduced and it is shown that the new constraints allow to derive anexact bound for n = 2 but also to improve the bounds of [7] over the hard instances for n > We first make the following assumption ℓ a = min x : x ⊤ x ≤ x ⊤ Ax + a ⊤ x < a , (2)i.e., the feasible region of (1) has a nonempty interior. Note that the assumption can bechecked in polynomial time by the solution of a trust region problem. We denote by X ⊆ R n a set such that E ⊆ X , i.e., it contains the ellipsoid defined by the second constraint. Foreach λ ≥
0, let us consider the Lagrangian relaxation: p ( λ ) = min x ∈ X x ⊤ ( Q + λ A ) x + ( q + λ a ) ⊤ x − λa x ⊤ x ≤ . (3)For each λ ≥
0, the optimal value p ( λ ) of this problem is a lower bound for problem (1).If X = R n , this is the standard Lagrangian relaxation of problem (1) and it can be solvedefficiently since it is a trust region problem. Now, let X ⋆ ( λ ) be the set of optimal solutionsof (3), i.e., X ⋆ ( λ ) = arg min x ∈ X : x ⊤ x ≤ x ⊤ ( Q + λ A ) x + ( q + λ a ) ⊤ x . The following theorem, whose proof is simple, will be central to all the results presented inthis paper.
Theorem 2.1
For each y λ ∈ X ⋆ ( λ ) and each y λ ∈ X ⋆ ( λ ) with λ < λ , it holds that y ⊤ λ Ay λ + a ⊤ y λ ≤ y ⊤ λ Ay λ + a ⊤ y λ . Proof.
Let us assume by contradiction that there exist y λ ∈ X ⋆ ( λ ) and y λ ∈ X ⋆ ( λ )such that y ⊤ λ Ay λ + a ⊤ y λ > y ⊤ λ Ay λ + a ⊤ y λ . (4)In view of y λ ∈ X ⋆ ( λ ) we have that y ⊤ λ Qy λ + q ⊤ y λ + λ ( y ⊤ λ Ay λ + a ⊤ y λ ) ≥ y ⊤ λ Qy λ + q ⊤ y λ + λ ( y ⊤ λ Ay λ + a ⊤ y λ ) . Moreover, in view of (4) and of λ > λ we have that( λ − λ )( y ⊤ λ Ay λ + a ⊤ y λ ) > ( λ − λ )( y ⊤ λ Ay λ + a ⊤ y λ ) . By summing up the two inequalities, we have y ⊤ λ Qy λ + q ⊤ y λ + λ ( y ⊤ λ Ay λ + a ⊤ y λ ) > y ⊤ λ Qy λ + q ⊤ y λ + λ ( y ⊤ λ Ay λ + a ⊤ y λ ) , which contradicts y λ ∈ X ⋆ ( λ ). (cid:3) It is worthwhile to remark that the theorem is valid for any possible region X ⊇ E , not onlyfor X = R n . Of course, for regions different from X = R n , the problem to be solved in order3o evaluate p as defined in (3) (a trust region problem when X = R n ) may be a difficultone. Throughout this section we will always assume that X = R n . This assumption will berelaxed in the following sections. Note that an optimal solution of (3) with λ = 0, i.e., a point y ∈ X ⋆ (0), such that y ∈ E , would be feasible and optimal also for the original problem(1). Therefore, in what follows we assume that y
6∈ E . The following lemma states that wecan always choose λ large enough so that y λ ∈ X ⋆ ( λ ) is feasible for (1). Lemma 2.1 If λ > ˆ λ = max x ∈ X : x ⊤ x ≤ x ⊤ Qx + q ⊤ x − min x ∈ X : x ⊤ x ≤ x ⊤ Qx + q ⊤ x a − ℓ a , (5) where ℓ a is defined in (2), then each y λ ∈ X ⋆ ( λ ) is feasible for (1). Proof.
Let z be an optimal solution of problem (2), i.e., z ⊤ Az + a ⊤ z = ℓ a . In view of y λ ∈ X ⋆ ( λ ), we must have: y ⊤ λ Qy λ + q ⊤ y λ + λ (cid:16) y ⊤ λ Ay λ + a ⊤ y λ (cid:17) ≤ z ⊤ Qz + q ⊤ z + λ (cid:16) z ⊤ Az + a ⊤ z (cid:17) = z ⊤ Qz + q ⊤ z + λℓ a . This can also be rewritten as z ⊤ Qz + q ⊤ z − ( y ⊤ λ Qy λ + q ⊤ y λ ) ≥ λ (cid:16) y ⊤ λ Ay λ + a ⊤ y λ − ℓ a (cid:17) . Now, let us assume by contradiction that y ⊤ λ Ay λ + a ⊤ y λ > a . Then, in view of (5) it also holds that z ⊤ Qz + q ⊤ z − ( y ⊤ λ Qy λ + q ⊤ y λ ) > max x ∈ X : x ⊤ x ≤ x ⊤ Qx + q ⊤ x − min x ∈ X : x ⊤ x ≤ x ⊤ Qx + q ⊤ x , which is not possible. (cid:3) Now, let us define the following one-dimensional function h ( λ ) = min y ∈ X ⋆ ( λ ) y ⊤ Ay + a ⊤ y . In view of Theorem 2.1 function h is nonincreasing but not necessarily continuous. Algorithm1, based on a binary search through the different λ values, is able to return the solution ofthe dual Lagrangian problem, i.e., the maximum of function p ( λ ) and, in some cases, eventhe solution of problem (1).The algorithm starts with an initial interval of λ values [ λ min , λ max ] = [0 , ˆ λ ], and at eachiteration halves such interval by first evaluating h at λ = ( λ max + λ min ) /
2, and then setting λ min = λ , if h ( λ ) > a , and λ max = λ , otherwise. We first make the following observation,which establishes a lower bound for the value of the left-hand side of the second constraint ofthe CDT problem (1) at an optimal solution of this problem.4et λ min = 0, λ max = ˆ λ while λ max − λ min > ε do Set λ = ( λ max + λ min ) / y λ be an optimal solution and compute h ( λ ) if h ( λ ) > a then Set λ min = λ else Set λ max = λ end ifend whilereturn y λ max , p ( λ ) Algorithm 1:
Binary search algorithm for the solution of the dual Lagrangian problemfor (1).
Observation 2.1
For each value λ max > , the optimal solution x ⋆ of (1) is such that x ⊤ ⋆ Ax ⋆ + a ⊤ x ⋆ ≥ y ⊤ λ max Ay λ max + a ⊤ y λ max . (6) Proof.
By optimality, x ⊤ ⋆ Qx ⋆ + q ⊤ x ⋆ ≤ y ⊤ λ max Qy λ max + q ⊤ y λ max . If (6) does not hold, then x ⊤ ⋆ Qx ⋆ + q ⊤ x ⋆ + λ max (cid:16) x ⊤ ⋆ Ax ⋆ + a ⊤ x ⋆ (cid:17) < y ⊤ λ max Qy λ max + q ⊤ y λ max + λ max (cid:16) y ⊤ λ max Ay λ max + a ⊤ y λ max (cid:17) , which contradicts y λ max ∈ X ⋆ ( λ max ). (cid:3) Now, let us assume that function h is continuous. Then, since h ( λ min ) > a , h ( λ max ) ≤ a ,and λ max − λ min →
0, by continuity we must have that h ( λ min ) , h ( λ max ) → a , from which itimmediately follows that y λ max → x ⋆ and p ( λ ) → p ⋆ . But under which conditions continuityof h holds? We observe that discontinuity at some λ is caused by the fact that problem(3) at that value λ has got multiple solutions. This is also strictly related to the necessaryand sufficient condition under which the dual Lagrangian bound is exact discussed in [1]. Inparticular, exactness does not hold if and only if function h is discontinuous and does notattain the value a , i.e., there exists ¯ λ such thatlim λ → ¯ λ − h ( λ ) < a < lim λ → ¯ λ + h ( λ ) . In this case p (¯ λ ) is the lower bound for problem (1) given by the dual Lagrangian relaxation. Example 2.1
Let us consider the following example taken from [7] Q = (cid:18) − − (cid:19) , A = (cid:18) (cid:19) , q = (1 1) a = (0 0) , a = 2 . Such instance has optimal value − attained at points (cid:16) √ , − √ (cid:17) and (cid:16) − √ , √ (cid:17) . By run-ning Algorithm 1 we have for ¯ λ = 1lim λ → ¯ λ − h ( λ ) ≈ . < a < . ≈ lim λ → ¯ λ + h ( λ ) . he optimal solution of (3) for λ = 1 which violates the second constraint is x ¯ λ = ( − . , . .The lower bound is p (1) = − . . It is point x in Figure 1 and is displayed as a circle, whilethe optimal solution in the interior of E is z and is denoted by × . -1 -0.5 0 0.5 1-1-0.8-0.6-0.4-0.200.20.40.60.81 Figure 1: Optimal solutions of the dual Lagrangian bound outside E ( x ) and in the interiorof E ( z ), denoted by ◦ and × , respectively. The continuous red curve is the border of theunit ball, while the dotted blue curve is the border of the ellipsoid E .In the next sections we will try to improve the dual Lagrangian bound (or the equivalent SDPbound) by adding linear cuts, i.e., by introducing regions X defined by one or two linear cuts. We assume that the dual Lagrangian relaxation is not exact, i.e., as previously stated, thereexists ¯ λ such that lim λ → ¯ λ − h ( λ ) < a < lim λ → ¯ λ + h ( λ ) , and p (¯ λ ) is the dual Lagrangian bound. Now we show how this bound can be improved. Wefirst observe that the optimal value of problem (1) does not change if we add constraints whichare implied by the second one. In particular, given any ¯ x at which the second constraint isactive, i.e., any ¯ x ∈ ∂ E , one can add the following linear constraint (supporting hyperplaneat ¯ x ): (2 A ¯ x + a ) ⊤ ( x − ¯ x ) ≤ , (7)6ithout modifying the feasible region of problem (1). In particular, if we denote by x ¯ λ theoptimal solution of (3) such that x ¯ λ
6∈ E , then we can set ¯ x equal to the projection of x ¯ λ over ∂ E , thus guaranteeing that (7) is violated by x ¯ λ . We can also redefine the Lagrangianproblem by adding constraint (7) to the unit ball constraint, ending up with the followingproblem min x x ⊤ ( Q + λ A ) x + ( q + λ a ) ⊤ x − λa x ⊤ x ≤ A ¯ x + a ) ⊤ ( x − ¯ x ) ≤ . (8)Note that this is exactly problem (3) where X = { x : (2 A ¯ x + a ) ⊤ ( x − ¯ x ) ≤ } . Theorem 2.1 still holds true and function h can be re-defined by taking into account theadditional constraint. However, if the additional inequality cuts one of the two optimalsolutions attained at ¯ λ , namely the one which violates the second constraint, then we canrun again Algorithm 1, decreasing the value λ max and increasing the lower bound for (1) bysolving at each iteration problem (8) rather than problem (3). Note that problem (8) canbe solved in polynomial time according to the results proved in [7, 14]. But we also discussan alternative way to implement Algorithm 1, namely Algorithm 2, which will turn out tobe useful in what follows and requires to solve a trust region problem at each iteration. For λ = ¯ λ , after the addition of the linear cut, a unique optimal solution exists, lying in theinterior of E and, consequently, also the linear constraint in (8) is not active at it, since E isa subset of the region defined by the linear cut. For λ values smaller than but close to ¯ λ , theoptimal solution must be a local and nonglobal optimal solution of the trust region problem(3) with X = R n . Indeed, the globally optimal solution of this trust region problem alwaysviolates the second constraint in (1) for all λ < ¯ λ . At this point we can run Algorithm 2. Foreach tested λ value in the while loop we can first check whether a local and nonglobal optimalsolution of problem (3) exists, by exploiting the necessary and sufficient condition stated in[15]. If such optimal solution does not exist, then we should set λ min = λ in Algorithm 2.Otherwise, if it exists, we denote it by z ( λ ) and we denote by f its objective function value.Then, we consider the optimal solution of (8) where the linear constraint is imposed to beactive. The resulting problem is converted into a trust region problem, after the change ofvariable x = ¯ x + Vz , where V ∈ R n × ( n − is a matrix whose columns form a basis for thenull space of vector ¯ x . The resulting (trust region) problem is:min w ∈ R n − w ⊤ V ⊤ ( Q + λ A ) Vw + (cid:2) x ⊤ ( Q + λ A ) V + ( q + λ a ) ⊤ (cid:3) w + ℓ (¯ x , λ ) k ¯ x + Vw k ≤ , (9)where ℓ (¯ x , λ ) = ¯ x ⊤ ( Q + λ A )¯ x + ( q + λ a ) ⊤ ¯ x − λa is constant with respect to the vector ofvariables w . Let w ⋆ be the optimal solution of problem (9). Then, we set z ( λ ) = ¯ x + Vw ⋆ and denote by f its objective function value. Finally, f is compared with f . If the formeris larger than the latter and z ( λ ) ∈ E , then we set λ max = λ , otherwise we set λ min = λ .Algorithm 2 will stop at some ˜ λ < ¯ λ . As before, if h is continuous at ˜ λ , then the bound isexact. Otherwise, we will have again two optimal solutions, one in the interior of E and theother outside E , namely vector v returned by Algorithm 2. We illustrate all this on Example2.1. 7 xample 3.1 The optimal solution of (3) for λ = 1 which violates the second constraint is x ¯ λ = ( − . , . . The lower bound is p (1) = − . . After the addition of the linearinequality (7) obtained with ¯ x equal to the projection of x ¯ λ over the boundary of the secondconstraint, we can run again Algorithm 1 and we get to ¯ λ ≈ . and p (¯ λ ) ≈ − . , whichimproves the previous lower bound. In Figure 2 we show the linear cut and the two newoptimal solutions outside and in the interior of E ( x and z , respectively). In the same figurewe also display the previous pair of optimal solutions in order to show the progress of thealgorithm. -1 -0.5 0 0.5 1-1-0.8-0.6-0.4-0.200.20.40.60.81 Figure 2: First linear cut and the two optimal solutions lying outside E ( x ) and in the interiorof E ( z ), denoted by ◦ and × , respectively. We propose two different ways to further improve the bound. The first one is still based onthe addition of a single linear cut but tries to select in some optimal way the point ¯ x employedin (8). The second one is based on the addition of a further linear cut.8et Lb = −∞ Set λ min = 0, λ max = ¯ λ Let V be a matrix whose columns form a basis for the null space of vector 2 A ¯ x + awhile λ max − λ min > ε do Set λ = ( λ max + λ min ) / if Problem (3) has no local and nonglobal minimizer then
Set f = + ∞ else Let z ( λ ) be the local and nonglobal minimizer for problem (3)Set f = z ( λ ) ⊤ ( Q + λ A ) z ( λ ) + ( q + λ a ) ⊤ z ( λ )Let z ( λ ) = ¯ x + Vw ⋆ , where w ⋆ is the global minimzer of problem (9)Set f = z ( λ ) ⊤ ( Q + λ A ) z ( λ ) + ( q + λ a ) ⊤ z ( λ ) if f < f or z ( λ )
6∈ E then
Set λ min = λ , v = z ( λ ) else Set λ max = λ if min { f , f } > Lb then Set Lb = min { f , f } end ifend ifend ifend whilereturn v , Lb, λ Algorithm 2:
Bound improvement through the additional linear inequality (7).9 .1 Searching for an ’optimal’ point ¯ x In the previous section we proposed to set ¯ x equal to the projection of x ¯ λ over ∂ E . However,this choice is not guaranteed to be the one which leads to the best possible bound. We wouldlike to compute the best possible bound which can be obtained by adding a single inequalitydefined by a supporting hyperplane of E . This is equal tomax λ ≥ , ¯ x ∈ ∂ E min x x ⊤ ( Q + λ A ) x + ( q + λ a ) ⊤ x − λa x ⊤ x ≤ A ¯ x + a ) ⊤ ( x − ¯ x ) ≤ . (10)However, we need to establish whether the problem above can be efficiently solved and, moregenerally, how to perturb a given point ¯ x in such a way that the bound can be improved. Tothis end we observe the following. For some ¯ x let λ (¯ x ) be the value λ computed by Algorithm2 when (8) is solved. Assuming that we are not in the lucky case when the algorithm returnsan exact bound, i.e., h is continuous at λ (¯ x ), for such λ value there are two optimal solutionsof (8). Let v be the optimal solution which violates the second constraint, i.e., v
6∈ E (this isthe point returned by Algorithm 2). Note that the linear constraint is active at v , i.e.,(2 A ¯ x + a ) ⊤ ( v − ¯ x ) = 0 . (11)Then, we should update ¯ x into ˜ x , in such a way that ˜ x ∈ ∂ E and, moreover,(2 A ˜ x + a ) ⊤ ( v − ˜ x ) = (2 A ˜ x + a ) ⊤ v + a ⊤ ˜ x − a > , i.e., v should also violate the new linear inequality, so that a single optimal solution, belongingto the interior of E , exists at the current λ value and, thus, we can decrease the λ value andincrease the lower bound. Then, we should choose a displacement vector d such that(¯ x + d ) ⊤ A (¯ x + d ) + a ⊤ (¯ x + d ) = a [2 A (¯ x + d ) + a ] ⊤ ( v − ¯ x − d ) > , or, equivalently, 2¯ x ⊤ Ad + d ⊤ Ad + a ⊤ d = 02( v − ¯ x ) ⊤ Ad − d ⊤ Ad − x ⊤ Ad − a ⊤ d > . By summing the equality and the inequality, we can rewrite the condition above as follows:2¯ x ⊤ Ad + d ⊤ Ad + a ⊤ d = 02( v − ¯ x ) ⊤ Ad − d ⊤ Ad > , (12)where, ignoring the equality, the right-hand side of the inequality is maximized for d = v − ¯ x .This suggests to search for the point ˜ x through a line search along ¯ x + η ( v − ¯ x ), η > x along the direction ∆ x = v − ¯ x , since we need todrive back the resulting point over ∂ E . We do this by a correction step along the direction z ¯ x = − (2 A ¯ x + a ). Thus, we consider a displacement d ′ = η ( v − ¯ x ) − γ (2 A ¯ x + a ) = η ∆ x + γ z ¯ x .
10e denote by γ η, ¯ x , v the smallest correction needed to reach ∂ E , i.e., γ η, ¯ x , v = min { γ > x + η ( v − ¯ x ) − γ (2 A ¯ x + a ) ∈ ∂ E} . (13)Note that such γ value may not be defined, but it certainly is if η is small enough. Now weare ready to prove the following theorem, which establishes the existence of a displacement d fulfilling (12). Theorem 4.1
Let d η, ¯ x , v = η ∆ x + γ η, ¯ x , v z ¯ x . Then, for η small enough (12) holds. Proof.
Let us substitute d η, ¯ x , v in (12), i.e., η ¯ x ⊤ A ∆ x + 2 γ ¯ x ⊤ Az ¯ x + η ∆ x ⊤ A ∆ x + 2 ηγ ∆ x ⊤ Az ¯ x + γ z ⊤ ¯ x Az ¯ x + η a ⊤ ∆ x + γ a ⊤ z ¯ x = 02 η ∆ x ⊤ A ∆ x + 2 γ ∆ x ⊤ Az ¯ x − η ∆ x ⊤ A ∆ x − ηγ ∆ x ⊤ Az ¯ x − γ z ⊤ ¯ x Az ¯ x > . (14) Notice that, in view of (11), η (2 A ¯ x + a ) ⊤ ∆ x = 0 , while 2 γ ¯ x ⊤ Az ¯ x + γ a ⊤ z ¯ x = − γ k A ¯ x + a k , so that we can rewrite the equality in (14) as − γ k A ¯ x + a k + η ∆ x ⊤ A ∆ x + γ z ⊤ ¯ x Az ¯ x + 2 ηγ ∆ x ⊤ Az ¯ x = 0 . Now, let us assume that γ = O ( η r ), for r <
2. For suitable constants c , c >
0, we have − η r k A ¯ x + a k + η ∆ x ⊤ A ∆ x + c η r + c η r +1 , which can not be equal to 0 for η small enough. Thus, we must have γ = O (cid:0) η (cid:1) . If wesubstitute this in the left-hand side of the inequality in (14), we have for suitable constants c , c , c >
0, 2 η ∆ x ⊤ A ∆ x − c η − η ∆ x ⊤ A ∆ x − c η − c η , which is larger than 0 for η small enough. (cid:3) This theorem suggests Algorithm 3 for the refinement of the bound. In line 2 z is initializedwith the input point ¯ x of Algorithm 2, while z ⋆ is initialized with the output point v returnedby the same algorithm. The outer while loop of the algorithm (lines 3-16) is repeated untilthe bound is improved by at least a tolerance value tol . Inside this loop, in line 4 the initialstep size η = 1 is set and a new incumbent y ∈ ∂ E is computed. The inner while loop (lines6-14) computes the step size: until the optimal value of problem (9) with ¯ x = y and λ = ¯ λ isnot larger than the current lower bound Lb , we need to decrease the step size and recomputea new incumbent y (lines 9-10), while as soon as the optimal value becomes larger than Lb ,we update z (line 12), i.e., a new linear inequality (7) with ¯ x = z is computed, which allowsto improve the bound. At line 15 we run Algorithm 2 with the new input point and update z ⋆ , Lb, ¯ λ with the output of this algorithm.In Figure 3 we display the original point v for our example (denoted by ◦ ) (here η = 1),the direction of the correction step (dotted line), the corrected point (denoted by × ), and,finally, the new linear cut (dashed line). Now we apply Algorithm 2 to our example.11et Lb old = −∞ Set z = ¯ x , z ⋆ = vwhile Lb − Lb old > tol do Set η = 1 and y = z + η ( z ⋆ − z ) − γ η, z , z ⋆ (2 Az + a ) ∈ ∂ E Set stop = 0 while stop = 0 do Solve problem (9) with ¯ x = y and λ = ¯ λ and let opt be its optimal value if opt ≤ Lb then Set η = η/ y = z + η ( z ⋆ − z ) − γ η, z , z ⋆ (2 Az + a ) ∈ ∂ E else Set stop = 1;Set z = yend ifend while Run Algorithm 2 with input z ∈ ∂ E , ¯ λ and update z ⋆ , Lb, ¯ λ with the output ofAlgorithm 2 end whilereturn Lb Algorithm 3:
Bound improvement through the search for an ’optimal’ linear inequal-ity.
Example 4.1
We have that z is initialized with ( − . , . and Lb with − . .During the execution of Algorithm 3, z and Lb are updated as indicated in Table 1. The finalvalue for λ is . . It is also instructive to understand when the correction is unable to Iteration z Lb − . , . − . − . , . − . − . , . − . − . , . − . − . , . − . − . , . − . − . , . − . further improve the bound. This is illustrated in Figure 4. At the last iteration of Algorithm3, when z corresponds to the one at the last line of Table 1, the supporting line of E at z is the dashed line in the figure and problem (9) has two optimal solutions (the circles in thefigure). Then, it is not possible to improve the bound, since a correction in the direction ofone of the two optimal solutions is able to exclude such optimal solution but not the other one.Also note that, by including also the optimal solution in the interior of E (denoted by a crossin the figure), we have three distinct optimal solutions of problem (8). Then, we are unableto make any progress when solving problem (3) when we have two distinct optimal solutions,while when solving problem (8) that happens when we have three distinct optimal solutions. Figure 3: Original point v ( ◦ ), direction of the correction step (dotted line), corrected point( × ), and the new linear cut (dashed line).Interestingly, the best bound obtained in the example is exactly the one obtained for the sameproblem by the approach proposed in [7], based on the addition of SOC-RLT constraints.Thus, we conjecture that the bound in [7] is equal to the best possible bound which can beobtained by adding a single linear inequality, i.e., bound (10). Another possible way to improve the bound is by adding a further linear cut to (8), forinstance, the one obtained through the projection over ∂ E of the optimal solution v
6∈ E returned by Algorithm 2. Let ˜ x be such projection. Then, we define the following problemmin x x ⊤ ( Q + λ A ) x + ( q + λ a ) ⊤ x − λa x ⊤ x ≤ A ¯ x + a ) ⊤ ( x − ¯ x ) ≤ A ˜ x + a ) ⊤ ( x − ˜ x ) ≤ , (15)which is equivalent to problem (3) where X = { x : (2 A ¯ x + a ) ⊤ ( x − ¯ x ) ≤ , (2 A ˜ x + a ) ⊤ ( x − ˜ x ) ≤ } . Again Theorem 2.1 still holds true and function h can be re-defined in the same way. However,a convex reformulation as the one proposed in [7, 14] for problem (8) is not available in this13 Figure 4: Situation at the last iteration of Algorithm 3: it is not possible to cut both optimalsolutions of problem (9) (the circles in the figure) by perturbing the current supporting linein either direction.case (unless the two linear inequalities do not intersect in the interior of the unit ball). Butin this case the alternative procedure discussed in Section 3 turns out to be useful. We stillhave that for λ values smaller but close to λ max a unique optimal solution exists, belongingto the interior of E , so that also the linear constraints in (15) are not active at it. Thus, suchoptimal solution must be a local and nonglobal minimizer of problem (3) (recall once againthat the globally optimal solution of this problem always violates the second constraint in (1)for all λ < ¯ λ ). As before, for each tested λ value in the while loop we can first check whethera local and nonglobal optimal solution of problem (3) exists, by exploiting the necessary andsufficient condition stated in [15]. If such optimal solution does not exist, then we should set λ min = λ in Algorithm 2. Otherwise, if it exists, we compare its value with the value of theoptimal solution of (15) where at least one of the two linear constraints is active, i.e., we needto solve the following problemmin x x ⊤ ( Q + λ A ) x + ( q + λ a ) ⊤ x − λa x ⊤ x ≤ A ¯ x + a ) ⊤ ( x − ¯ x ) ≤ A ˜ x + a ) ⊤ ( x − ˜ x ) ≤ (cid:2) (2 A ¯ x + a ) ⊤ ( x − ¯ x ) (cid:3) (cid:2) (2 A ˜ x + a ) ⊤ ( x − ˜ x ) (cid:3) = 0 . (16)14 convex reformulation of this problem has been proposed in [17]. Alternatively, one can solvetwo distinct problems, each imposing that one of the two linear inequalities is active. Each ofthese problems can be converted into a trust region one with an additional linear inequality,which can be solved in polynomial time through the already mentioned convex reformulationproposed in [7, 14]. As before, the optimal value of this problem is compared with the valueof the local and nonglobal minimizer. If the former is larger than the latter and the local andnonglobal minimizer belongs to E , then we set λ max = λ , otherwise we set λ min = λ . Statedin another way, we still apply Algorithm 2, with the only difference that now z ( λ ) is thesolution of problem (16), while z ( λ ) is still the local and nonglobal minimizer of the trustregion problem (if it exists). We illustrate all this on Example 2.1. Example 4.2
We add a second linear cut obtained through the projection over ∂ E of theoptimal solution of problem (8) with λ = 0 . outside E . This leads to a further improvementwith ¯ λ ≈ . and p (¯ λ ) ≈ − . , which almost closes the gap. In Figure 5 we show the twolinear cuts and the two new optimal solutions outside and in the interior of E ( x and z ,respectively). Again, we also report the previous pairs of optimal solutions in order to showthe progress. -1 -0.5 0 0.5 1-1-0.8-0.6-0.4-0.200.20.40.60.81 Figure 5: Two linear cuts and the two optimal solutions outside E ( x ) and in the interior of E ( z ), denoted by ◦ and × , respectively.Of course, also in this case ’optimal’ ¯ x and ˜ x could be searched for. One can combine thecontents presented in Section 4.1 and in the current section, by using a technique similarto the one described in the former section to choose an optimal pair of points ¯ x and ˜ x . In15articular, once we have computed the bound, we have as usual, one optimal solution in theinterior of E and the other outside E . We denote the latter by v and we observe that at leastone of the two linear cuts is active at this point. Then, if only the first cut is active at v , weperturb such cut as indicated in Theorem 4.1, in order to produce a new cut violated by v . Ifonly the second cut is active, then we do the same with such cut. Finally, if both are activewe select one of the two cuts and perturb it. After the perturbation, we run again Algorithm2, and we repeat this procedure until there is a significant reduction of the bound. Example 4.3
In our example, this refinement is finally able to close the gap and returnthe exact optimal value − . In Figure 6 we report the result of the first perturbation of thelinear cuts. Since only the second linear cut is active at x , in this case the second linearcut is slightly perturbed and becomes equivalent to the tangent to E at the optimal solution ( −√ / , √ / of the original problem (1). It is interesting to note that the new optimalsolution outside E , indicated by x , lies in a different region with respect to the previous onesand is further from ∂ E with respect to x and x (the reduction of λ reduces the penalizationof points outside E ). Such solution is cut by the new linear inequality, obtained by a (not sosmall) perturbation of the first linear cut, displayed in Figure 7, together with the two newoptimal solutions ( x and z ), now corresponding to the two optimal solutions of problem (1). -1 -0.5 0 0.5 1-1-0.8-0.6-0.4-0.200.20.40.60.81 Figure 6: Perturbation of the second linear cut and the two new optimal solutions outside E ( x ) and in the interior of E ( z ), denoted by ◦ and × , respectively.16 Figure 7: Perturbation of the first linear cut and the two optimal solutions outside E ( x )and in the interior of E ( z ), denoted by ◦ and × , respectively. In this section we report the computational results for the proposed bounds over the set ofhard instances selected from the random ones generated in [7] and inspired by [11]. Moreprecisely, in [7] 1000 random instances were generated for each size n = 5 , ,
20. Some ofthese instances have been declared hard ones, namely those for which the bound obtainedby adding SOC-RLT constraints was not exact. In particular, these are 38 instances with n = 5, 70 instances with n = 10, and 104 instances with n = 20. Such instances have beenmade available in GAMS , AMPL , and
COCOUNT formats in [10]. We tested our bounds on suchinstances. All tests have been performed on an Intel Core i7 8550U with 16GB of RAM. Allbounds have been coded in
MATLAB .We computed the following bounds: • LbDual , the dual Lagrangian bound computed through Algorithm 1; • LbOneCut , the bound obtained by adding a single linear cut and computed throughAlgorithm 2; • LbOneOpt , the bound obtained by ’optimizing’ the added linear cut as indicated inAlgorithm 3; • LbTwoCut , the bound obtained by adding two linear cuts;17
LbTwoOpt the bound obtained by selecting two ’optimal’ cuts.According to what done in [2, 7, 16], an instance is considered to be solved when the rel-ative gap between the lower bound, say LB , and the upper bound, say U B , is not larger than10 − , i.e., U B − LB | U B | ≤ − . We set
U B equal to the lowest value obtained by running, after the addition of the firstlinear cut, two local searches for the original problem (1), one from the optimal solution of(8) belonging to the interior of E with the λ value at the end of Algorithm 2, and the otherfrom the optimal solution of the same problem outside E . In Tables 2-4 we report the averageand maximum relative gaps for each bound, and the average and maximum computing timesfor n = 5 , ,
20, respectively. Moreover, the average computing time for bound
LbTwoOpt iscomputed only over the instances (87 overall, as we will see) which are not solved by bound
LbTwoCut , while for bounds
LbTwoCut and
LbTwoOpt the average gap is taken over the in-stances which were not solved by these bounds.We remark that the bound
LbTwoCut is computed by adding the first cut as in bound
LbOneCut , and then adding a further linear cut through the projection of the optimal so-lution outside E obtained when computing bound LbOneCut . We could as well choose the’optimized’ cut computed by bound
LbOneOpt as the first cut for bound
LbTwoCut , but weobserved that with this choice no improvement over
LbOneOpt is obtained. This is relatedto what already observed in Example 4.1: bound
LbOneOpt cannot be improved any morewhen there are (at least) two optimal solutions outside E (besides the one in the interior of E ). Thus, the second cut is able to remove one of such optimal solutions but not the other,so that the bound cannot be improved. Similarly, for bound LbTwoOpt the two initial cutsare the ones computed for bound
LbTwoCut .For what concerns the computing times, we observe that these are lower than those re-ported in [16] for the bound obtained with the addition of SOC-RLT cuts (around 4s for aninstance with n = 20) and for the bound obtained by adding lifted-RLT cuts (around 92sfor an instance with n = 20). They are also lower than those reported in [2] for the boundobtained by adding KSOC cuts (up to 2s for n = 20 instances). In general, the proposedbounds are very cheap. Only for two instances with n = 20, LbTwoOpt required times above1s (around 1.5s in both cases). Usually the computing times are (largely) below 1s. Boththe dual Lagrangian bound and the bound obtained by a single linear cut are pretty cheapbut with poorer performance in terms of relative gap. The bound obtained by Algorithm3 with an optimized linear cut is better than the two previous ones in terms of gap but isalso more expensive (although still cheap). The bound
LbTwoCut offers a good combinationbetween quality and cheap computing time. But a more careful choice of the two linear cutsimproves the quality without compromising the computing times. This is confirmed by theresults reported for
LbTwoOpt . Although this bound is more expensive than the others, theadditional search for ’optimal’ linear cuts further increases the quality of the bound. In Table5 we report the number of solved instances for
LbTwoCut and
LbTwoOpt . According to whatreported in [2], the total number of unsolved instances out of the 212 hard instances is equal18o: 133 for the bound proposed in [16] (18 with n = 5, 49 with n = 10, and 66 with n = 20);85 for the bound proposed in [2] (18 with n = 5, 22 with n = 10, and 45 with n = 20); 56 byconsidering the best bound between the one in [16] and the one in [2] (10 with n = 5, 15 with n = 10, and 31 with n = 20). For bound LbTwoCut the total number of unsolved instancesreduces to 87 (24, 29 and 34 for n = 5, n = 10, and n = 20, respectively). Finally, for bound LbTwoOpt we have the remarkable outcome that there is just one unsolved instance. For thesake of correctness, we should warn that the value
U B in [2, 16] is not computed by runningtwo local searches as done in this paper. It is instead computed from the final solution of therelaxed problem, so that it could be slightly worse and justify the larger number of unsolvedinstances. All the same, the quality of the proposed bounds appears to be quite good.Bound Average relative gap (%) Max relative gap (%) Average time Max time
LbDual
LbOneCut
LbOneOpt
LbTwoCut
LbTwoOpt n = 5Bound Average relative gap (%) Max relative gap (%) Average time Max time LbDual
LbOneCut
LbOneOpt
LbTwoCut
LbTwoOpt n = 10Bound Average relative gap (%) Max relative gap (%) Average time Max time LbDual
LbOneCut
LbOneOpt
LbTwoCut
LbTwoOpt n = 20 As a final experiment, we investigate the behaviour of bound
LbTwoOpt over the hardestinstances with n = 20. Among such instances we included, besides the single one for which19ound n = 5 (out of 38) n = 10 (out of 70) n = 20 (out of 104) LbTwoCut
14 41 70
LbTwoOpt
38 70 103Table 5: Number of solved instances for the bounds
LbTwoCut and
LbTwoOpt .the relative error is above 10 − , also those (overall, five more) for which the relative erroris below 10 − but above 10 − . For all these instances, at the last iteration we recorded thefollowing objective function values, corresponding to values of local minimizers of problem(15), which certainly include the global minimizer(s) of such problem: • the value at the optimal solution of problem (15) belonging to the interior of E ; • the value at the globally optimal solution of the trust region problem obtained by fixingin problem (15) the first linear cut to an equality, in case such solution fulfills the secondlinear cut, or, alternatively, the value at the local and nonglobal solution of the sameproblem, in case such solution exists and fulfills the second linear cut (if the globalminimizer does not fulfill the second linear cut and the local and nonglobal minimizerdoes not exist or does not fulfill the second linear cut, then the value is left undefined); • the same value as above but after fixing the second linear cut to an equality in problem(15); • the value at the globally optimal solution of the trust region problem obtained by fixingboth cuts to equalities in problem (15).Note that two of the four values must be equal. In particular, one of the two equal valuesis always the first one, attained in the interior of E . But for the six hardest instances weobserved that at least three of the above values are equal or very close to each other (andalso lower than the U B value). Moreover, for the single instance for which the relative erroris above 10 − , it turns out that all four values are very close to each other and all of themare lower than the U B value. These situations can be illustrated through Figures 8a-8c.We remark that such figures are not derived from a real instance, but they are employed toillustrate in 2-D the situations described above with three or four similar values, which turnout to be challenging for bound
LbTwoOpt . The case with three solutions with equal (or, atleast, very close) objective function values are illustrated in Figures 8a and 8b. As usual, thepoint in the interior of E is denoted by × , while the others (outside E ) are denoted by ◦ . Wenotice that in these cases it is not possible to remove all the solutions outside E by perturbinga single linear cut. Indeed, in both cases the perturbation of a single linear cut is able toremove just one of the two optimal solutions outside E . But it is possible to remove bothby perturbing both linear cuts. Instead, Figure 8c illustrates the case where there are foursolutions with equal (or, again, very close) values. In this case even the perturbation of bothlinear cuts is unable to remove all the three solutions outside E . This is indeed the situationoccurring with the single instance for which the relative error is above 10 − . Note that suchsituation is the equivalent for the case of two linear cuts of the one discussed in Example 4.1for the case with a single linear cut. The only way to remove all three solutions outside E is through the addition of a further linear cut, but, of course, this leads to a more complexproblem with one trust region constraint and three linear inequalities.20 Conclusions
In this paper we discussed the CDT problem. First, we have re-derived a necessary and suffi-cient condition for the exactness of the Shor relaxation and of the equivalent dual Lagrangianbound. The condition is based on the existence of multiple solutions for a Lagrangian relax-ation. Based on such condition, we proposed to strengthen the dual Lagrangian bound byadding one or two linear cuts. These cuts are based on supporting hyperplanes of one of thetwo quadratic constraints and they are, thus, redundant for the original CDT problem (1).However, the cuts are not redundant for the Lagrangian relaxation and their addition allowsto improve the bound. We ran different computational experiments over the 212 hard testinstances selected from the three thousand ones randomly generated in [7], reporting gapsand computing times. We have shown that the bounds are computationally cheap and arequite effective. In particular, one of them, based on the addition of two linear cuts, is able tosolve all but one of the hard instances. We have also investigated when the bounds are unableto make further progress. It turns out that while the dual Lagrangian bound is not exactwhen we get to a Lagrangian relaxation with two optimal solutions, for the dual Lagrangianbound with an additional linear cut the same occurs when we get to a Lagrangian relaxationwith three optimal solutions, while for the dual Lagrangian bound with two additional linearcuts it occurs when we get to a Lagrangian relaxation with four optimal solutions. An inter-esting topic for future research could be that of establishing the relations between the boundsproposed in this work and those presented in the recent literature. Moreover, it would alsobe interesting to develop procedures which are able to generate CDT instances for which thebound
LbTwoOpt is unable to return the optimal value.
References [1] W. Ai, S. Zhang, Strong duality for the CDT subproblem: A necessary and sufficientcondition,
SIAM J. Optim. , 19(4), 1735–1756 (2009)[2] K.M. Anstreicher, Kronecker product constraints with an application to the two-trustregion subproblem,
SIAM J. Optim. , 27(1), 368–378 (2017)[3] A.I. Barvinok, Feasibility testing for systems of real quadratic equations,
Discrete Com-putational Geometry , 10, 1–13 (1993)[4] A. Beck and Y. C. Eldar, Strong duality in nonconvex quadratic optimization with twoquadratic constraints,
SIAM J. Optim. , 17, 844–860 (2006)[5] D. Bienstock, A note on polynomial solvability of the CDT problem,
SIAM Journal onOptimization , 26, 488–498 (2016)[6] I.M.Bomze, M.L. Overton, Narrowing the difficulty gap for the Celis-Dennis-Tapia prob-lem,
Mathematical Programming , 151, 459–476 (2015)[7] S.Burer, K.M. Anstreicher, Second-oder-cone constraints for extended trust-region sub-problems,
SIAM J. Optim. , 23(1), 432–451 (2013)[8] M.R. Celis, J.E. Dennis, R.A. Tapia, A trust region strategy for nonlinear equality con-strained optimization. In: Boggs, P.T., Byrd, R.H., Schnabel, R.B. (eds.)
Numerical Op-timization 1984 , SIAM, Philadelphia (1985)219] L. Consolini and M.Locatelli, On the complexity of quadratic programming with twoquadratic constraints,
Mathematical Programming , 164, 91–128 (2017)[10] T. Montahner, A. Neumaier, F. Domes, A computational study of global optimizationsolvers on two trust region subproblems,
Journal of Global Optimization , 71, 915–934(2018)[11] J.M. Martinez, Local minimizers of quadratic functions on Euclidean balls andspheres,
SIAM J. Optim. , 4(1), 159–176 (1994)[12] D. C. Sorensen, Newton’s method with a model trust region modification,
SIAM J.Numer.Anal. , 19, 409–426 (1982)[13] S. Sakaue, Y. Nakatsukasa, A.Takeda, A. and S. Iwata, Solving generalized CDT prob-lems via two-parameter eigenvalues,
SIAM Journal on Optimization , 26, 1669–1694 (2016)[14] J.F. Sturm, S. Zhang, On cones of nonnegative quadratic functions,
Mathematics ofOperations Research , 28(2), 246–267 (2003)[15] J. Wang, Y. Xia, Closing the gap between necessary and sufficient conditions for lo-cal nonglobal minimizer of trust region subproblems,
SIAM J. Optim. , 30(3), 1980–1995(2020)[16] B. Yang, S. Burer, A two-varable approach to the two-trust region subproblem,
SIAMJ. Optim. , 26(1), 661–680 (2016)[17] Y. Ye, S. Zhang, New results on quadratic minimization,
SIAM J. Optim. , 14(1), 245–267(2003)[18] J. Yuan, M. Wang, W. Ai, T. Shuai, New results on narrowing the duality gap on theextended Celis-Dennis-Tapia problem,
SIAM J. Optim. , 27(2), 890–909 (2017)22 (a) Three optimal solutions, none with both linear cuts active. -1.5 -1 -0.5 0 0.5 1 1.5-1.5-1-0.500.511.5 (b) Three optimal solutions, one with both linear cuts active. -1.5 -1 -0.5 0 0.5 1 1.5-1.5-1-0.500.511.5 (c) Four optimal solutions.(c) Four optimal solutions.