Face Dimensions of General-Purpose Cutting Planes for Mixed-Integer Linear Programs
aa r X i v : . [ c s . D M ] N ov Face Dimensions of General-Purpose Cutting Planes forMixed-Integer Linear Programs
Matthias Walter ∗ November 13, 2020
Abstract
Cutting planes are a key ingredient to successfully solve mixed-integer linear programs. Forspecific problems, their strength is often theoretically assessed by showing that they are facet-defining for the corresponding mixed-integer hull. In this paper we experimentally investigatethe dimensions of faces induced by general-purpose cutting planes generated by a state-of-the-art solver. Therefore, we relate the dimension of each cutting plane to its impact in abranch-and-bound algorithm.
We consider the mixed-integer programmax c ⊺ x (1a)s.t. Ax ≤ b (1b) x i ∈ Z ∀ i ∈ I (1c)for a matrix A ∈ R m × n , vectors c ∈ R n and b ∈ R m and a subset I ⊆ { , , . . . , n } of integervariables. Let P := conv { x ∈ R n : x satisfies (1b) and (1c) } denote the corresponding mixed-integer hull . A cutting plane for (1) is an inequality a ⊺ x ≤ β that is valid for P (and possiblyinvalid for some point computed during branch-and-cut). Such a valid inequality induces the face F := { x ∈ P : a ⊺ x = β } of P . For more background on polyhedra we refer to [10].To tackle specific problems, one often tries to find facet-defining inequalities, which are inequal-ities whose induced face F satisfies dim( F ) = dim( P ) −
1. This is justified by the fact that anysystem Cx ≤ d with P = { x : Cx ≤ d } has to contain an inequality that induces F . Since thedimension of a face can vary between − x ∈ P satisfies a ⊺ x = β ) and dim( P ) (all x ∈ P satisfy a ⊺ x = β ), the following hypothesis is a reasonable generalization of facetness as a strengthindicator. Hypothesis 1.
The practical strength of an inequality correlates with the dimension of its inducedface.
There is no unified notion of the practical strength of an inequality, but we will later define onethat is related to its impact in a branch-and-bound algorithm. The main goal of this paper is tocomputationally test this hypothesis for general-purpose cutting planes used in MIP solvers.
Outline.
In Section 2 we present the algorithm we used to compute dimensions. Section 3 isdedicated to the score we used to assess a cutting plane’s impact, and in Section 4 we present ourfindings, in particular regarding Hypothesis 1. ∗ Department of Applied Mathematics, University of Twente, The Netherlands, [email protected] Computing the dimension of a face
This section is concerned about how to effectively compute the dimension of P or of one of itsfaces F induced by some valid inequality a ⊺ x ≤ β . In our experiments we will consider instanceswith several hundreds variables, and hence the enumeration of all vertices of P or F is typicallyimpossible. Instead, we use an oracle-based approach. An optimization oracle for P is a black-boxsubroutine that can solve any linear program over the polyhedron P , i.e., for any given w ∈ R n itcan solve max w ⊺ x s.t. x ∈ P. (2)In case (2) is feasible and bounded, the oracle shall return an optimal solution, and in case it isfeasible and unbounded, it shall return an unbounded direction. Note that for such an oracle weneither require all vertices of P nor all valid (irredundant) inequalities. Indeed, we can use anyMIP solver and apply it to problem (1) with c := w .We now discuss how one can compute dim( P ) by only accessing an optimization oracle. To keepthe presentation simple we assume that P is bounded, although our implementation can handleunbounded polyhedra as well. The algorithm maintains a set X ⊆ P of affinely independent pointsand a system Dx = e of valid equations, where D has full row-rank r . Hence, dim( X ) ≤ dim( P ) ≤ n − r holds throughout and the algorithm works by either increasing dim( X ) or r in every iteration.The details are provided in Algorithm 1. Algorithm 1:
Affine hull of a polytope via optimization oracle
Input:
Optimization oracle for a polytope ∅ 6 = P ⊆ R n . Output:
Affine basis X of aff( P ), non-redundant system Dx = e withaff( P ) = { x ∈ R n : Dx = e } . Initialize X := ∅ and Dx = e empty. while | X | + 1 < n − | rows( D ) | do Compute a direction vector d := aff( X ) ⊥ \ span(rows( D )). Query the oracle to maximize d ⊺ x over x ∈ P and let x + := arg max { d ⊺ x : x ∈ P } . Query the oracle to maximize − d ⊺ x over x ∈ P and let x − := arg min { d ⊺ x : x ∈ P } . if d ⊺ x + = d ⊺ x − then Add equation d ⊺ x = d ⊺ x + to system Dx = e . if X = ∅ then set X := { x + } . end else if X = ∅ then set X := { x + , x − } . else Let γ := d ⊺ x for some x ∈ X . if d ⊺ x + = γ then set X := X ∪ { x + } . else set X := X ∪ { x − } . end end return ( X, Dx = e ) . Proposition 2.
For an optimization oracle for a non-empty polytope P ⊆ R n , Algorithm 1 requires n oracle queries to compute a set X ⊆ P with | X | = dim( P )+1 and a system Dx = e of n − dim( P ) equations satisfying aff( P ) = aff( X ) = { x ∈ R n : Dx = e } . Proof.
Since every point x ∈ X was computed by the optimization oracle, we have X ⊆ P .Moreover, Dx = e is valid for P since for each equation d ⊺ x = γ , min { d ⊺ x : x ∈ P } = γ =max { d ⊺ x : x ∈ P } holds. We now prove by induction on the number of iterations that in everyiteration of the algorithm, the points x ∈ X are affinely independent and that the rows of D arelinearly independent.Initially, this is invariant holds because X = ∅ and D has no rows. Whenever an equation d ⊺ x = d ⊺ x + is added to Dx = e in step 7, the vector d is linearly independent to the rows of D (see step 3). If in step 8, X is initialized with a single point, it is clearly affinely independent.2imilarly, the two points in step 10 also form an affinely independent set since d ⊺ x + > d ⊺ x − implies x + = x − . Suppose X is augmented by ¯ x := x + in step 13 or by ¯ x := x − in step 14. Note that bythe choice of d in step 3, d ⊺ x = γ holds for all x ∈ X . Due to d ⊺ ¯ x = γ , X ∪ { ¯ x } remains affinelyindependent.After the first iteration, we have | X | + | rows( C ) | = 2, and in every further iteration, thisquantity increases by 1. Hence, the algorithm requires n iterations, each of which performs 2oracle queries. The result follows.For a proof that the number 2 n of oracle queries is optimal we refer to [11]. Implementation details.
We now describe some details of the implementation within the soft-ware framework
IPO [12]. In the description of Algorithm 1 we did not specify step 3 precisely.While one may enforce the requirement d / ∈ span(rows( D )) via d ∈ rows( D ) ⊥ , it turned out thatthis is numerically less stable than first computing a basis of aff( X ) ⊥ and selecting a basis element d that is not in the span of D ’s rows. Moreover, we can take d ’s sparsity and other numericalproperties into account. Sparsity can speed-up the overall computation since very sparse objectivevectors are sometimes easier to optimize for a MIP solver. In theory, for all x ∈ X , the products d ⊺ x have the same value. However, due to floating-point arithmetic the computed values maydiffer. It turned out that preferring directions d for which the range of these products is smallhelps to avoid numerical difficulties.In our application we compute the dimension of P and of several of its faces. We can exploitthis by caching all points x ∈ P returned by an optimization oracle in a set ¯ X . Then, for eachface F induced by an inequality a ⊺ x ≤ β we can then compute the set ¯ X F := { x ∈ ¯ X : a ⊺ x = β } .Before querying the oracle we can then check the set ¯ X F for a point with sufficiently large objectivevalue, which saves two calls to the MIP solvers.Let ¯ Dx = ¯ e be the equation system returned by Algorithm 1 for P . Now, for a face F inducedby inequality a ⊺ x ≤ β we can initialize Dx = e in the algorithm by ¯ Dx = ¯ e . Moreover, if a ⊺ x = β is not implied by Dx = e we add this equation to Dx = e as well.Since the algorithm is implemented in floating-point arithmetic, errors can occur which may leadto wrong dimension results. We checked the results using an exact arithmetic implementation ofAlgorithm 1 for easier instances (with less than 200 variables). For these, the computed dimensionvaried by at most 2 from the true dimension.In a first implementation, our code frequently reported dimension −
1, and it turned out thatoften the right-hand side was only slightly larger than needed to make the inequality supporting.Thus, for each inequality a ⊺ x ≤ β , normalized to || a || = 1, we computed β true := max { a ⊺ x : x ∈ P } by a single oracle query. Whenever we observed β < β true − − , we considered the cut a ⊺ x ≤ β as invalid (indicated by the symbol (cid:7) ). If β > β true + 10 − , we declare the cut to benon-supporting. Otherwise, we replace β by β true before running Algorithm 1. In this section we introduce our cut impact measure for indicating, for a given cutting plane a ⊺ x ≤ β , how useful its addition in the context of branch-and-cut is. The main goal of solvingan LP at a branch-and-bound node is to determine a dual bound of the current subproblem. Ifthis bound is at most the value of the best feasible solution known so far, then the subproblemcan be discarded. Thus, we consider the value of such a bound (after adding a certain cut) inrelation to the problem’s optimum z ⋆ := max { c ⊺ x : Ax ≤ b, x i ∈ Z ∀ i ∈ I } and the value z LP := max { c ⊺ x : Ax ≤ b } of the LP relaxation.Our first approach was to just evaluate the dual bound obtained from the LP relaxation Ax ≤ b augmented by a ⊺ x ≤ β . However, adding a single inequality often does not cut off the optimalface of the LP relaxation, which means that the bound does not change. In a second attemptwe tried to evaluate the dual bound of the LP relaxation augmented by a random selection of k cutting planes. However, the variance of the resulting cut impact measure was very large evenafter averaging over 10.000 such selections. 3s a consequence, we discarded cut impact measures based on the combined impact of severalcutting planes. Instead we carried out the following steps for given cutting planes a ⊺ x ≤ β , a ⊺ x ≤ β , . . . , a ⊺ k x ≤ β k :1. Compute the optimum z ⋆ and optimum solution x ⋆ ∈ P .2. Compute z LP .3. For i = 0 , , , . . . , k , solvemax c ⊺ x s.t. Ax ≤ b, x j ∈ Z ∀ j ∈ I and a ⊺ i x ≤ β i if i ≥ x ⋆ as an initial incumbent, with presolve, cutting planes and heuristics disabled andwhere a time limit of 60 seconds is enforced.4. Let N denote the minimum number of branch-and-bound nodes used in any these k + 1 runs.5. For i = 0 , , . . . , k , let z i be the dual bound obtained in run i when stopping after N branch-and-bound nodes. For i ≥
1, the closed gap of cut i is defined as ( z i − z LP ) / ( z ⋆ − z LP ). For i = 0, this yields the closed gap without cuts . Remarks.
The closed gap is essentially the dual bound, normalized such that a value of 0 meansno bound improvement over the root LP bound without cuts and a value of 1 means that theinstance was solved to optimality. The effective limit of N branch-and-bound nodes was introducedsuch that all runs reach this limit. This circumvents the question of how to compare runs in whichthe problem was solved to optimality with those that could not solve it. For all runs, presolveand domain propagation were disabled due to our focus on the branch-and-bound algorithm itself.To avoid interaction with heuristics, the latter were disabled, but the optimal solution x ⋆ wasprovided. In order to test Hypothesis 1 we considered the 65 instances from the MIPLIB 3 [9]. For each ofthem we computed the dimension of the mixed-integer hull P , imposing a time limit of 10 minutes .Moreover, we ran the state-of-the-art solver SCIP [3] and collected all cutting planes generated inthe root node, including those that were discarded by
SCIP ’s cut selection routine . For each ofthe cuts we computed the dimension of its induced face (see Section 2) as well as the closed gapafter processing N (as defined in Section 3) branch-and-bound nodes.While for instances air03 , air05 , nw04 , no cuts were generated by SCIP , our implementationof Algorithm 1 ran into numerical difficulties during the computation of dim( P ) for set1ch . More-over, dim( P ) could not be computed within 10 minutes for instances air04 , arki001 , cap6000 , dano3mip , danoint , dsbmip , fast0507 , gesa2 0 , gesa3 0 , l152lav , mas74 , misc06 , mitre , mkc , mod010 , mod011 , pk1 , pp08aCUTS , pp08a , qnet1 , rentacar , rout and swath .We evaluated the remaining 37 instances, whose characteristic data is shown in Table 1. Wedistinguish how many cuts SCIP found by which cut separation method, in particular to investigatewhether certain separation routines tend to generate cuts with low- or high-dimensional faces.We verified some of the invalid cutting planes manually, i.e., checked that these cuts are gen-erated by
SCIP and that there exists a feasible solution that is indeed cut off. The most likelyreason for their occurrence is that
SCIP performed dual reductions although we disabled them viacorresponding parameters . All experiments were carried out on a single core of an Intel Core i3 CPU running at 2.10 GHz with 8 GB RAM. We disabled presolve, domain propagation, dual reductions, symmetry, and restarts. We set misc/allowweakdualreds and misc/allowstrongdualreds to false . /
0, timeouts / , invalid cuts (cid:7) ), dimension of the mixed-integer hull,and number N of branch-and-bound nodes (see Section 3).– Lifted extended weight inequalities [14, 8, 13]– Complemented Mixed-Integer Rounding (c-MIR) inequalities [7, 14]– { , / } -Chv´atal-Gomory inequalities [5, 2]– Strengthened Chv´atal-Gomory inequalities [6]– Lifted flow-cover inequalities [4]– Multi-commodity flow inequalities [1] Instance Cuts Analyzed by class Failed Dim. B&Btotal / / (cid:7) of P nodes bell3a 25 25 121 53075bell5 53 36 1 11 5 97 7815blend2 8 7 1 245 597dcmulti 172 137 14 21 467 773egout 125 97 1 24 3 41 1enigma 82 59 1 17 5 3 1fiber 533 127 139 7 4 9 207 33 6 1 946 23395fixnet6 772 612 83 42 35 779 53900flugpl 42 42 9 1753gen 28 17 8 3 540 21gesa2 470 16 419 2 6 27 1176 21114gesa3 259 194 2 6 15 38 4 1104 729gt2 143 92 3 39 8 1 188 1harp2 1028 659 210 9 4 112 5 28 1 1300 13748khb05250 122 78 5 39 1229 425lseu 125 35 85 2 3 89 3495markshare1 107 104 3 50 318260markshare2 84 79 3 2 60 286347mas76 225 204 1 20 151 211381misc03 634 3 283 33 311 4 116 13misc07 808 286 34 440 36 12 204 22839mod008 441 119 272 3 47 319 1158modglob 268 186 3 1 78 327 112516noswot 164 151 11 2 120 160344p0033 94 14 40 10 30 27 127p0201 263 12 152 14 78 7 139 21p0282 904 368 441 6 15 1 4 69 282 57p0548 577 159 213 10 16 62 117 520 921p2756 1000 82 96 22 48 55 26 671 2716 15149qiu 63 51 2 10 709 10406qnet1 162 95 10 47 6 4 1233 5rgn 278 168 65 45 160 1691seymour 6246 656 53 5528 9 0 1255 9stein27 886 517 9 360 27 3673stein45 1613 1221 10 382 45 45371vpm1 281 129 32 117 3 288 27881vpm2 353 251 1 30 63 8 286 172271 p2756 (too many failures, see Table 1), blend2 (only 7 cuts analyzed), enigma (dim( P ) = 3 is very small), and for markshare1 , markshare2 and noswot (all cuts were ineffective). Moreover, we present several plots in the Appendix A sincethese are similar to those of other instances.The plots show the dimension of the cuts (horizontal axis, rounded to 19 groups) togetherwith their closed gap (vertical axis, 14 groups) according to Section 3. Each circle correspondsto a nonempty set of cuts, where the segments depict the respective cut classes (see Table 1) andtheir color depicts the number k of cuts, where the largest occurring number L is specified in thecaption. The colors are red (0 . L < k ≤ L ), orange (0 . L < k ≤ . L ), yellow (0 . L < k ≤ . L ),green (0 . L < k ≤ . L ), turquoise (0 . L < k ≤ . L ) and blue (1 ≤ k < . L ). For instance,the circle for fixnet6 containing a red and a turquoise segment subsumes cutting planes with facedimensions between 730 and 777, and closed gap of approximately 0 .
45. As the legend next to theplot indicates, this circle represents k ∈ [109 , k ′ ∈ [13 ,
36] multi-commodityflow cuts. The dashed horizontal line indicates the closed gap without cuts (see Section 3).0 194 389 583 7780 . . . fixnet6 : L = 121 cuts Segment colors for fixnet6 : between 1 and 12 cutsbetween 13 and 36 cutsbetween 37 and 60 cutsbetween 61 and 84 cutsbetween 85 and 108 cutsbetween 109 and 121 cuts0 28 57 86 11500 . misc03 : L = 103 cuts 0 46 93 140 1870 . . . . gt2 : L = 48 cutsThe first three plots already highlight that the results are very heterogeneous: while the facesof the strongest cuts in fixnet6 have a high dimension, the strongest ones for misc03 are noteven supporting. Even when considering non-supporting cuts as outliers, the dimension does notindicate practical strength, as the plot for gt2 shows. A quick look at the other plots lets usconclude that Hypothesis 1 is false — at least for the strength measure from Section 3.6 324 649 974 12990 . . . harp2 : L = 186 cuts 0 79 159 238 3180 . . . mod008 : L = 19 cutsThe dashed line for harp2 shows that adding a single cut does not necessarily help in branch-and-bound, which may be due to side-effects such as different branching decisions. For someinstances, such as mod008 , the cuts’ face dimensions are well distributed. In contrast to this, someinstances exhibit only very few distinct dimension values, e.g., only non-supporting cuts for qiu .Interestingly, the 6246 cuts for seymour induce only empty faces as well as faces with dimensionsbetween 1218 and 1254. This can partially be explained via the cut classes. On the one hand, allgenerated strengthened Chv´atal-Gomory cuts are non-supporting. On the other hand, some of thec-MIR cuts and { , / } -cuts are non-supporting while others induce faces of very high dimension.0 177 354 531 7080 . . . qiu : L = 10 cuts 0 313 627 940 12540.060.080.100.12 seymour : L = 2703 cutsIn general we don’t see an indication that cuts from certain classes induce higher dimensionalfaces than others. At first glance, such a pattern is apparent for misc07 (at dimensions 130–160),however one has to keep in mind that these blue segments constitute a minority of the cuts. Inline with that, the majority of the cuts for fiber is concentrated around dimension 900 with aclosed gap similar to that without cuts.0 50 101 152 2030 . . misc07 : L = 114 cuts 0 236 472 708 9450 . . . fiber : L = 39 cuts7espite the heterogeneity of the results, one observation is common to many instances: thedistribution of the face dimensions is biased towards − ∅ [ % , % ) [ % , % ) [ % , % ) [ % , % ) [ % , % ) [ % , % ) [ % , % ) [ % , % ) [ % , % ) [ % , % ) [ % , % ) [ % , % ) [ % , % ) [ % , % ) [ % , % ) [ % , % ) [ % , % ) [ % , % ) [ % , % ) [ % , % ) % ∞ Relative dimension F r e q u e n c y [ % ] Figure 1: Distribution of relative dimensions over the 37 instances from Table 1 except for p2756 (avoiding a bias due to many failures dimension computations). A cut of dimension k in an instancehaving ℓ cuts and with dim( P ) = d contributes 1 / (36 ℓ ) to its bar. For k = − k = d ), thisis ∅ (resp. ∞ ), and it is k/ ( d −
1) otherwise.The first bar in Fig. 1 indicates that for an instance chosen uniformly at random among theones we considered and then a randomly chosen cut for this instance, this cut is non-supportingwith probability greater than 35 %. This is remarkably high and thus we conclude this paperby proposing to investigate means to (heuristically) test for such a situation, with the goal ofstrengthening a non-supporting cutting plane by a reduction its right-hand side.
Acknowledgments.
We thank R. Hoeksma and M. Uetz as well as the
SCIP development team,in particular A. Gleixner, C. Hojny and M. Pfetsch for valuable suggestions on the computationalexperiments and their presentation. 8 eferences [1] Tobias Achterberg and Christian Raack. The MCF-separator: detecting and exploiting multi-commodity flow structures in MIPs.
Mathematical Programming Computation , 2(2):125–165,2010.[2] Alberto Caprara and Matteo Fischetti. {
0, 1/2 } -Chv´atal-Gomory cuts. Mathematical Pro-gramming , 74(3):221–235, 1996.[3] Gerald Gamrath, Daniel Anderson, Ksenia Bestuzheva, Wei-Kun Chen, Leon Eifler, MaximeGasse, Patrick Gemander, Ambros Gleixner, Leona Gottwald, Katrin Halbig, Gregor Hen-del, Christopher Hojny, Thorsten Koch, Pierre Le Bodic, Stephen J. Maher, Frederic Matter,Matthias Miltenberger, Erik M¨uhmer, Benjamin M¨uller, Marc E. Pfetsch, Franziska Schl¨osser,Felipe Serrano, Yuji Shinano, Christine Tawfik, Stefan Vigerske, Fabian Wegscheider, DieterWeninger, and Jakob Witzig. The SCIP Optimization Suite 7.0. Technical report, Optimiza-tion Online, March 2020.[4] Zonghao Gu, George L. Nemhauser, and Martin W. P. Savelsbergh. Lifted flow cover inequal-ities for mixed 0-1 integer programs.
Mathematical Programming , 85(3):439–467, 1999.[5] Arie M.C.A. Koster, Adrian Zymolka, and Manuel Kutschka. Algorithms to Separate { , } -Chv´atal-Gomory Cuts. Algorithmica , 55(2):375–391, 2009.[6] Adam N. Letchford and Andrea Lodi. Strengthening Chv´atal–Gomory cuts and Gomoryfractional cuts.
Operations Research Letters , 30(2):74–82, 2002.[7] Hugues Marchand and Laurence A. Wolsey. Aggregation and Mixed Integer Rounding toSolve MIPs.
Operations Research , 49(3):363–371, 2001.[8] Alexander Martin. Integer programs with block structure, 1999.[9] Cassandra M. Mczeal, Martin W. P. Savelsbergh, and Robert E. Bixby. An updated mixedinteger programming library: MIPLIB 3.0.
Optima , 58:12–15, 1998.[10] Alexander Schrijver.
Theory of Linear and Integer Programming . John Wiley & Sons, Inc.,New York, NY, USA, 1986.[11] Matthias Walter.
Investigating Polyhedra by Oracles and Analyzing Simple Extensions ofPolytopes . PhD thesis, Otto-von-Guericke-Universit¨at Magdeburg, 2016.[12] Matthias Walter. IPO – Investigating Polyhedra by Oracles, 2016. Software available at:bitbucket.org/matthias-walter/ipo/.[13] Robert Weismantel. On the 0/1 knapsack polytope.
Mathematical Programming , 77(3):49–68,1997.[14] Kati Wolter. Implementation of Cutting Plane Separators for Mixed Integer Programs. Mas-ter’s thesis, Technische Universit¨at Berlin, 2006.9
Additional plots