Knowledge engineering mixed-integer linear programming: constraint typology
KK NOWLEDGE ENGINEERING MIXED - INTEGER LINEARPROGRAMMING : CONSTRAINT TYPOLOGY
A P
REPRINT
Vicky Mak-Hau
School of Information TechnologyDeakin UniversityWaurn Ponds, VIC 3216 Australia [email protected]
John Yearwood
School of Information TechnologyDeakin UniversityWaurn Ponds, VIC 3216 Australia [email protected]
William Moran
Electrical and Electronic EngineeringThe University of MelbourneParkville, VIC 3010 Australia [email protected]
March 1, 2021 A BSTRACT
In this paper, we investigate the constraint typology of mixed-integer linear programming (MILP)formulations. MILP is a commonly used mathematical programming technique for modellingand solving real-life scheduling, routing, planning, resource allocation, timetabling optimizationproblems, providing optimized business solutions for industry sectors such as: manufacturing,agriculture, defence, healthcare, medicine, energy, finance, and transportation. Despite the numerousreal-life Combinatorial Optimization Problems found and solved, and millions yet to be discoveredand formulated, the number of types of constraints (the building blocks of a MILP) is relativelymuch smaller. In the search of a suitable machine readable knowledge representation for MILPs,we propose an optimization modelling tree built based upon an MILP ontology that can be used asa guidance for automated systems to elicit an MILP model from end-users on their combinatorialbusiness optimization problems. Our ultimate aim is to develop a machine-readable knowledgerepresentation for MILP that allows us to map from an end-user’s natural language description of thebusiness optimization problem to an MILP formal specification. K eywords Mixed Integer Linear Programming · Constraint Typology · Knowledge Representation
Combinatorial Optimization Problems (COPs) arise in many real-life applications such as scheduling [11, 16], planning[1, 4], resource allocation [7, 10], routing [9, 15], and time-tabling [5, 18]. See, for example, [3, 6, 17], for moreexamples of mathematical programming applied in real-life business COPs where millions or even billions of dollarswere saved.There are several commonly-employed solution approaches for COPs. The two main branches are exact methods andheuristic methods. Mathematical Programming is an exact method that can provide proven optimal solutions, and evenwhen it fails to produce an optimal solution within a predetermined time and memory limit, it can still provide a provenoptimality gap. Heuristic approaches (such as trial-and-error, simulation, learning, meta-heuristic or custom-madeproblem-specific heuristic) on the other hand, do not have a solution guarantee. With meta-heuristics or learningmethods, with parameters properly tuned, they may be able to provide reasonably good quality solutions within a muchshorter time. Exact algorithms will always be preferred in applications where a proven optimal solution matter. For a r X i v : . [ c s . A I] F e b PREPRINT - M
ARCH
1, 2021instance, in Kidney Exchange Optimization, an increment of one unit in the objective function means one more kidneytransplant can be carried out, and undoubtably will have a significant impact on the health outcome of the patient with akidney failure.When solving a COP, mathematical Programming-based exact algorithms essentially implement an exhaustive treesearch with smart pruning strategies. In the case of Integer Programming (IP)-family of methods, the theoretical basis isalgebra, whereas in the case of Constraint Programming (CP), the theoretical basis is logical inferences. CP and IPeach has their strengths and weaknesses. IP-family of methods include Pure Integer Programming where all decisionvariables are integers, Binary Integer Programming where all decision variables are binary, and Mixed-integer LinearProgramming (MILP) where some decision variables are continuous, and the rest are binary or general integers. MILPcan also model some nonlinear terms (e.g., quadratic, bilinear, and piecewise linear terms), and therefore MILP is avery practical technique in modelling and solving real-life COPs. This is evidenced by the fact that in the history ofFranz Edelman Awards, 20% of the finalists applied IP-family of methods [6], and that the top two algorithms in the2 nd Nurse Rostering Competition are MILP-based methods [2].The formal mathematical specification of a MILP problem is given as follows: { min , max } { c · x + d · y : A x + B y ≤ f , x ∈ Z n + , y ∈ Q p + } , with x = ( x , . . . , x n ) and y = ( y , . . . , y p ) the decision variables; c and d the cost coefficients,for c a n -vector of Q , d a p -vector of Q ; A ∈ Z m × n and B ∈ Q m × p the constraint coefficient matrices; and f a m -vector of Q . For a thorough exposition of MILP, see, for example, [12, 14].An MILP comprises four main components: a set of decision variables, an objective function that is a linear combinationof the decision variables, a set of constraints (each containing a linear combination of decision variables), and the indexsets that enumerate the decision variables and constraints. We proposed an MILP ontology in [13], see Figure 1 below.Figure 1: The proposed ontology of mixed integer linear programming models for Combinatorial Optimization problems.Note: The visualization is courtesy of WebVOWL [8]. Here, we ask a fundamental question: is there a finite number of MILP constraint types, and if the answer is yes,how many are there? Classic families of COPs such as routing, scheduling, planning have dozens or hundreds ofvariations from real-life applications. Every COP is different. Numerous MILPs for real-life COPs found and solved,and many more yet to be discovered and formulated. We are not able to examine every single MILP ever developed inhistory, however we performed two simple studies for obtaining insights. 1) We examined all constraints used in theproduction planning problems listed in H. Paul Williams’ textbook “Model building in mathematical programming”. 2)We examined constraints used in a number of publications.We considered the production planning examples in H. Paul Williams’ book “Model building in mathematical program-ming”, and observed that constraints that represent limits (bounds), those that blending from raw materials to products,those that balance two quantities, those that governs logic conditions, and the classic binary integer programmingconstraints such as set partition, set packing, and set covering and their weighted variations cover all the productionplanning problems presented therein. In Figure 2, we present a table where we listed the meaning of the constraintsused and the section number in the textbook where the examples were discussed. These constraints are reasonably easy2
PREPRINT - M
ARCH
1, 2021to interpret in the sense that the mathematical specification of the constraints is either very close to or can be directlytranslated from a natural language (NL) description of a näive end-user, (a domain-expert end-user who is not trained todevelop MILP models)–we call these explicit constraints.Figure 2: A table of constraints and the sections in which they are used in the H Paul Williams book [17] for productionplanning problems.There are many ways to classify explicit constraints into different types. For instance, if we classify MILP constraintby their mathematical form, they can only be in one of the following forms: a · x ≤ b , a · x = b , and a · x ≥ b (here, b ≥ ). For simplicity, we will write a · x as ax for the rest of the paper. Type I: Bound constraints (Demand and Supply)
Resource limit (supply) constraints ax ≤ b and demand constraints ax ≥ d are very commonly used in MILPs, particularly in production planning-type of problems. For resource limit(supply) constraints, an upper bound on the supply can be fixed (e.g., a Knapsack constraint) or depends on the value ofa decision variable. Similar for the lower bound on a demand that needs to be satisfied. Type II: Balancing constraints
Equality constraints ax = b has many variations in its usage: to balance (equate) inputand output quantity; to balance the flow or quantities over two consecutive time periods, to set initial conditions, toassign values, and so on. Set packing/partitioning/covering constraints
The set packing/partitioning/covering constraints are subtypes ofType I and Type II constraints, typically used for assignment or allocation. They allow us to model the choice of atmost/exactly/at least one out of many. The weighted version of the set packing/partitioning/covering constraints allowus to model the choice of n > out of many. Logic constraints
The three main subtypes of logic constraints are the
Big-M , If-then , and
Either-or constraints, eachhas a number of varieties. (Some of these varieties were discussed in [13]).So, the next question is, what about real-life COPs other than the ones in the [17] and how do we represent theknowledge in order to enable automatic mapping from business requirements to the mathematical specification (or thatof a general purpose modelling language such as OPL or Minizinc)?
We designed an Optimization Modelling Tree (OMT) and examined a number of COPs in published journal articles toascertain whether the OMT is adequate in the sense that by traversing through the tree all elements for the MILPs canbe found. The focus of this exercise is to evaluate whether the constraint types and subtypes in the OMT are enough torepresent these example COPs. 3
PREPRINT - M
ARCH
1, 2021Figure 3: The Optimization Modelling Tree (OMT)
A chemical production scheduling example
A chemical production scheduling MILP model is presented in [16]. Theproblem considers a given planning horizon, partitioned into a number of time slots. The decisions to be made arewhether a unit (a machine or equipment) should start processing a task at a particular time slot, what the batch sizeshould be, and the inventory level of each material at each time slot. A basic MILP is presented with 4 constraints.Constraint Set (1) is a set packing constraint ensuring that each unit will be starting at most one task at a time (Number11 on the OMT). Constraint Set (2) is a combined logic (Big-M) and upper/lower bound constraint–if a unit startsprocessing a task at a given time, then the capacity of the batch size must be observed, otherwise, the task will not beprocessed on this machine at this time (Numbers 3 and 9 on the OMT). Constraint Set (3) presented in the paper shouldhave been two constraints. One is to equate the inventory (storage) of a material at a time slot to the inventory at theprevious time slot plus the new production and minus the consumption (Number 14 on the OMT). The second part ofConstraint Set (3) is an upper bound on the storage limit (Number 7 on the OMT). These constraints are reasonablystraight forward to describe by an end-user, and all requirements can be found in the constraint types on the OMT.
A supply chains production planning example
An MILP model for mid-term production planning for high-techlow-volume supply chains is presented in [4]. The decision variables are mostly general integer variables, and thesix constraint sets are as follows. Constraint Set (1) are to balance two quantities, in specific, quantities between twoconsecutive time slots (Number 12 on the OMT). Constraint Sets (2) and (5) are equality constraints for assigningquantities (Number 13 on the OMT). Constraint Sets (3) and (6) are variable upper bounding and lower bounding(Numbers 2 and 8 on the OMT) whereas Constraint set (4) has the logic condition that the upper and lower bounds ondecision variables for quantities apply only when the associated binary decision variables is non-zero (Numbers 3 and 9on the OMT).
A university course timetabling problem example
A university course timetabling problem was modelled as anMILP in [5]. The decision variables are binary. One set of the decision variables represent yes/no answers to whether aparticular section of a course should be assigned to a particular professor in a particular time slot. Translating themfrom NL to formal specifications should be reasonably straight forward. Constraint sets (2) and (3) are Set Partitioningconstraints (for choice of exactly one out of many, Number 17 on the OMT), Constraint Sets (4) to (9) are Set Packing4
PREPRINT - M
ARCH
1, 2021constraints (for choice of at most one out of many, Number 11 on the OMT)). Constraint sets (11)–(13), and (15) aregeneral if-then constraints regulating if X occurs, then both of Y and Z must occur. The constraints are in the form of X ≤ Y + Z (Number 24 on the OMT), although X ≤ Y and X ≤ Z are better constraints to use. This brings animportant aspect for knowledge engineering MILPs: multiple feasible MILP constraints exist for the same requirement,some are strong for computational use than others. The OMT in its current state has some limitations, as we can seefrom the next example. A multitrip vehicle routing problem with time windows example . The COP described in [15] is a routing-typeproblem. An end-user not trained with MILP knowledge does not normally describe that a yes/no decision is associatedwith each pair of locations (e.g., i and j with a yes answer indicating Location j must be visiting immediate afterLocation i ). However, commonly-used MILPs for routing problems typically use a binary variable for each of thesedecisions. Once the hurdle in decision variable definition is overcome, the rest of the constraints can be found in theconstraint types or subtypes described in the OMT. Constraint Sets (1), (3), (4) are all Set Partitioning Constraints, i.e.,to choose exactly one out of many (Number 17 on the OMT). Constraint Set (2) is to set to zero variables that representimpossible decisions (Number 19 on the OMT). Constraint Sets (6) to (8) are to regulate the time of arrival of a vehicleroute to visit a customer, and the constraints are if-then subtype be found in the OMT. The last constraint set (10)ensures is a straight forward upper bounding constraint on total time used, and the bound itself is a variables (Number 2on the OMT). Constraint Set (9) is a special type of demand - capacity constraint commonly used in routing-type ofproblems. The requirement is not trivial to describe in NL by an end-user but the mathematical constraint itself can befound in the OMT (it is in fact a Set Packing Constraint, Number 11 on the OMT). Constraint Set (5) is a flow-balanceconstraint (which is covered by the OMT), the mathematical meaning is that if a customer is visited, then there must bea customer that was visited before him/her and one after him/her. An end-user would not describe the requirement likethis. We call these implicit constraints. What the OMT contains, is not just the mathematical specification of the MILP constarints. Mathematically, ax ≤ b , ax = b , and ax ≥ b are enough to cover all MILP constraints. However, the OMT we designed branches by usage, (or,the meaning of the constraints in application). A beginner MILP modeller, for example, can traverse through the tree toelicit business requirements from a non-expert end-user. We have tested some COP instances and the constraints in theOMT do in fact cover all the “explicit” (or, straight-forward) constraints. Even for constraints in our test cases that arenot straight forward, i.e., the “implicit” constraints, they too are covered by the OMT, though the mapping mechanismis not represented on the OMT.We have the same results for the ACs and the SECs of an ATSP, they can appear in the form of Set Partitioning and SetCovering constraints respectively, but the mapping is not explicitly represented in the tree.In summary, we hypothesize that the OMT is scalable, however, we are unable to prove it in this paper. Now, what is required appears to be the compilation of a list of mappings for commonly-used implicit constraints. Forexample, the knowledge that “visiting each city exactly once and return to the home city” is equivalent to “each entityhas one that precedes it and one that succeeds it” is needed to be represented on this OMT, and consequently the SetPartitioning and Set Covering constraints be identified as the right ones to use. This will be the subject of our nextresearch project: to perform a survey of literature for commonly-used implicit constraints and the usage they map to,and to represent such a mapping on the OMT in an efficient way.
References [1] K. Akartunali, V. Mak-Hau, and T. Tran. A unified mixed-integer programming model for simultaneous fluenceweight and aperture optimization in vmat, tomotherapy, and cyberknife.
Computers & Operations Research ,56:134–150, 2015.[2] S. Ceschia, N. Dang, P. De Causmaecker, S. Haspeslagh, and A. Schaerf. The second international nurse rosteringcompetition.
Annals of Operations Research , 274, 03 2019.[3] D. Chen, R. Batson, and Y. Dang.
Applied Integer Programming: Modeling and Solution . Wiley, 2010.[4] J. T. de Kruijff, C. A. Hurkens, and T. G. de Kok. Integer programming models for mid-term production planningfor high-tech low-volume supply chains.
European Journal of Operational Research , 269(3):984–997, 2018.5
PREPRINT - M
ARCH
1, 2021[5] A. Ghoniem, V. Pereira, and H. Gomes Costa. Linear integer model for the course timetabling problem of a facultyin Rio de Janeiro.
Advances in Operations Research , 2016:7597062, 2016.[6] M. Gorman, L. Nittala, and J. Alden. Anatomy of the edelman: Measuring the world’s best analytics projects.
INFORMS Journal on Applied Analytics , 50:6:343–403, 2020.[7] P. Lalbakhsh, V. Mak-Hau, R. Séguin, V. Nguyen, and A. Novak. Capacity analysis for aircrew training schools -estimating optimal manpower flows under time varying policy and resource constraints. In , pages 2285–2296, 2018.[8] S. Lohmann, S. Negru, F. Haag, and T. Ertl. Visualizing ontologies with VOWL.
Semantic Web , 7(4):399–419,2016.[9] V. Mak and A. Ernst. New cutting-planes for the time- and/or precedence-constrained atsp and directed vrp.
MathMeth Oper Res , 66:69–98, 2007.[10] V. Mak-Hau. On the kidney exchange problem: cardinality constrained cycle and chain problems on directedgraphs: a survey of integer programming approaches.
Journal of Combinatorial Optimization , 33(1):35–59, 2017.[11] V. Mak-Hau, B. Hill, D. Kirszenblat, B. Moran, V. Nguyen, and A. Novak. A simultaneous sequencing and alloca-tion problem for military pilot training: integer programming approaches.
Computers & Industrial Engineering ,page 107161, 2021.[12] G. Nemhauser and L. Wolsey.
Integer and Combinatorial Optimization . John Wiley & Sons, Ltd, 1988.[13] B. Ofoghi, V. Mak, and J. Yearwood. A knowledge representation approach to automated mathematical modelling,2020.[14] Y. Pochet and L. Wolsey.
Production Planning by Mixed Integer Programming . Springer, 2006.[15] M. P. Seixas and A. B. Mendes. Column generation for a multitrip vehicle routing problem with time windows,driver work hours, and heterogeneous fleet.
Mathematical Problems in Engineering , 2013.[16] S. Velez and C. T. Maravelias. Reformulations and branching methods for mixed-integer programming chemicalproduction scheduling models.
Industrial & Engineering Chemistry Research , 52(10):3832–3841, 2013.[17] H. Williams. Model building in mathematical programming. 01 2013.[18] W. Zhou, X. You, and W. Fan. A mixed integer linear programming method for simultaneous multi-periodic traintimetabling and routing on a high-speed rail network.