Efficient Incremental Modelling and Solving
EEfficient Incremental Modelling and Solving
G¨okberk Koc¸ak, ¨Ozg¨ur Akg¨un, Nguyen Dang, Ian Miguel
School of Computer Science, University of St Andrews, UK { gk34,ozgur.akgun,nttd,ijm } @st-andrews.ac.uk Abstract.
In various scenarios, a single phase of modelling and solving is eithernot sufficient or not feasible to solve the problem at hand. A standard approach tosolving AI planning problems, for example, is to incrementally extend the plan-ning horizon and solve the problem of trying to find a plan of a particular length.Indeed, any optimization problem can be solved as a sequence of decision prob-lems in which the objective value is incrementally updated. Another example isconstraint dominance programming (CDP), in which search is organized into asequence of levels. The contribution of this work is to enable a native interactionbetween SAT solvers and the automated modelling system S
AVILE R OW to sup-port efficient incremental modelling and solving. This allows adding new decisionvariables, posting new constraints and removing existing constraints (via assump-tions) between incremental steps. Two additional benefits of the native couplingof modelling and solving are the ability to retain learned information betweenSAT solver calls and to enable SAT assumptions, further improving flexibilityand efficiency. Experiments on one optimisation problem and five pattern miningtasks demonstrate that the native interaction between the modelling system andSAT solver consistently improves performance significantly. Keywords:
Constraint Programming · Constraint Modelling · Incremental Solv-ing · Constraint Optimization · Planning · Data Mining · Itemset Mining · PatternMining · Dominance Programming
When approaching the solution of a class of problems, in many cases a simple single-phase approach works well: formulate a model parameterised on the data that definesan individual instance of the problem class, and solve each instance in a single solvingphase. In some scenarios however, as we will illustrate below, this approach is either notsufficient or not feasible to solve the problem at hand. Instead, a larger or more difficultproblem instance is solved as a sequence of smaller or simpler related instances. Inthis situation, communication between a modelling system that prepares an instance forsolution for a low-level solver and the solver itself can become a bottleneck, with muchwork repeated between consecutive, very similar instances.Incremental modelling and solving is a process of constructing an initial low levelinstance and obtaining further instances in a sequence by modelling and encoding justthe differences between the previous and the new instance. Most SAT solvers are ca-pable of working incrementally by allowing to append new irrevocable clauses or setcertain assumptions that are temporary to each call. a r X i v : . [ c s . A I] S e p G Koc¸ak et al.
To illustrate, consider the task of pattern mining, the process of extracting useful pat-terns from large data sets. The most well-known pattern mining task, frequent itemsetmining [1], requires us to find the sets of items whose number of occurrences together(known as the support ) in a transactional database exceeds a specified threshold. Spe-cialised, efficient tools exist for standard pattern mining tasks [26]. However, findingall frequent patterns is rarely useful since it usually produces a very large volume ofresults. Rather, an end-user is typically interested in focusing on a much smaller set ofpatterns for further inspection. One approach is to seek patterns that compactly repre-sent the full set of patterns [23], another is to consider domain-specific side constraints[4] that further reduce the volume of patterns returned. Both methods require a moresophisticated search for patterns and hence carry an increase in computational cost.Constraint-based mining [8] offers a general means of modelling more sophisticatedpattern mining tasks. Its flexibility means that side constraints can easily be added tothe basic model of a pattern mining problem, which is difficult to do with a specialisedmining tool. We distinguish local and non-local constraints in modelling pattern miningproblems. The former, such as the frequent itemset property, can be expressed simply ona candidate solution, e.g. by constraining the support of a candidate itemset to be equalto or greater than the threshold. Non-local constraints, however, must be expressed be-tween candidate solutions and are therefore more challenging to model. Closed frequentitemset mining [23,13], which is one approach to representing the full set of frequentitemsets more compactly, is an illustrative example: it stipulates that an itemset is closedfrequent if its support exceeds that of all of its supersets.Constraint Dominance Programming (CDP) [19] provides a method of supportingconstraints between solutions via dominance blocking constraints : every time a newsolution is found, a new blocking constraint is added to disallow solutions that it woulddominate. An extension to CDP, CDP+I [11,12] exploits incomparability between so-lutions (solutions A and B are incomparable if A does not dominate B and B does notdominate A ) so that they may be found in batches. The search is organized into levelsin which all solutions are incomparable, and hence may be found together through asingle call to a solver without the need for additional per-solution blocking constraints.Operating on CDP, which requires posting new constraints after each solution, and op-erating on CDP+I, which has requirements similar to CDP’s but for a batch of solutions,are incremental modelling and solving examples.Other problem types that might be considered for incrementality are constrainedoptimisation problems (COP), where an objective function is given in addition to astandard constraint satisfaction problem, or AI planning problems where we can incre-mentally extend the planning horizon and solve the problem of trying to find a plan ofa particular length.CP solvers like M INION [9] or chuffed [7] are typically capable of supporting COPdirectly in addition to CSP. However other solver types, such as standard SAT solvers,sometimes lack the facility to represent objective values. Instead of using standard SATencoding for the problem, a maximal satisfiability problem encoding (MaxSAT) canbe used to represent the objective function. However, converting a SAT encoding to aMaxSAT encoding may be time consuming depending on the size of the instance. fficient Incremental Modelling and Solving 3
Alternatively, using SAT or SMT solvers is possible for optimisation and planningproblems via a sequence of solver calls in an incremental structure. The COP can beencoded as a pair of CSP’s with a different optimisation value encoded into each CSP.Afterwards, those CSP instances can be solved for satisfiability. The threshold wherethe problem switches from SAT to UNSAT or the other way around can indicate theproven optimal value for the original COP instance. Searching for this threshold willhave multiple solver calls that can be adjusted for efficiency.
Contribution
This paper proposes to enable a native interaction between the SATsolver and the automated modelling system that organizes the CDP+I mining processand the optimization process using a SAT backend. This is done to remove a major bot-tleneck in which the consecutive SAT calls are operated. Two additional benefits of thisnative coupling are the ability to retain learned information between SAT solver callsand to enable SAT assumptions, further improving efficiency by reducing redundantsearch between levels.Our experiments on one optimization problem and five pattern mining tasks demon-strate that the native interaction between the modelling system and SAT solver consis-tently improves the performance of each system significantly. E SSENCE [2] is an abstract high-level constraint specification language. It has the powerto represent complex abstract structures, such as sets, multisets, sequences, and parti-tions. It supports arbitrary nesting of these structures and also supports quantificationover decision variables. Hence, the language is ideally suited to expressing data miningproblems. E
SSENCE can be refined into a constraint model in E
SSENCE P RIME [21]using C
ONJURE [2]. Due to the high-level abstract nature of the specification, there aremultiple ways of compiling E
SSENCE to E
SSENCE P RIME . C
ONJURE has a number ofbuilt in heuristics to make modelling decisions automatically. Alternatively, the mod-elling decisions can be manually selected. S
AVILE R OW translates E SSENCE P RIME into input suitable for a variety of black-box solvers while applying solver specificoptimisations to the model, such as rewriting constraint expressions, common sub-expression elimination and using M
INION to enforce strong levels of consistency ina preprocessing step [22].A constraint satisfaction problem consists of decision variables ( V ), their domains( D ) and problem constraints ( C ). CDP extends constraint satisfaction problems (CSP)by adding a dominance relation ( R ), which defines the condition under which an as-signment to the decision variables is dominated by another assignment. In CDP, anassignment is a solution if it is not dominated by any other solution. When enumerat-ing all solutions of a CDP instance, dominance blocking constraints can be generatedfor each solution as soon as they are found. These constraints will eliminate all fu-ture dominated assignments. However, a post-processing step may still be needed [19]. G Koc¸ak et al. language Essence letting
ITEM be domain int (...) letting
SUPPORT be domain int (...) given db : mset of set of ITEM given minSupport : intfind itemset: set of ITEM find support: SUPPORT such that support = sum entry in db . toInt (itemset subsetEq entry),support >= minSupport, SideConstraints dominanceRelation (itemset subsetEq fromSolution (itemset))-> (support != fromSolution (support)) incomparabilityFunction descending |itemset|
Fig. 1: Closed Frequent Itemset Mining in E
SSENCE . The dominance relation defines theclosedness property between the currently sought solution and the previous solutions via fromSolution . The incomparability function is defined on cardinality using a descendingorder, since closedness is defined by a superset relation.
CDP+I extends CDP by defining an incomparability function ( I ), which defines whentwo assignments are incomparable (mutually non-dominating).An itemset mining problem can be specified naturally in E SSENCE as a multisetof transactions. Depending on the nature of the mining task, each transaction can berepresented using a set of integer item labels or ornamented (using tuples or records)with additional information such as a class label. Figure 1 presents the specification ofthe Closed Frequent Itemset Mining problem in three parts. The first part is the declara-tion of the parameters, the decision variables and any constraints that concern a singlesolution. The second part gives the dominance relation in terms of previously foundsolutions. The third part defines the incomparability function, which in this problem isany two solutions that have the same itemset cardinality.Algorithm 1 makes use of both the dominance relation and the incomparabilityfunction when solving CDP+I instances. The CDP+I algorithm aims to find all non-dominated solutions. It achieves this by partitioning the search space into levels ex-tracted from the incomparability function. For example, for the closed itemset miningproblem, a separate search is conducted for every value in the domain of |itemset| .For every level, we take the base CSP model and start by adding a level restriction con- fficient Incremental Modelling and Solving 5
Algorithm 1
CDP+I ( V, D, C, R, I ) ← CDP + I levels ← getLevels ( I ) for l ← levels do C ← C ∪ levelRestriction ( l ) CSP ← ( V, D, C ) S ← findAllSolutions ( CSP ) B ← generateDominanceBlocking ( R, S ) C ← C − levelRestriction ( l ) C ← C ∪ B straint to it. In our running example, this corresponds to posting a cardinality constrainton the itemset. Then, we enumerate all solutions and generate the corresponding dom-inance blocking constraints. The problem constraints are then updated to remove thelevel restriction constraint before adding the new dominance blocking constraints.Previous implementations of CDP+I made a separate solver call for each level whenusing an AllSAT solver and a separate solver call for each solution when using a stan-dard SAT solver. This allows for a simple implementation of the CDP+I algorithm at thecost of losing learned clauses between separate solver calls. The performance of mod-ern SAT solvers relies heavily on learned clauses [16]. Section 4 presents our approachfor enabling native interaction with SAT and AllSAT solvers. Through the use of as-sumptions in SAT, we achieve improved performance without changing the high-levelproblem specifications.The use of E SSENCE for specifying the problems allows access to a large number ofdifferent models (via C
ONJURE options), different preprocessing options (via S
AVILE R OW options), and different solvers (SAT and AllSAT). A COP problem can be rewritten as a series of CSP problems where the objective func-tion value is encoded differently in each of them. A naive but inefficient approach wouldbe to exhaustively try all possible values and pick the best one which satisfies the in-stance. Alternatively, we can apply a search for the optimal objective function valuein its domain space. Three different search strategies which are supported by S
AVILE R OW can be considered for this purpose, namely Linear, UNSAT, and Bisect. They areexplained as follows (assuming that we are solving a maximisation problem). Linear search
Linear search is a straightforward search strategy to search for the opti-mal value. It starts from the lowest value and increase the optimal by one incrementallyuntil the problem becomes unsatisfiable.
UNSAT search
This is also a straightforward strategy which starts from the highestobjective function value and decreases it one by one until the problem becomes satisfi-able.
G Koc¸ak et al.
Bisect search
This is a binary search strategy also known as dichotomic search. Itstarts with splitting the objective function’s domain into two. This results in two CSPproblems, each with half of split domain. The satisfiable CSP problem is chosen andthe same procedure is repeated until the objective function’s domain size reduce to one(the optimal objective function value).
Throughout this paper we will experiment on six problem classes to demonstrate the en-hancements we will introduce. Five of these problem classes are pattern mining prob-lems encoded in CDP+I and the instances we use are taken from the supplementarymaterial of [12]. The sixth problem class is Multi-Mode Resource Constrained ProjectScheduling Problem (MRCPSP).The pattern mining problems are variations of the frequent itemset mining problem,each parameterised over a dataset of transactions. The task is to find a set of frequentitems that satisfy minimum value and maximum cost side constraints. In addition, eachproblem class has a different constraint among assignments which encodes the domi-nance relationship.
Closed frequent itemset mining (CFIS)
A frequent itemset is closed if and only if itssupport is greater than all of its supersets [23]. The support of an itemset is the numberof times the set occurs together in the transactions database. Maximal itemset mining isa similar problem class where the only difference is that a frequent itemset is maximal if none of its supersets are frequent. We do not include maximal itemset mining in ourexperiments since it is a simpler version of closed itemset mining.
Generator frequent itemset mining (GFIS)
Generator itemsets (also called free item-sets or key itemsets) [5] are frequent itemsets which do not have any frequent subsetswith the same support.
Minimal rare itemset mining (MRIM)
A minimal rare itemset is an infrequent item-set whose subsets are all frequent [25].
Closed discriminative itemset mining (DFIS)
Discriminative itemset mining [6] isparameterised over a dataset of transactions that also have a class label (positive/neg-ative). Instead of a single support value, we maintain two support values: the positivesupport of an itemset is the number of transactions that are labelled positive and havethe itemset as a subset. The negative support similarly is the number of transactionsthat are labelled negative and have the itemset as a subset. A discriminative itemset isone where the difference between the positive and the negative support is greater thana given threshold. A closed discriminative itemset is a discriminative itemset that hassupport greater than all of its supersets. fficient Incremental Modelling and Solving 7
Relevant subgroup discovery (RSD)
Relevant subgroup discovery [15] is similar todiscriminative itemset mining. While discriminative itemset mining reasons on the sup-port numbers of different classes of transactions, relevant subgroup discovery reasonsusing the actual sets of transactions that provide the support [19]. A relevant subgroup X is an itemset where at least one of following conditions hold; 1) For positive trans-actions, no other itemset covers a superset of the transactions covered by X , 2) Fornegative transactions, no other itemset covers a subset of the transactions covered by X or 3) For both kinds of transactions, no other itemset that has the same total cover is asuperset of X . Multi-mode resource constrained project scheduling problem (MRCPSP)
This is avariant of the project scheduling problem [14], a classical and well-known optimisationproblem in operations research. Given a number of activities and a set of renewableresources. Each activity is associated with a duration and demands for some resources.The activites are non-interrupted and there are precedence constraints which states thatsome activities can only start once some others are finished. The variant considered inthis paper is the multi-mode [18], where each activity may have multple modes. Eachmode dictates the duration and resource demands of the activity. The goal is to schedulethe activities and choose a mode for each of them so that the makespan (the latestcompletion time) is minimised. An E
SSENCE specification of this problem is presentedin Appendix A (Figure 7).
The main CDP+I algorithm (Algorithm 1) and the SAT optimisation backend requiresmultiple solver calls. For CDP+I, each solver calls occur once per level when usingan AllSAT solver and once per solution when using a standard SAT solver. Solutionsfrom a level are used to produce dominance blocking constraints for the next level.Furthermore, level restriction constraints are both added and removed between levels.Likewise, for optimisation problems using a standard SAT backend, multiple solvercalls occur to apply three optimization strategies to reach to optimal value. In addi-tion to adding temporary constraints, the ability to remove added constraints is alsorequired. Adding constraints during search is relatively common even without an incre-mental process. However, removing constraints requires special treatment by the solverin question. A direct implementation of these algorithms would indeed call the solverseveral times and consequently would not benefit from any learned clauses betweensolver calls.There are two main ways of maintaining learned clauses between solver calls. Thefirst option works by extracting learned clauses once the solver finishes the search andpost-processing them to keep a relevant subset for a future solver invocation. [24] usesa similar approach to learn candidate implied constraints from a learning solver. Thesecond option works by keeping the solver active, modifying the active model by post-ing additional constraints and restarting search. Adding new variables and constraintsin this way is a relatively common operation, available in ipasir , an incrementalityAPI for SAT solvers used in SAT competitions [10]. Removing constraints requires the
G Koc¸ak et al. assumptions machinery that is available in most modern SAT solvers. Constraints thatare going to be removed are posted as conditional new clauses dependent on an assump-tion. Hence, when the assumption is lifted (and the constraint is removed) any learnedclauses which depend on that assumption can be deactivated.We define a new API for SAT solvers that shares most of the functionality of ipasir , including methods for adding new clauses, adding assumptions, solving andretrieving solutions. We extend this basic API to also include methods for reportingdetailed statistics about learned clauses and the solver’s state, in addition to triggeringsolution callbacks. Our extended API is implemented using the Rust programming lan-guage. It works with SAT solvers
GLUCOSE , CADICAL and
MINISAT and the AllSATsolver
NBC MINISAT ALL . Our Rust implementation encapsulates the required func-tionality of these solvers and compiles them into a shared library.The entire pipeline of tools starts with C
ONJURE , which produces an E
SSENCE P RIME model for each problem class. A modified S
AVILE R OW is then used to instan-tiate the problem class model using a given data file, preprocessing it using M INION to shave domains, and then encoding into SAT using the standard encodings found inS
AVILE R OW [20]. Prior to our work, S AVILE R OW worked by producing a DIMACSfile that has the entire encoding in it and calling a SAT solver on this file. Thanks tothe new API we define and implement, S AVILE R OW now skips building this file anddirectly makes calls to the SAT solver to create the model.Our solver API layer is implemented in Rust while S AVILE R OW was implementedin Java. We use the Java Native Interface (JNI) to integrate the API layer into S AVILE R OW . To demonstrate the effectiveness of keeping SAT learnt clauses between levels duringthe optimisation process using native interaction, we evaluate the three optimisationstrategies explained in Section 2.2 on 928 MRCPSP instances from the PSPlib [14]. TheSAT solver
GLUCOSE [3] is combined with each of the three optimisation strategies. Wealso compare the the resulting performance with Open-WBO [17], a MaxSAT solverand with
Chuffed [7], a learning CP solver.Each run on an instance is given a time limit of one CPU hour, and is repeated threetimes. The average solving time is recorded. The comparison of the usage of native in-teraction on
GLUCOSE is shown in Figure 2. Results suggest that for all three strategies,the native interaction boosts the efficiency significantly on all tested instances.Comparison against Open-WBO and Chuffed are plotted in Figure 3. While in thefirst figure only includes the default SAT strategies, the second figure replaces themwith their native equivalents. Results suggest that the native interaction create a drasticperformance improvement for the SAT backend
GLUCOSE and results on these probleminstances are competitive against the two established optimisation solvers. fficient Incremental Modelling and Solving 9
Without native interaction W i t h n a t i v e i n t e r a c t i o n glucose-bisect Without native interaction W i t h n a t i v e i n t e r a c t i o n glucose-linear Without native interaction W i t h n a t i v e i n t e r a c t i o n glucose-UNSAT Fig. 2: Solving time of
GLUCOSE with versus without native interaction on 928 MRCPSP in-stances.
Instances T i m e ( s ) glucose-bisectglucose-linearglucose-UNSATOpen-WBOChuffed (a) Without native interaction Instances T i m e ( s ) glucose-bisectglucose-linearglucose-UNSATOpen-WBOChuffed (b) With native interactionFig. 3: Solving time of GLUCOSE with three settings (bisect, linear and UNSAT), Open-WBOand Chuffed on 928 MRCPSP instances.
GLUCOSE ’s results are shown without (top) and with(bottom) native interaction.0 G Koc¸ak et al.
In order to evaluate the ef-fectiveness of maintaining learned clauses and using SAT assumptions between CDP+Ilevels, we solve 240 instances across 5 problem classes (see Section 3). Within a 6-hourtime limit, the native version solves 210 instances whereas pure CDP+I solves only 173instances. We believe this is due to needing fewer search nodes, which is made possibleby pruning large parts of the search tree via the learned clauses.Figure 4 presents the median number of search nodes per level. Since instances havedifferent numbers of levels, we normalise the number of levels on the horizontal axis.The plot also shows that the default CDP+I’s performance can vary amongst differentinstances, while CDP+I-native’s performance has more stability, indicating that CDP+I-native is more robust.
Normalized Levels0500010000 S o l v e r n o d e s (a) CFIS
Normalized Levels0500010000 S o l v e r n o d e s CDP+I CDP+I native (b)
GFIS
Normalized Levels050001000015000 S o l v e r n o d e s (c) MRIM
Normalized Levels02000 S o l v e r n o d e s (d) DFIM
Normalized Levels010002000 S o l v e r n o d e s (e) RSD
Normalized Levels020004000 S o l v e r n o d e s (f) All problem classesFig. 4: Median solver nodes per CDP+I level. Error bars range between the 45 th and the 55 th percentile. Horizontal axis represents normalised levels between instances. Native CDP+I usessignificantly fewer search nodes, thanks to accumulated learned clauses between levels. CDP+I-native uses fewer search nodes than pure CDP+I, due to maintaining a sub-set of learned clauses between levels. Figure 6a presents a comparison of total solver fficient Incremental Modelling and Solving 11 run time of the two CDP+I variants on
NBC MINISAT ALL and shows that native in-teraction clearly results in faster run times as well. On PAR2 average, CDP+I-nativespends 493 seconds per instance whereas pure CDP+I spends 8,210 seconds.
A Case Study on
CFIS
Tumor 20% instance
To evaluate whether keeping learnedclauses improves efficiency, we will demonstrate this by examining one particular in-stance in detail as a case study.Figure 5 presents two plots. The first shows that CDP+I-native uses fewer searchnodes on each level. The second illustrates the increased number of SAT clauses ineach level that result from keeping learnt clauses. The improved efficiency seen on thefirst plot is a direct result of the restricted search space from having more clauses.
Levels02000400060008000 S o l v e r n o d e s CDP+I CDP+I-native
Levels110000112500115000117500120000122500125000127500130000 N u m b e r o f S A T C l a u s e s Fig. 5: A comparison on one CDP+I instance with and without native interaction using
NBC MINISAT ALL
AllSAT solver. The example instance is
CFIS
Tumor with 20% frequency.Each plot is averaged out from a single model and multiple random seeds. The plot on the leftshows the number of solver nodes on each level, while the plot on the right shows the total numberof SAT clauses on each level.
Computational Evaluation with a Standard SAT Solver
CDP+I on a standard SATsolver operates by generating solution blocking clauses between each solver call in alevel. Once a level is completed, the dominance blocking clauses generated by S
AVILE R OW are encoded and passed on to the next level. The solution blocking clauses are notencoded again since they are redundant and already implied in the dominance blockingconstraints.Implementing a native interactive system on a standard SAT solver will bring bothcosts and benefits to its performance. AllSAT solvers are already capable of keepinglearned information in a level due to their all solution enumeration behaviour. The nativeinteraction will grant the standard SAT solver this capability, in addition to making thelearned information persistent between levels. Thus, the increase of the standard SATsolver’s performance will be relatively much higher than the increase of the AllSATsolver’s performance. However, since we will still be using solution blocking clauses in a level and since the system cannot eliminate the redundant solution blocking clausesonce the level is done, the standard SAT model might expand far beyond its non-nativeequivalent. AllSAT solvers are not susceptible to this because they can operate withoutthe use of solution blocking clauses, regardless of whether they use native interaction. CDP+I-native NBC10 C D P + I N B C (a) Comparing total solver time using the All-SAT solver NBC MINISAT ALL . CDP+I-native Glucose10 C D P + I G l u c o s e (b) Comparing total solver time using the stan-dard SAT solver GLUCOSE .Fig. 6: Comparison plot between pure CDP+I and CDP+I-native. The time limit is 6 hours perinstance. Each data point is averaged out from a single model and multiple random seeds.
Figure 6b illustrates a comparison of CDP+I with and without native interactionusing the standard SAT solver
GLUCOSE . Native interaction increases the performanceamongst all instances significantly. The results also suggest that the anticipated decreasein performance due to the expansion of the model did not outweigh the increase pro-vided by native interaction.In this section we have evaluated the effect of native interaction on the performanceof CDP+I. We conducted our analysis on an AllSAT solver and a standard SAT solver.In the next section we evaluate the configuration space of CDP+I-native.
We have proposed and implemented a new native interaction component to bridge thegap between low level SAT solving and higher level model compilation in S
AVILE R OW . We integrated this component into S AVILE R OW to be able to use in the CDP+Iframework and optimization problems. Our experiments on different pattern miningtasks and an optimization problem (MRCPSP) show that the native component boostedsolving performance significantly. This interaction enabled accessing SAT assumptionsto encode level information in a transparent way and also made learned informationpersistent across multiple runs. fficient Incremental Modelling and Solving 13 Future work includes evaluating the native interaction component with differentproblem classes. We believe this native interaction can be a viable option for multi ob-jective optimization tasks as well. Additionally, there is a large space of possible con-figurable options which is yet to cover, including different modelling and reformulationmethods, other SAT solvers and SMT solvers.
Acknowledgements
This work is supported by EPSRC grant EP/P015638/1. NguyenDang is a Leverhulme Trust Early Career Fellow (ECF-2020-168).
References
1. Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: 20th int.conf. very large data bases, VLDB. vol. 1215, pp. 487–499 (1994)2. Akg¨un, ¨O., Frisch, A.M., Gent, I.P., Hussain, B.S., Jefferson, C., Kotthoff, L., Miguel, I.,Nightingale, P.: Automated symmetry breaking and model selection in conjure. In: Inter-national Conference on Principles and Practice of Constraint Programming. pp. 107–116.Springer (2013)3. Audemard, G., Simon, L.: On the glucose sat solver. International Journal on Artificial Intel-ligence Tools (01), 1840001 (2018)4. Bonchi, F., Lucchese, C.: On closed constrained frequent pattern mining. In: Fourth IEEEInternational Conference on Data Mining (ICDM’04). pp. 35–42. IEEE (2004)5. Boulicaut, J.F., Bykowski, A., Rigotti, C.: Approximation of frequency queries by means offree-sets. In: European Conference on Principles of Data Mining and Knowledge Discovery.pp. 75–85. Springer (2000)6. Cheng, H., Yan, X., Han, J., Hsu, C.W.: Discriminative frequent pattern analysis for effectiveclassification. In: 2007 IEEE 23rd International Conference on Data Engineering. pp. 716–725. IEEE (2007)7. Chu, G., Stuckey, P.J.: Chuffed solver description, 20148. De Raedt, L., Guns, T., Nijssen, S.: Constraint programming for itemset mining. In: SIGKDDinternational conference on Knowledge discovery and data mining. pp. 204–212. ACM(2008)9. Gent, I.P., Jefferson, C., Miguel, I.: Minion: A fast scalable constraint solver. In: ECAI.vol. 141, pp. 98–102 (2006)10. J¨arvisalo, M., Le Berre, D., Roussel, O., Simon, L.: The international sat solver competitions.Ai Magazine (1), 89–92 (2012)11. Koc¸ak, G., Akg¨un, ¨O., Guns, T., Miguel, I.: Towards improving solution dominance withincomparability conditions: A case-study using generator itemset mining. arXiv preprintarXiv:1910.00505 (2019)12. Koc¸ak, G., Akg¨un, ¨O., Guns, T., Miguel, I.: Exploiting incomparability in solution domi-nance: Improving general purpose constraint-based mining. In: ECAI (2020)13. Koc¸ak, G., Akg¨un, ¨O., Miguel, I., Nightingale, P.: Closed frequent itemset mining with ar-bitrary side constraints. In: 2018 IEEE International Conference on Data Mining Workshops(ICDMW). pp. 1224–1232. IEEE (2018)14. Kolisch, R., Sprecher, A.: Psplib-a project scheduling problem library: Or software-orsepoperations research software exchange program. European journal of operational research (1), 205–216 (1997)15. Lemmerich, F., Rohlfs, M., Atzmueller, M.: Fast discovery of relevant subgroup patterns. In:Twenty-Third International FLAIRS Conference (2010)4 G Koc¸ak et al.16. Marques-Silva, J., Lynce, I., Malik, S.: Conflict-driven clause learning sat solvers. In: Hand-book of satisfiability, pp. 131–153. ios Press (2009)17. Martins, R., Manquinho, V., Lynce, I.: Open-wbo: A modular maxsat solver. In: InternationalConference on Theory and Applications of Satisfiability Testing. pp. 438–445. Springer(2014)18. Mori, M., Tseng, C.C.: A genetic algorithm for multi-mode resource constrained projectscheduling problem. European Journal of Operational Research (1), 134–141 (1997)19. Negrevergne, B., Dries, A., Guns, T., Nijssen, S.: Dominance programming for itemset min-ing. In: 2013 IEEE 13th International Conference on Data Mining. pp. 557–566. IEEE (2013)20. Nightingale, P., Akg¨un, ¨O., Gent, I.P., Jefferson, C., Miguel, I., Spracklen, P.: Automaticallyimproving constraint models in savile row. Artificial Intelligence , 35–61 (2017)21. Nightingale, P., Rendl, A.: Essence’ description (2016), arXiv:1601.02865 [cs.AI]22. Nightingale, P., Spracklen, P., Miguel, I.: Automatically improving sat encoding of constraintproblems through common subexpression elimination in savile row. In: International Confer-ence on Principles and Practice of Constraint Programming. pp. 330–340. Springer (2015)23. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets forassociation rules. In: International Conference on Database Theory. pp. 398–416. Springer(1999)24. Shishmarev, M., Mears, C., Tack, G., de la Banda, M.G.: Learning from learning solvers. In:International conference on principles and practice of constraint programming. pp. 455–472.Springer (2016)25. Szathmary, L., Napoli, A., Valtchev, P.: Towards rare itemset mining. In: 19th IEEE Interna-tional Conference on Tools with Artificial Intelligence (ICTAI 2007). vol. 1, pp. 305–312.IEEE (2007)26. Zaki, M.J.: Scalable algorithms for association mining. IEEE transactions on knowledge anddata engineering (3), 372–390 (2000)fficient Incremental Modelling and Solving 15 A Essence specification for MRCPSP language Essence given nonRenewableResources new type enum given renewableResources new type enum given jobs new type enum given startDummy, endDummy : jobs given modes new type enum given renewableLimits: function ( total ) renewableResources --> intgiven nonRenewableLimits : function ( total ) nonRenewableResources --> intgiven successors : function ( total ) jobs --> set of jobs given renewableResourceUsage : function (jobs, modes, renewableResources) --> intgiven nonRenewableResourceUsage : function (jobs, modes, nonRenewableResources) --> intgiven duration : function (jobs,modes) --> intgiven horizon : intletting timesRange be domain int (1..horizon) find start: function ( total ) jobs --> timesRange find mode: function ( total ) jobs --> modes find jobActive: function ( total ) (jobs,timesRange) --> boolsuch thatforAll job : jobs . forAll jobSuccessor in successors(job) .start(jobSuccessor) >= start(job) + duration((job,mode(job))) such thatforAll job : jobs . forAll time : timesRange .jobActive((job,time)) <->(time >= start(job) /\ time < start(job) + duration((job,mode(job)))) such thatforAll resource : nonRenewableResources . sum ([nonRenewableResourceUsage((job, mode(job), resource) )| job : jobs])<= nonRenewableLimits(resource) such thatforAll resource : renewableResources . forAll time : timesRange . sum ([renewableResourceUsage((job,mode(job),resource)) |job : jobs, jobActive((job,time))])<= renewableLimits(resource) such that start(startDummy)=1 minimising start(endDummy)start(endDummy)