A threshold search based memetic algorithm for the disjunctively constrained knapsack problem
aa r X i v : . [ c s . N E ] J a n A threshold search based memetic algorithmfor the disjunctively constrained knapsackproblem
Zequn Wei and Jin-Kao Hao ∗ LERIA, Universit ´ e d’Angers, 2 Boulevard Lavoisier, 49045 Angers, France Abstract
The disjunctively constrained knapsack problem consists in packing a subset ofpairwisely compatible items in a capacity-constrained knapsack such that the totalprofit of the selected items is maximized while satisfying the knapsack capacity.DCKP has numerous applications and is however computationally challenging(NP-hard). In this work, we present a threshold search based memetic algorithmfor solving the DCKP that combines the memetic framework with thresholdsearch to find high quality solutions. Extensive computational assessments on twosets of 6340 benchmark instances in the literature demonstrate that the proposedalgorithm is highly competitive compared to the state-of-the-art methods. Inparticular, we report 24 and 354 improved best-known results (new lower bounds)for Set I (100 instances) and for Set II (6240 instances), respectively. We analyzethe key algorithmic components and shed lights on their roles for the performanceof the algorithm. The code of our algorithm will be made publicly available.
Keywords : Knapsack problems; Disjunctive constraint; Threshold search;Heuristics.
As a generalization of the conventional 0-1 knapsack problem (KP) [18], thedisjunctively constrained knapsack problem (DCKP) is defined as follows. Let V = { , . . . , n } be a set of n items, where each item i = { , . . . , n } has a profit p i > w i >
0. Let G = ( V, E ) be a conflict graph, where V is ∗ Corresponding author.
Email addresses: [email protected] (Zequn Wei), [email protected] (Jin-Kao Hao).
Preprint submitted to Elsevier 14 January 2021 he set of n items and an edge { i, j } ∈ E defines the incompatibility of items i and j . Let C > S of pairwisely compatible items of V to maximizethe total profit of S while ensuring that the total weight of S does not surpassthe knapsack capacity C . Formally, the DCKP can be stated as follows.( DCKP ) Maximize f ( S ) = n X i =1 p i x i (1)subject to W ( S ) = n X i =1 w i x i ≤ C, S ⊆ V, (2) x i + x j ≤ , ∀ ( i, j ) ∈ E, (3) x i ∈ { , } , i = 1 , . . . , n. (4)Objective function (1) commits to maximize the total profit of the selecteditem set S . Constraint (2) ensures that the knapsack capacity constraint issatisfied. Constraints (3), called disjunctive constraints, guarantee that twoincompatible items are never selected simultaneously. Constraints (4) forcethat each item is selected at most once.It is easy to observe that the DCKP reduces to the NP-hard KP when G isan empty graph. The DCKP is equivalent to the NP-hard maximumweighted independent set problem [17] when the knapsack capacity isunbounded. Moreover, the DCKP is closely related to other combinatorialoptimization problems, such as the multiple-choice knapsack problem [18],and the bin packing problem with conflicts [16]. In addition to its theoreticalsignificance, the DCKP is a useful model for practical applications where theresources with conflicts cannot be used simultaneously while a given budgetenvelope cannot be surpassed.Given the importance of the DCKP, a number of solution methods havebeen developed including exact, approximation and heuristic algorithms. Asthe literature review shown in Section 2, considerable progresses have beencontinually made since the introduction of the problem. Meanwhile, giventhe NP-hard nature of the problem, more powerful algorithms are stillneeded to push the limits of existing methods.In this work, we investigate for the first time the population-based memeticframework [21] for solving the DCKP and design an effective algorithmmixing threshold based local optimization and crossover based solutionrecombination. The threshold search procedure ensures the main role ofsearch intensification by finding high quality local optimal solutions. Thespecialized backbone crossover generates promising offspring solutions forsearch diversification. The algorithm uses also a distance-and-quality2trategy for population management. The algorithm has the advantage ofavoiding the difficult task of parameter tuning.From a perspective of performance assessment, we apply the proposedalgorithm to solve the two sets of DCKP benchmark instances in theliterature. The results show that for the 100 instances of Set I (optimalitystill unknown) which were commonly tested by heuristic algorithms, ouralgorithm discovers 24 new best-known results (new lower bounds) andmatches the best-known results for the 76 remaining instances. For the 6240instances of Set II which were tested by exact algorithms, our algorithmfinds 354 improved best lower bounds on the difficult instances whoseoptimal values are unknown and attains the known optimal results on mostof the remaining instances.The rest of the paper is organized as follows. Section 2 provides a literaturereview on the DCKP. Section 3 presents the proposed algorithm. Section 4shows computational results of our algorithm and provides comparisons withthe state-of-the-art algorithms. Section 5 analyzes essential components of thealgorithm. Finally, Section 6 summarizes the work and provides perspectivesfor future research. The DCKP has attracted considerable attentions in the past two decades.In this section, we review related literature for solving the DCKP. Existingsolution methods can be roughly classified into two categories as follows.(1)
Exact and approximation algorithms : These algorithms are able toguarantee the quality of the solutions they find. In 2002, Yamada et al.[31] introduced the DCKP and proposed the first implicit enumerationalgorithm where the disjunctive constraints are relaxed. In 2007, Hifiand Michrafy [14] introduced three versions of an exact algorithm basedon a local reduction strategy. In 2009, Pferschy and Schauer [22]proposed a pseudo-polynomial time and space algorithm for solvingthree special cases of the DCKP and proved the DCKP is stronglyNP-hard on perfect graphs. In 2016, Salem et al. [27] developed abranch-and-cut algorithm that combines a greedy clique generationprocedure with a separation procedure. In 2017, Bettinelli et al. [1]presented a branch-and-bound algorithm by combining a upperbounding procedure that considers both the capacity constraint and thedisjunctive constraints with a branching procedure that employs adynamic programming to presolve the 0-1 KP. They generated 4800DCKP instances with conflict graph densities between 0.1 and 0.9 (see3ection 4.1). Also in 2017, Pferschy and Schauer [23] applied theapproximation methods of modular decompositions and cliqueseparators to the DCKP, and showed complexity results on specialgraph classes. In 2019, Gurski and Rehs [10] designed a dynamicprogramming algorithm and achieved pseudo-polynomial solutions forthe DCKP. In 2020, Coniglio et al. [4] presented anotherbranch-and-bound algorithm based on an n -ary branching scheme andsolved the integer linear programming formulations of the DCKP by theCPLEX solver. They introduced 1440 new and challenging DCKPinstances (see Section 4.1).(2) Heuristic algorithms : These algorithms aim to find good near-optimalsolutions with a given time. In 2002, Yamada et al. [31] proposed agreedy algorithm to generate an initial solution and a 2-optneighborhood search algorithm to improve the obtained solution. In2006, Hifi and Michrafy [13] reported a local search algorithm, whichcombines a complementary constructive procedure to improve the initialsolution and a degrading procedure to diversify the search. Theygenerated a set of 50 DCKP instances with 500 and 1000 items (seeSection 4.1), which was widely tested in later studies. In 2012, Hifi andOtmani [15] studied two scatter search algorithms. In 2014, Hifi [12]devised an iterative rounding search-based algorithm that uses arounding strategy to perform a linear relaxation of the fractionalvariables. In 2017, Salem et al. [26] designed a probabilistic tabu searchalgorithm (PTS) that operates with multiple neighborhoods. In thesame year, Quan and Wu investigated two parallel algorithms: theparallel neighborhood search algorithm (PNS) [25] and the cooperativeparallel adaptive neighborhood search algorithm (CPANS) [24]. Theyalso designed a new set of 50 DCKP large instances with 1500 and 2000items (see Section 4.1).Existing studies have significantly contributed to better solving the DCKP.According to the computational results reported in the literature, theparallel neighborhood search algorithm [25], the cooperative paralleladaptive neighborhood search algorithm [24], and the probabilistic tabusearch algorithm [26] can be regarded as the state-of-the-art methods for theinstances of Set I. For the instances of Set II, the branch-and-boundalgorithms presented in [1,4] and the integer linear programmingformulations solved by the CPLEX solver [4] showed the best performance.In this work, we aim to advance the state-of-the-art of solving the problem byproposing the first threshold search based memetic approach, which proves tobe effective on the two sets of DCKP instances tested in the literature.4
Threshold search based memetic algorithm for the DCKP
Our threshold search based memetic algorithm (TSBMA) for the DCKP is apopulation-based algorithm combining evolutionary search and localoptimization. In this section, we first present the general procedure of thealgorithm and then describe its components.
The TSBMA algorithm relies on the general memetic algorithm framework[21] and follows the design principles recommended in [11]. The flowchart ofTSBMA and its pseudo-code are shown in Figure 1 and Algorithm 1,respectively.
Initialize the population and record the best solution S*
Randomly pick two parent solutions from the populationCrossover operatorThreshold serachUpdate the best solution S*Output the best solution S* Update the population t < t max ? yes no Fig. 1. Flowchart of the proposed TSBMA algorithm.
The algorithm starts from a set of feasible solutions of good quality that aregenerated by the population initialization procedure (line 4, Alg. 1, andSection 3.3). The best solution is identified and recorded as the overall bestsolution S ∗ (line 5, Alg. 1). Then the algorithm enters the main ‘while’ loop(lines 6-15, Alg. 1) to perform a number of generations. At each generation,two solutions are randomly picked and used by the crossover operator tocreate an offspring solution (line 7-8, Alg. 1, and Section 3.5). Afterwards,the threshold search procedure is triggered to perform local optimizationwith three neighborhoods N , N and N (line 9, Alg. 1, and Section 3.4).After conditionally updating the overall best solution S ∗ (line 11-13, Alg. 1),5he diversity-based pool updating procedure is applied to decide whether thebest solution S b found during the threshold search should be inserted intothe population (line 14, Alg. 1, and Section 3.6). Finally, when the giventime limit t max is reached, the algorithm returns the overall best solution S ∗ found during the search and terminates. Algorithm 1
Main framework of threshold search based memetic algorithmfor the DCKP Input : Instance I , cut-off time t max , population P , the maximum number ofiterations IterM ax , neighborhoods N , N , N .2: Output : The overall best solution S ∗ found.3: S ∗ ← ∅ /* Initialize S ∗ (i.e., f ( S ∗ ) = 0)*/4: P OP = { S , . . . , S | P | } ← P opulation Initialization ( I ) /* Section 3.3 */5: S ∗ ← argmax { f ( S k ) | k = 1 , . . . , p } while T ime ≤ t max do
7: Randomly pick two solutions S i and S j from the population POP8: S o ← Crossover Operator ( S i , S j ) /* Section 3.5 */9: S b ← T hreshold Search ( S o , N − , IterM ax ) /* Section 3.4 */10: /* Record the best solution S b found during threshold search */11: if f ( S b ) > f ( S ∗ ) then S ∗ ← S b /* Update the overall best solution S ∗ found so far */13: end if P OP ← P ool U pdating ( S b , P OP ) /* Section 3.6 */15: end while return S ∗ The DCKP is a subset selection problem. Thus, a candidate solution for aset V = { , . . . , n } of n items can be conveniently represented by a binaryvector S = ( x , . . . , x n ), such that x i = 1 if item i is selected, and x i = 0otherwise. Equivalently, S can also be represented by S = < A, ¯ A > such that A = { q : x q = 1 in S } and ¯ A = { p : x p = 0 in S } .Let G = ( V, E ) be the given conflict graph and C be the knapsack capacity.Our TSBMA algorithm explores the following feasible search space Ω F satisfying both the disjunctive constraints and the knapsack constraint.Ω F = { x ∈ { , } n : n X i =1 w i x i ≤ C ; x i + x j ≤ , ∀{ i, j } ∈ E, ≤ i, j ≤ n, i = j } (5)The quality of a solution S in Ω F is determined by the objective value f ( S )of the DCKP (Equation 1). 6 .3 Population initialization The TSBMA algorithm builds each of the | P | initial solutions of thepopulation P in two steps. First, it randomly adds one by one non-selecteditems into an individual solution S i ( i = 1 , . . . , | P | ) until the capacity of theknapsack is reached, while keeping the disjunctive constraints satisfied.Second, to obtain an initial population of reasonable quality, it improves thesolution S i by a short run of the threshold search procedure (Section 3.4) bysetting IterM ax = 2 n .It is worth mentioning that the population size | P | is determined accordingto the number of candidate items n of the given instance, i.e., | P | = n/
100 + 5. This strategy is based on two considerations. First, sincethe TSBMA algorithm is powerful enough to solve the instances of smallsize, a smaller population size can help to reduce the initialization time.Second, the instances of large size are more challenging, a larger populationsize helps to diversify the search.
The local optimization procedure of the TSBMA algorithm relies on thethreshold accepting method [7]. To explore a given neighborhood, themethod accepts both improving and deteriorating neighbor solutions so longas the solution satisfies a quality threshold. One notices that this method hasbeen successfully applied to solve several knapsack problems (e.g., quadraticmultiple knapsack problem [3], multi-constraint knapsack problem [8] andmultiple-choice knapsack problem [33]) and other combinatorial optimizationproblems (e.g., [2,28]). In this work, we adopt for the first time this methodfor solving the DCKP and devise a multiple neighborhood threshold searchprocedure reinforced by an operation-prohibiting mechanism.
As shown in Algorithm 2, the threshold search procedure (TSP) starts itsprocess from an input solution and three empty hash vectors (used for theoperation-prohibiting mechanism, lines 3-5, Alg. 2). It then performs anumber of iterations to explore three neighborhoods (Section 3.4.2) toimprove the current solution S . Specifically, for each ‘while’ iteration (lines9-25, Alg. 2), the TSP procedure explores the neighborhoods N , N and N in a deterministic way as explained in the next section. Any samplednon-prohibited neighbor solution S ′ is accepted immediately if the qualitythreshold T is satisfied (i.e., f ( S ′ ) ≥ T ). Then the hash vectors are updated7or solution prohibition and the best solution found during the TSPprocedure is recorded in S b (line 18-20, Alg. 2). The main search (‘while’loop) terminates when 1) no admissible neighbor solution (i.e.,non-prohibited and satisfying the quality threshold) exists in theneighborhoods N , N and N , or 2) the best solution S b cannot be furtherimproved during IterM ax consecutive iterations. Specifically, the qualitythreshold T is determined adaptively by f ( S b ) − n/
10 ( n is the number ofitems of each instance) while IterM ax is set to ( n/
500 + 5) × Algorithm 2
Threshold search procedure Input : Input solution S o , threshold T , the maximum number of iterations IterM ax , hash vectors H , H , H , hash functions h , h , h , length of hashvectors L , neighborhoods N , N , N .2: Output : The best feasible solution S b found by threshold search procedure.3: for i ← to L − do H [ i ] ← H [ i ] ← H [ i ] ←
0; /* Initialization of hash vectors */5: end for S b ← S o /* S b record the best solution found */7: S ← S o /* S record the current solution */8: iter ← while iter ≤ IterM ax do
10: Examine the neighborhoods N ( S ), N ( S ), N ( S ) in turn; /* Section 3.4.2*//* Each non-prohibited neighbor solution S ′ satisfies H [ h ( S ′ )] ∧ H [ h ( S ′ )] = 0 ∧ H [ h ( S ′ )] = 0 */11: for Each non-prohibited S ′ of N ( S ) or N ( S ) or N ( S ) do if f ( S ′ ) ≥ T then S ← S ′
14: /* Update the hash vectors with S , Section 3.4.3 */ H [ h ( S )] ← H [ h ( S )] ← H [ h ( S )] ← break ;16: end if end for if f ( S ) > f ( S b ) then S b ← S /* Update the best solution S b found during threshold search */20: iter ← else iter ← iter + 123: end if end while return S b The TSP procedure examines candidate solutions by exploring threeneighborhoods induced by the popular move operators: add , swap and drop .8et S be the current solution and mv is one of these operators. We use S ′ = S ⊕ mv to denote a feasible neighbor solution obtained by applying mv to S and N x ( x = 1 , ,
3) to represent the resulting neighborhoods. To avoidthe examination of unpromising neighbor solutions, TSP employs thefollowing dynamic neighborhood filtering strategy inspired by [20,29]. Let S ′ be a neighbor solution in the neighborhood currently under examination,and S c be the best neighbor solution encountered during the currentneighborhood examination. Then S ′ is excluded for consideration if it is nobetter than S c (i.e., f ( S ′ ) ≤ f ( S c )). By eliminating the unpromisingneighbor solutions, TSP increases the efficiency of its neighborhood search.Specifically, the associated neighborhoods induced by add , swap and drop aredefined as follows. • add ( p ): This move operator expands the selected item set A by one non-selected item p from the set ¯ A such that the resulting neighbor solutionis feasible. This operator induces the neighborhood N . N ( S ) = { S ′ : S ′ = S ⊕ add ( p ) , p ∈ ¯ A } (6) • swap ( q, p ): This move operator exchanges a pair of items ( q, p ) whereitem q belongs to the selected item set A and p belongs to the non-selected item set ¯ A such that the resulting neighbor solution is feasible.This operator induces the neighborhood N . N ( S ) = { S ′ : S ′ = S ⊕ swap ( q, p ) , q ∈ A, p ∈ ¯ A, f ( S ′ ) > f ( S c ) } (7) • drop ( q ): This operator displaces one selected item q from the set A tothe non-selected item set ¯ A and induces the neighborhood N . N ( S ) = { S ′ : S ′ = S ⊕ drop ( q ) , q ∈ A, f ( S ′ ) > f ( S c ) } (8)One notices that the add operator always leads to a better current solutionwith an additional eligible item, and thus the neighborhood filtering is notneeded for N . The drop operator always deteriorates the quality of the currentsolution, and the feasibility of a neighbor solution is always ensured. The swap operator may either increase or decrease the objective value and the feasibilityof a neighbor solution needs to be verified. For N and N , neighborhoodfiltering excludes uninteresting solutions that can in no way be accepted duringthe TSP process.The TSP procedure examines the neighborhoods N , N , and N in a token-ring way [5] to explore different local optimal solutions. For N , as long asthere exists a non-prohibited neighbor solution, TSP selects such a neighborsolution to replace the current solution (ties are broken randomly). Once N becomes empty, TSP moves to N , if there exists a non-prohibited neighbor9olution S ′ satisfying f ( S ′ ) ≥ T , TSP selects S ′ to become the current solutionand immediately returns to the neighborhood N . When N becomes empty,TSP continues its search with N and explores N exactly like with N . When N becomes empty, TSP terminates its search and returns the best solutionfound S b . TSP may also terminate if its best solution remains unchangedduring IterM ax consecutive iterations.
During the TSP procedure, it is important to prevent the search from revisitinga previously encountered solution. For this purpose, TSP utilizes an operation-prohibiting (OP) mechanism that is based on the tabu list strategy [9]. Toimplement the operation-prohibiting (OP) mechanism, we adopt the solution-based tabu search technique [30]. Specifically, we employ three hash vectors H v ( v = 1 , ,
3) of length L ( | L | = 10 ) to record previously visited solutions.Given a solution S = ( x , . . . , x n ) ( x i ∈ { , } ), we pre-compute for eachitem i , the weight W i = i γ v ( v = 1 , , γ v is equal to 1 . , . , . h v ( v = 1 , ,
3) are defined as follows. h v ( S ) = ( n X i =1 ⌊W i × x i ⌋ ) mod L (9)The hash value of a neighbor solution S ′ from the add , swap or drop operatorcan be efficiently computed as follows ( x ∈ A, y ∈ ¯ A , Section 3.2). h v ( S ′ ) = h ( S ) + W y , for the add operator h ( S ) − W x + W y , for the swap operator h ( S ) − W x , for the drop operator (10)Starting with the hash vectors set to 0, the corresponding positions in thethree hash vectors H v is updated by 1 whenever a new neighbor solution S ′ is accepted to replace the current solution S (line 12-16, Alg. 2). For eachcandidate neighbor solution S ′ , its hashing value h v ( S ′ ) is calculated withEquation (10) in O (1). Then, this neighbor solution S ′ is previously visited if H [ h ( S ′ )] ∧ H [ h ( S ′ )] ∧ H [ h ( S ′ )] = 1 and is prohibited from considerationby the TSP procedure. The crossover operator generally creates new solutions by recombining twoexisting solutions. For the DCKP, we adopt the idea of the double backbone-10ased crossover (DBC) operator [32] and adapt it to the problem.Given two solutions S i and S j , we use them to divide the set of n itemsinto three subsets: the common items set X = S i ∩ S j , the unique items set X = ( S i ∪ S j ) \ ( S i ∩ S j ) and the unrelated set X = V \ ( S i ∪ S j ). The basicidea of the DBC operator is to generate an offspring solution S o by selectingall items in set X (the first backbone) and some items in set X (the secondbackbone), while excluding items in set X .As shown in Algorithm 3, from two randomly selected parent solutions S i and S j , the DBC operator generates S o in three steps. First, we initialize S o bysetting all the variables x oa ( a = 1 , . . . , n ) to 0 (line 3, Alg. 3). Second, weidentify the common items set X and the unique items set X (line 4-10, Alg.3). Third, we add all items belonging to X into S o and randomly add itemsfrom X into S o until the knapsack constraint is reached (line 11-17, Alg. 3).Note that the knapsack and disjunctive constraints are always satisfied duringthe crossover process.Since the DCKP is a constrained problem, the DBC operator adopted forTSBMA has several special features to handle the constraints, which isdifferent from the DBC operator introduced in [32]. First, we iteratively addan item into S o by selecting one item from the unique items set X randomlyuntil the knapsack constraint is reached, while each item in X is consideredwith a probability p (0 < p <
1) in [32]. Second, unlike [32] where a repairoperation is used to achieve a feasible offspring solution, our DBC operatorensures the satisfaction of the problem constraints during the offspringgeneration process.
Once a new offspring solution is obtained by the DBC operator in the lastsection, it is further improved by the threshold search procedure presented inSection 3.4. Then we adopt a diversity-based population updating strategy [19]to decide whether the improved offspring solution should replace an existingsolution in the population. This strategy is beneficial to balance the qualityof the offspring solution and its distance from the population.To accomplish this task, we temporarily insert the improved offspring solutioninto the population and compute the distance (Hamming distance) betweenany two solutions in the population. Then we obtain the goodness score ofeach solution in the same way as proposed in [19]. Finally, the worst solutionin the population is identified according to the goodness score and deletedfrom the population. 11 lgorithm 3
The double backbone-based crossover operator Input : Two parent solutions S i = ( x i , x i , . . . , x in ) and S j = ( x j , x j , . . . , x jn ).2: Output : An offspring solution S o = ( x o , x o , . . . , x on ).3: S o ← ∅ /* Initialize S o (i.e., f ( S o ) = 0)*/4: for a ← to n do if x ia = 1 and x ja = 1 then X ← a /* X is the common items set */7: else if x ia = 1 or x ja = 1 then X ← a /* X is the unique items set */9: end if end for S o ← X /* Add all items belonging to X into S o */12: Randomly shuffle all items in X ;13: for each a ∈ X do if S o ∪ ( x oa = 1) is a feasible solution then x oa ← end if end for return S o As shown in Section 3.3, the population initialization procedure includes twosteps. Given a DCKP instance with n items, the first step of random selectiontakes time O ( n ). Given an input solution S = < A, ¯ A > (see Section 3.2), thecomplexity of one iteration of the TSP procedure is O (( n + | A | × | ¯ A | )). Thenthe second step of the initialization procedure can be realized in O ([( n + | A | ×| ¯ A | )] × IterM ax ), where
IterM ax is set to 2 n in the initialization procedure.The complexity of the population initialization procedure is O ( n ).Now we consider the four procedures in the main loop of the TSBMAalgorithm: parent selection, crossover operator, the TSP procedure andpopulation updating. The parent selection procedure is realized in O (1). Thecrossover operator takes time O ( n ). The complexity of the TSP procedure is O ([( n + | A | × | ¯ A | )] × IterM ax ), where
IterM ax is determined in Section3.4.1. The population updating procedure can be achieved in O ( n | P | ), where | P | is the population size. Then, the complexity of one iteration of the mainloop of the TSBMA algorithm is O ( n × IterM ax ). In this section, we assess the proposed TSBMA algorithm by performingextensive experiments and making comparisons with state-of-the-art DCKP12lgorithms. We report computational results on two sets of 6340 benchmarkinstances.
The benchmark instances of the DCKP tested in our experiments were widelyused in the literature, which can be divided into two sets (see Tables 1 and 2for the main characteristics of these instances).
Set I (100 instances) : These instances are grouped into 20 classes (eachwith 5 instances) and named by xIy ( x = { , . . . , } and y = { , . . . , } ).The first 50 instances (1 Iy to 10 Iy ) were introduced in 2006 [13] and havethe following features: number of items n = 500 or 1000, capacity C = 1800or 2000, and density η going from 0.05 to 0.40. Note that the density is givenby 2 m/n ( n − m is the number of disjunctive constraints (i.e., thenumber of edges of the conflict graph). These instances have an item weight w i uniformly distributed in [1 , p i = w i + 10. For the instanceclasses 11 Iy to 20 Iy introduced in 2017 [24], the number of items n is set to1500 or 2000, the capacity C is set to 4000, and the density η ranges from0.04 to 0.20. These instances have an item weight w i uniformly distributed in[1 , p i equaling w i + 10. Set II (6240 instances) : This set of instances was introduced in 2017 [1]and expanded in 2020 [4]. For the four correlated instance classes C C CC ) and four random classes R R
15 (denoted by CR ), thenumber of items n is from 60 to 1000, the capacity C is from 150 to 15000,and the density η is from 0.10 to 0.90. Each of these eight classes contains720 instances. For the correlated instance class SC and the random instanceclass SR of the sparse graphs, the number of items n is from 500 to 1000, thecapacity C is from 1000 to 2000, and the density η is from 0.001 to 0.05. Eachof these two classes contains 240 DCKP instances. More details about this setof instances can be found in [4]. Reference algorithms.
For the 100 DCKP instances of Set I that were widelytested by heuristic algorithms, we adopt as our reference methods three state-of-the-art heuristic algorithms: parallel neighborhood search algorithm (PNS)[25], cooperative parallel adaptive neighborhood search algorithm (CPANS)[24], and probabilistic tabu search algorithm (PTS) [26]. Note that PTS onlyreported results of the 50 instances 1 Iy to 10 Iy , since the other 50 instancesof 11 Iy to 20 Iy were designed later. For the 6240 DCKP instances of Set II13 able 1Summary of main characteristics of the 100 DCKP instances of Set I. Class Total n C η
Class Total n C η Iy Iy Iy Iy Iy Iy Iy Iy Iy Iy Iy Iy Iy Iy Iy Iy Iy Iy Iy Iy Table 2Summary of main characteristics of the 6240 DCKP instances of Set II.
Class Total n C η
Min Max Min Max Min Max C C C
10 720 60 1000 1500 10000 0.10 0.90 C
15 720 60 1000 15000 15000 0.10 0.90 R R R
10 720 60 1000 1500 10000 0.10 0.90 R
15 720 60 1000 15000 15000 0.10 0.90 SC
240 500 1000 1000 2000 0.001 0.05 SR
240 500 1000 1000 2000 0.001 0.05 that were only tested by exact algorithms until now, we cite the results ofthree best performing methods: branch-and-bound algorithms BCM [1] andCFS [4]) as well as the integer linear programming formulations solved by theCPLEX solver (ILP) [4].
Computing platform.
The proposed TSBMA algorithm was written inC++ and compiled using the g++ compiler with the -O3 option. Allexperiments were carried out on an Intel Xeon E5-2670 processor (2.5 GHzCPU and 2 GB RAM) under the Linux operating system. The results of themain reference algorithms have been obtained on computing platforms withthe following features: an Intel Xeon processor with 2 × The code of our TSBMA algorithm will be available at:
Parameter settings.
The TSBMA algorithm does not require parametertuning (it is parameter-free). However, for the 6240 instances of Set II (witha wide range of densities and number of items), we adjusted the threshold T (see Section 3.4.1) to T = M inP + rand (20), where M inP is the minimumprofit value for each instance tested.
Stopping condition.
For the 100 DCKP instances of Set I, the TSBMAalgorithm adopted the same cut-off time as the reference algorithms (PNS,CPANS and PTS), i.e., 1000 seconds. Note that for the instances 11 Iy to20 Iy , PNS used a much longer limit of 2000 seconds. Given its stochasticnature, TSBMA was performed 20 times independently with different randomseeds to solve each instance. For the 6240 instances of Set II, the cut-off timewas set to 600 seconds as in the CFS algorithm and the number of repeatedruns was set to 10. In this section, we first present summarized comparisons of the proposedTSBMA algorithm against each reference algorithm on the 100 instances ofSet I, and then show the comparative results on the 6240 DCKP instances ofSet II. The detailed computational results of our algorithm and the referencealgorithms on the instances of Set I are shown in the Appendix, while oursolution certificates for these 100 instances are available at the webpageindicated in footnote 1. For the 6240 instances of Set II, we report theirobjective values at the same website.
The comparative results of the TSBMA algorithm and each referencealgorithm are summarized in Table 3. Column 1 indicates the pairs ofcompared algorithms and column 2 gives the names of instance class.Column 3 shows the quality indicators: the best objective value ( f best ) andthe average objective value ( f avg ) (when the average results are available inthe literature). The following columns f best and f avg values (when the averageresults are available in the literature). The outcomes of the Wilcoxon tests15re shown in the last column where ‘NA’ means that the two sets ofcompared results are exactly the same.From Table 3, one observes that the TSBMA algorithm competes veryfavorably with all the reference algorithms by reporting improved or equalresults on all the instances. Compared to the probabilistic tabu searchalgorithm (PTS) [26] which reported results only on the first 50 instances ofclasses 1 Iy to 10 Iy , TSBMA finds 8 (45) better f best ( f avg ) values, whilematching the remaining results. Compared to the two parallel algorithms(PNS) [25] and (CPANS) [24] that reported only the f best values, TSBMAobtained 35 and 29 better f best results, respectively. The small p - values ( < .
05) from the Wilcoxon tests between TSBMA and its competitorsindicate that the performance differences are statistically significant. Finally,it is remarkable that our TSBMA algorithm discovered 24 new lower boundson the instances 11 Iy to 20 Iy (see the detailed results shown in theAppendix). Table 3Summarized comparisons of the TSBMA algorithm against each reference algorithmwith the p - values of the Wilcoxon signed-rank test on the 100 DCKP instances ofSet I. Algorithm pair Instance Indicator p - value TSBMA vs. PTS [26] 1 Iy − Iy (50) f best f avg
45 5 0 5.34e-9TSBMA vs. PNS [25] 1 Iy − Iy (50) f best Iy − Iy (50) f best
26 24 0 8.25e-6TSBMA vs. CPANS [24] 1 Iy − Iy (50) f best Iy − Iy (50) f best
29 21 0 2.59e-6
To complete the assessment, we provide the performance profiles [6] of thefour compared algorithms on the 100 instances of Set I. Basically, theperformance profile of an algorithm shows the cumulative distribution for agiven performance metric, which reveals the overall performance of thealgorithm on a set of instances. In our case, the plots concern the bestobjective values ( f best ) of the compared algorithms since the average resultsof some reference algorithms are not available in the literature. Given a setof algorithms (solvers) S and an instance set P , the performance ratio isgiven by r p,s = f p,s min { f p,s : s ∈S} , where f p,s is the f best value of instance p of P obtained by algorithm s of S . The performance profiles are shown in Figure2, where the performance ratio and the percentage of instances solved byeach compared algorithm are displayed on the X - axis and Y - axis ,respectively. When the value of X - axis is 1, the corresponding value of Y - axis indicates the fraction of instances for which algorithm s can reachthe best f best value of the set S of the compared algorithms.From Figure 2, we observe that our TSBMA algorithm has a very good16erformance on the 100 benchmark instances of Set I compared to thereference algorithms. For the 50 instances 1 Iy to 10 Iy , TSBMA and CPANSare able to reach 100% best f best values on these 50 instances, while PTS andPNS fail on around 15% of the instances. When considering the 50 instances11 Iy to 20 Iy , the plot of TSBMA strictly runs above the plots of PNS andCPANS, revealing that our algorithm dominates the reference algorithms onthese 50 instances. These outcomes again confirm the high performance ofour TSBMA algorithm. ! !!" ! " $" % & ’ ( " ) * + % , & ’ % $" ,, ) - . " / ! ! " $ !" " &) 354 " !" ! !!" ! " $" % & ’ ( " ) * + % , & ’ % $" ,, ) - . " / ! ! " $ !" " &) 564 " !" Fig. 2. Performance profiles of the compared algorithms on the 100 DCKP instancesof Set I.
Table 4 summarizes the comparative results of our TSBMA algorithm onthe 6240 instances of Set II, together with the three reference algorithmsmentioned in 4.2. Note that three ILP formulations were studied in [4], weextracted the best results of these formulations in Table 4, i.e., the results oninstances CC and CR (conflict graph density from 0.10 to 0.90) with ILP andthe results on very sparse instances SC and SR (conflict graph density from0.0001 to 0.005) with ILP . Columns 1 and 2 of Table 4 identify each instanceclass and the total number of instances of the class. Columns 3 to 5 indicate thenumber of instances solved to optimality by the three reference algorithms.Column 6 shows the number of instances for which our TSBMA algorithmreaches the optimal solution proved by exact algorithms. The number of newlower bounds (denoted by NEW LB in Table 4) found by TSBMA is providedin column 7. The best results of the compared algorithms are highlightedin bold. In order to further evaluate the performance of our algorithm, wesummarize the available comparative results between MSBTS and the mainreference algorithm CFS in columns 8 to 10. The last three rows provide anadditional summary of the results for each column.From Table 4, we observe that TSBMA performs globally very well on theinstances of Set II. For the 5760 CC and CR instances, TSBMA reachesmost of the proved optimal solutions (5381 out of 5389) and discovers new17ower bounds for 323 difficult instances whose optima are still unknown. Forthe 240 very sparse SC instances, TSBMA matches 195 out of 200 provedoptimal solutions and finds 24 new lower bounds for the remaining instances.Although TSBMA successfully solves only 9 out of the 229 solved verysparse SR instances, it discovers 7 new lower bounds. The high performanceof TSBMA is further evidenced with the comparison with the best exactalgorithm CFS (last three columns).Notice that the performance of CPLEX with ILP is better than TSBMA aswell as the two reference algorithms BCM and CFS on the two classes of verysparse instances ( SC and SR ). As analyzed in [4], one of the main reasons isthat the LP relaxation of ILP provides a very strong upper bound, whichmakes the ILP formulation very suitable for solving very sparse instances.The disjunctive constraints become very weak when the conflict graph is verysparse. For these two classes of instances, the pure branch-and-bound CFSalgorithm is more effective on extremely sparse instances with densities up to0.005. On the contrary, our TSBMA algorithm is more suitable for solvingsparse instances with densities between 0.01 and 0.05. In fact, the new lowerbounds found by TSBMA all concern instances with a density of 0.05.Finally, the TSBMA algorithm remains competitive on the 240 correlatedsparse instances SC , even if the density is the smallest (0.001), which meansthat only the random sparse instance class SR is challenging for TSBMA.In summary, our TSBMA algorithm is computational efficient on a majorityof the 6240 benchmark instances of Set II and is able to discover new lowerbounds on 354 difficult DCKP instances, whose optimal solutions are stillunknown. In this section, we analyze two essential components of the TSBMAalgorithm: the importance of the threshold search and the contribution ofthe operation-prohibiting mechanism. The studies in this section are basedon the 50 benchmark instances 11 Iy to 20 Iy of Set I. The threshold search procedure of the TSBMA algorithm is the firstadaptation of the threshold accepting method to the DCKP. To assess theimportance of this component, we compare TSBMA with two TSBMAvariants by replacing the TSP procedure with the f irst - improvement able 4Summarized comparisons of the TSBMA algorithm against each reference algorithmon the 6240 DCKP instances of Set II. Class Total ILP , [4] BCM [1] CFS [4] TSBMA (this work) TSBMA vs. CFSSolved Solved Solved Solved New LB C
720 720 720 720 C
720 720
716 0 0 716 4 C
10 720 446 552
617 617 91
91 629 0 C
15 720 428 550
600 600 117
117 603 0 R
720 720 720
717 0 0 717 3 R
720 720 720 R
10 720 508 630
37 681 2 R
15 720 483 590
622 622 78
78 641 1 SC
109 156 195
70 165 5 SR
240 229 154 176 9
43 8 189Total on CC and CR SC and SR
480 429 263 332 204 31 113 173 194Grand total 6240 4998 5424 5721 5585 354 436 5600 204 descent procedure and best - improvement descent procedure. In other words,these variants (named as MA1 and MA2) use, in each iteration, the first andthe best improving solution S ′ in the neighborhood to replace the currentsolution, respectively. We carried out an experiment by running the twovariants to solve the 50 instances 11 Iy to 20 Iy with the same experimentalsettings of Section 4.2. The performance profiles of TSBMA and theseTSBMA variants are shown in Figure 3 based on the best objective values(left sub-figure) and the average objective values (right sub-figure).From Figure 3, we can clearly observe that TSBMA dominates MA1 andMA2 according to the cumulative probability obtained by the f best and f avg values. The plots of TSBMA strictly run above the plots of MA1 and MA2,indicating TSBMA performs always better than the two variants. Thisexperiment implies that the adopted threshold search procedure of TSBMAis relevant for its performance. 19 !!" ! " $" % & ’ ( " ) * + % , & ’ % $" ,, ) - . " / ! ! " $ !" !" ! !!" ! " $" % & ’ ( " ) * + % , & ’ % $" ,, ) - . " / ! ! " !" !" Fig. 3. Performance profiles of the compared algorithms on the 50 DCKP instances11 Iy to 20 Iy . TSBMA avoids revisiting previously encountered solutions with the OPmechanism introduced in Section 3.4.3. To assess the usefulness of the OPmechanism, we created a TSBMA variant (denoted by TSBMA − ) bydisabling the OP component and keeping the other components unchanged.We ran TSBMA − to solve the 50 11 Iy to 20 Iy instances according toexperimental settings given in Section 4.2 and reported the results in Table5. The first column gives the name of each instance and the remainingcolumns show the best objective values ( f best ), the average objective values( f avg ) and the standard deviations ( std ). Row p - values from the Wilcoxon signed-rank test. Thebest results of the compared algorithms are highlighted in bold.From Table 5, we observe that TSBMA − performs worse than TSMBA.TSBMA − obtains worse f best values for 35 out of the 50 instances and worse f avg values for 48 instances. Considering the std values, TSBMA − shows amuch less stable performance than TSMBA. Moreover, the small p - values ( < .
05) from the Wilcoxon tests confirm the statistically significantdifference between the results of TSMBA and TSBMA − . This experimentdemonstrates the effectiveness and robustness of the operation-prohibitingmechanism employed by the TSMBA algorithm.20 able 5Comparison between TSBMA − (without the OP mechanism) and TSBMA (withthe OP mechanism) on the instances 11 Iy to 20 Iy . Instance TSBMA − TSBMA f best f avg std f best f avg std I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I p - values Conclusions
The disjunctively constrained knapsack problem is a well-known NP-hardmodel. Given its practical significance and intrinsic difficulty, a variety ofexact and heuristic algorithms have been designed for solving the problem.We proposed the threshold search based memetic algorithm that combinesfor the first time threshold search with the memetic framework.Extensive evaluations on a large number of benchmark instances in theliterature (6340 instances in total) showed that the algorithm performscompetitively with respect to the state-of-the-art algorithms. Our approachis able to discover 24 new lower bounds out of the 100 instances of Set I and354 new lower bounds out of the 6240 instances of Set II. These new lowerbounds are useful for future studies on the DCKP. The algorithm alsoattains the best-known or known optimal results on most of the remaininginstances. We carried out additional experiments to investigate the twoessential ingredients of the algorithm (the threshold search technique and theoperation-prohibiting mechanism). The disjunctively constrained knapsackproblem is a useful model to formulate a number of practical applications.The algorithm and its code (that we will make available) can contribute tosolving these problems.There are at least two possible directions for future work. First, TSBMAperformed badly on most random sparse instances of SR . It would beinteresting to improve the algorithm to better handle such instances. Second,given the good performance of the adopted approach, it is worthinvestigating its underlying ideas to solve related problems discussed in theintroduction. Declaration of competing interest
The authors declare that they have no known competing interests that couldhave appeared to influence the work reported in this paper.
Acknowledgments
We would like to thank Dr. Zhe Quan, Dr. Lei Wu, Dr. Pablo San Segundoand their co-authors for sharing the instances of the DCKP and the detailedresults of their algorithms reported in [24], [25], and [4].22 eferences [1] A. Bettinelli, V. Cacchiani, E. Malaguti, A branch-and-bound algorithm for theknapsack problem with conflict graph, INFORMS Journal on Computing 29 (3)(2017) 457–473.[2] D. Castelino, N. Stephens, Tabu thresholding for the frequency assignmentproblem, in: Meta-Heuristics, Springer, 1996, pp. 343–359.[3] Y. Chen, J.-K. Hao, Iterated responsive threshold search for the quadraticmultiple knapsack problem, Annals of Operations Research 226 (1) (2015) 101–131.[4] S. Coniglio, F. Furini, P. San Segundo, A new combinatorial branch-and-bound algorithm for the knapsack problem with conflicts, European Journalof Operational Research 289 (2) (2020) 435–455.[5] L. Di Gaspero, A. Schaerf, Neighborhood portfolio approach for local searchapplied to timetabling problems, Journal of Mathematical Modelling andAlgorithms 5 (1) (2006) 65–89.[6] E. D. Dolan, J. J. Mor´e, Benchmarking optimization software with performanceprofiles, Mathematical Programming 91 (2) (2002) 201–213.[7] G. Dueck, T. Scheuer, Threshold accepting: A general purpose optimizationalgorithm appearing superior to simulated annealing, Journal of ComputationalPhysics 90 (1) (1990) 161–175.[8] G. Dueck, J. Wirsching, Threshold accepting algorithms for 0–1 knapsackproblems, in: Proceedings of the Fourth European Conference on Mathematicsin Industry, pages 255–262, Springer, 1991.[9] F. Glover, M. Laguna, Tabu search, Springer Science+Business Media NewYork, 1997.[10] F. Gurski, C. Rehs, Solutions for the knapsack problem with conflict and forcinggraphs of bounded clique-width, Mathematical Methods of Operations Research89 (3) (2019) 411–432.[11] J.-K. Hao, Memetic algorithms in discrete optimization, in: Handbook ofMemetic Algorithms, Springer, 2012, pp. 73–94.[12] M. Hifi, An iterative rounding search-based algorithm for the disjunctivelyconstrained knapsack problem, Engineering Optimization 46 (8) (2014) 1109–1122.[13] M. Hifi, M. Michrafy, A reactive local search-based algorithm for thedisjunctively constrained knapsack problem, Journal of the OperationalResearch Society 57 (6) (2006) 718–726.[14] M. Hifi, M. Michrafy, Reduction strategies and exact algorithms for thedisjunctively constrained knapsack problem, Computers & Operations Research34 (9) (2007) 2657–2673.
15] M. Hifi, N. Otmani, An algorithm for the disjunctively constrained knapsackproblem, International Journal of Operational Research 13 (1) (2012) 22–43.[16] K. Jansen, An approximation scheme for bin packing with conflicts, Journal ofCombinatorial Optimization 3 (4) (1999) 363–377.[17] D. S. Johnson, M. R. Garey, Computers and intractability: A guide to thetheory of NP-completeness, WH Freeman, 1979.[18] H. Kellerer, U. Pfersch, D. Pisinger, Knapsack problems, Spinger, 2004.[19] X. Lai, J.-K. Hao, F. Glover, Z. L¨u, A two-phase tabu-evolutionary algorithm forthe 0–1 multidimensional knapsack problem, Information Sciences 436 (2018)282–301.[20] X. Lai, J.-K. Hao, D. Yue, Two-stage solution-based tabu search forthe multidemand multidimensional knapsack problem, European Journal ofOperational Research 274 (1) (2019) 35–48.[21] P. Moscato, Memetic algorithms: A short introduction, New Ideas inOptimization (1999) 219–234.[22] U. Pferschy, J. Schauer, The knapsack problem with conflict graphs, JournalGraph Algorithms and Applications 13 (2) (2009) 233–249.[23] U. Pferschy, J. Schauer, Approximation of knapsack problems with conflict andforcing graphs, Journal of Combinatorial Optimization 33 (4) (2017) 1300–1323.[24] Z. Quan, L. Wu, Cooperative parallel adaptive neighbourhood search for thedisjunctively constrained knapsack problem, Engineering Optimization 49 (9)(2017) 1541–1557.[25] Z. Quan, L. Wu, Design and evaluation of a parallel neighbor algorithm for thedisjunctively constrained knapsack problem, Concurrency and Computation:Practice and Experience 29 (20) (2017) e3848.[26] M. B. Salem, S. Hanafi, R. Taktak, H. B. Abdallah, Probabilistic tabusearch with multiple neighborhoods for the disjunctively constrained knapsackproblem, RAIRO-Operations Research 51 (3) (2017) 627–637.[27] M. B. Salem, R. Taktak, A. R. Mahjoub, H. Ben-Abdallah, Optimizationalgorithms for the disjunctively constrained knapsack problem, Soft Computing22 (6) (2018) 2025–2043.[28] C. D. Tarantilis, C. T. Kiranoudis, V. S. Vassiliadis, A threshold acceptingmetaheuristic for the heterogeneous fixed fleet vehicle routing problem,European Journal of Operational Research 152 (1) (2004) 148–158.[29] Z. Wei, J.-K. Hao, Iterated two-phase local search for the set-union knapsackproblem, Future Generation Computer Systems 101 (2019) 1005–1017.[30] D. L. Woodruff, E. Zemel, Hashing vectors for tabu search, Annals of OperationsResearch 41 (2) (1993) 123–137.
31] T. Yamada, S. Kataoka, K. Watanabe, Heuristic and exact algorithms for thedisjunctively constrained knapsack problem, Journal of Information ProcessingSociety of Japan 43 (9) (2002) 2864–2870.[32] Y. Zhou, J.-K. Hao, F. Glover, Memetic search for identifying critical nodes insparse graphs, IEEE Transactions on Cybernetics 49 (10) (2018) 3699–3712.[33] Y. Zhou, V. Naroditskiy, Algorithm for stochastic multiple-choice knapsackproblem and application to keywords bidding, in: Proceedings of the 17thInternational Conference on World Wide Web, pages 1175–1176, 2008.
A Computational results on the 100 DCKP instances of Set I
Tables A.1 and A.2 report the detailed computational results of the TSBMAalgorithm and the reference algorithms (PNS [25], CPANS [24] and PTS [26])on the 100 DCKP instances of Set I.The first two columns of the tables give the name of each instance and thebest-known objective values (BKV) ever reported in the literature. We employthe following four performance indicators to present our results: best objectivevalue ( f best ), average objective value over 20 runs ( f avg ), standard deviationsover 20 runs ( std ), and average run time t avg in seconds to reach the bestobjective value. However, some of the performance indicators of the referencealgorithms are not available in the literature (i.e., f avg , t avg and std ). Notethat for [25] (PNS) and [24] (CPANS), the authors reported several groups ofresults obtained by using different numbers of processors (range from 10 to400). To make a fair comparison, we take the best f best value of each instance inthese groups of results as the final result. We use the average of the t avg valuesin these groups as the final average run time. The last row able A.1Computational results of the TSBMA algorithm with the reference algorithms onthe 50 DCKP instances of Set I (1 Iy to 10 Iy ). Instance BKV PNS [25] CPANS [24] PTS [26] TSBMA (this work) f best f best t avg ( s ) f best f avg f best f avg std t avg ( s )1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ∗ ∗ I I ∗ ∗ I able A.2Computational results and comparison of the TSBMA algorithm with the referencealgorithms on the 50 DCKP instances of Set I (11 Iy to 20 Iy ). Instance BKV PNS [25] CPANS [24] TSBMA (this work) f best f best t avg ( s ) f best f avg std t avg ( s )11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I4386.40 4.05 646.570