[PDF] A Weighted Population Update Rule for PACO Applied to the Single Machine Total Weighted Tardiness Problem

Abstract

In this paper a new population update rule for population based ant colony optimization (PACO) is proposed. PACO is a well known alternative to the standard ant colony optimization algorithm. The new update rule allows to weight different parts of the solutions. PACO with the new update rule is evaluated for the example of the single machine total weighted tardiness problem (SMTWTP). This is an NP -hard optimization problem where the aim is to schedule jobs on a single machine such that their total weighted tardiness is minimized. PACO with the new population update rule is evaluated with several benchmark instances from the OR-Library. Moreover, the impact of the weights of the jobs on the solutions in the population and on the convergence of the algorithm are analyzed experimentally. The results show that PACO with the new update rule has on average better solution quality than PACO with the standard update rule.

Full PDF

aa r X i v : . [ c s . N E ] A p r A Weighted Population Update Rule for PACO Applied to theSingle Machine Total Weighted Tardiness Problem

Daniel Abitz , Tom Hartmann , and Martin Middendorf Swarm Intelligence and Complex Systems Group, Faculty of Mathematics and Computer Science, University ofLeipzig, Augustusplatz 10, D-04109 Leipzig, Germany. Bioinformatics Group, Department of Computer Science & Interdisciplinary Center for Bioinformatics,Universit¨at Leipzig, H¨artelstraße 16–18, D-04107 Leipzig, Germany.

Abstract

N P -hard optimization problem where the aim is to schedule jobs on a single machine such that theirtotal weighted tardiness is minimized. PACO with the new population update rule is evaluated withseveral benchmark instances from the OR-Library. Moreover, the impact of the weights of the jobs onthe solutions in the population and on the convergence of the algorithm are analyzed experimentally.The results show that PACO with the new update rule has on average better solution quality than PACOwith the standard update rule.

Keywords:

Ant algorithms, Combinatorial optimization, Metaheuristics, Swarm intelligence, Time-tabling and scheduling

Population based ant colony optimization (PACO) algorithm [9] is an iterative metaheuristic where apopulation of solutions is transferred from one iteration to the next iteration. In each iteration thepopulation is used to generate corresponding pheromone information which is then used by the ants asin standard ant colony optimization (ACO) in order to construct new solutions. An advantage of PACOis that it exhibits faster pheromone update and evaporation mechanisms than usual ACO algorithmswhile being competitive with respect to solution quality (e.g., [9, 14, 15]). For more information ondiﬀerent ACO approaches as well as recent developments in the research ﬁeld of ACO metaheuristics thereader is referred to [6] and [7], respectively.Since the population in PACO determines the pheromone information the PACO metaheuristic usesa population update rule to change the pheromone values. Thus, the population update rule in PACOcorresponds to the pheromones update rule of ACO. In this paper a new population update rule forPACO is proposed that uses weights for diﬀerent parts of the solutions in order to control the strengthsof their inﬂuence on the optimization process. PACO with the new update rule (WPACO) is appliedto the single-machine total weighted tardiness problem (SMTWTP). The SMTWTP is a well-studiedscheduling problem that is known to be

N P -hard [10]. An instance of the SMTWTP is a set of jobswhere each job has a processing time, a due date, and a weight. The aim is to ﬁnd a schedule of thejobs on a single machine such that the weighted total tardiness, i.e., the weighted sum of delays causedby ﬁnishing a job after its due date, is minimized.The principle of the new update rule (when applied to the SMTWTP) is to incorporate the weightsof the jobs into the PACO algorithm. Instead of a population of solutions, the WPACO uses a sequence( P , . . . , P n ) of multisets of jobs, where n is the number of jobs. Multiset P i contains the jobs that wereon place i in the best solutions that were found by the ants in the last iterations. In PACO it is aprinciple that the iteration best solution is entered into the population. In WPACO a correspondingprinciple is used: For each position i ∈ [1 , n ] of the iteration best solution, the job j on position i enters he multiset P i . Each multiset P i has a maximum capacity k and the sum of the weights of the jobs thatare stored in the multiset cannot exceed k . Details of the new update rules are described in Section 3.Algorithm WPACO is evaluated on several benchmark instances from the OR-Library [2] and isexperimentally compared to PACO with the standard population update rule. For the experiments theparameter values of WPACO are optimized using the automatic conﬁguration tool Irace [11].The paper is organized as follows: in Section 2 a formal deﬁnition of the SMTWTP is presented.In Section 3 ACO and PACO are described as well as the new weighted population update rule. Theexperimental setup is given in Section 4 and discussed in Section 5. In Section 6 a summary of the paperis given and avenues for future work are outlined.

The single machine total weighted tardiness problem (SMTWTP) is deﬁned as follows: Consider a set of n ∈ N jobs that need to be processed on a single machine that can handle at most one job at a time.Each job j is assigned to a processing time p j ∈ N ≥ that describes the time that is needed to processjob j , a due date d j ∈ N ≥ that describes the time point when the processing of job j should have beenﬁnished, and a weight w j ∈ N ≥ that represents the priority of job j . Given such a set of n jobs, a schedule π is a permutation of length n , i.e., a bijective mapping π : { , ..., n } → { , ..., n } , that assignsto each place i in the queue a job π ( i ). For the sake of a clear notation, we represent a permutation π asthe n -tuple ( π (1) , ..., π ( n )). Clearly, a schedule π deﬁnes a total order in which the n jobs are processedon a single machine with π (1) (respectively π ( n )) being the ﬁrst (respectively last) job in π . For a givenschedule π , the completion time C j of a job j is the time that is needed to complete job j in π , i.e., C j := P i ≤ π − ( j ) i =1 p π ( i ) , where π − ( j ) denotes the position of job j in π . The tardiness T j of a job j isdeﬁned as T j := max { C j − d j , } . Note that the tardiness cannot be negative and thus, it can be seen asa penalty for completing a job after its due date. Given a set of n jobs, the single machine total weightedtardiness problem aims to ﬁnd a schedule of all n jobs that minimizes the weighted tardiness of all jobs,i.e., it aims minimize the objective P nj =1 w j T j . The expression P nj =1 w j T j of a schedule π is also calledthe total weighted tardiness of π .If w = ... = w n = 1, then the objective function of the SMTWTP can be simpliﬁed to P nj =1 T j .This problem is called single machine total tardiness problem. This section presents background information on diﬀerent ACO approaches for the SMTWTP. In par-ticular, in Section 3.1 the ACO approaches proposed in [5] and [12] are described. In Section 3.2 thePACO metaheuristic that has been presented in [9] is outlined. The details of the proposed weightedpopulation update rule for PACO are described in Section 3.3.

In this section the ACO approach for the SMTWTP of Besten et al. [5] is described. Consider agiven SMTWTP with n jobs to be scheduled. Recall that a schedule of the n jobs is represented as apermutation ( π (1) , ..., π ( n )). For example the permutation (4 , , ,

2) is a schedule of 4 jobs in which job4 is processed ﬁrst, followed by jobs 1, 3, and 2 in that order. The ACO approach of Besten et al. isinitialized with a ﬁxed number of ants and number of iterations. In each iteration every ant starts withan empty schedule and iteratively appends unscheduled jobs until the schedule is complete, i.e., all jobsare scheduled. Through this process it is ensured that all jobs are scheduled and no job is scheduledmultiple times. Two kinds of information are used in order to inﬂuence an ant’s decision to select a job j at position i . The heuristic information η ij indicates how desirable it is to schedule job j at position i with respect to a problem speciﬁc heuristic function. The pheromone value τ ij gives details aboutfavorable schedules that have been found in previous iterations. A large value τ ij indicates that of job j has often been placed at position i by ants in previous iterations. This implies that placing job j atposition i may be favorable with respect to the objective function, since only ”good” schedules of formeriterations are usually allowed to update the pheromone values. For the solution construction, Besten etal. combine a maximization strategy with a probabilistic decision. Consider a probability parameter q with 0 ≤ q <

1. With probability q an ant chooses a job j ∈ S ∗ from the set of unscheduled jobs S ∗ at position i if and only if the expression ( τ ij ) α · ( η ij ) β (1) s maximized, where α and β are parameters that represent the inﬂuence of the pheromone informationand the heuristic information, respectively. With probability (1 − q ) job j ∈ S ∗ is scheduled on position i randomly with respect to the probability p ij =  ( τ ij ) α · ( η ij ) β P k ∈ S ∗ ( τ ik ) α · ( η ik ) β , if j ∈ S ∗ i all preceding pheromone values should beconsidered and thus, they suggested using the sum of pheromone values of all already scheduled jobs.Using the ideas presented in [12] formulas (1) and (2) can be updated resulting into the following ACOalgorithm. With probability q an ant chooses a job j ∈ S ∗ at position i that maximizes( i X l =1 τ lj ) α · ( η ij ) β (3)and with probability (1 − q ) job j ∈ S ∗ is chosen at position i randomly with respect to the probability p ij =  ( P il =1 τ lj ) α · ( η ij ) β P k ∈ S ∗ ( P il =1 τ lk ) α · ( η ik ) β , if j ∈ S ∗ η ij . The most basicone is called the Earliest Due Date (EDD). The EDD heuristic prefers jobs that have a small due dateand thus, the heuristic values are calculated by η ij = 1 /d j .A more elaborated heuristic called the Modiﬁed Due Date (MDD) heuristic has been proposed in[1] and was further improved in [12]. The MDD heuristic considers (in addition to the due date) thepotential completion time C j a job j would have if scheduled at position i . Following this notion, theheuristic values are calculated by η ij = 1 max { C j , d j } − ( C j − p j ) . (5)After an ant has scheduled a job, the following local pheromone update is performed: The pheromonevalue τ ij is replaced with (1 − ρ ) τ ij + ρτ , where 0 ≤ ρ < τ aninitial amount of pheromones. Value τ is computed with respect to the SMTWTP as τ = 1 / ( nT EDD ),where n is the number of jobs and T EDD is the total tardiness of a schedule obtained via the EDDheuristic. Note that the value τ is also used to initialize the pheromone matrix.At the end of each iteration, i.e., after all ants have constructed a schedule for all jobs, the globalamount of pheromone is updated by means of two procedures. First, pheromone is evaporated by setting τ ij to (1 − ρ ) τ ij . The idea behind evaporation is that the inﬂuence of old solutions is reduced duringthe run of the algorithm. The second procedure performs an additional update of the global pheromonevalues for all job-position pairs that occur in the best schedule that has been found so far. In detail, ifjob j is at position i in the best schedule π , then the pheromone value τ ij is increased by 1 /T b , where T b is the weighted tardiness of π .The algorithm stops after a termination criterion is met, e.g., a speciﬁc number of iterations is reached. In this section, the population based ant colony optimization approach (PACO) for the SMTWTP thathas been presented in [9] is described. Generally, the PACO algorithm follows the same procedure as thealgorithms explained in Section 3.1: In each iteration a ﬁxed number of artiﬁcial ants construct a newsolution, pheromone is updated, and pheromone is transmitted to the following iteration. However, thepheromone update and the transmission are diﬀerent. Instead of using a matrix of pheromone valuesthat is transmitted from one iteration to another, PACO uses a set of solutions, called a population,from which the pheromone values can be calculated in each iteration. In addition, pheromone update inthe PACO algorithm is performed by changing the solutions in the population. Compared to the ACO,the pheromone update of the population based approach is much faster [14]. The following paragraphdescribes these procedures of the PACO algorithms in detail, see also Figure 1 for an example. , , , )(3, 2, 1, 4)(4, 2, 1, 3)- - - - (a) ( , , , )(3, 1, 2, 4)(3, 2, 1, 4)(4, 2, 1, 3) (b) ( , , , )(1, 4, 2, 3)(3, 1, 2, 4)(3, 2, 1, 4) (c) Figure 1:

The ﬁgure shows the standard population update of the PACO algorithm. Subﬁgures (a)to (c) illustrate population P with a capacity of 4 at the end of iteration 3 to 5, respectively. Eachline within P is either empty (dashes) or it is ﬁlled with a schedule. The schedule that was added to P during the respective iteration is highlighted. (b) Population P after schedule (1 , , ,

3) was added.(c) Since the capacity of P is reached, the oldest schedule (4 , , ,

3) is removed from P . In addition,the schedule (2 , , ,

1) is added to P .Consider a population P with a capacity of k ∈ N schedules, i.e., P = { π , ..., π h } with 0 ≤ h ≤ k .The pheromone values τ ij that are needed for constructing a new solution, i.e., formulas (3) and (4),can be computed by τ ij = τ + τ s ℓ ij , where τ s = ( τ max − τ ) /k and ℓ ij ∈ { , ..., h } denotes how oftenjob j is at position i in the h schedules contained in P . Parameter τ max controls the maximal amountof pheromones. Formally, the value ℓ ij is deﬁned as ℓ ij := |{ π ∈ P : π ( i ) = j }| , where | X | denotesthe cardinality of a set X . Note that this implies that for each pheromone value τ ij it holds that τ ≤ τ ij ≤ τ + kτ s , and thus τ ≤ τ ij ≤ τ max . At the end of each iteration, i.e., after all artiﬁcial antshave constructed a new schedule, evaporation is performed by removing the oldest schedule from thecurrent population. This population update rule is called age-based strategy [9] and it is the standardrule for PACO. In addition, the iteration-best schedule is added to the population. At the beginning ofthe optimization process, either the initial population is empty or it is ﬁlled with k schedules that areconstructed randomly or heuristically. If the initial population is empty, then no schedule is removedfrom the population in the ﬁrst k iterations. See Figure 1 for an example of the population update ruleof the PACO algorithm. A novel population update rule for the PACO algorithm for the SMTWTP is proposed in this section.The idea of this rule is to consider the weights of the jobs of a schedule that is added to the population.We refer to the PACO algorithm that uses the novel population update rule as weighted populationbased ant colony optimization (WPACO) algorithm. Generally, the WPACO algorithm follows the sameprocedures as the PACO algorithm that is described in Section 3.2 but it diﬀers in the way a populationis construed and the way a schedule is added to the population at the end of an iteration. Instead ofconsidering a population P = { π , ..., π k } of capacity k ∈ N , the idea of the novel population update rulein the WPACO algorithm is to consider a weighted population wP = ( P , ..., P n ) that contains for eachposition i ∈ { , ..., n } a multiset of jobs P i = { π ( i ) , ..., π k ( i ) } with capacity k that were scheduled toposition i in the last k iterations. Observe that this is a diﬀerence to the notion of the PACO algorithm,since it has the beneﬁt that a population is no longer bound to contain feasible schedules. Whereas thismay appear counterproductive, it allows to perform an update rule that considers the weight of a job ofan SMTWTP as explained in the following.Consider an SMTWTP of n ∈ N jobs and a weighted population wP = ( P , ..., P n ) in which eachmultiset has a capacity of k ∈ N . Recall that each job π ( i ) is assigned to a weight w π ( i ) that representsits priority. Suppose that schedule π is added to wP at the end of an iteration and job π ( i ) is scheduledto position i in π , then evaporation is performed by removing the oldest w π ( i ) jobs from multiset P i .In addition, π ( i ) is added w π ( i ) times to P i in order to ﬁll the weighted population. At the beginningof the optimization process, the WPACO behaves analogously to the PACO algorithm. In particular,each multiset of the weighted population is either empty or it is ﬁlled with jobs of schedules that wereconstructed randomly or heuristically. Likewise, if the initial weighted population is empty, then no jobis removed from a multiset P i until P i is completely ﬁlled. Figure 2 illustrates an example of this novelpopulation update rule.Figures 1 and 2 demonstrate the primary diﬀerence between the population update rule in the PACOalgorithm and the WPACO algorithm: Using the novel update rule and weighted populations, a popula-tion may contain partial and invalid solutions. For example, only the ﬁrst row in Figure 2.(b) representsa feasible schedule for the given SMTWTP. The reasoning is that jobs occur multiple times in the remain-ing rows. However, it is worth to mention that this does not aﬀect the way artiﬁcial ants construct theirschedules. The reasoning is that the artiﬁcial ants of the PACO and the WPACO algorithm constructtheir schedules with respect to the pheromone values τ ij which are depend on the values of parameters (a) (b) Figure 2:

Weighted Population wP = ( P , ..., P ) with a capacity of 4 at the start (a) and end (b) ofan iteration of the WPACO algorithm. The multisets P , ..., P of wP are illustrated by the columns(from left to right), i.e., wP = ( P = { , } , P = { , } , P = { , } , P = { , } ). The weights ofthe four jobs in the exempliﬁed SMTWTP are w = w = 1, w = 2, and w = 3. The schedule(3 , , ,

4) is added to wP at the end of this iteration. Since the capacity of P is 4, the oldest job 4 isremoved. The reason is that adding job 3 three times to P would result in a multiset that contains5 jobs. Observe that no job was removed from P , P and P as they satisfy this capacity constraint.Subsequently, job 3, 1, 2, and 4 is added 3, 1, 2, and 1 times to P , P , P , and P , respectively. Theadded jobs are highlighted in (b). τ , τ s and on the value ℓ ij that represents how often job j is at position i in the current population.It is not hard to see that the value of τ and τ s can be set easily. In addition, the value ℓ ij can alsobe obtained from a weighted population wP = ( P , ..., P n ) by counting how often job j occurs in themultiset P i . Consequently, the artiﬁcial ants of the WPACO algorithm construct feasible solutions aswell as the ones of the PACO algorithm. SMTWTP instances from the OR-Library [2] were used to investigate the optimization behavior ofthe proposed weighted population update rule. An SMTWTP instance consists of n = 100 jobs andis generated as explained in the following: For each job j ∈ { , ..., } a processing time p j is chosenuniformly at random from { , ..., } and a job weight w j is chosen uniformly at random from { , ..., } .In addition, the due date d j of job j is chosen uniformly at random from " n X j =1 p j · (1 − T F − RDD , n X j =1 p j · (1 − T F + RDD , where T F and

RDD are parameters each from the set { . , . , . , . , . } . Parameter T F represents the hardness of an SMTWTP instance and pa-rameter

RDD represents the relative range of due dates. A great

T F value results in a small (or evennegative) lower bound on the due dates. Since due dates cannot be negative, all negative due dates areset to 0. Observe that such a job is always completed after its due date, i.e., it contributes a positivetardiness for all schedules. In contrast a small

T F value results in large due dates and thus, more jobscan be expected to be ﬁnished in time. The variance of the due dates is determined by the parameter

RDD . In particular, a great

RDD value results in more diverse due dates, whereas a small

RDD valueresults in more similar due dates.For each combination of parameters

T F and

RDD ﬁve problem instances were generated. Conse-quently, a set of 125 SMTWTP instances was generated. The set of all 125 generated SMTWTP instancesis called evaluation set and it is henceforth denoted by X . The subset of X that were generated with theparameter values RDD = a and T F = b is denoted by X a,b . The evaluation set X was used for investi-gating the optimization behavior of the weighted population update rule. For that reason, we comparedthe results of the PACO algorithm with weighted population update rule (henceforth WPACO) with theresults of the standard PACO. In addition, the best-known solutions of all problem instances are usedfor the comparison. A listing of these solutions can be found in [2]. It is worth mentioning that thesesolutions were obtained using the method that has been proposed in [4].As the parameters k , q , α , β , and τ max have a crucial impact on the optimization behavior ofACO algorithms, a parameter optimization was conducted for the PACO and the WPACO in a two-stage procedure. First, sets of standard parameters were obtained from the literature. Then, for eachalgorithm an average initial best parameter setting was obtained by checking all combinations of thechosen parameters. In the second step, the automatic conﬁguration tool Irace [11] was used to optimizethe values of parameters α and β . Both steps are explained in detail in the following paragraphs.The parameters q ∈ { . , . , . } , k ∈ { , , } , and τ max ∈ { , , } were obtained from theliterature on PACO [9]. Recall that PACO and WPACO have diﬀerent notions of the term population. herefore, we use the parameter k PACO ∈ { , , } and k WPACO ∈ { , , } to denote the size of thepopulation in the respective algorithm. The larger values of parameter k WPACO were chosen with respectto the maximum weight 10. The reasoning is that all jobs within a multiset of the weighted populationwould be equivalent if the weight of a job is larger than the capacity of the weighted population. Inthe ﬁrst parameter optimization step, all problem instances were solved by the PACO and the WPACOalgorithm using all 27 combinations of the values of q , τ max , k PACO , and q , τ max , k WPACO , respectively.Each problem instance from X was solved 5 times for each value combination and each algorithm. Theremaining parameter values of both algorithms were chosen as follows: The number of ants was 10 andthe number of iterations was 10000. In addition, standard values α = 1 and β = 2 were used. Theheuristic information was obtained using the modiﬁed MDD heuristic (Formula (5)) and the solutionswere constructed using the summation rule (formulas (3) and (4)). These decisions are based on resultspresented in [12].The aim of the second step of parameter optimization is to improve the initial parameter valuesthat were obtained at the end of the ﬁrst step. In particular, the values for parameters α and β wereoptimized using the automatic conﬁguration tool Irace [11], which is an extension of the Iterated F-raceprocedure [3]. Given a set of problem instances, an algorithm that solves these problems, and a setof parameters of the algorithm, the iterated racing procedure consists of three main phases that areiteratively performed until a stopping criterion is met: First, new parameter conﬁgurations are selectedfrom the parameter space according to a particular sampling distribution. The initial parameter spaceis spanned by the ranges of the input parameters. Second, the best of these parameter conﬁgurationsare determined according to a statistical approach. Third, the sampling distribution is adjusted in orderto sample towards the best conﬁgurations. After the stopping criterion is met,

Irace returns a set ofmost appropriated parameter settings for the given set of problem instances. For more information oniterated racing and the

Irace tool, the reader is referred to [11].

Irace was used to optimize the valuesof parameters α, β ∈ [0 . , .

0] (with step size 0.001) for each problem instance of the evaluation setindividually. Initial tests showed that the lower and upper bounds on α and β are appropriate. Irace was conﬁgured to run each instance 2000 times. The step of parameter optimization is performed for thePACO algorithm only in order to achieve a clear competitive advantage for the PACO algorithm. Theidea is to show that the weighted population update rule is able to improve the solutions of the standardPACO algorithm even if the values of the parameters are not explicitly tuned for this algorithm. Theoutcome of the second parameter optimization step is that for each problem instance a most appropriatedsetting of parameter values is determined for the standard PACO algorithm.To evaluate the proposed population update rule a third experiment was conducted. Each probleminstance of X was solved 5 times by the PACO and the WPACO algorithm. For each computation, theparameter settings that were obtained by the use of Irace were used for both algorithms.Combining ACO algorithms with local search strategies has a high impact on solution quality. On onehand, it has been proven to improve the solution quality signiﬁcantly, e.g., see [5, 12] and [14] for resultson ACO and PACO, respectively. On the other hand, it moves much of the optimization process awayfrom the ACO algorithm. As the main objective of this work is to investigate the proposed populationupdate rule, all presented PACO algorithms do not utilize local search strategies.

Figure 3 shows the average total weighted tardiness (TWT) achieved by the PACO and the WPACOalgorithm for the ﬁrst step of the parameter optimization. It can be seen that on average the best TWT isobtained for parameter values q = 0 . τ max = 1, k PACO = 5, and k WPACO = 50. The results show thatboth algorithms, i.e., PACO and WPACO, achieve on average a smaller TWT for smaller values of q and τ max . Whereas small values of q enhance the exploration of diﬀerent solutions, small values of parameter τ max increase the inﬂuence of the heuristic information during the optimization process. The reason isthat a small τ max value results in small pheromone values τ ij . As the values η ij are independent of τ max , it holds by formulas (3) and (4) that the inﬂuence of the heuristic information increases for smallerpheromone values. The results for the population size parameters k PACO and k WPACO show that the bestTWT is obtained by both algorithms with a medium-sized population. This result agrees with resultsfrom the literature on PACO, e.g., see [9].The aim of the second step of parameter optimization is to utilize the software tool

Irace to furtherimprove the parameter values used for the PACO algorithm. Parameter values q = 0 . k PACO = 5, and τ max = 1 were ﬁxed during this step of parameter optimization. The reason is that initial tests showedthat optimizing these parameters on the evaluation set results in highly similar parameter conﬁgurations.As a consequence, the software tool Irace was used to optimize the values of parameters α and β withinthe range [0 . , .

0] only. .0000.0250.0500.0750.1000.125 0.1 0.5 0.9q R e l a t i v e de v i a t i on PACOWPACO (a)

PACO k WPACO R e l a t i v e de v i a t i on PACOWPACO (b) t max R e l a t i v e de v i a t i on PACOWPACO (c)

Figure 3:

Relative deviation from the average total weighted tardiness (TWT) achieved by applyingthe algorithms PACO and WPACO to the evaluation set X . The results are illustrated for diﬀerentparameter values of q (a), k PACO /k WPACO (b), and τ max (c) and in relation to the parameter conﬁg-uration that resulted in the smallest TWT, i.e., q = 0 . τ max = 1, k PACO = 5, and k WPACO = 50. a b (a) a b (b) Figure 4:

Distribution of parameters α and β obtained by Irace . The boxplots show the optimizedvalues of α and β for all problem instances of the evaluation set X (a) and X \ ( X . , . ∪ X . , . ∪X . , . ∪ X . , . ) (b). T o t a l w e i gh t ed t a r d i ne ss PACO−DWPACO−DPACO−IWPACO−I

Figure 5:

Average weighted tardiness over 10000 iterations computed for each problem instance ofthe evaluation set including 5 repetitions for each problem instance. PACO-D and WPACO-D usethe default values α = 1, β = 2 and the standard population update rule. Algorithms PACO-I andWPACO-I use the parameter conﬁgurations that were optimized by Irace and the proposed populationupdate rule.

Table 1:

Average total weighted tardiness of the best solutions from the OR-Library as well asthe solutions obtained by applying PACO-D, PACO-I, WPACO-D, and WPACO-I to the probleminstances from the evaluation set.

Method Total weighted tardiness diﬀ. to OR

OR-Library 217851 -PACO-D 274140 25.8%WPACO-D 268571 23.3%PACO-I 233718 7.3%WPACO-I 225641 3.6%

Since

Irace has been used to optimize the parameter values for each problem instance of the evaluationset separately, a distribution of optimized values for α and β parameters has been obtained. Figure 4.(a)illustrates this distribution. It can be seen that most optimized values for α and β lay in the intervals[0 . , .

76] and [0 . , . α . Most of the outliers correspond to problem instances from X for which it holdsthat RDD = 0 . T F ∈ { . , . , . , . } , i.e., for the sets X . , . , X . , . , X . , . , and X . , . . Thereason is that these problem instances could be solved optimally with nearly each combination of valuesfor α and β . Consequently, problem instances from X . , . ∪ X . , . ∪ X . , . ∪ X . , . do not allowa parameter optimization. As a result, the α and β values that were obtained from Irace by tuningthe parameter values for PACO on these problem instances were removed from the distribution. Theresulting distributions for α and β are shown in Figure 4.(b). The ﬁgure shows that now only a fewoutliers occur.Over the course of 10000 iterations, Figure 5 shows the average TWT achieved by applying the algo-rithms PACO and WPACO with and without optimized parameter values to each problem instance ofthe evaluation set. PACO and WPACO using the default values α = 1, β = 2 are denoted by PACO-Dand WPACO-D, respectively, and PACO-I and WPACO-I, respectively, for the values α , β that wereoptimized by Irace . It can be seen that algorithms PACO-I and WPACO-I achieve solutions with signiﬁ-cantly smaller average TWT than algorithms PACO-D and WPACO-D. Whereas algorithms PACO-I andWPACO-I converge approximately at iteration 1500, the algorithms PACO-D and WPACO-D convergeconsiderably later around iteration 8000. The results show the beneﬁt of the parameter optimizationthat was performed by

Irace . Figure 5 also shows the eﬀect of the proposed population update rule:For each iteration, algorithm WPACO-I (WPACO-D) achieves solutions with smaller average TWT thanPACO-I (respectively PACO-D).The average TWT of the best solutions from the OR-Library as well as the solutions obtained byapplying PACO-D, PACO-I, WPACO-D, and WPACO-I to the problem instances of the evaluation set arelisted in Table 1. It shows that all four algorithms produce solutions that are on average worse than thebest solutions of the OR-Library. This result is not surprising as all four algorithms are metaheuristics.Another fact that contributes to the deviation from the average TWT of the best OR-Library solutionsis that no local search strategy was used in order to improve already found solutions. However, Table 1also shows that the PACO algorithms that use the proposed weighted population update rule give much .000.250.500.75 ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) ( . , . ) Set of problem instances R e l a t i v e de v i a t i on PACO−D WPACO−D PACO−I WPACO−I

Figure 6:

Relative deviation of the average weighted tardiness (y-axis) from the average weightedtardiness of the best OR-Library solutions for PACO-D, PACO-I, WPACO-D, and WPACO-I for theproblem instances of the evaluation set. The relative deviation is illustrated for each algorithm andthe solutions of each set X a,b that was constructed using the parameters RDD = a and T F = b . Theset X a,b is represented by the notation ( a, b ) (x-axis). A v e r age f r a c t i on PACO−I WPACO−I

Figure 7:

Average fraction of iterations where a job with a certain weight changes its position withinthe iteration best schedules.better results than PACO with the standard update rule. More precisely, algorithms WPACO-I andWPACO-D achieve solutions that exhibit an average TWT that is larger than the average TWT of thebest OR-Library solutions by 3.6% and 23.3%, respectively. The corresponding PACO algorithms thatuse the standard update rule, i.e., PACO-I and PACO-D, obtain solutions with a TWT that is largerthan the average TWT of the best OR-Library solutions by 7.3% and 25.8%, respectively.Figure 6 shows by which fraction the average TWT obtained by all four investigated PACO algorithmsdeviates from the average TWT of the best OR-Library solutions with respect to all combinations ofparameter values

T F and

RDD that were used for the construction of the evaluation set. Generally, itcan be seen that the algorithms PACO-I and WPACO-I achieve much smaller relative deviations, i.e.,a much smaller average TWT, than algorithms PACO-D and WPACO-D, respectively. The ﬁgure alsoshows that the best OR-Library solutions were found by each algorithm for the problem instances of thesets X . , . , X . , . , and X . , . . This result can be explained by the fact is that these combinations of RDD and

T F values lead to comparatively large and diverse due dates. The problem instances of theevaluation set that were generated with

T F = 0 . RDD values.In particular, the problem instances from set X . , . were solved optimally by the PACO-D algorithmonly. Moreover, the problem instances from set X . , . were not solved optimally by all investigatedalgorithms. One reason is that the due dates become less diverse for smaller values of RDD . The worstaverage TWT was achieved for the problem instances from the sets X . , . and X . , . . This result isconsistent with the observations that were made in [5, 8] and which were explained and investigated inmore detail in [13]. In particular, the authors stated that SMTWTP instances that were generated with T F = 0 . X \ X . , . thealgorithms WPACO-I and WPACO-D achieve a much smaller average TWT than the algorithms PACO-Iand PACO-D, respectively.To investigate the diﬀerence between the weighted population update rule and the standard popu-lation update rule with respect to the composition of a population during optimization, Figure 7 shows .080.100.120.140.16 0 2500 5000 7500 10000Iteration A v e r age f r a c t i on Job weight

Figure 8:

Average fraction that a job with a certain weight changes its position within successiveiteration best schedules over 10000 iterations by applying WPACO-I. For better visualization only jobweights 1, 3, 5, 7 and 10 are included. P r obab ili t y Job weight

Figure 9:

Empirical probability that a job with a certain weight changes its position within theiteration best schedules over 10000 iterations by applying WPACO-I. For better visualization only jobweights 1, 3, 5, 7 and 10 are included. ow often a job with a certain weight changes its position in a single iteration. For each problem instanceof the evaluation set and the algorithms PACO-I and WPACO-I, the ﬁgure was obtained by comparingthe iteration best schedules of successive iterations. For PACO-I it can be seen that the average fractionof iterations is equally distributed among the job weights. The reason for this is that PACO-I usesthe standard population update rule which cannot consider the weights of the jobs a given SMTWTPinstance. For WPACO-I the ﬁgure shows that the average fraction of a position change increases for adecreasing job weight. Thus, jobs with a large weight are less likely to get scheduled to another position.The reason for this is explained in the following. Jobs with a small weight have less inﬂuence on the TWTthan jobs with a large weight. Hence, jobs with a large weight are scheduled early during optimizationin order to reduce the TWT of a schedule. Since the weighed population update rule adds jobs with alarge weight multiple times to the population, it follows that future ants prefer the same position forthose jobs. A consequence is that the position of a job with a large weight is ﬁxed in early iterationswhich reduces its average fraction of position change illustrated in Figure 7. In the following iterations,the process of optimization focuses on jobs with smaller weights leading to an increased average fractionof their position changes. Figure 8 displays this result. After a short initialization phase the averagefraction of jobs with smaller weights increases signiﬁcantly. After approximately 1500 iterations, theaverage fractions adjust. At this point the algorithm converges, as Figure 5 shows. Additionally, theprobability that a job changes its position decreases in further iterations. Figure 9 illustrates this eﬀect.After the algorithm converges (approximately 1500 iterations), the probabilities become stable. Alto-gether this shows that for WPACO-I, which uses the weighted population update rule, a correlationsbecomes noticeable that jobs with large weights are less likely to get rescheduled at another position.To verify the assumption, the Pearson correlation between the job weight and the average fraction ofiterations were its position changes was calculated. The result is a probability p = 6 . · − and acorrelation coeﬃcient r = − . In this paper a novel population update rule for population based ant colony optimization (PACO) hasbeen presented for the example of the single machine total weighted tardiness problem. The new updaterule, called weighted population update rule, allows to weight diﬀerent parts of a solution. PACO with thenew population update rule (WPACO) has achieved better solution quality for 125 benchmark probleminstances than its counterpart that used the standard population update rule. A detailed analysis of thesolutions obtained by WPACO has revealed a strong negative correlation between the weight of a joband the probability that a job gets rescheduled at another position in successive iterations.For future work it is planned to explore the possibility to apply WPACO to other optimizationproblems like traveling salesperson problems or quadratic assignment problems.

References [1] Andreas Bauer, Bernd Bullnheimer, Richard F Hartl, and Christine Strauss. An ant colony opti-mization approach for the single machine total tardiness problem. In

Proc. Congress on EvolutionaryComputation (CEC 1999) , volume 2, pages 1445–1450, 1999.[2] John E Beasley. Or-library: distributing test problems by electronic mail.

Journal of the OperationalResearch Society , 41(11):1069–1072, 1990.[3] Mauro Birattari, Zhi Yuan, Prasanna Balaprakash, and Thomas St¨utzle. F-race and iterated f-race:An overview. In

Experimental methods for the analysis of optimization algorithms , pages 311–336.Springer, 2010.[4] Richard K Congram, Chris N Potts, and Steef L van de Velde. An iterated dynasearch algorithm forthe single-machine total weighted tardiness scheduling problem.

INFORMS Journal on Computing ,14(1):52–67, 2002.[5] Matthijs Den Besten, Thomas St¨utzle, and Marco Dorigo. Ant colony optimization for the totalweighted tardiness problem. In

Proc. Int’l Conference on Parallel Problem Solving from Nature(PPSN 2000) , pages 611–620. Springer, 2000.[6] Marco Dorigo and Thomas St¨utzle.

Ant Colony Optimization . Bradford Company Scituate, MA,USA, Jan. 2004.

7] Marco Dorigo and Thomas St¨utzle. Ant colony optimization: overview and recent advances. In

Handbook of metaheuristics , pages 311–351. Springer, 2019.[8] Martin Josef Geiger. On heuristic search for the single machine total weighted tardiness problem–some theoretical insights and their empirical veriﬁcation.

European Journal of Operational Research ,207(3):1235–1243, 2010.[9] Michael Guntsch and Martin Middendorf. A population based approach for aco. In

Proc. Workshopson Applications of Evolutionary Computation , pages 72–81. Springer, 2002.[10] Jan Karel Lenstra, AHG Rinnooy Kan, and Peter Brucker. Complexity of machine schedulingproblems. In

Annals of Discrete Mathematics , volume 1, pages 343–362. Elsevier, 1977.[11] Manuel L´opez-Ib´a˜nez, J´er´emie Dubois-Lacoste, Leslie P´erez C´aceres, Mauro Birattari, and ThomasSt¨utzle. The irace package: Iterated racing for automatic algorithm conﬁguration.

OperationsResearch Perspectives , 3:43–58, 2016.[12] Daniel Merkle and Martin Middendorf. An ant algorithm with a new pheromone evaluation rule fortotal tardiness problems. In

Proc. Workshops on Real-World Applications of Evolutionary Compu-tation , pages 290–299. Springer, 2000.[13] Daniel Merkle and Martin Middendorf. A new approach to solve permutation scheduling problemswith ant colony optimization. In

Proc. Workshops on Applications of Evolutionary Computation(EvoWorkshops 2001) , pages 484–494. Springer, 2001.[14] Sabrina M Oliveira, Mohamed Saifullah Hussin, Thomas St¨utzle, Andrea Roli, and Marco Dorigo.A detailed analysis of the population-based ant colony optimization algorithm for the tsp and theqap. In

Proc. Genetic and Evolutionary Computation Conference Companion (GECCO 2011) , pages13–14. ACM, 2011.[15] Thomas Weise, Raymond Chiong, Ke Tang, J¨org L¨assig, Shigeyoshi Tsutsui, Wenxiang Chen, Zbig-niew Michalewicz, and Xin Yao. Benchmarking optimization algorithms: An open source frameworkfor the traveling salesman problen.

IEEE Computational Intelligence Magazine , 9(3):40–52, 2014., 9(3):40–52, 2014.