[PDF] Lin-Kernighan Heuristic Adaptations for the Generalized Traveling Salesman Problem

Abstract

The Lin-Kernighan heuristic is known to be one of the most successful heuristics for the Traveling Salesman Problem (TSP). It has also proven its efficiency in application to some other problems. In this paper we discuss possible adaptations of TSP heuristics for the Generalized Traveling Salesman Problem (GTSP) and focus on the case of the Lin-Kernighan algorithm. At first, we provide an easy-to-understand description of the original Lin-Kernighan heuristic. Then we propose several adaptations, both trivial and complicated. Finally, we conduct a fair competition between all the variations of the Lin-Kernighan adaptation and some other GTSP heuristics. It appears that our adaptation of the Lin-Kernighan algorithm for the GTSP reproduces the success of the original heuristic. Different variations of our adaptation outperform all other heuristics in a wide range of trade-offs between solution quality and running time, making Lin-Kernighan the state-of-the-art GTSP local search.

Full PDF

aa r X i v : . [ c s . D S ] J un Lin-Kernighan Heuristic Adaptations for the GeneralizedTraveling Salesman Problem

D. Karapetyan a, ∗ , G. Gutin a a Royal Holloway London University, Egham, Surrey, TW20 0EX, United Kingdom

Abstract

The Lin-Kernighan heuristic is known to be one of the most successful heuristics for theTraveling Salesman Problem (TSP). It has also proven its efﬁciency in application tosome other problems.In this paper we discuss possible adaptations of TSP heuristics for the GeneralizedTraveling Salesman Problem (GTSP) and focus on the case of the Lin-Kernighan al-gorithm. At ﬁrst, we provide an easy-to-understand description of the original Lin-Kernighan heuristic. Then we propose several adaptations, both trivial and complicated.Finally, we conduct a fair competition between all the variations of the Lin-Kernighanadaptation and some other GTSP heuristics.It appears that our adaptation of the Lin-Kernighan algorithm for the GTSP reproducesthe success of the original heuristic. Different variations of our adaptation outperform allother heuristics in a wide range of trade-offs between solution quality and running time,making Lin-Kernighan the state-of-the-art GTSP local search.

Keywords:

Heuristics, Lin-Kernighan, Generalized Traveling Salesman Problem,Combinatorial Optimization.

1. Introduction

One of the most successful heuristic algorithms for the famous Traveling SalesmanProblem (TSP) known so far is the Lin-Kernighan heuristic (Lin and Kernighan, 1973).It was proposed almost forty years ago but even nowadays it is the state-of-the-art TSPlocal search (Johnson and McGeoch, 2002).In this paper we attempt to reproduce the success of the original TSP Lin-Kernighanheuristic for the Generalized Traveling Salesman Problem (GTSP), which is an importantextension of TSP. In the TSP, we are given a set V of n vertices and weights w ( x → y ) of moving from a vertex x ∈ V to a vertex y ∈ V . A feasible solution, or a tour, is acycle visiting every vertex in V exactly once. In the GTSP, we are given a set V of n vertices, weights w ( x → y ) of moving from x ∈ V to y ∈ V and a partition of V into m ∗ Corresponding author

Email addresses: [email protected] (D. Karapetyan), [email protected] (G. Gutin)

Preprint submitted to Elsevier June 24, 2010 onempty clusters C , C , . . . , C m such that C i ∩ C j = ∅ for each i = j and S i C i = V .A feasible solution, or a tour, is a cycle visiting exactly one vertex in every cluster. Theobjective of both TSP and GTSP is to ﬁnd the shortest tour.If the weight matrix is symmetric, i.e., w ( x → y ) = w ( y → x ) for any x, y ∈ V , theproblem is called symmetric . Otherwise it is an asymmetric GTSP. In what follows, thenumber of vertices in cluster C i is denoted as | C i | , the size of the largest cluster is s , and Cluster ( x ) is the cluster containing a vertex x . The weight function w can be used foredges, paths w ( x → x → . . . → x k ) = w ( x → x )+ w ( x → x )+ . . . + w ( x k − → x k ) , and cycles.Since Lin-Kernighan is designed for the symmetric problem, we do not consider theasymmetric GTSP in this research. However, some of the algorithms proposed in thispaper are naturally suited for both symmetric and asymmetric cases.Observe that the TSP is a special case of the GTSP when | C i | = 1 for each i and,hence, the GTSP is NP-hard. The GTSP has a host of applications in warehouse orderpicking with multiple stock locations, sequencing computer ﬁles, postal routing, airportselection and routing for courier planes and some others, see, e.g., (Fischetti et al., 1995,1997; Laporte et al., 1996; Noon and Bean, 1991) and references therein.A lot of attention was paid in the literature to solving the GTSP. Several researchers(Ben-Arieh et al., 2003; Laporte and Semet, 1999; Noon and Bean, 1993) proposed trans-formations of the GTSP into the TSP. At ﬁrst glance, the idea to transform a little-studiedproblem into a well-known one seems to be natural; however, this approach has a very lim-ited application. On the one hand, it requires exact solutions of the obtained TSP instancesbecause even a near-optimal solution of such TSP may correspond to an infeasible GTSPsolution. On the other hand, the produced TSP instances have quite an unusual structurewhich is difﬁcult for the existing solvers. A more efﬁcient way to solve the GTSP ex-actly is a branch-and-bound algorithm designed by Fischetti et al. (1997). This algorithmwas able to solve instances with up to 89 clusters. Two approximation algorithms wereproposed in the literature, but both of them are unsuitable for the general case of the prob-lem, and the guarantied solution quality is unreasonably low for real-world applications,see (Bontoux et al., 2010) and references therein.In order to obtain good (i.e., not necessarily exact) solutions for larger GTSP in-stances, one should use the heuristic approach. Several construction heuristics and localsearches were discussed in (Bontoux et al., 2010; Gutin and Karapetyan, 2010; Hu and Raidl,2008; Renaud and Boctor, 1998; Snyder and Daskin, 2006) and some others. A numberof metaheuristics were proposed by Bontoux et al. (2010); Gutin and Karapetyan (2010);Gutin et al. (2008); Huang et al. (2005); Pintea et al. (2007); Silberholz and Golden (2007);Snyder and Daskin (2006); Tasgetiren et al. (2007); Yang et al. (2008).In this paper we thoroughly discuss possible adaptations of a TSP heuristic for theGTSP and focus on the Lin-Kernighan algorithm. The idea of the Lin-Kernighan algo-rithm was already successfully applied to the Multidimensional Assignment Problem (Balas and Saltzman,1991; Karapetyan and Gutin, 2010). A straightforward adaptation for the GTSP was pro-posed by Hu and Raidl (2008); their algorithm constructs a set of TSP instances andsolves all of them with the TSP Lin-Kernighan heuristic. Bontoux et al. (2010) applythe TSP Lin-Kernighan heuristic to the TSP tours induced by the GTSP tours. It will beshown in Section 3 that both of these approaches are relatively weak.The Lin-Kernighan heuristic is a sophisticated algorithm adjusted speciﬁcally for the2SP. The explanation provided by Lin and Kernighan (1973) is full of details whichcomplicate understanding of the main idea of the method. We start our paper from a clearexplanation of a simpliﬁed TSP Lin-Kernighan heuristic (Section 2) and then propose sev-eral adaptations of the heuristic for the GTSP (Section 3). In Section 4, we provide resultsof a thorough experimental evaluation of all the proposed Lin-Kernighan adaptations anddiscuss the success of our approach in comparison to other GTSP heuristics. In Section 5we discuss the outcomes of the conducted research and select the state-of-the-art GTSPlocal searches.

2. The TSP Lin-Kernighan Heuristic

In this section we describe the TSP Lin-Kernighan heuristic ( LK tsp ). It is a simpli-ﬁed version of the original algorithm. Note that (Lin and Kernighan, 1973) was publishedalmost 40 years ago, when modest computer resources, obviously, inﬂuenced the algo-rithm design, hiding the main idea behind the technical details. Also note that, back then,the ‘goto’ operator was widely used; this affects the original algorithm description. Incontrast, our interpretation of the algorithm is easy to understand and implement. LK tsp is a generalization of the k -opt local search. The k -opt neighborhood N k -opt ( T ) includes all the TSP tours which can be obtained by removing k edges from the origi-nal tour T and adding k different edges such that the resulting tour is feasible. Observethat exploring the whole N k -opt ( T ) takes O ( n k ) operations and, thus, with a few ex-ceptions, only 2-opt and rarely 3-opt are used in practice (Johnson and McGeoch, 2002;Rego and Glover, 2006).Similarly to k -opt, LK tsp tries to remove and insert edges in the tour but it exploresonly some parts of the k -opt neighborhood that deem to be the most promising. Considerremoving an edge from a tour; this produces a path. Rearrange this path to minimize itsweight. To close up the tour we only need to add one edge. Since we did not considerthis edge during the path optimization, it is likely that its weight is neither minimized normaximized. Hence, the weight of the whole tour is probably reduced together with theweight of the path. Here is a general scheme of LK tsp :1. Let T be the original tour.2. For every edge e → b ∈ T do the following:(a) Let P = b → . . . → e be the path obtained from T by removing the edge e → b .(b) Rearrange P to minimize its weight. Every time an improvement is foundduring this optimization, try to close up the path P . If it leads to a tour shorterthan T , save this tour as T and start the whole procedure again.(c) If no tour improvement was found, continue to the next edge (Step 2).In order to reduce the weight of the path, a local search is used as follows. On everymove, it tries to break up the path into two parts, invert one of these parts, and then rejointhem (see Figure 1). In particular, the algorithm tries every edge x → y and selects theone which maximizes the gain g = w ( x → y ) − w ( e → x ) . If the maximum g is positive,the corresponding move is an improvement and the local search is applied again to theimproved path. 3 >=<89:; b / / ?>=<89:; / / ?>=<89:; x w ( x → y ) / / ?>=<89:; y / / ?>=<89:; / / ?>=<89:; e (a) The original path. ?>=<89:; b / / ?>=<89:; / / ?>=<89:; x ____ ?>=<89:; y o o ?>=<89:; o o ?>=<89:; e w ( x → e ) (b) The path after a local search move. Figure 1: An example of a local search move for a path improvement. The weight of thepath is reduced by w ( x → y ) − w ( x → e ) .Observe that this algorithm tries only the best improvement and skips the other ones.A natural enhancement of the heuristic would be to use a backtracking mechanism totry all the improvements. However, this would slow down the algorithm too much. Acompromise is to use the backtracking only for the ﬁrst α moves. This approach is imple-mented in a recursive function ImprovePath ( P, depth , R ) , see Algorithm 1. Algorithm 1

ImprovePath ( P, depth , R ) recursive algorithm ( LK tsp version). The func-tion either terminates after an improved tour is found or ﬁnishes normally with no proﬁt. Require:

The path P = b → . . . → e , recursion depth depth and a set of restrictedvertices R . if depth < α thenfor every edge x → y ∈ P such that x / ∈ R do Calculate g = w ( x → y ) − w ( e → x ) (see Figure 1b). if g > thenif the tour b → . . . → x → e → . . . → y → b is an improvement over theoriginal one then Accept the produced tour and terminate . else ImprovePath ( b → . . . → x → e → . . . → y, depth + 1 , R ∪ { x } ) . else Find the edge x → y which maximizes g = w ( x → y ) − w ( e → x ) . if g > thenif the tour b → . . . → x → e → . . . → y → b is an improvement over the originalone then Accept the produced tour and terminate . elsereturn ImprovePath ( b → . . . → x → e → . . . → y, depth + 1 , R ∪ { x } ) . ImprovePath ( P, , ∅ ) takes O ( n α · depth max ) operations, where depth max is themaximum depth of recursion achieved during the run. Hence, one should use only smallvalues of backtracking depth α . 4he algorithm presented above is a simpliﬁed Lin-Kernighan heuristic. Here is a listof major differences between the described algorithm and the original one.1. The original heuristic does not accept the ﬁrst found tour improvement. It records itand continues optimizing the path in the hope of ﬁnding a better tour improvement.Note that it was reported by Helsgaun (2000) that this complicates the algorithmbut does not really improve its quality.2. The original heuristic does not try all the n options when optimizing a path. It con-siders only the ﬁve shortest edges x → e in the non-decreasing order. This hugelyreduces the running time and helps to ﬁnd the best rather than the ﬁrst improvementon the backtracking stage. However, this speed-up approach is known to be a weakpoint of the original implementation (Helsgaun, 2000; Johnson and McGeoch, 2002).Indeed, even if the edge x → y is long, the algorithm does not try to break it if theedge x → e is not in the list of ﬁve shortest edges to e .Note that looking for the closest vertices or clusters may be meaningless in theapplication to the GTSP. In our implementation, every edge x → y is considered.3. The original heuristic does not allow deleting the previously added edges or addingthe previously deleted edges. It was noted (Helsgaun, 2000; Johnson and McGeoch,2002) that either of these restrictions is enough to prevent an inﬁnite loop. In ourimplementation a previously deleted edge is allowed to be added again but ev-ery edge can be deleted only once. Our implementation also prevents some othermoves; however, the experimental evaluation shows that this does not affect theperformance of the heuristic.4. The original heuristic also considers some more sophisticated moves to produce apath from the tour.5. The original heuristic is, in fact, embedded into a metaheuristic which runs theoptimization several times. There are several tricks related to the metaheuristicwhich are inapplicable to a single run.The worst case time complexity of the Lin-Kernighan heuristic seems to be unknownfrom the literature (Helsgaun, 2009) but we assume that it is exponential. Indeed, observethat the number of iterations of the k -opt local search may be non-polynomial for any k (Chandra et al., 1994) and that LK tsp is a modiﬁcation of k -opt. However, Helsgaun(2009) notes that such undesirable instances are very rare and normally LK tsp proceeds ina polynomial time.

3. Adaptations of the Lin-Kernighan Heuristic for the GTSP

It may seem that the GTSP is only a slight variation of the TSP. In particular, onemay propose splitting the GTSP into two problems (Renaud and Boctor, 1998): solvingthe TSP induced by the given tour to ﬁnd the cluster order, and ﬁnding the shortest cyclevisiting the clusters according to the found order. We will show now that this approach ispoor with regards to solution quality. Let N TSP ( T ) be a set of tours which can be obtainedfrom the tour T by reordering the vertices in T . Observe that one has to solve a TSPinstance induced by T to ﬁnd the best tour in N TSP ( T ) .Let N CO ( T ) be a set of all the GTSP tours which visit the clusters in exactly the sameorder as in T . The size of the N CO ( T ) neighborhood is Q mi =1 | C i | ∈ O ( s m ) but there5xists a polynomial algorithm (we call it Cluster Optimization , CO ) which ﬁnds the besttour in N CO ( T ) in O ( ms ) operations (Fischetti et al., 1997). Moreover, it requires only O ( ms · min i | C i | ) time, i.e., if the instance has at least one cluster of size O (1) , CO proceeds in O ( ms ) . (Recall that s is the size of the largest cluster: s = max i | C i | .)The following theorem shows that splitting the GTSP into two problems (local searchin N TSP ( T ) and then local search in N CO ( T ) ) does not guarantee any solution quality. Theorem 1.

The best tour among N CO ( T ) ∪ N TSP ( T ) can be a longest GTSP tour differentfrom a shortest one.Proof. Consider the GTSP instance G in Figure 2a. It is a symmetric GTSP containing5 clusters { } , { , ′ } , { } , { } and { } . The weights of the edges not displayed in thegraph are as follows: w (1 →

3) = w (1 →

4) = 0 and w (2 →

5) = w (2 ′ →

5) = 1 .Observe that the tour T = 1 → → → → → , shown in Figure 2b, is a localminimum in both N CO ( T ) and N TSP ( T ) . The dashed line shows the second solution in N CO ( T ) but it gives the same objective value. It is also clear that T is a local minimumin N TSP ( T ) . Indeed, all the edges incident to the vertex 2 are of weight 1, and, hence, anytour through the vertex 2 is at least of weight 2.The tour T is in fact a longest tour in G . Observe that all nonzero edges in G areincident to the vertices 2 and ′ . Since only one of these vertices can be visited by a tour,at most two nonzero edges can be included into a tour. Hence, the weight of the worsttour in G is 2.However, there exists a better GTSP tour T opt = 1 → ′ → → → → ofweight 1, see Figure 2a. ?>=<89:; ????????? ?>=<89:; >>>>>>>>> >>>>>>>>> >>>>>>>>> ?>=<89:; (cid:127)(cid:127)(cid:127)(cid:127)(cid:127)(cid:127)(cid:127)(cid:127)(cid:127) ?>=<89:; ′ (cid:127)(cid:127)(cid:127)(cid:127)(cid:127)(cid:127)(cid:127)(cid:127)(cid:127) ?>=<89:; ?>=<89:; (a) The instance G and the optimal GTSP tour T opt . ?>=<89:; ?>=<89:; ?>=<89:; (cid:127)(cid:127)(cid:127)(cid:127)(cid:127)(cid:127)(cid:127)(cid:127)(cid:127) ___ ?>=<89:; ′ (cid:127)(cid:127)(cid:127)(cid:127)(cid:127) ?>=<89:; ?>=<89:; (b) A local minimum T which is the worst pos-sible GTSP tour. Figure 2: An example of a local minimum in both N TSP ( T ) and N CO ( T ) which is alongest possible GTSP tour.In fact, the TSP and the GTSP behave quite differently during optimization. Observethat there exists no way to ﬁnd out quickly if some modiﬁcation of the cluster order im-proves the tour. Indeed, choosing wrong vertices within clusters may lead to an arbitrarylarge increase of the tour weight. And since a replacement of a vertex within one clustermay require a replacement of vertices in the neighbor clusters, any local change inﬂuencesthe whole tour in general case. A typical local search with the neighborhood N ( T ) performs as follows:6 equire: The original solution T . for all T ′ ∈ N ( T ) doif w ( T ′ ) < w ( T ) then T ← T ′ .Run the whole algorithm again. return T .Let N ( T ) ⊆ N TSP ( T ) be a neighborhood of some TSP local search LS ( T ) . Let N ( T ) ⊆ N CO ( T ) be a neighborhood of some GTSP local search LS ( T ) which leavesthe cluster order ﬁxed. Then one can think of the following two adaptations of a TSP localsearch for the GTSP:(i) Enumerate all solutions T ′ ∈ N ( T ) . For every candidate T ′ run T ′ ← LS ( T ′ ) tooptimize it in N ( T ′ ) .(ii) Enumerate all solutions T ′ ∈ N ( T ) . For every candidate T ′ run T ′ ← LS ( T ′ ) tooptimize it in N ( T ′ ) .Observe that the TSP neighborhood N ( T ) is normally harder to explore than the clus-ter optimization neighborhood N ( T ) . Consider, e.g., N ( T ) = N TSP ( T ) and N ( T ) = N CO ( T ) . Then both options yield an optimal GTSP solution but Option (i) requires O ( m ! ms ) operations while Option (ii) requires O ( s m m !) operations.Moreover, many practical applications of the GTSP have some localization of clusters,i.e., | w ( x → y ) − w ( x → y ) | ≪ w ( x → y ) on average, where Cluster ( y ) = Cluster ( y ) = Cluster ( x ) . Hence, the landscape of N ( T ) depends on the cluster ordermore than the landscape of N ( T ) depends on the vertex selection. From above it followsthat Option (i) is preferable.Option (ii) was used by Hu and Raidl (2008) as follows. The cluster optimizationneighborhood N ( T ) includes there all the tours which differ from T in exactly one ver-tex. For every T ′ ∈ N ( T ) the Lin-Kernighan heuristic was applied. This results in n runs of the Lin-Kernighan heuristic which makes the algorithm unreasonably slow.Option (i) may be implemented as follows: Require:

The original tour T . for all T ′ ∈ N ( T ) do T ′ ← QuickImprove ( T ′ ) . if w ( T ′ ) < w ( T ) then T ← SlowImprove ( T ′ ) .Run the whole algorithm again. return T .Here QuickImprove ( T ) and SlowImprove ( T ) are some tour improvement heuristicswhich leave the cluster order unchanged. Formally, these heuristics should meet the fol-lowing requirements: • QuickImprove ( T ) , SlowImprove ( T ) ∈ N CO ( T ) for any tour T ; • w ( QuickImprove ( T )) ≤ w ( T ) and w ( SlowImprove ( T )) ≤ w ( T ) for any tour T . QuickImprove is applied to every candidate T ′ before its evaluation. SlowImprove isonly applied to successful candidates in order to further improve them. One can think ofthe following improvement functions: 7

Trivial I ( T ) which leaves the solution without any change: I ( T ) = T . • Full optimization CO ( T ) which applies the CO algorithm to the given solution. • Local optimization L ( T ) . It updates the vertices only within clusters, affected bythe latest solution change. E.g., if a tour x → x → x → x → x was changedto x → x → x → x → x , some implementation of L ( T ) will try every x → x ′ → x ′ → x → x , where x ′ ∈ Cluster ( x ) and x ′ ∈ Cluster ( x ) .There are ﬁve meaningful combinations of QuickImprove and

SlowImprove :1.

QuickImprove ( T ) = I ( T ) and SlowImprove ( T ) = I ( T ) . This actually yields theoriginal TSP local search.2. QuickImprove ( T ) = I ( T ) and SlowImprove ( T ) = CO ( T ) , i.e., the algorithmexplores the TSP neighborhood but every time an improvement is found, the solu-tion T is optimized in N CO ( T ) . One can also consider SlowImprove ( T ) = L ( T ) ,but it has no practical interest. Indeed, SlowImprove is used quite rarely and so itsimpact on the total running time is negligible. At the same time, CO ( T ) is muchbetter than L ( T ) with respect to solution quality.3. QuickImprove ( T ) = L ( T ) and SlowImprove ( T ) = I ( T ) , i.e., every solution T ′ ∈ N ( T ) is improved locally before it is compared to the original solution.4. QuickImprove ( T ) = L ( T ) and SlowImprove ( T ) = CO ( T ) , which is the sameas Option 3 but it additionally optimizes the solution T ′ globally in N CO ( T ′ ) everytime an improvement is found.5. QuickImprove ( T ) = CO ( T ) and SlowImprove ( T ) = I ( T ) , i.e., every candidate T ′ ∈ N ( T ) is optimized globally in N CO ( T ′ ) before it is compared to the originalsolution T .These adaptations were widely applied in the literature. For example, the heuris-tics G2 and G3 (Renaud and Boctor, 1998) are actually 2-opt and 3-opt adapted accord-ing to Option 5. An improvement over the naive implementation of 2-opt adapted inthis way is proposed by Hu and Raidl (2008); asymptotically, it is faster by factor 3.However, this approach is still too slow. Adaptations of 2-opt and some other heuris-tics according to Option 3 were used by Fischetti et al. (1997), Gutin and Karapetyan(2010), Silberholz and Golden (2007), Snyder and Daskin (2006), and Tasgetiren et al.(2007). Some unadapted TSP local searches (Option 1) were used by Bontoux et al.(2010), Gutin and Karapetyan (2010), Silberholz and Golden (2007), and Snyder and Daskin(2006). LK tsp In this section we present our adaptation LK of LK tsp for the GTSP. A pseudo-codeof the whole heuristic is presented in Algorithm 2. Some of its details are encapsulatedinto the following functions (note that LK tsp is not a typical local search based on someneighborhood and, thus, the framework presented above cannot be applied to it straight-forwardly): • Gain ( P, x → y ) is intended to calculate the gain of breaking a path P at an edge x → y . 8 lgorithm 2 LK general implementation

Require:

The original tour T .Initialize the number of idle iterations i ← . while i < m do Cyclically select the next edge e → b ∈ T .Let P o = b → . . . → e be the path obtained from T by removing the edge e → b .Run T ′ ← ImprovePath ( P o , , ∅ ) (see below). if w ( T ′ ) < w ( T ) then Set T = ImproveTour ( T ′ ) .Reset the number of idle iterations i ← . else Increase the number of idle iterations i ← i + 1 . Procedure

ImprovePath ( P, depth , R ) Require:

The path P = b → . . . → e , recursion depth depth and the set of restrictedvertices R . if depth ≥ α then Find the edge x → y ∈ P , x = b , x / ∈ R such that it maximizes the path gain Gain ( P, x → y ) . else Repeat the rest of the procedure for every edge x → y ∈ P , x = b , x / ∈ R .Conduct the local search move: P ← RearrangePath ( P, x → y ) . if GainIsAcceptable ( P, x → y ) then Replace the edge x → y with x → e in P . T ′ = CloseUp ( P ) . if w ( T ′ ) ≥ w ( T ) then Run T ′ ← ImprovePath ( P, depth + 1 , R ∪ { x } ) . if w ( T ′ ) < w ( T ) thenreturn T ′ . else Restore the path P . return T . • RearrangePath ( P, x → y ) removes an edge x → y from a path P and addsthe edge x → e , where P = b → . . . → x → y → . . . → e , see Figure 1.Together with CloseUp , it includes an implementation of

QuickImprove ( T ) (seeSection 3.1), so RearrangePath may also apply some cluster optimization. • GainIsAcceptable ( P, x → y ) determines if the gain of breaking a path P at anedge x → y is worth any further effort. • CloseUp ( P ) adds an edge to a path P to produce a feasible tour. Together with RearrangePath , it includes an implementation of

QuickImprove ( T ) (see Sec-tion 3.1), so CloseUp may also apply some cluster optimization.9

ImproveTour ( T ) is a tour improvement function. It is an analogue to SlowImprove ( T ) (see Section 3.1).These functions are the key points in the adaptation of LK tsp for the GTSP. Theydetermine the behaviour of the heuristic. In Sections 3.3, 3.4 and 3.5 we describe differentimplementations of these functions. The

Basic variation of LK tsp (in what follows denoted by B ) is a trivial adapta-tion of LK according to Option 1 (see Section 3.1). It deﬁnes the functions Gain , RearrangePath , CloseUp and

ImproveTour as follows:

Gain B ( b → . . . → e, x → y ) = w ( x → y ) − w ( e → x ) , RearrangePath B ( b → . . . → x → y → . . . → e, x → y ) = b → . . . → x → e → . . . → y , CloseUp B ( b → . . . → e ) = b → . . . → e → b , and ImproveTour B ( T ) is trivial. We also consider a B co variation (Option 2) whichapplies CO every time an improvement is found: ImproveTour ( T ) = CO ( T ) .The implementation of GainIsAcceptable ( G, P ) will be discussed in Section 3.6. The

Closest and the

Shortest variations (denoted as C and S , respectively) aretwo adaptations of LK tsp according to Option 3, i.e., QuickImprove ( T ) = L ( T ) and SlowImprove ( T ) = I ( T ) . In other words, a local cluster optimization is applied to everycandidate during the path optimization.Consider an iteration of the path improvement heuristic ImprovePath . Let the path P = b → . . . → x → y → . . . → e be broken at the edge x → y (see Figure 3). Then, ?>=<89:; b / / ?>=<89:; / / ?>=<89:; p / / ?>=<89:; x ___ ?>=<89:; y o o ?>=<89:; o o ?>=<89:; r o o ?>=<89:; e % % w ( x → e ) Figure 3: Path optimization.to calculate

Gain ( P, x → y ) in C , we replace x ∈ X with x ′ ∈ X such that the edge x → e is minimized: Gain C ( b → . . . → p → x → y → . . . → e, x → y )= w ( p → x → y ) − w ( p → x ′ → e ) , where x ′ ∈ Cluster ( x ) is chosen to minimize w ( x ′ → e ) .In S , we update both x and e such that the path p → x → e → r is minimized: Gain S ( b → . . . → p → x → y → . . . → r → e, x → y ) = w ( p → x → y ) + w ( r → e ) − w ( p → x ′ → e ′ → r ) , x ′ ∈ Cluster ( x ) and e ′ ∈ Cluster ( e ) are chosen to minimize w ( p → x ′ → e ′ → r ) . Observe that the most time-consuming part of LK is the path optimization. In case ofthe S variation, the bottleneck is the gain evaluation function which takes O ( s ) opera-tions. In order to reduce the number of gain evaluations in S , we do not consider someedges x → y . In particular, we assume that the improvement is usually not larger than w min ( X, Y ) − w min ( X, E ) , where X = Cluster ( x ) , Y = Cluster ( y ) , E = Cluster ( e ) and w min ( A, B ) is the weight of the shortest edge between some clusters A and B : w min ( A, B ) = min a ∈ A,b ∈ B w ( a → b ) . Obviously, all the values w min ( A, B ) are precal-culated. Note that this speed-up heuristic is used only when depth ≥ α , see Algorithm 2.One can hardly speed up the Gain function in B or C .The RearrangePath function does some further cluster optimization in the C varia-tion: RearrangePath C ( b → . . . → p → x → y → . . . → e, x → y )= b → . . . → p → x ′ → e → . . . → y , where x ′ ∈ Cluster ( x ) is chosen to minimize the weight w ( p → x ′ → e ) . In S it justrepeats the optimization performed for the Gain evaluation:

RearrangePath S ( b → . . . → p → x → y → . . . → r → e, x → y )= b → . . . → p → x ′ → e ′ → r → . . . → y , where x ′ ∈ Cluster ( x ) and e ′ ∈ Cluster ( e ) are chosen to minimize w ( p → x ′ → e ′ → r ) . Every time we want to close up the path, both C and S try all the combinations of theend vertices to minimize the weight of the loop: CloseUp

C, S ( b → p → . . . → q → e ) = b ′ → p → . . . → q → e ′ → b ′ : b ′ ∈ Cluster ( b ) , e ′ ∈ Cluster ( e ) and w ( q → e ′ → b ′ → p ) is minimized . We also implemented the C co and S co variations such that CO is applied every time atour improvement is found (see Option 4 above): ImproveTour ( T ) = CO ( T ) . Finally we propose the

Exact ( E ) variation. For every cluster ordering under consid-eration it ﬁnds the shortest path from the ﬁrst to the last cluster (via all clusters in thatorder). After closing up the path it always applies CO (see Option 5 above). However, itexplores the neighborhood much faster than a naive implementation would do.The Gain function for E is deﬁned as follows: Gain E ( b → . . . → x → y → . . . → e, x → y ) = w co ( b → . . . → x → e → . . . → y ) − w co ( b → . . . → x → y → . . . → e ) , w co ( P ) is the weight of the shortest path through the corresponding clusters: w co ( x → x → . . . → x m ) = min x ′ i ∈ Cluster ( x i ) ,i =1 ,...,m w ( x ′ → x ′ → . . . → x ′ m ) . Note that

ImprovePath runs this function sequentially for every x → y ∈ P . In case ofa naive implementation, it would take O ( m s ) operations. Our implementation requiresonly O ( ms ) operations but in practice it is much faster (almost O ( ms ) ). Also note thattypically m ≫ s .Our implementation proceeds as follows. Let X , X , . . . , X m be the sequence ofclusters in the given path (see Figure 4a). Let l v be the length of the shortest path from v ∈ X j q ∈ X j +1 e ∈ X m GF ED@A BC X l v / / ______ GF ED@A BC X j GF ED@A BC X j +1 o o l eq ______ GF ED@A BC X m (a) The original sequence of clusters X , X , . . . , X m . The value l v denotes the shortest pathfrom the cluster X through X , X , . . . , X j − to the vertex v ∈ X j . It takes O ( | X j − || X j | ) operations to calculate all l v for some j . Value l eq denotes the shortest path from the vertex e ∈ X m through X m − , X m − , . . . , X j +2 to the vertex q ∈ X j +1 . It takes O ( | X m || X j +2 || X j +1 | ) operations to calculate all l eq for some j . v ∈ X j q ∈ X j +1 u ∈ X t e ∈ X m GF ED@A BC X l v / / ______ GF ED@A BC X j GF ED@A BC X j +1 GF ED@A BC X t l uq o o _ _ _ _ _ _ GF ED@A BC X m l eu o o _ _ _ _ _ _ (b) An improved algorithm. Let cluster X t be the smallest cluster among X j +2 , X j +3 , . . . , X m .To calculate all the shortest paths l uq from u ∈ X t to q ∈ X j +1 via X t − , X t − , . . . , X j +2 ,one needs O ( | X t || X j +2 || X j +1 | ) operations for some j , i.e., it is | X m | / | X t | times faster than thestraightforward calculations. The values l eu are calculated as previously, see Figure (a). v ∈ X j q ∈ X j +1 u ∈ X t e ∈ X m GF ED@A BC X l v / / ______ GF ED@A BC X j Inserted edge

GF ED@A BC X j +1 GF ED@A BC X t l uq o o _ _ _ _ _ _ GF ED@A BC X m l eu o o _ _ _ _ _ _ (c) The sequence of clusters after the local search move. To ﬁnd the shortest path from X to X j +1 via X , X , . . . , X j , X m , X m − , . . . , X j +2 , we need to ﬁnd all the shortest paths l ′ e from X to every e ∈ X m as l ′ e = min v { l v + w ( v → e ) } in O ( s ) operations, then ﬁnd all the shortestpaths l ′ u from X to every u ∈ X t as l ′ u = min e { l ′ e + l eu } in O ( s ) operations and, ﬁnally, ﬁndthe whole shortest path l ′ from X to X j +1 as l ′ = min u,q { l ′ u + l uq } in O ( s ) operations. Figure 4: A straightforward and an enhanced implementations of the E variation. X to v ∈ X j through the cluster sequence X , X , . . . , X j − : l v = min x i ∈ X i ,i =1 ,...,j − w ( x → x → . . . → x j − → v ) .

12t takes O ( s m ) operations to calculate all l v using the algorithm for the shortest path inlayered networks.Let l eq be the length of the shortest path from e ∈ X m to q ∈ X j +1 through the clustersequence X m − , X m − , . . . , X j +2 : l eq = min x i ∈ X i ,i = j +2 ,...,m − w ( e → x m − → x m − → . . . → x j +2 → q ) . It takes O ( s m ) operations to calculate all l eq using the algorithm for the shortest path inlayered networks.As a further improvement, we propose an algorithm to calculate l eq which also takes O ( s m ) operations in the worst case but in practice it proceeds signiﬁcantly faster.Note that a disadvantage of a straightforward use of the shortest path algorithm to ﬁnd l eq is that its performance strongly depends on the size of X m ; indeed, the straightforwardapproach requires | X m || X j +2 || X j +1 | operations for every j . Assume | X t | < | X m | forsome t , j + 1 < t < m , and we know the values l eu for every u ∈ X t (see Figure 4b).Now for every j < t − we only need to calculate l uq , where u ∈ X t and q ∈ X j +1 . Thiswill take | X t || X j +1 || X j | operations for every j , i.e, it is | X m | / | X t | times faster than thestraightforward approach. A formal procedure is shown in Algorithm 3. Algorithm 3

Calculation of the shortest paths l eu and l uq for E . Require:

The sequence of clusters X , X , . . . , X m . for every e ∈ X m and every q ∈ X m − do l eq ← w ( e → q ) . Y ← X m . for j ← m − , m − , . . . , doif | X j +2 | < | Y | thenif Y = X m thenfor every e ∈ X m and every u ∈ X j +2 do l eu ← min y ∈ Y { l ey + l yu } . Y ← X j +2 . for every y ∈ Y and every q ∈ X j +1 do l yq ← min u ∈ X j { l yu + w ( u → q ) } .Having all l v , l eu and l uq , where v ∈ X j , q ∈ X j +1 , e ∈ X m and u ∈ X t , j + 1 < t

The index j . Require:

The values l v , l eu and l uq , where v ∈ X j , q ∈ X j +1 , e ∈ X m and u ∈ X t , j + 1 < t ≤ m .Calculate l e ← min v ∈ X j l v + w ( v → e ) for every e ∈ X m . if t < m then Calculate l u ← min e ∈ X m l e + l eu for every u ∈ X t .Calculate l q ← min u ∈ X t l u + l uq for every q ∈ X j +1 . else Calculate l q ← min e ∈ X m l e + l eq for every q ∈ X j +1 . return min q ∈ X j +1 l q .Observe that, unlike other adaptations of the original LK tsp heuristic, Exact is natu-rally suitable for asymmetric instances.Note that another approach to implement the CO algorithm is proposed by Pop (2007).It is based on an integer formulation of the GTSP; a more general case is studied in (Pop et al.,2006). However, we believe that the dynamic programming approach enhanced by theimprovements discussed above is more efﬁcient in our case. The gain is a measure of a path improvement. It is used to ﬁnd the best path improve-ment and to decide whether this improvement should be accepted. To decide this, weuse a boolean function

GainIsAcceptable ( P, x → y ) . This function greatly inﬂuencesthe performance of the whole algorithm. We propose four different implementations of GainIsAcceptable ( P, x → y ) in order to ﬁnd the most efﬁcient ones. For the notation,see Algorithm 2.1. GainIsAcceptable ( P, x → y ) = w ( P ) < w ( P o ) , i.e., the function accepts anychanges while the path is shorter than the original one.2. GainIsAcceptable ( P, x → y ) = w ( P ) + w ( T ) m < w ( T ) , i.e., it is assumed that anedge of an average weight w ( T ) m will close up the path.3. GainIsAcceptable ( P, x → y ) = w ( P ) + w ( x → y ) < w ( T ) , i.e., the lastremoved edge is ‘restored’ for the gain evaluation. Note that the weight of the edge x → y cannot be obtained correctly in E . Instead of w ( x → y ) we use the weight w min ( X, Y ) of the shortest edge between X = Cluster ( x ) and Y = Cluster ( y ) .4. GainIsAcceptable ( P, x → y ) = w ( P ) < w ( T ) , i.e., the obtained path has tobe shorter than the original tour. In other words, the weight of the ‘close up edge’is assumed to be 0. Unlike the ﬁrst three implementations, this one is optimisticand, hence, yields deeper search trees. This takes more time but also improves thesolution quality.5. GainIsAcceptable ( P, x → y ) = w ( P ) + w ( T )2 m < w ( T ) , i.e., it is assumed thatan edge of a half of an average weight will close up the path. It is a mixture ofOptions 2 and 4. 14 . Experiments In order to select the most successful variations of the proposed heuristic and to proveits efﬁciency, we conducted a set of computational experiments.Our test bed includes several TSP instances taken from TSPLIB (Reinelt, 1991) con-verted into the GTSP by the standard clustering procedure of Fischetti et al. (1997) (thesame approach is widely used in the literature, see, e.g., (Gutin and Karapetyan, 2010;Silberholz and Golden, 2007; Snyder and Daskin, 2006; Tasgetiren et al., 2007)). LikeBontoux et al. (2010), Gutin and Karapetyan (2010), and Silberholz and Golden (2007),we do not consider any instances with less than 10 or more than 217 clusters (in otherpapers the bounds are stricter).Every instance name consists of three parts: ‘ m t n ’, where m is the number ofclusters, t is the type of the original TSP instance (see (Reinelt, 1991) for details) and n is the number of vertices.Observe that the optimal solutions are known only for some instances with up to89 clusters (Fischetti et al., 1997). For the rest of the instances we use the best knownsolutions, see (Bontoux et al., 2010; Gutin and Karapetyan, 2010; Silberholz and Golden,2007).The following heuristics were included in the experiments:1. The Basic variations, i.e., B αx and B α co x , where α ∈ { , , } and x ∈ { , , , , } deﬁne the backtracking depth and the gain acceptance strategy, respectively. Theletters ‘co’ in the superscript mean that the CO algorithm is applied every time atour improvement is found (for details see Section 3.1).2. The Closest variations, i.e., C αx and C α co x , where α ∈ { , , } and x ∈ { , , , , } .3. The Shortest variations, i.e., S αx and S α co x , where α ∈ { , , } and x ∈ { , , , , } .4. The Exact variations, i.e., E αx , where α ∈ { , , } and x ∈ { , , , , } .5. Adaptations of the 2-opt ( ) and 3-opt ( ) local searches according to Sec-tion 3.1.6. A state-of-the-art memetic algorithm ma by Gutin and Karapetyan (2010).Observe that ma dominates all other GTSP metaheuristics known from the litera-ture. In particular, Gutin and Karapetyan (2010) compare it to the heuristics proposedby Silberholz and Golden (2007), Snyder and Daskin (2006) and Tasgetiren et al. (2007),and it appears that ma dominates all these algorithms in every experiment with respectto both solution quality and running time. Similarly, one can see that it dominates twomore recent algorithms by Bontoux et al. (2010) and Tasgetiren et al. (2010) in every ex-periment. Note that the running times of all these algorithms were normalized accordingto the computational platforms used to evaluate the algorithms. Hence, we do not includethe results of the other metaheuristics in our comparison.In order to generate the starting tour for the local search procedures, we use a sim-pliﬁed Nearest Neighbour construction heuristic ( NN ). Unlike proposed by Noon (1988),our algorithm tries only one starting vertex. Trying every vertex as a starting one signif-icantly slows down the heuristic and usually does not improve the solutions of the localsearches. Note that in what follows the running time of a local search includes the runningtime of the construction heuristic. 15ll the heuristics are implemented in Visual C++. The evaluation platform is basedon an Intel Core i7 2.67 GHz processor.The experimental results are presented in two forms. The ﬁrst form is a fair compe-tition of all the heuristics joined in one table. The second form is a set of standard tablesreporting solution quality and running time of the most successful heuristics. Many researchers face the problem of a fair comparison of several heuristics. Indeed,every experiment result consist of at least two parameters: solution error and running time.It is a trade-off between the speed and the quality, and both quick (and low-quality) andslow (and high-quality) heuristics are of interest. A heuristic should only be consideredas useless if it is dominated by another heuristic, i.e., it is both slower and yields solutionsof a lower quality.Hence, one can clearly separate a set of successful from a set of dominated heuristics.However, this only works for a single experiment. If the experiment is conducted forseveral test instances, the comparison becomes not obvious. Indeed, a heuristic may besuccessful in one experiment and unsuccessful in another one. A natural solution of thisproblem is to use averages but if the results vary a lot for different instances this approachmay be incorrect.In a fair competition, one should compare heuristics which have similar running times.For every time τ i ∈ { } we compare solution quality ofall the heuristics which were able to solve an instance in less than τ i . In order to furtherreduce the size of the table and to smooth out the experimental results, we additionallygroup similar instances together and report only the average values for each group.Moreover, we repeat every experiment 10 times. It requires some extra effort to ensurethat an algorithm H proceeds differently in different runs, i.e., H i ( I ) = H j ( I ) in generalcase, where i and j are the run numbers. For ma r the run number r is the randomgenerator seed value. In NN r , we start the tour construction from the vertex C r, , i.e.,from the ﬁrst vertex of the r th cluster of the instance. This also affects all the localsearches since they start from the NN r solutions.Finally we get Table 1. Roughly speaking, every cell of this table reports the most suc-cessful heuristics for a given range of instances and being given some limited time. Moreformally, let τ = { τ , τ , . . . } be a set of predeﬁned time limits. Let I = {I , I , . . . } bea set of predeﬁned instance groups such that all instances in every I j have similar difﬁ-culty. Let H be a set of all heuristics included in the competition. H ( I ) time and H ( I ) error are the running time and the relative solution error, respectively, of the heuristic H ∈ H for the instance I ∈ I : H ( I ) error = w ( H ( I )) − w ( I best ) w ( I best ) , where I best is the optimal or the best known solution for the instance I . H ( I j ) time and H ( I j ) error denote the corresponding values averaged for all the instances I ∈ I j and all r ∈ { , , . . . , } .For every cell i, j we deﬁne a winner heuristic Winner i,j ∈ H as follows:1.

Winner ri,j ( I ) time ≤ τ i for every instance I ∈ I j and every r ∈ { , , . . . , } .16. Winner i,j ( I j ) error < Winner i − ,j ( I j ) error (it is only applicable if i > ).3. If several heuristics meet the conditions above, we choose the one with the smallest H i,j ( I j ) error .4. If several heuristics meet the conditions above and have the same solution quality,we choose the one with the smallest H i,j ( I j ) time .Apart from the winner, every cell contains all the heuristics H ∈ H meeting thefollowing conditions:1. H r ( I ) time ≤ τ i for every instance I ∈ I j and every r ∈ { , , . . . , } .2. H ( I j ) error < Winner i − ,j ( I j ) error (it is only applicable if i > ).3. H ( I j ) error ≤ . · W inner i,j ( I j ) error .4. H ( I j ) time ≤ . · W inner i,j ( I j ) time .Since LK is a powerful heuristic, we did not consider any instances with less than 30clusters in this competition. Note that all the smaller instances are relatively easy to solve,e.g., ma was able to solve all of them to optimality in our experiments, and it took onlyabout 30 ms on average, and for S it takes, on average, less than 0.5 ms to get 0.3%error, see Table 3.We use the following groups I j of instances:Tiniest: , , , , and .Tiny: , , , , and .Small: , , , and .Moderate: , , , and .Large: , , , , and .Huge: , , and .Giant: , , and .Note that the instances , , , , , and are excluded from this competition since they are signif-icantly harder to solve than the other instances of the corresponding groups. This isdiscussed in Section 4.2 and the results for these instances are included in Tables 4 and 5.One can see from Table 1 that there is a clear tendency: the proposed Lin-Kernighanadaptation outperforms all the other heuristics in a wide range of trade-offs between solu-tion quality and running time. Only the state-of-the-art memetic algorithm ma is able tobeat LK being given large time. There are several occurrences of in the upper rightcorner (i.e., for Huge and Giant instances and less than 5 ms time) but this is because thistime is too small for even the most basic variations of LK . Note that B and coB denotethe local search adapted for the GTSP according to Options 1 and 2, respectively,see Section 3.1.Clearly, the most important parameter of LK is its variation, and each of the fourvariations ( Basic , Closest , Shortest and

Exact ) is successful in a certain running timerange. B wins the competition for small running times. For the middle range of runningtimes one should choose C or S . The E variation wins only in a small range of times;having more time, one should choose the memetic algorithm ma .Here are some tendencies with regards to the rest of the LK parameters: • It is usually beneﬁcial to apply CO every time a tour improvement is found.17able 1: The fair competition. Every cell reports the most successful heuristics beinggiven some limited time (see the ﬁrst column) for a given range of instances (see theheader). Every heuristic is provided with the average relative solution error in percent.To make the table easier to read, all the B and E adaptations of LK are selected with boldfont. All the cells where the dominating heuristic is C or S are highlighted with greybackground. Tiniest Tiny Small Moderate Large Huge Giant ≤ ms S S C C C B B B B B coB B ≤ ms S C C S C B B B B B B coB ≤ ms — — C C C C C B ≤ ms — S S S C C C ≤ ms — S S S S S S C C ≤ . s — S S C C S ≤ . s — — — E S ≤ . s — ma S ≤ s — — — E E S ≤ s — — — — — S ≤ s — — — ma E E ≤ s — — — — ma ≤ s — — — — — ma ≤ s — — — — — — ma • The most successful gain acceptance options are 4 and 5 (see Section 3.6). • The larger the backtracking depth α , the better the solutions. However, it is anexpensive way to improve the solutions; one should normally keep α ∈ { , , } .Table 1, however, does not make it clear what parameters one should use in practice.In order to give some advice, we calculated the distances d ( H ) between each heuristic H ∈ H and the winner algorithms. For every column j of Table 1 we calculated d j ( H ) : d j ( H ) = H ( I j ) error − Winner i,j ( I j ) error Winner i,j ( I j ) error , where i is minimized such that H r ( I ) time ≤ τ i for every I ∈ I j and r ∈ { , , . . . , } .Then d j ( H ) were averaged for all j to get the required distance: d ( H ) = d j ( H ) . The list18f the heuristics H with the smallest distances d ( H ) is presented in Table 2. In fact, weadded coB , B and E to this list only to ﬁll the gaps. Every heuristic H in Table 2 isalso provided with the average running time T ( H ) , in % of ma running time: T ( H ) = T ( H, I, r ) is averaged for all the instances I ∈ I and all r ∈ { , , . . . , } , where T ( H, I, r ) = H r ( I ) time MA ( I ) time and MA ( I ) time = MA r ( I ) time is averaged for all r ∈ { , , . . . , } . Table 2: The list of the most successful heuristics. The heuristics H are ordered accordingto their running times, from the fastest to the slowest ones. coB denotes the localsearch adapted for the GTSP according to Option 2, see Section 3.1. H d ( H ) , % Time, % of ma time coB

44 0.04 B

34 0.10 C

12 0.40 S

19 0.97 S

19 2.53 S

35 8.70 S

32 15.34 E

56 43.62 ma In this section we provide the detailed information on the experimental results forthe most successful heuristics, see Section 4.1. Tables 3, 4 and 5 include the followinginformation: • The ‘Instance’ column contains the instance name as described above. • The ‘Best’ column contains the best known or optimal (Fischetti et al., 1997) ob-jective values of the test instances. • The rest of the columns correspond to different heuristics and report either relativesolution error or running time in milliseconds. Every value is averaged for ten runs,see Section 4.1 for details. • The ‘Average’ row reports the averages for all the instances in the table. • The ‘Light avg’ row reports the averages for all the instances used in Section 4.1. • Similarly, the ‘Heavy avg’ row reports the averages for all the instances ( m ≥ )excluded from the competition in Section 4.1.19ll the small instances ( m < ) are separated from the rest of the test bed to Table 3.One can see that all these instances are relatively easy to solve; in fact several heuristicsare able to solve all or almost all of them to optimality in every run and it takes only asmall fraction of a second. A useful observation is that E solves all the instances with upto 20 clusters to optimality, and in this range E is signiﬁcantly faster than ma .As regards the larger instances ( m ≥ ), it is worth noting that there exist several‘heavy’ instances among them: , , , , , and . Some heuristics perform extremely slowly for these in-stances: the running time of S , S , S and E is 3 to 500 times larger for every‘heavy‘ instance than it is for the other instances of a similar size. Other LK variationsare also affected, though, this mostly relates to the ones which use the ‘optimistic’ gainacceptance functions (Options 4 and 5), see Section 3.6.Our analysis has shown that all of these instances have an unusual weight distribution.In particular, all these instances have enormous number of ‘heavy’ edges, i.e., the theweights which are close to the maximum weight in the instance, prevail over the smallerweights. Recall that LK bases on the assumption that a randomly selected edge willprobably have a ‘good’ weight. Then we can optimize a path in the hope to ﬁnd a goodoption to close it up later. However, the probability to ﬁnd a ‘good’ edge is low in a‘heavy’ instance. Hence, the termination condition GainIsAcceptable does not usuallystop the search though a few tour improvements can be found. This, obviously, slowsdown the algorithm.Note that a similar result was obtained by Karapetyan and Gutin (2010) for the adap-tation of the Lin-Kernighan heuristic for the Multidimensional Assignment Problem.Observe that such ‘unfortunate’ instances can be easily detected before the algorithm’srun. Observe also that even the fast heuristics yield relatively good solutions for theseinstances (see Tables 4 and 5). Hence, one can use a lighter heuristic to get a reasonablesolution quality in a reasonable time in this case.

5. Conclusion

The Lin-Kernighan heuristic is known to be a very successful TSP heuristic. In thispaper we present a number of adaptations of Lin-Kernighan for the GTSP. Several ap-proaches to adaptation of a TSP local search for the GTSP are discussed and the best onesare selected and applied to the Lin-Kernighan heuristic. The experimental evaluation con-ﬁrms the success of these approaches and proves that the proposed adaptations reproducethe efﬁciency of the original TSP heuristic.Based on the experimental results, we selected the most successful Lin-Kernighanadaptations for different solution quality/running time requirements. Only for the verysmall running times (5 ms or less) and huge instances (132 clusters and more) our heuris-tic is outperformed by some very basic local searches just because none of our adaptationsis able to proceed in this time. For the very large running times, the Lin-Kernighan adap-tations are outperformed by the state-of-the-art memetic algorithm which usually solvesthe problem to optimality.To implement the most powerful adaptation ‘Exact’, a new approach was proposed.Note that the same approach can be applied to many other TSP local searches. Comparing20o the previous results in the literature, the time complexity of exploration of the corre-sponding neighborhood is signiﬁcantly reduced which makes this adaptation practical.Though it was often outperformed by either faster adaptations or the memetic algorithmin our experiments, it is clearly the best heuristic for small instances (up to 20 clusters inour experiments) and it is also naturally suitable for the asymmetric GTSP.Further research on adaptation of the Lin-Kernighan heuristic for other combinatorialoptimization problems may be of interest. Our future plans also include a thorough studyof different GTSP neighborhoods and their combinations.

References

Balas, E., Saltzman, M.J., 1991. An algorithm for the three-index assignment problem.Operations Research 39, 150–161.Ben-Arieh, D., Gutin, G., Penn, M., Yeo, A., Zverovitch, A., 2003. Transformations ofgeneralized ATSP into ATSP. Operations Research Letters 31, 357–365.Bontoux, B., Artigues, C., Feillet, D., 2010. A memetic algorithm with a large neighbor-hood crossover operator for the generalized traveling salesman problem. Computers &Operations Research 37, 1844–1852.Chandra, B., Karloff, H., Tovey, C., 1994. New results on the old k -opt algorithm for theTSP, in: Proceedings of the 5th Annual ACM-SIAM Symposium on Discrete Algo-rithms, pp. 150–159.Fischetti, M., Salazar Gonz´alez, J.J., Toth, P., 1995. The symmetric generalized travelingsalesman polytope. Networks 26, 113–123.Fischetti, M., Salazar Gonz´alez, J.J., Toth, P., 1997. A branch-and-cut algorithm for thesymmetric generalized traveling salesman problem. Operations Research 45, 378–394.Gutin, G., Karapetyan, D., 2010. A memetic algorithm for the generalized travelingsalesman problem. Natural Computing 9, 47–60.Gutin, G., Karapetyan, D., Krasnogor, N., 2008. A memetic algorithm for the generalizedasymmetric traveling salesman problem, in: Nature Inspired Cooperative Strategies forOptimization (NICSO 2007), pp. 199–210.Helsgaun, K., 2000. An effective implementation of the Lin-Kernighan traveling sales-man heuristic. European Journal of Operational Research 126, 106–130.Helsgaun, K., 2009. General k-opt submoves for the LinKernighan TSP heuristic. Math-ematics and Statistics 1, 119–163.Hu, B., Raidl, G.R., 2008. Effective neighborhood structures for the generalized travelingsalesman problem, in: Proceedings of EvoCOP 2008, pp. 36–47.Huang, H., Yang, X., Hao, Z., Wu, C., Liang, Y., Zhao, X., 2005. Hybrid chromosome ge-netic algorithm for generalized traveling salesman problems, in: Proceedings of ICNC2005, pp. 137–140. 21ohnson, D.S., McGeoch, L.A., 2002. Experimental analysis of heuristics for the STSP,in: Gutin, G., Punnen, A.P. (Eds.), The Traveling Salesman Problem and its Variations.Kluwer, pp. 369–444.Karapetyan, D., Gutin, G., 2010. Local search heuristics for the multidimensional assign-ment problem. To appear in Journal of Heuristics. A preliminary version is publishedin Lecture Notes Comp. Sci.

Solution error, % Running time, msInstance Best coB C S E ma 2o coB C S E ma m ≥ ). Thereported values are relative solution errors, %. Instance Best coB B C S S S S E ma m ≥ ). Thereported values are running times, ms. Instance coB B C S S S S E ma30ch150 0.1 0.1 0.5 1.4 2.8 2.7 8.0 46.7 56.230kroa150 0.0 0.1 0.4 1.0 1.8 4.0 4.2 32.3 57.730krob150 0.0 0.1 0.4 1.2 1.5 2.5 7.0 50.6 65.531pr152 0.0 0.2 0.4 1.4 4.5 25.3 33.4 38.8 39.032u159 0.1 0.1 0.3 0.9 2.7 4.2 25.7 31.9 62.435si175 0.1 0.2 1.8 3.6 10.0 23.5 358.8 232.5 64.036brg180 0.0 0.2 0.4 0.4 1.2 2.3 279.3 46.4 53.039rat195 0.1 0.1 0.7 1.2 3.1 7.3 13.7 64.9 138.840d198 0.2 0.6 2.0 3.7 21.9 134.2 310.4 98.7 126.440kroa200 0.1 0.1 0.7 1.6 3.2 4.1 11.7 60.6 123.240krob200 0.1 0.2 0.5 1.4 2.4 4.2 16.2 56.3 157.641gr202 0.1 0.3 0.8 1.9 7.8 11.0 81.2 86.1 198.145ts225 0.1 0.3 0.7 3.0 8.0 10.0 19.9 273.0 191.945tsp225 0.1 0.2 0.8 2.3 3.1 7.6 15.5 112.3 156.046pr226 0.1 0.4 1.0 1.9 4.5 12.7 21.7 44.1 95.246gr229 0.1 0.2 1.0 3.2 3.5 8.8 13.9 145.1 224.653gil262 0.2 0.3 1.9 3.8 7.9 9.2 21.3 107.8 290.253pr264 0.2 1.0 5.7 6.5 66.2 282.4 505.4 230.9 204.456a280 0.2 0.3 1.1 2.2 11.2 9.3 43.9 148.2 291.760pr299 0.1 0.2 1.5 3.8 8.7 12.6 31.4 146.7 347.964lin318 0.2 0.3 2.0 4.2 17.3 48.6 81.4 223.1 404.080rd400 0.3 0.7 3.8 5.6 18.2 36.7 74.4 305.8 872.084ﬂ417 0.3 2.3 5.9 9.7 59.0 174.8 315.1 645.8 583.487gr431 0.4 0.8 4.4 9.3 19.8 59.6 107.9 485.2 1673.988pr439 0.3 0.8 3.0 11.6 24.4 54.3 109.3 764.4 1146.689pcb442 0.5 0.8 4.1 9.5 23.1 42.9 88.8 656.8 1530.499d493 0.7 2.0 7.5 13.1 148.3 2666.1 1616.2 591.2 3675.4107ali535 1.0 2.3 7.1 13.4 29.9 52.2 170.2 795.6 3558.4107att532 0.6 1.9 8.0 17.1 33.1 71.5 312.1 932.9 2942.2107si535 0.5 5.5 32.9 46.7 337.0 1921.9 12725.0 3503.8 1449.2113pa561 0.7 1.3 5.4 11.6 28.6 51.0 104.2 695.8 2931.3115u574 0.7 1.9 6.8 10.7 53.3 63.9 156.1 956.3 3017.1115rat575 0.5 1.3 6.4 17.9 41.0 92.9 128.0 697.3 2867.3131p654 1.2 9.4 40.9 27.3 213.9 1074.8 2964.0 3293.2 2137.2132d657 0.9 2.3 13.6 22.0 109.4 1009.3 2322.9 794.0 4711.2134gr666 1.0 2.3 8.7 28.1 51.5 135.9 374.4 1425.8 10698.6145u724 1.0 2.7 13.4 32.6 62.8 105.8 242.0 1326.0 7952.9157rat783 1.5 2.2 17.7 30.8 73.7 131.3 248.3 2165.3 9459.9200dsj1000 3.5 10.3 80.7 104.5 592.8 5199.5 8032.5 9361.6 22704.4201pr1002 2.3 6.2 39.1 57.0 156.4 290.6 539.8 2719.1 21443.9207si1032 3.5 37.4 839.3 875.2 7063.7 195644.0 306944.8 112926.4 17840.3212u1060 3.7 7.1 36.4 80.2 195.5 307.5 1040.5 2990.5 31201.8217vm1084 2.5 6.6 51.4 78.5 204.8 496.1 978.1 4687.8 27587.2Average 0.7 2.6 29.3 36.3 226.4 4890.9 7941.8 3604.6 4310.1Light avg. 0.7 1.6 9.5 16.8 56.0 315.7 488.5 972.0 4653.6Heavy avg. 0.8 7.1 116.1 121.6 971.6 24907.3 40550.4 15122.2 2807.2