[PDF] A New Integer Programming Formulation of the Graphical Traveling Salesman Problem

Abstract

In the Traveling Salesman Problem (TSP), a salesman wants to visit a set of cities and return home. There is a cost c ij of traveling from city i to city j , which is the same in either direction for the Symmetric TSP. The objective is to visit each city exactly once, minimizing total travel costs. In the Graphical TSP, a city may be visited more than once, which may be necessary on a sparse graph. We present a new integer programming formulation for the Graphical TSP requiring only two classes of constraints that are either polynomial in number or polynomially separable, while addressing an open question proposed by Denis Naddef.

Full PDF

NNoname manuscript No. (will be inserted by the editor)

A New Integer Programming Formulation of theGraphical Traveling Salesman Problem

Robert D. Carr · Neil Simonetti

Received: date / Accepted: date

Abstract

In the Traveling Salesman Problem (TSP), a salesman wants tovisit a set of cities and return home. There is a cost c ij of traveling from city i to city j , which is the same in either direction for the Symmetric TSP. Theobjective is to visit each city exactly once, minimizing total travel costs. In theGraphical TSP, a city may be visited more than once, which may be necessaryon a sparse graph. We present a new integer programming formulation forthe Graphical TSP requiring only two classes of constraints that are eitherpolynomial in number or polynomially separable, while addressing an openquestion proposed by Denis Naddef. Keywords

Linear Program · Relaxation · TSP · Traveling SalesmanProblem · GTSP · Graphical Traveling Salesman Problem

The Traveling Salesman Problem (TSP), is one of the most studied problemsin combinatorial optimization [9] [10]. In its classic form, a salesman wantsto visit each of a set of cities exactly once and return home while minimizingtravel costs. Costs of traveling between cities are stored in a matrix where

Robert D. CarrComputer Science Department, University of New MexicoAlbuquerque, NM 87131This material is based upon research supported in part by the U. S. Oﬃce of Naval Researchunder award number N00014-18-1-2099.E-mail: [email protected] SimonettiBusiness, Computer Science, and Mathematics Department, Bryn Athyn CollegeBryn Athyn, PA 19009-0717E-mail: [email protected] a r X i v : . [ c s . D M ] J un Robert D. Carr, Neil Simonetti entry c ij indicates the cost of traveling from city i to city j . Units may bedistance, time, money, etc.If the underlying graph for the TSP is sparse, a complete cost matrix canstill be constructed by setting c ij equal to the shortest path between city i andcity j for each pair of cities. However, this has the disadvantage of turning asparse graph G = ( V, E ) where the edge set E could be of size O ( | V | ) into acomplete graph G (cid:48) = ( V, E (cid:48) ), where the edge set E (cid:48) is O ( | V | ).Ratliﬀ and Rosenthal were the ﬁrst to consider a case where the edge setis not expanded to a complete graph, but left sparse, [17], while soon after,Fleischmann [8] and Cornu´ejols, Fonlupt, and Naddef [5] examined this in amore general case, the latter giving this its name: the Graphical TravelingSalesman Problem (GTSP). As a consequence, a city may be visited morethan once, since there is no guarantee the underlying graph will be Hamilto-nian. While the works of Fleischmann and Cornu´ejols et al. focused on cuttingplanes and facet-deﬁning inequalities, this paper will look at a new compactformulation that can improve on the integrality gap created when solving alinear programming relaxation of the problem. This paper will investigate the symmetric GTSP, where the cost of travel-ing between two cities is the same, regardless of direction, which allows thefollowing notation to be used: G = ( V, E ) : The graph G with vertex set V and edge set E . c e : The cost of using edge e , replaces c ij . x e : The variable indicating the use of edge e , replaces x ij whichis used in most general TSP formulations. δ ( v ) : The set of edges incident to vertex v . δ ( S ) : The set of edges with exactly one endpoint in vertex set S . x ( F ) : The sum of variables x e for all e ∈ F ⊂ E .If given a formulation on a complete graph K n , a formulation for a sparsegraph G can be created by simply setting x e = 0 for any edge e in the graph K n but not in the graph G .2.1 Symmetric TSPThe standard formulation for the TSP, attributed to Dantzig, Fulkerson, andJohnson [6], contains constraints that guarantee the degree of each node ina solution is exactly two (degree constraints) and constraints that prevent asolution from being a collection of disconnected subtours (subtour elimination New IP Formulation of Graphical TSP 3 constraints). minimize (cid:80) e ∈ E c e x e subject to (cid:80) e ∈ δ ( v ) x e = 2 ∀ v ∈ V (cid:80) e ∈ δ ( S ) x e ≥ ∀ S ⊂ V. S (cid:54) = ∅ x e ∈ { , } ∀ e ∈ E. When this integer program is relaxed, the integer constraints x e ∈ { , } arereplaced by the boundary constraints 0 ≤ x e ≤ ≤ | S | ≤ | V | , there are still exponentiallymany of these constraints. Using a similar technique to Martin [14], which wasdirectly applied to the TSP by Carr and Lancia [1], these constraints can bereplaced by a polynomial number of ﬂow constraints which ensure the solutionis a 2 edge-connected graph.2.2 Symmetric GTSPThis formulation for the Graphical TSP comes from Cornu´ejols, Fonlupt, andNaddef [5], and diﬀers from the formulation above by allowing the degree ofa node to be any even integer, and by removing any upper bound on thevariables. minimize (cid:80) e ∈ E c e x e subject to (cid:80) e ∈ δ ( v ) x e is positive and even ∀ v ∈ V (cid:80) e ∈ δ ( S ) x e ≥ ∀ S ⊂ V, S (cid:54) = ∅ x e ≥ ∀ e ∈ E.x e is integer ∀ e ∈ E. When this program is relaxed, the integer constraints at the end are removed,and the disjunctive constraints that require the degree of each node to be anypositive and even integer are eﬀectively replaced by a lower bound of two onthe degree of each node.The disjunctive constraints for the formuation above are unusual for tworeasons. Firstly, in most mixed-integer programs, only variables are constrainedto be integers, not sums of variables found in constraints. Secondly, the sum isrequired not to be just integer, but an even integer. In terms of a mixed-integerformulation, the second perculiarity could be addressed with: (cid:80) e ∈ δ ( v ) x e ∈ Z ∀ v ∈ V To our knowledge, no other integer programming formulation for a graphtheory application uses constraints of this kind. (Even the constraints for the

Robert D. Carr, Neil Simonetti common T-join problem are diﬀerent than what we are proposing here, whichwill be discussed at the end of the paper.)Addressing the ﬁrst peculiarity, that integer and mixed-integer programsonly allow integrality of variables, we could set these sums to new variablesindexed on the vertices of the graph, d v . (cid:80) e ∈ δ ( v ) x e = d v ∀ v ∈ Vd v ∈ Z ∀ v ∈ V While this approach works, it feels unsatisfying. The addition of these d v variables is purely cosmetic. When solving the relaxation, there is nothingpreventing us from waiting until a solution is generated before deﬁning thevalues of d v using the sums above. Thus the new variables do not facilitate theaddition of any new constraints, and do nothing to strengthn the LP relaxationin any way.When solving the integer program, we can bypass the d v variables bybranching with constraints based on the degree sums. For example, if thesolution from a relaxation creates a graph where node i has odd degree q , webranch with constraints of the form: (cid:80) e ∈ δ ( i ) x e ≤ q − (cid:80) e ∈ δ ( i ) x e ≥ q + 1At a conference, Denis Naddef proposed a challenge of ﬁnding a set of con-straints for a mixed-integer formulation of GTSP, where integrality constraintsare limited to only x e ∈ Z [15]. We will address the state of this challenge insection 4. Cornu´ejols et al. proved that an upper bound of two on each x e is implied ifall the edge costs are positive [5] (and also note that without this additionalbound, graphs with negative weight edges would not have ﬁnite optimal solu-tions). This fact allows us to dissect the variables x e into two components y e and z e such that, for each edge e ∈ E : y e = 1 if edge e is used exactly once, y e = 0 otherwise z e = 1 if edge e is used exactly twice, z e = 0 otherwiseNote that x e = y e + 2 z e (1)Additionally, we can add the constraint y e + z e ≤ e ∈ E ,since using both would imply an edge being used three times in a solution. Butmore importantly, we now have a way to enforce even degree without usingdisjunctions, since only the y e variables matter in determining if the degree ofa node is odd or even. New IP Formulation of Graphical TSP 5 y e variables is one, the following constraints willenforce even degree: (cid:88) e ∈ F (1 − y e ) + (cid:88) e ∈ δ ( v ) \ F y e ≥ ∀ v ∈ V and F ⊂ δ ( v ) with | F | odd. (2)This type of constraint was used by Yannakakis et al. [18] and Lancia etal. [12] when working with the parity polytope.Note that for integer values of y e , if the y -degree of a node v is odd, thenwhen F is the set of nodes adjacent to v indicated by y , the expression in theleft-hand side of the constraint above will be zero. If the y -degree of a node v is even, then for any set F with | F | odd, the left-hand side must be at leastone.For sparse graphs, this adds at most O ( | V | ∆ − ) constraints, where ∆ is the maximum degree in G . Typical graphs from roadmaps usually have4 ≤ ∆ ≤

6, while graphs from highway maps might have 5 ≤ ∆ ≤

8. Also notethat Euler’s formula for planar graphs guarantees that | E | ≤ | V | −

6, and sothe average degree of a node in a planar graph cannot be more than six.Unfortunately, in the relaxation of the linear program with these con-straints, odd degree nodes can still result from allowing a path of nodes where,for each edge e in the path, y e = 0 and z e = . See ﬁgure 1. y e = 1 and z e = 0 y e = 0 and z e = Fig. 1

A path where y e = 0 and z e = z path is to require the edges indicatedby y and z contain a spanning tree. This is diﬀerent than demanding that x contains a spanning tree since each unit of z e contributes two units to x e .For the spanning tree constraint, each z e contributes only one unit towarda spanning tree, which means that for any node whose y -degree is zero, the z -degree must be at least one, and in the case where two nodes with y -degreezero are connected using an edge in the spanning tree, the z -degree of one ofthose nodes must be at least two, which eﬀectively prevents this half- z path.Place constraints on binary variables t such that the edges where t e = 1indicate a tree that spans all nodes of G (the graph must be connected andcontain no cycles). This is done by the well-known partition inequalities thatwill be discussed in section 4. As with the subtour elimination constraints, Robert D. Carr, Neil Simonetti

Martin describes a compact set of constraints that ensure t indicates a tree(or a convex combination of trees) [14]. Then, add the constraint: t e ≤ y e + z e , ∀ e ∈ E (3)Only the connectedness of the graph indicated by t e is important, since theconstraint only requires that y + z dominate a spanning tree, so the constraintsthat would prevent cycles are unnecessary.This constraint is valid since any tour that visits every node has within ita spanning tree that touches each node. y e ∈ { , } , to ﬁnd an optimal integer solution value,making the subtour elimination constraints unnecessary. The following newmixed integer programming formulation therefore does not include these op-tional constraints. Note that all GTSP tours will satisfy the constraints in thisformulation. The notation δ ( V , ..., V k ) refers to the set of edges with endpointsin diﬀerent vertex sets.minimize (cid:80) e ∈ E c e x e subject to x e = y e + 2 z e ∀ e ∈ E (4 . (cid:80) e ∈ F − y e + (cid:80) e ∈ δ ( v ) \ F y e ≥ ∀ v ∈ V and F ⊂ δ ( v ) with | F | odd (4 . (cid:80) e ∈ δ ( V ,...,V k ) t e ≥ k − ∀ partitions V , ..., V k of V (4 . t e ≤ y e + z e ≤ ∀ e ∈ E (4 . ≤ t e ≤ ∀ e ∈ E (4 . ≤ z e ≤ ∀ e ∈ E (4 . y e ∈ { , } ∀ e ∈ E (4 . Theorem 1

Given a MIP solution ( y ∗ , z ∗ ) to the GTSP formulation above,then x ∗ = y ∗ + 2 z ∗ will indicate an edge set that is an Euler tour, or a convexcombination of Euler tours. P roof . It should be noted that it is suﬃcient for the MIP solultion todominate an Euler tour (or a convex combination of them), since if there issome edge e where x ∗ e is larger than necessary by (cid:15) units (for some 0 < (cid:15) ≤ x ∗ e ≤ e twice to a collection of Euler tours with totalweight (cid:15) .Let ( y ∗ , z ∗ ) be a feasible solution to the GTSP formulation speciﬁed. Byconstraints (4.3) and (4.4), we know that y ∗ + z ∗ dominates a convex com-bination of spanning trees, thus we have y ∗ + z ∗ ≥ (cid:80) i λ i T i , where each T i is an edge incidence vector of a spanning tree. Deﬁne R i by R ie = 1 if both New IP Formulation of Graphical TSP 7 T ie = 1 and y ∗ e = 0, and R ie = 0 otherwise. So R i becomes the remnant of tree T i when edges indicated by y ∗ are removed. Since y ∗ only contains integervalues, the constraint y e + z e ≤ z e = 0 whenever y e = 1,and guarantees y e = 0 whenever z e >

0. This means that for all edges where y e = 0, we have z ∗ ≥ (cid:80) i λ i T i = (cid:80) i λ i R i . Since R i is the result of removingedges from T i where y e = 1, we are guaranteed that R ie = 0 for every i where y e = 1, and thus z ∗ ≥ (cid:80) i λ i R i over all edges.Hence, x ∗ = y ∗ + 2 z ∗ ≥ y ∗ + 2 (cid:80) i λ i R i = (cid:80) i λ i ( y ∗ + 2 R i ), where foreach i , y ∗ + 2 R i is an Euler tour, because constraint (4.2) ensures the graphindicated by y ∗ + 2 R i will have even degree at every node, and constraints(4.3) and (4.4) ensure the graph indicated by y ∗ + R i , and thus y ∗ + 2 R i , isconnected. (cid:117)(cid:116) Constraints (4.3) are exponential in number, but these can also be reducedto a compact set of constraints using the techniques from Martin [14]. Theconstraints we used are below, and require a model using directed edges toregulate ﬂow variables φ . Assume V = { , , ..., n } and designate city n as thehome city.In Martins formulation, we use directed ﬂow variables φ k that carry oneunit of ﬂow from any node with index higher than k to node k and supported bythe values −→ t ij as edge capacities. From any feasible integral solution (directedspanning tree), it is not hard to derive such a set of unit ﬂows by directingthe tree from the home node n . For this ﬂow, we can now set ﬂow goinginto any node j with j > k to zero in ﬂow problem φ k , and ﬂow balancingconstraints among the nodes numbered k + 1 or higher are also unnecessary.Finally, t e = −→ t ij + −→ t ji to create variables for the undirected spanning tree. φ ki,j = 0 ∀ j ∈ V, k ∈ V \ { n } with j > k, { i, j } ∈ Eφ kk,i = 0 ∀ k ∈ V \ { n } , { i, k } ∈ E (cid:80) i ∈ δ ( k ) φ ki,k = 1 ∀ k ∈ V \ { n } (cid:80) i ∈ δ ( j ) φ ki,j − (cid:80) i ∈ δ ( j ) φ kj,i = 0 ∀ j ∈ V, k ∈ V \ { n } with j < k ≤ φ ki,j ≤ −→ t ij ∀ k ∈ V \ { n } , { i, j } ∈ Et e = −→ t ij + −→ t ji ∀ e = { i, j } ∈ E (cid:80) e ∈ E t e ≤ n − t e ≤ y e + z e ∀ e ∈ E Constraints (4.2) are exponential in ∆ , the maximum degree of the graph,which is not a concern if the graph is sparse, leading to a compact formulation.If the graph is not sparse, identifying when a constraint from the set (4.2) isviolated, a process called separation, can be done quickly and eﬃciently, evenif the solution is from a relaxation and thus contains fractional values for some y e variables. Robert D. Carr, Neil Simonetti

Theorem 2

Given a solution to the relaxation of the GTSP formulation abovewithout constraints (4.2), if a constraint from (4.2) is violated, it can be foundin O ( | V | ) time. P roof . For each node v ∈ V , minimize the left-hand side of constraint(4.2) over all possible sets F ⊂ δ ( v ) ( | F | even or odd), by placing edges with y e > in F and leaving edges with y e < for δ ( v ) \ F . Edges with y e = could go in either set. – If this minimum is not less than 1, no constraint from (4.2) will be violatedfor this node. – If the minimum is less than 1, and | F | is odd, this is a violated constraintfrom (4.2). – If the minimum is less than 1, and | F | is even, ﬁnd the edge e where | y e − | is smallest. Then ﬂip the status of the membership of edge e in F . Thiswill create the minimum left-hand side over all sets F with | F | odd.For each node, this requires summing or searching items indexed by δ ( v )a constant number of times, and since | δ ( v ) | < | V | this requires O ( | V | ) time. (cid:117)(cid:116) x to be integer andallow y and z to hold fractional values, which addresses the challenge thatDenis Naddef proposed [15]. He wished to know if one could ﬁnd a simpleformulation for the GTSP that ﬁnds optimal solutions by only requiring inte-grality of the decision variables x ∗ , and nothing else. But this cannot be done(for polynomially-sized or polynomially-separable classes of inequalities unless P = N P ), which can be seen by the folowing theorem.

Theorem 3

Let G be a 3-regular graph, and let G (cid:48) be the result of adding onevertex to the middle of each edge in G . Consider a solution x ∗ , where x ∗ e = 1 for each edge e ∈ G (cid:48) . Then x ∗ is in the GTSP polytope iﬀ G is Hamiltonian. In this proof, deﬁne x ∗ ( S ) = (cid:80) e ∈ S x ∗ e . P roof . If G is Hamiltonian, let P be the set of edges in a Hamilton cycleof G , and let P (cid:48) be the set of corresponding edges in the graph G (cid:48) . Note thatin G (cid:48) every degree-two node is adjacent to two degree-three nodes, and thatthe cycle P (cid:48) reaches every degree-three node in G (cid:48) . One GTSP tour in G (cid:48) canbe created by adding an edge of weight two on exactly one of the two edgesadjacent to each degree-two node in G (cid:48) but not used in P (cid:48) . The other GTSPtour can be created by adding an edge of weight two on the edges not chosenby the ﬁrst tour. The convex combination of each of these tours with weight will create a solution where x ∗ e = 1 for each edge e ∈ G (cid:48) Now suppose we have a solution x ∗ , where x ∗ e = 1 for each edge e ∈ G (cid:48) and x ∗ is in the GTSP polytope. Express x ∗ = (cid:80) k λ k χ k as a convex combination New IP Formulation of Graphical TSP 9 of GTSP tours. Consider any degree-two vertex v in G (cid:48) . Since v has degreetwo, x ∗ ( δ ( v )) = 2. Also χ k ( δ ( v )) ≥ χ k ,and so, by the convex combination, it must be that χ k ( δ ( v )) = 2 for each χ k . If the neighbors of v are nodes i and j , then χ k ( δ ( v )) = 2 implies either χ ki,v = 1 and χ kj,v = 1, or χ ki,v = 2 and χ kj,v = 0, or χ ki,v = 0 and χ kj,v = 2.The edges of weight one in χ k form disjoint cycles, so pick one such cycle C ,and let S be the set of vertices in C . (If there are no edges of weight one in χ k , let S be a set containing any single degree three vertex.) Let S be theset of degree two vertices v such that χ ki,v = 2 for some i ∈ S . Notice that χ k ( δ ( S ∪ S )) = 0, since the degree of any node, v ∈ S is exactly two in χ k , and the edge connecting v to S has weight two. χ k is a tour and thusmust be connected, which is only possible if S ∪ S is the entire vertex set of G (cid:48) , and therefore the cycle C must visit every degree three node in G (cid:48) . Thecorresponding cycle in the graph G would therefore be a Hamilton cycle. (cid:117)(cid:116) If one knew when the integer solution x ∗ were in the GTSP polytope,then this theorem would imply a polynomial time algorithm to determine if a3-regular graph is Hamiltonian, which is an N P -complete problem.The challenge that Naddef proposed never speciﬁcally deﬁned what makesa formulation simple. Certainly having all constraint sets be polynomially-sizedor polynomially-separable (in terms of n , the number of nodes) would qualifyas simple, but there may be other normal sets of constraints that could satisfythe spirit of Naddef’s challenge. One such example is Naddef’s conjecturethat simply using the three classes of inequalities from his 1985 paper (path,wheelbarrow, and bicycle inequalities) [5] with integrality constraints only onthe variables x , would be suﬃcient to formulate the problem. Since it is notknown if these three classes of inequalities can be separated in polynomialtime, the theorem above does not directly address this conjecture.However, if we are given an arbitrary constraint of the form ax ≥ b , it canbe recognized in polynomial time whether or not this constraint belongs to aparticular class of inequality (path, wheelbarrow, or bicycle) and whether ornot a potential solution x ∗ violates this constraint. If that potential solution x ∗ were not in the GTSP polytope, then there would be a polynomially sizedveriﬁcation of the graph G not being Hamiltonian, which would imply N P =co-

N P .This would apply to any integer programming formulation with a ﬁnitenumber of inequality classes that contain inequalties that are normal. In thiscase, we deﬁne normal to mean that the membership of any individual con-straint in a class can be veriﬁed in polynomial time.This implies Naddef’s challenge cannot be completed successfully usingnormal inequalities, unless

N P = co-

N P . However, our formulation followsits spirit, as the integer constrained variables in our formulation y have a one-to-one correspondence to the integer constrained variables x in the challenge. x e ∈ { , , } , whereas most formulations forother graph theory applications simply require x e ∈ { , } . In the variantof GTSP where doubled edges are disallowed, but nodes may still be visitedmultiple times, the formulation from above would be a solution to the Naddefchallenge, since in this case, x = y and z = . If there are degree 2 nodespresent in the graph, then disallowing doubles edges forces the tour across aparticular path, since the tour cannot visit this degree 2 node and return backalong the same edge.Assume we have an integer programming formulation for the GTSP. Thenany integer point x ∗ dominates a convex combination of GTSP tours, or itmust violate at least one inequality from this formulation. Therefore, using thegraphs G and G (cid:48) illustrated in Theorem 3 (where G is a 3-regular graph, and G (cid:48) is the same graph with every edge subdivided into two with a new degree2 node), the GTSP formulation when applied to G (cid:48) would certify whether thegraph G is Hamiltonian or non-Hamiltonian. If the constraint classes of theformulation are normal, as deﬁned at the end of the previous section, then thiscertiﬁcate can be constructed in polynomial time.In the case where G is not Hamiltonian, an integer programming formula-tion for the GTSP must have a violated constraint for any solution x ∗ where x ∗ ( E (cid:48) ) = | E (cid:48) | , where E (cid:48) is the edge set of G (cid:48) . Assuming n = | V | , the nodeset of G , and n (cid:48) = | V (cid:48) | , the node set of G (cid:48) , we can determine the size of | E (cid:48) | by noting that every edge in E (cid:48) connects a degree 3 node to a degree 2 node.Therefore, the set of degree 2 nodes can be represented by W = V (cid:48) \ V and | E (cid:48) | is equal to the number of degree 2 nodes in G (cid:48) times two, or 2( n (cid:48) − n ). Thenumber of degree 2 nodes in G (cid:48) is the same as the number of edges in G , whichis n , so we get n (cid:48) − n = n or n = n (cid:48) , and 2( n (cid:48) − n ) = 2( n (cid:48) − n (cid:48) ) = n (cid:48) .Since x ( δ ( v (cid:48) )) ≥ v (cid:48) ∈ G (cid:48) , adding these constraints over alldegree 2 nodes gives the constraint: (cid:80) v (cid:48) ∈ W x ( δ ( v (cid:48) )) = x ( E (cid:48) ) ≥ n (cid:48) Since the graph is not Hamiltonian, this contraint cannot be satisﬁed atequality, leading to x ( E (cid:48) ) > n (cid:48) . Since the expression n (cid:48) is equal to two timesan integer quantity ( n (cid:48) − n ), n (cid:48) must be an even integer. Furthermore, forany solution to the formulation, x ( δ ( v (cid:48) )) must be positive and even for everydegree 2 node v (cid:48) ∈ G (cid:48) , and thus x ( E (cid:48) ) must also be even, so the constraintcan be stated as: (cid:80) v (cid:48) ∈ W x ( δ ( v (cid:48) )) = x ( E (cid:48) ) ≥ n (cid:48) + 2But these inequalities do not make up a normal class, as deﬁned in theprevious section. This is because the validity of this inequality relies on thecertainty of G being non-Hamiltonian. We believe these inequalities can belifted to a complete graph with no coeﬃcient greater than 3 (we are quite surewe could do this with maximum coeﬃcient 4). New IP Formulation of Graphical TSP 11

Another interesting graph involving degree 2 nodes comes from subdividingan edge twice, creating a path of three edges with two intermediate degree 2nodes. Given any 3-regular, 3-edge connected graph, Haddadan et al. [16]showed that the point x ∗ where x e = 1 for every edge in such a graph will bein the GTSP polytope, even though every node in the graph has odd degree.Now imagine choosing any individual degree 3 node, call it v , and subdividingeach of the incident edges twice, creating six degree 2 nodes, two along eachpath. Now the solution x ∗ where x e = 1 for every edge cannot be in theGTSP polytope, since we can easily ﬁnd a violated 3-tooth comb inequalityby choosing v and its immediate neighbors for the handle, and the pairs ofadjacent degree 2 nodes as the teeth (see ﬁgure 2). Fig. 2

A violated 3-tooth comb inequality

Furthermore, consider a solution x ∗ that is in the GTSP polytope for agraph, where x e = 1 along each edge of a path of at least three edges connectingtwo higher-degree nodes with only degree 2 nodes along the interior of the path(see ﬁgure 3). Then every GTSP tour that makes up the convex combinationof tours for the solution x ∗ must also have x e = 1 along each edge of the path.To prove this for a path P of three edges, notice that x ( P ) ≥ x ∗ ( P ) = 3, we know x ( P ) = 3 for every GTSP tour in theconvex combination indicated by x ∗ . If x e = 2 for some edge in the path, thensince the path has three edges, x ( P ) ≥

4, and thus could not be in the convexcombination of tours indicated by x ∗ . Alternatively, consider a solution x ∗ that is in the GTSP polytope for a graph, where x e = 1 along each edge ofa path of only two edges connecting two higher-degree nodes with a degree 2node in the middle of the path. Note that an edge with x e = 2 may be in oneof the GTSP tours in the convex combination indicated by x ∗ , since one tourof weight could visit one edge of the path twice, and another tour of weight could visit the other edge twice. Fig. 3

A path of 3 edges connecting 2 higher-degree nodes with interior nodes of degree 22 Robert D. Carr, Neil Simonetti G = ( V d ∪ V s , E ) with V d ∩ V s = ∅ where V d represents the set of destination nodes and V s represents the set of Steinernodes.minimize (cid:80) e ∈ E c e x e subject to (cid:80) e ∈ δ ( v ) x e is positive and even ∀ v ∈ V d (cid:80) e ∈ δ ( v ) x e is even ∀ v ∈ V s (cid:80) e ∈ δ ( S ) x e ≥ ∀ S ⊂ V where S ∩ V d (cid:54) = ∅ , (cid:54) = V d x e ≥ ∀ e ∈ E.x e is integer ∀ e ∈ E. Note that only sets that include destination nodes need to have correspondingcut constraints, and these can be limited to sets where the intersection is | V d | or smaller. Again, these can be replaced by the ﬂow constraints in the styleproposed by Martin [14]. The constraints used in our computational results aresimilar to those in the multi-commodity ﬂow formulation found by Letchford,Nasiri, and Theis [13]. We were able to reduce the number of variables used byLetchford, et al. by a factor of 2, by setting many variables to 0, as we did withthe ﬂow variables in the formulation of section 4.1. Assume V d = { , , ..., d } and V s = { d + 1 , d + 2 , ...n } and designate city d as the home city. We include d − S = { k + 1 , k + 2 , ..., d } to node k using the values of x as edgecapacities. Since nodes in S are all sources, we can set ﬂow into these nodesto zero, as well as setting the ﬂow coming out of node k to zero. f ki,j = 0 ∀ j ∈ V d , k ∈ V d \ { d } with j > k, { i, j } ∈ Ef kk,i = 0 ∀ k ∈ V d \ { d } , { i, k } ∈ E (cid:80) i ∈ δ ( k ) f ki,k = 2 ∀ k ∈ V d \ { d } (cid:80) i ∈ δ ( j ) f ki,j − (cid:80) i ∈ δ ( j ) f kj,i = 0 ∀ j ∈ V, k ∈ V d \ { d } with either j < k or j ∈ V s f ki,j + f kj,i ≤ x e ∀ k ∈ V d \ { d } , { i, j } = e ∈ E z Path without Spanning TreesWhile the spanning tree constaints (3) of section 3.2 can prevent half- z pathswhen integrality of y is enforced, for the computational results in the next New IP Formulation of Graphical TSP 13 section, better integrality gaps were obtained by using the subtour eliminationconstraints plus the following, which prevents half- z paths (with three or moreedges) without requiring the integrality of y . (cid:88) e (cid:48) ∈ δ ( i ) x e (cid:48) + (cid:88) e (cid:48) ∈ δ ( j ) x e (cid:48) − z e ≥ ∀ e ∈ E (4)where i and j are endpoints of edge e .If z e = 1, this constraint is the subtour elimination constraint for the set { i, j } . If z e = 0, this is the sum of the degree constraints (lower bound) fornodes i and j . But in the middle of a path of length three or longer with edgesthat have y e = 0 and z e = , the left side of this constraint will only add tothree.It should be noted that this constraint can only be used when both end-points of e are destination nodes, since Steiner nodes do not have a lowerbound of degree 2, but could be degree zero.It should also be noted that if the GTSP instance is composed only of threepaths of length three between two speciﬁc nodes (see ﬁgure 1 from section3.1) constraints (2) from section 3.1 (those that enforce even degree) and (4)(deﬁned above) will be enough to close the entire integrality gap using an LPrelaxation. If the paths are all four or more edges long, this constraint will noteliminate the integrality gap, but will help. (See ﬁgure 4) Objective value 12 using only constraint (2)Objective value 13 using both constraints (2) and (4)Objective value 14 for an integer solution y e = 1 and z e = 0 y e = 0 and z e = y e = 0 and z e = 1 Fig. 4

Solutions from a 3-path conﬁguration of four edges each

As the paths get longer, the integrality gap slowly grows. The spanningtree constraints will be useful once the paths reach a length of at least seven.(See ﬁgure 5)Spanning tree constraints help close the integrality gap on these long-pathgraphs because any spanning tree must contain n − (cid:80) e ∈ E y e + z e ≥ n −

1, where n is the number of nodes in the graph. For a y e = 1 and z e = 0 y e = 0 and z e = y e = 0 and z e = 1 Fig. 5

Solutions from a 3-path conﬁguration of seven edges each relaxation on a graph that tries to save costs by employing frequent fractional z variables, this constraint limits the amount that can be saved. An optimalsolution for a relaxation including constraint (3) for the 3-path conﬁgurationof seven-edge paths is shown in ﬁgure 6. y e = and z e = y e = and z e = Fig. 6

Objective value 24 using constraints (2), (3), and (4)

Spanning tree constraints (3) did not contribute to smaller integrality gapsin our computational results of section 6 when added to the LP relaxationconsisting of the subtour elimination constraints (or their compact equivalent),and the constraints in (4) designed speciﬁcally to prevent the short half- z path.5.3 Removing Steiner NodesRemoving Steiner nodes increase the eﬀectiveness of constraints in (4). A graphwith Steiner nodes G = ( V d ∪ V s , E ) can be transformed into a graph withoutSteiner nodes G (cid:48) = ( V d , E (cid:48) ) by doing the following:For each pair of nodes i, j ∈ V d , if the shortest path from i to j in G contains no other nodes in V d , then add an edge connecting i to j to E (cid:48) witha cost equal to the cost of this shortest path; otherwise, do not add an edgefrom i to j to E (cid:48) .In all but one of our test problems (see section 6), removing Steiner nodesresulted in fewer, not more, edges in the original instance. That removing New IP Formulation of Graphical TSP 15

Steiner nodes often reduces the total edges in a graph was also observed byCorber´an, Letchford, and Sanchis [4].

Fig. 7

Map of basic United States highway system

Our search for a reasonable sized data set based on the interstate high-ways of the United States led us to a text ﬁle uploaded by Sergiy Kolodyazh-nyy on GitHub [11]. After a few errors were corrected and additions made,we had a highway network with 216 nodes and 358 edges, with a maxi-mum degree node of seven (Indianapolis). See ﬁgure 7. Data for this graph,and the cities used to create the instances in this section, may be found athttps://ns.faculty.brynathyn.edu/interstate/

Table 1

GTSP instances Solutionname desccription destinations (miles)dakota3path 3-path conﬁguarion in northern plains 11 2682NFLcities Cities with National Football League teams 29 11050NWcities Cities in the Northwest region 43 8119CAPcities 48 state capitals plus Washington D.C. 49 14878AtoJcities Cities beginning with letters from A to J 101 17931ESTcities Cities east of the Mississippi River 139 13251MSAcities Centers of 145 metropolitan statistical areas 145 22720deg3cities Cities in original graph with degree ≥ Instances were created from this map by choosing a subset of cities as desti-nation nodes, and adding any cities along a shortest path between destinationsas Steiner nodes. Alternate versions of these instances were constructed by re-moving the Steiner nodes as indicated in section 5.3. Table 1 gives the basicinformation for several instances we used. Table 2 shows the results from run-ning the relaxation of the formulation from Cornu´ejols et al. [5]. It shouldbe noted that the solutions found by this relaxation were the same whetherSteiner nodes were removed or not.

Table 2

Relaxations from Cornu´ejols et al. formulation [5]destinations edges integralityname (Steiner nodes) (w/o Steiner) gap (%)dakota3path 11 (0) 12 (12) 139 (5.18%)NFLcities 29 (152) 304 (135) 35 (0.32%)NWcities 43 (4) 63 (59) 12 (0.15%)CAPcities 49 (132) 301 (199) 34 (0.23%)AtoJcities 101 (95) 326 (289) 261.5 (1.45%)ESTcities 139 (2) 243 (240) 61.4 (0.46%)MSAcities 145 (63) 348 (317) 143 (0.63%)deg3cities 171 (16) 321 (305) 70 (0.37%)NScities 174 (29) 341 (324) 93.5 (0.42%)CtoWcities 182 (30) 353 (358) 151 (0.62%)ALLcities 216 (0) 358 (358) 274.8 (1.04%)

Table 3

Relaxations from our additional constraintsintegrality gap with integrality gap w/o best % of gap closedname Steiner nodes (%) Steiner nodes (%) from formulation in [5]dakota3path - 0 (0%) 100%NFLcities 35 (0.32%) same as Steiner 0%NWcities 8 (0.10%) same as Steiner 33.3%CAPcities 34 (0.23%) same as Steiner 0%AtoJcities 228.5 (1.45%) same as Steiner 12.6%ESTcities 53.9 (0.46%) same as Steiner 12.2%MSAcities 114.5 (0.50%) 98.5 (0.43%) 31.1%deg3cities 70 (0.37%) same as Steiner 0%NScities 48.5 (0.22%) same as Steiner 48.1%CtoWcities 103 (0.42%) 111.5 (0.46%) 31.8%ALLcities - 217.8 (0.82%) 20.7%

Running times on a 2.1GHz Xeon processor for all of the relaxations wereunder 10 seconds, while the running times to generate the integer solutionsnever exceeded ﬁve minutes. We wish to point out that the value of the newformulation is not a faster running time, but the reduced integrality gap.In this paper, the integrality gap refers to the diﬀerence in objective valuesbetween the program where integer constraints are enforced and the program

New IP Formulation of Graphical TSP 17 where the integer constraints are relaxed. The percentage is the gap size ex-pressed as a percentage of the integer solution value. This is diﬀerent than theratio deﬁnitions of integrality gap used in some other contexts. [2]When our constraints were added, the spanning tree constraints (3) werenot useful when (2) and (4) were present. In most cases, removing Steinernodes did not change the optimal values found by our relaxation. In one case,the relaxation was better when the Steiner nodes were removed, and in onecase, the relaxation was worse when the Steiner nodes were removed. Table 3shows our results, where the last column indicates the percentage that ourformulation closed of the gap left by the formulation of Cornu´ejols et al. [5].We noticed that in instances where the z e variables were rarely positive,our relaxation fared no better than that of Cornu´ejols et al. But when thenumber of edges with values of z e > x e >

0, we were able to shave anywhere from 10% to almost 50% of thegap left behind by Cornu´ejols et al. (See ﬁgure 8) P e r ce n t o f x e > e d g e s a l s o w i t h z e > Fig. 8

Scatter plot of integrality gap closure and percent of variables with z e > In section 2, we noted that the constraints of requiring a sum of variables tobe even was unique and diﬃcult, which was a reason Naddef proposed hischallenge, described in section 4. At ﬁrst glance, these constraints may appearto have the same type of structure as the T-join problem, but this is not thecase, as seen in a book by Cook, et al. [3].