[PDF] ADMM for the SDP relaxation of the QAP

Abstract

The semidefinite programming (SDP) relaxation has proven to be extremely strong for many hard discrete optimization problems. This is in particular true for the quadratic assignment problem (QAP), arguably one of the hardest NP-hard discrete optimization problems. There are several difficulties that arise in efficiently solving the SDP relaxation, e.g.,~increased dimension; inefficiency of the current primal-dual interior point solvers in terms of both time and accuracy; and difficulty and high expense in adding cutting plane constraints. We propose using the alternating direction method of multipliers (ADMM) to solve the SDP relaxation. This first order approach allows for inexpensive iterations, a method of cheaply obtaining low rank solutions, as well a trivial way of adding cutting plane inequalities. When compared to current approaches and current best available bounds we obtain remarkable robustness, efficiency and improved bounds.

Full PDF

AADMM for the

SDP relaxation of the

QAP ∗ Danilo Elias Oliveira † Henry Wolkowicz ‡ Yangyang Xu § December 18, 2015

Abstract

The semideﬁnite programming

SDP relaxation has proven to be extremely strong for many harddiscrete optimization problems. This is in particular true for the quadratic assignment problem

QAP ,arguably one of the hardest NP-hard discrete optimization problems. There are several diﬃculties thatarise in eﬃciently solving the

SDP relaxation, e.g., increased dimension; ineﬃciency of the current primal-dual interior point solvers in terms of both time and accuracy; and diﬃculty and high expense in addingcutting plane constraints.We propose using the alternating direction method of multipliers

ADMM to solve the

SDP relaxation.This ﬁrst order approach allows for inexpensive iterations, a method of cheaply obtaining low rank solu-tions, as well a trivial way of adding cutting plane inequalities. When compared to current approachesand current best available bounds we obtain remarkable robustness, eﬃciency and improved bounds.

Keywords:

Quadratic assignment problem, semideﬁnite programming relaxation, alternating directionmethod of moments, large scale.

Classiﬁcation code:

Contents

QAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.3 Low-rank solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83.4 Diﬀerent choices for V, (cid:98) V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 ∗ This work is partially supported by NSERC and AFOSR. † Dept. of Combinatorics and Optimization, University of Waterloo. ‡ Dept. of Combinatorics and Optimization, University of Waterloo. Research supported by The Natural Sciences andEngineering Research Council of Canada and by AFOSR. Email: [email protected] § Institute for Mathematics and its Applications (IMA), University of Minnesota a r X i v : . [ m a t h . O C ] D ec ist of Tables e −

5. . . . . . . . . . . . . . . . . . . . . . . . 10

The quadratic assignment problem ( QAP ), in the trace formulation is p ∗ X := min X ∈ Π n (cid:104) AXB − C, X (cid:105) , (1.1)where A, B ∈ S n are real symmetric n × n matrices, C is a real n × n matrix, (cid:104)· , ·(cid:105) denotes the trace innerproduct, (cid:104) Y, X (cid:105) = trace

Y X (cid:62) , and Π n denotes the set of n × n permutation matrices. A typical objective ofthe QAP is to assign n facilities to n locations while minimizing total cost. The assignment cost is the sumof costs using the ﬂows in A ij between a pair of facilities i, j multiplied by the distance in B st between theirassigned locations s, t and adding on the location costs of a facility i in a position s given in C is .It is well known that the QAP is an NP-hard problem and that problems with size as moderate as n = 30 still remain diﬃcult to solve. Solution techniques rely on calculating eﬃcient lower bounds. Animportant tool for ﬁnding lower bounds is the work in [13] that provides a semideﬁnite programmming ( SDP ), relaxation of (1.1). The methods of choice for

SDP are based on a primal-dual interior-point, p-d i-p , approach. These methods cannot solve large problems, have diﬃculty in obtaining high accuracy solutionsand cannot properly exploit sparsity. Moreover, it is very expensive to add on nonnegativity and cuttingplane constraints. The current state for ﬁnding bounds and solving

QAP is given in e.g., [1, 2, 4, 7, 9].In this paper we study an alternating direction method of multipliers ( ADMM ), for solving the

SDP relaxationof the

QAP . We compare this with the best known results given in [9] and with the best known boundsfound at SDPLIB [5]. and with a p-d i-p methods based on the so-called HKM direction. We see that the

ADMM method is signiﬁcantly faster and obtains high accuracy solutions. In addition there are advantagesin obtaining low rank

SDP solutions that provide better feasible approximations for the

QAP for upperbounds. Finally, it is trivial to add nonnegativity and rounding constraints while iterating so as to obtainsigniﬁcantly stronger bounds and also maintain sparsity during the iterations.We note that previous success for

ADMM for

SDP in presented in [12]. A detailed survey article for

ADMM can be found in [3].

We start the derivation from the following equivalent quadratically constrained quadratic problemmin X (cid:104) AXB − C, X (cid:105) s.t. X ij X ik = 0 , X ji X ki = 0 , ∀ i, ∀ j (cid:54) = k,X ij − X ij = 0 , ∀ i, j, (2.1) n (cid:88) i =1 X ij − , ∀ j, n (cid:88) j =1 X ij − , ∀ i. Remark 2.1.

Note that the quadratic orthogonality constraints X (cid:62) X = I, XX (cid:62) = I , and the linear rowand column sum constraints Xe = e, X (cid:62) e = e can all be linearly represented using linear combinations ofthose in (2.1) .In addition, the ﬁrst set of constraints, the elementwise orthogonality of the row and columns of X , arereferred to as the gangster constraints. They are particularly strong constraints and enable many of the otherconstraints to be redundant. In fact, after the facial reduction done below, many of these constraints alsobecome redundant. (See the deﬁnition of the index set J below.) L ( X, U, V, W, u, v ) = (cid:104)

AXB − C, X (cid:105) + n (cid:88) i =1 (cid:88) j (cid:54) = k U ( i ) jk X ij X ik + n (cid:88) i =1 (cid:88) j (cid:54) = k V ( i ) jk X ji X ki + (cid:88) i,j W ij ( X ij − X ij )+ n (cid:88) j =1 u j (cid:32) n (cid:88) i =1 X ij − (cid:33) + n (cid:88) i =1 v i  n (cid:88) j =1 X ij −  . The dual problem is a maximization of the dual functional d ,max d ( U, V, W, u, v ) := min X L ( X, U, V, W, u, v ) . (2.2)To simplify the dual problem, we homogenize the X terms in L by multiplying a unit scalar x to degree-1terms and adding the single constraint x = 1 to the Lagrangian. We let L ( X, x , U, V, W, w , u, v ) = (cid:104) AXB − x C, X (cid:105) + n (cid:88) i =1 (cid:88) j (cid:54) = k U ( i ) jk X ij X ik + n (cid:88) i =1 (cid:88) j (cid:54) = k V ( i ) jk X ji X ki + (cid:88) i,j W ij ( X ij − x X ij )+ n (cid:88) j =1 u j (cid:32) n (cid:88) i =1 X ij − (cid:33) + n (cid:88) i =1 v i  n (cid:88) j =1 X ij −  + w ( x − . This homogenization technique is the same as that in [13]. The new dual problem ismax d ( U, V, W, w , u, v ) := min X,x L ( X, x , U, V, W, w , u, v ) . (2.3)Note that d ≤ d . Hence, our relaxation still yields a lower bound to (2.1). In fact, the relaxations givethe same lower bound. This follows from strong duality of the trust region subproblem as shown in [13].Let x = vec( X ), y = [ x ; x ], and w = vec( W ), where x, w is the vectorization, columnwise, of X and W ,respectively. Then L ( X, x , U, V, W, w , u, v ) = y (cid:62) [ L Q + B ( U ) + B ( V ) + Arrow( w, w ) + K ( u ) + K ( v )] y − e (cid:62) ( u + v ) − w , where K ( u ) = blkdiag(0 , u ⊗ I ) , K ( v ) = blkdiag(0 , I ⊗ v ) , Arrow( w, w ) = (cid:20) w − w (cid:62) − w Diag( w ) (cid:21) and B ( U ) = blkdiag(0 , ˜ U ) , B ( V ) = blkdiag(0 , ˜ V ) . Here, ˜ U and ˜ V are n × n block matrices. ˜ U has zero diagonal blocks and the ( j, k )-th oﬀ-diagonal block tobe the diagonal matrix Diag( U (1) jk , . . . , U ( n ) jk ) for all j (cid:54) = k , and ˜ V has zero oﬀ-diagonal blocks and the i -thdiagonal block to be  V ( i )12 · · · V ( i )1 n V ( i )21 · · · V ( i )2 n ... ... . . . ... V ( i ) n V ( i ) n · · ·  . Hence, the dual problem (2.3) ismax − e (cid:62) ( u + v ) − w (2.4)s.t. L Q + B ( U ) + B ( V ) + Arrow( w, w ) + K ( u ) + K ( v ) (cid:23) . SDP relaxation of (2.1):min (cid:104) L Q , Y (cid:105) s.t. G J ( Y ) = E , diag( ¯ Y ) = y , (2.5)trace( ˜ Y ii ) = 1 , ∀ i, n (cid:88) i =1 ˜ Y ii = I,Y (cid:23) , where ˜ Y ij is an n × n matrix for each ( i, j ), and we have assumed the block structure Y = (cid:20) y y (cid:62) y ¯ Y (cid:21) ; ¯ Y made of n × n block matrices ˜ Y = ( ˜ Y ij ) . (2.6)The index set J and the gangster operator G J are deﬁned properly below in Deﬁnition 2.1. (By abuse ofnotation this is done after the facial reduction which results in a smaller J .) Remark 2.2.

If one more feasible quadratic constraint q ( X ) can be added to (2.1) and q ( X ) cannot belinearly represented by those in (2.1) , the relaxation following the same derivation as above can be tighter.We conjecture that no more such q ( X ) exists, and thus (2.5) is the tightest among all Lagrange dual relaxationfrom a quadratically constrained program like (2.1) . However, this does not mean that more linear inequalityconstraints cannot be added, i.e., linear cuts. Theorem 2.1 ( [13]) . The matrix Y is feasible for (2.5) if, and only if , it is feasible for (3.1) . As above, let x = vec X ∈ R n be the vectorization of X by column. Y is the original matrix variable ofthe SDP relaxation before the facial reduction. It can be motivated from the lifting Y = (cid:18) X (cid:19) (cid:18) X (cid:19) (cid:62) .The SDP relaxation of

QAP presented in [13] uses facial reduction to guarantee strict feasibility. The

SDP obtained is p ∗ R := min R (cid:104) L Q , ˆ V R ˆ V (cid:62) (cid:105) s.t. G J ( ˆ V R ˆ V (cid:62) ) = E R (cid:23) , (2.7)where the so-called gangster operator, G J , ﬁxes all elements indexed by J and zeroes out all others, L Q = (cid:20) − vec( C ) (cid:62) − vec( C ) B ⊗ A (cid:21) , ˆ V = (cid:20) n e V ⊗ V (cid:21) (2.8)with e being the vector of all ones , of appropriate dimension and V ∈ R n × ( n − being a basis matrix of theorthogonal complement of e , e.g., V = (cid:34) I n − − e (cid:35) . We let Y = ˆ V R ˆ V (cid:62) ∈ S n +1 . Lemma 2.1 ( [13]) . The matrix ˆ R deﬁned by ˆ R := (cid:34) n ( n − ( nI n − − E n − ) ⊗ ( nI n − − E n − ) (cid:35) ∈ S ( n − +1++ is (strictly) feasible for (2.7) . Deﬁnition 2.1.

The gangster operator G J : S n +1 → S n +1 and is deﬁned by G J ( Y ) ij = (cid:26) Y ij if ( i, j ) ∈ J or ( j, i ) ∈ J otherwise y abuse of notation, we let the same symbol denote the projection onto R | J | . We get the two equivalentprimal constraints: G J ( ˆ V R ˆ V (cid:62) ) = E ∈ S n +1 ; G J ( ˆ V R ˆ V (cid:62) ) = G J ( E ) ∈ R | J | . Therefore, the dual variable for the ﬁrst form is Y ∈ S n +1 . However, the dual variable for the second formis y ∈ R | J | with the adjoint now yielding Y = G ∗ J ( y ) ∈ S n +1 obtained by symmetrization and ﬁlling in themissing elements with zeros.The gangster index set, J is deﬁned to be (00) union the set of of indices i < j in the matrix ¯ Y in (2.6) corresponding to:1. the oﬀ-diagonal elements in the n diagonal blocks;2. the diagonal elements in the oﬀ-diagonal blocks except for the last column of oﬀ-diagonal blocks andalso not the ( n − , ( n − oﬀ-diagonal block. (These latter oﬀ-diagonal block constraints are redundantafter the facial reduction.) We note that the gangster operator is self-adjoint, G ∗ J = G J . Therefore, the dual of (2.7) can be writtenas the following. d ∗ Y := max Y (cid:104) E , Y (cid:105) (= Y )s.t. ˆ V (cid:62) G J ( Y ) ˆ V (cid:22) ˆ V (cid:62) L Q ˆ V (2.9)Again by abuse of notation, using the same symbol twice, we get the two equivalent dual constraints:ˆ V (cid:62) G J ( Y ) ˆ V (cid:22) ˆ V (cid:62) L Q ˆ V ; ˆ V (cid:62) G ∗ J ( y ) ˆ V (cid:22) ˆ V (cid:62) L Q ˆ V .

As above, the dual variable for the ﬁrst form is Y ∈ S n +1 and for the second form is y ∈ R | J | . We haveused G ∗ for the second form to emphasize that only the ﬁrst form is self-adjoint. Lemma 2.2 ( [13]) . The matrices ˆ Y , ˆ Z , with M > suﬃciently large, deﬁned by ˆ Y := M (cid:34) n I n ⊗ ( I n − E n ) (cid:35) ∈ S ( n − +1++ , ˆ Z := ˆ V (cid:62) L Q ˆ V − ˆ V (cid:62) G J ( ˆ Y ) ˆ V ∈ S ( n − +1++ . and are (strictly) feasible for (2.9) . We can write (2.7) equivalently asmin

R,Y (cid:104) L Q , Y (cid:105) , s.t. G J ( Y ) = E , Y = ˆ V R ˆ V (cid:62) , R (cid:23) . (3.1)The augmented Lagrange of (3.1) is L A ( R, Y, Z ) = (cid:104) L Q , Y (cid:105) + (cid:104) Z, Y − ˆ V R ˆ V (cid:62) (cid:105) + β (cid:107) Y − ˆ V R ˆ V (cid:62) (cid:107) F . (3.2)Recall that ( R, Y, Z ) are the primal reduced, primal, and dual variables respectively. We denote (

R, Y, Z )as the current iterate . We let S rn + denote the matrices in S n + with rank at most r . Our new algorithm is anapplication of the alternating direction method of multipliers ADMM , that uses the augmented Lagrangianin (3.2) and performs the following updates for ( R + , Y + , Z + ): R + = arg min R ∈ S rn + L A ( R, Y, Z ) , (3.3a) Y + = arg min Y ∈P i L A ( R + , Y, Z ) , (3.3b) Z + = Z + γ · β ( Y + − ˆ V R + ˆ V (cid:62) ) , (3.3c)5here the simplest case for the polyhedral constraints P i is the linear manifold from the gangster constraints : P = { Y ∈ S n +1 : G J ( Y ) = E } We use this notation as we add additional simple polyhedral constraints. The second case is the polytope: P = P ∩ { ≤ Y ≤ } . Let ˆ V be normalized such that ˆ V (cid:62) ˆ V = I . Then if r = n , the R -subproblem can be explicitly solved by R + = arg min R (cid:23) (cid:104) Z, Y − ˆ V R ˆ V (cid:62) (cid:105) + β (cid:107) Y − ˆ V R ˆ V (cid:62) (cid:107) F = arg min R (cid:23) (cid:13)(cid:13)(cid:13) Y − ˆ V R ˆ V (cid:62) + β Z (cid:13)(cid:13)(cid:13) F = arg min R (cid:23) (cid:13)(cid:13)(cid:13) R − ˆ V (cid:62) (cid:0) Y + β Z (cid:1) ˆ V (cid:13)(cid:13)(cid:13) F = P S + (cid:16) ˆ V (cid:62) (cid:0) Y + β Z (cid:1) ˆ V (cid:17) , (3.4)where S + denotes the SDP cone, and P S + is the projection to S + . For any symmetric matrix W , we have P S + ( W ) = U + Σ + U (cid:62) + , where ( U + , Σ + ) contains the positive eigenpairs of W and ( U − , Σ − ) the negative eigenpairs.If i = 1 in (3.3b), the Y -subproblem also has closed-form solution: Y + = arg min G J ( Y )= E (cid:104) L Q , Y (cid:105) + (cid:104) Z, Y − ˆ V R + ˆ V (cid:62) (cid:105) + β (cid:107) Y − ˆ V R + ˆ V (cid:62) (cid:107) F = arg min G J ( Y )= E (cid:13)(cid:13)(cid:13)(cid:13) Y − ˆ V R + ˆ V (cid:62) + L Q + Zβ (cid:13)(cid:13)(cid:13)(cid:13) F = E + G J c (cid:18) ˆ V R + ˆ V (cid:62) − L Q + Zβ (cid:19) (3.5)The advantage of using ADMM is that its complexity only slightly increases while we add more con-straints to (2.7) to tighten the

SDP relaxation. If 0 ≤ ˆ V R ˆ V (cid:62) ≤ ≤ Y ≤ p ∗ RY := min R,Y (cid:104) L Q , Y (cid:105) , s.t. G J ( Y ) = E , ≤ Y ≤ , Y = ˆ V R ˆ V (cid:62) , R (cid:23) . (3.6)The ADMM for solving (3.6) has the same R -update and Z -update as those in (3.3), and the Y -update ischanged to Y + = E + min (cid:18) , max (cid:18) , G J c (cid:0) ˆ V R + ˆ V (cid:62) − L Q + Zβ (cid:1)(cid:19)(cid:19) . (3.7)With nonnegativity constraint, the less-than-one constraint is redundant but makes the algorithm convergefaster. If we solve (2.7) or (3.1) exactly or to a very high accuracy, we get a lower bound of the original

QAP .However, the problem size of (2.7) or (3.1) can be extremely large, and thus having an exact or highlyaccurate solution may take extremely long time. In the following, we provide an inexpensive way to get alower bound from the output of our algorithm that solves (3.1) to a moderate accuracy. Let ( R out , Y out , Z out )be the output of the ADMM for (3.6). 6 emma 3.1.

Let R := { R (cid:23) } , Y := { Y : G J ( Y ) = E , ≤ Y ≤ } , Z := { Z : ˆ V (cid:62) Z ˆ V (cid:22) } . Deﬁne the

ADMM dual function g ( Z ) := min Y ∈Y {(cid:104) L Q + Z, Y (cid:105)} . Then the dual problem of

ADMM (3.6) is deﬁned as follows and satisﬁes weak duality. d ∗ Z := max Z ∈Z g ( Z ) ≤ p ∗ R . Proof.

The dual problem of (3.6) can be derived as d ∗ Z := max Z min R ∈R ,Y ∈Y (cid:104) L Q , Y (cid:105) + (cid:104) Z, Y − ˆ V R ˆ V (cid:62) (cid:105) = max Z min Y ∈Y (cid:104) L Q , Y (cid:105) + (cid:104) Z, Y (cid:105) + min R ∈R (cid:104) Z, − ˆ V R ˆ V (cid:62) (cid:105) = max Z min Y ∈Y (cid:104) L Q , Y (cid:105) + (cid:104) Z, Y (cid:105) + min R ∈R (cid:104) ˆ V (cid:62) Z ˆ V , − R (cid:105) = max Z ∈Z min Y ∈Y (cid:104) L Q + Z, Y (cid:105) , = max Z ∈Z g ( Z )Weak duality follows in the usual way by exchanging the max and min.For any Z ∈ Z , we have g ( Z ) is a lower bound of (3.6) and thus of the original QAP . We use the dualfunction value of the projection g (cid:0) P Z ( Z out ) (cid:1) as the lower bound, and next we show how to get P Z ( ˜ Z ) forany symmetric matrix ˜ Z .Let ˆ V ⊥ be the orthonormal basis of the null space of ˆ V . Then ¯ V = ( ˆ V , ˆ V ⊥ ) is an orthogonal matrix. Let¯ V (cid:62) Z ¯ V = W = (cid:20) W W W W (cid:21) , and we haveˆ V (cid:62) Z ˆ V (cid:22) ⇔ ˆ V (cid:62) Z ˆ V = ˆ V (cid:62) ¯ V W ¯ V (cid:62) ˆ V = W (cid:22) . Hence, P Z ( ˜ Z ) = arg min Z ∈Z (cid:107) Z − ˜ Z (cid:107) F = arg min W (cid:22) (cid:107) ¯ V W ¯ V (cid:62) − ˜ Z (cid:107) F = arg min W (cid:22) (cid:107) W − ¯ V (cid:62) ˜ Z ¯ V (cid:107) F = (cid:20) P S − ( ˜ W ) ˜ W ˜ W ˜ W (cid:21) , where S − denotes the negative semideﬁnite cone, and we have assumed ¯ V (cid:62) ˜ Z ¯ V = (cid:20) ˜ W ˜ W ˜ W ˜ W (cid:21) . Note that P S − ( W ) = −P S + ( − W ). Let ( R out , Y out , Z out ) be the output of the ADMM for (3.6). Assume the largest eigenvalue and the corre-sponding eigenvector of Y are λ and v . We let X out be the matrix reshaped from the second through thelast elements of the ﬁrst column of λvv (cid:62) . Then we solve the linear programmax X (cid:104) X out , X (cid:105) , s.t. Xe = e, X (cid:62) e = e, X ≥ .3 Low-rank solution Instead of ﬁnding a feasible solution through (3.8), we can directly get one by restricting R to a rank-onematrix, i.e., rank( R ) = 1 and R ∈ S + . With this constraint, the R -update can be modiﬁed to R + = P S + ∩R (cid:18) ˆ V (cid:62) (cid:0) Y + Zβ (cid:1) ˆ V (cid:19) , (3.9)where R = { R : rank( R ) = 1 } denotes the set of rank-one matrices. For a symmetric matrix W with largesteigenvalue λ > w , we have P S + ∩R = λww (cid:62) . V, (cid:98) V The matrix (cid:98) V is essential in the steps of the algorithm, see e.g., (3.4). A sparse (cid:98) V helps in the projection ifone is using a sparse eigenvalue code. We have compared several. One is based on applying a QR algorithmto the original simple V from the deﬁnition of ˆ V in (2.8). The other two are based on the approach in [10]and we present the most successful here. The orthogonal V we use is V = (cid:20) I (cid:98) n (cid:99) ⊗ √ (cid:20) − (cid:21)(cid:21) ( n − (cid:98) n (cid:99) ) , (cid:98) n (cid:99)   I (cid:98) n (cid:99) ⊗  − −  ( n − (cid:98) n (cid:99) ) , (cid:98) n (cid:99)  (cid:2) . . . (cid:3) (cid:104) (cid:98) V (cid:105) n × n − i.e., the block matrix consisting of t blocks formed from Kronecker products along with one block (cid:98) V tocomplete the appropriate size so that V (cid:62) V = I n − , V (cid:62) e = 0. We take advantage of the 0, 1 structure of theKronecker blocks and delay the scaling for the normalization till the end. The main work in the low rankprojection part of the algorithm is to evaluate one (or a few) eigenvalues of W = (cid:98) V (cid:62) ( Y + β Z ) ˆ V to obtainthe update R + . Y + 1 β Z = (cid:20) ρ w (cid:62) w ¯ W (cid:21) . We let K := V ⊗ V, α = 1 / √ , v = 1 √ n e, x = (cid:18) x ¯ x (cid:19) . The structure for (cid:98) V in (2.8) means that we can evaluate the product for W x as (cid:20) α v K (cid:21) (cid:62) (cid:20) ρ w (cid:62) w ¯ W (cid:21) (cid:20) α v K (cid:21) x = (cid:20) α v K (cid:21) (cid:62) (cid:20) ρ w (cid:62) w ¯ W (cid:21) (cid:18) αx x v + K ¯ x (cid:19) = (cid:20) α v (cid:62) K (cid:62) (cid:21) (cid:18) ραx + w (cid:62) ( x v + K ¯ x ) αx w + ¯ W ( x v + K ¯ x ) (cid:19) = (cid:18) ρα x + αw (cid:62) ( x v + K ¯ x ) + v (cid:62) (cid:0) αx w + ¯ W ( x v + K ¯ x ) (cid:1) K (cid:62) (cid:0) αx w + ¯ W ( x v + K ¯ x ) (cid:1) (cid:19) = (cid:18) ρα x + (cid:0) αw (cid:62) + v (cid:62) ¯ W (cid:1) ( x v + K ¯ x ) + v (cid:62) ( αx w ) K (cid:62) (cid:0) αx w + ¯ W ( x v + K ¯ x ) (cid:1) (cid:19) . We emphasize that V ⊗ V = ( ¯ V ⊗ ¯ V ) / ( D ⊗ D ), where ¯ V denotes the unscaled V , D is the diagonalmatrix of scale factors to obtain the orthogonality in V , and / denotes the MATLAB division on the right,multiplication by the inverse on the right. Therefore, we can evaluate K (cid:62) ¯ W K = ( V ⊗ V ) (cid:62) ¯ W ( V ⊗ V ) = ( ¯ V ⊗ ¯ V ) (cid:62) (cid:2) ( D ⊗ D ) \ ¯ W / ( D ⊗ D ) (cid:3) ( ¯ V ⊗ ¯ V ) . Numerical experiments

We illustrate our results in Table 1 on the forty ﬁve

QAP instances I and II, see [5, 6, 9]. The optimalsolutions are in column 1 and current best known lower bounds from [9] are in column 3 marked bundle .The p-d i-p lower bound is given in the column marked

HKM-FR . (The code failed to ﬁnd a lower bound onseveral problems marked − SDP relaxationand exploiting the low rank (one and two) of the constraints. We used SDPT3 [11]. Our

ADMM lower bound follows in column 4. We see that it is at least as good as the current bestknown bounds in every instance. The percent improvement is given in column 7. We then present the bestupper bounds from our heuristics in column 5. This allows us to calculate the percentage gap in column 6.The CPU seconds are then given in the last columns 8 − γ = 1 .

618 and β = n in ADMM . We used two diﬀerent tolerances 1 e − , e − SDP to the higher accuracy did not improve the bounds. However, it is interesting that the

ADMM approach was able to solve the

SDP relaxations to such high accuracy, something the p-d i-papproach has great diﬃculty with. We provide the CPU times for both accuracies. Our times are signiﬁcantlylower than those reported in [4, 9], e.g., from 10 hours to less than an hour.We emphasize that we have improved bounds for all the

SDP instances and have provably found exactsolutions six of the instances Had12,14,16,18, Rou12, Tai12a. This is due to the ability to add all thenonnegativity constraints and rounding numbers to 0 , ADMM algorithm. We do not include the times as they were much greater than for the ADMM approach, e.g., hours instead of minutes anda day instead of an hour. . 2. 3. 4. 5. 6. 7. ADMM 8 Tol5 9 Tol5 10 Tol12/5 11 HKMopt Bundle [9] HKM-FR ADMM feas ADMM vs Bundle cpusec cpusec cpuratio cpuratiovalue LowBnd LowBnd LowBnd UpBnd %gap %Impr LowBnd HighRk LowRk HighRk Tol 9Esc16a 68 59 50 64 72 11.76 7.35 2.30e+01 4.02 4.14 9.37Esc16b 292 288 276 290 300 3.42 0.68 3.87e+00 4.55 2.15 8.08Esc16c 160 142 132 154 188 21.25 7.50 1.09e+01 8.09 4.53 4.88Esc16d 16 8 -12 13 18 31.25 31.25 2.14e+01 3.69 4.87 10.22Esc16e 28 23 13 27 32 17.86 14.29 3.02e+01 4.29 4.80 8.79Esc16g 26 20 11 25 28 11.54 19.23 4.24e+01 4.27 2.72 8.63Esc16h 996 970 909 977 996 1.91 0.70 4.91e+00 3.53 2.33 10.60Esc16i 14 9 -21 12 14 14.29 21.43 1.37e+02 4.30 2.39 8.76Esc16j 8 7 -4 8 14 75.00 12.50 8.95e+01 4.80 3.83 7.93Had12 1652 1643 1641 1652 1652 0.00 0.54 1.02e+01 1.08 1.06 5.91Had14 2724 2715 2709 2724 2724 0.00 0.33 3.23e+01 1.69 1.19 10.46Had16 3720 3699 3678 3720 3720 0.00 0.56 1.75e+02 3.15 1.04 12.51Had18 5358 5317 5287 5358 5358 0.00 0.77 4.49e+02 6.00 2.22 13.28Had20 6922 6885 6848 6922 6930 0.12 0.53 3.85e+02 12.15 4.20 14.53Kra30a 149936 136059 -1111 143576 169708 17.43 5.01 5.88e+03 149.32 2.22 1111.11Kra30b 91420 81156 -1111 87858 105740 19.56 7.33 4.36e+03 170.57 3.01 1111.11Kra32 88700 79659 -1111 85775 103790 20.31 6.90 3.57e+03 200.26 4.28 1111.11Nug12 578 557 530 568 632 11.07 1.90 2.60e+01 1.04 6.61 5.93Nug14 1014 992 960 1011 1022 1.08 1.87 7.15e+01 1.87 5.06 8.43Nug15 1150 1122 1071 1141 1306 14.35 1.65 9.10e+01 3.31 5.90 7.79Nug16a 1610 1570 1528 1600 1610 0.62 1.86 1.81e+02 3.06 3.28 12.24Nug16b 1240 1188 1139 1219 1356 11.05 2.50 9.35e+01 3.19 6.23 11.83Nug17 1732 1669 1622 1708 1756 2.77 2.25 2.31e+02 4.34 3.63 13.13Nug18 1930 1852 1802 1894 2160 13.78 2.18 4.16e+02 5.47 2.43 15.23Nug20 2570 2451 2386 2507 2784 10.78 2.18 4.76e+02 11.56 3.75 14.35Nug21 2438 2323 2386 2382 2706 13.29 2.42 1.41e+03 15.32 1.68 14.95Nug22 3596 3440 3396 3529 3940 11.43 2.47 2.07e+03 21.82 1.39 13.90Nug24 3488 3310 -1111 3402 3794 11.24 2.64 1.20e+03 29.64 3.29 1111.11Nug25 3744 3535 -1111 3626 4060 11.59 2.43 3.12e+03 39.23 1.65 1111.11Nug27 5234 4965 -1111 5130 5822 13.22 3.15 5.11e+03 78.18 1.58 1111.11Nug28 5166 4901 -1111 5026 5730 13.63 2.42 4.11e+03 83.38 2.17 1111.11Nug30 6124 5803 -1111 5950 6676 11.85 2.40 7.36e+03 133.38 1.76 1111.11Rou12 235528 223680 221161 235528 235528 0.00 5.03 2.76e+01 0.93 0.98 6.90Rou15 354210 333287 323235 350217 367782 4.96 4.78 3.12e+01 2.70 8.68 9.46Rou20 725522 663833 642856 695181 765390 9.68 4.32 1.67e+02 10.31 10.90 16.08Scr12 31410 29321 23973 31410 38806 23.55 6.65 4.40e+00 1.17 2.40 5.79Scr15 51140 48836 42204 51140 58304 14.01 4.51 1.38e+01 2.41 1.84 10.75Scr20 110030 94998 83302 106803 138474 28.78 10.73 1.53e+03 9.61 1.15 17.96Tai12a 224416 222784 215637 224416 224416 0.00 0.73 1.79e+00 0.90 1.04 6.70Tai15a 388214 364761 349586 377101 412760 9.19 3.18 2.74e+01 2.35 14.69 10.34Tai17a 491812 451317 441294 476525 546366 14.20 5.13 6.50e+01 4.52 7.31 12.04Tai20a 703482 637300 619092 671675 750450 11.20 4.89 1.28e+02 10.10 14.32 15.85Tai25a 1167256 1041337 1096657 1096657 1271696 15.00 4.74 3.09e+02 38.48 5.58 1111.11Tai30a 1818146 1652186 -1111 1706871 1942086 12.94 3.01 1.25e+03 142.55 10.51 1111.11Tho30 88900 77647 -1111 86838 102760 17.91 10.34 2.83e+03 164.86 4.74 1111.11 Table 1:

QAP

Instances I and II. Requested tolerance 1 e − In this paper we have shown the eﬃciency of using the

ADMM approach in solving the

SDP relaxationof the

QAP problem. In particular, we have shown that we can obtain high accuracy solutions of the

SDP relaxation in less signiﬁcantly less cost than current approaches. In addition, the

SDP relaxationincludes the nonnegativity constraints at essentially no extra cost. This results in both a fast solution andimproved lower and upper bounds for the

QAP .In a forthcoming study we propose to include this in a branch and bound framework and implement it ina parallel programming approach, see e.g., [8]. In addition, we propose to test the possibility of using warmstarts in the branching/bounding process and test it on the larger test sets such as used in e.g., [7].10 ndex J , gangster index set, 5 G J , gangster operator, 4vec, 3ˆ R , 4ˆ Y , 5ˆ Z , 5 S rn + , 5 d ∗ Z , 7 d ∗ Y , 5 e , ones vector, 4 g ( Z ), 7 p ∗ R , 4 p ∗ X , 2 p ∗ RY , 6 P = { Y ∈ S n +1 : G J ( Y ) = E } , 6 P = P ∩ { ≤ Y ≤ } , 6 R := { R (cid:23) } , 7 Y := { Y : G J ( Y ) = E , ≤ Y ≤ } , 7 Z := { Z : ˆ V (cid:62) Z ˆ V (cid:22) } , 7 QAP , quadratic assignment problem, 2

SDP , semideﬁnite programmming, 2alternating direction method of multipliers, 2, 5augmented Lagrange, 5dual function, 7facial reduction, 4gangster constraints, 2, 6gangster index set, J , 5gangster operator, G J , 4lifting, 4linear cuts, 4ones, 4optimal value ADMM relaxation, p ∗ RY , 6optimal value QAP , p ∗ X , 2optimal value dual SDP relaxation, d ∗ Y , 5optimal value primal SDP relaxation, p ∗ R , 4primal-dual interior-point, p-d i-p, 2quadratic assignment problem, 2semideﬁnite programmming, 2strictly feasible pair, ( ˆ R, ˆ Y , ˆ Z ), 4, 5trace inner product, (cid:104) Y, X (cid:105) = trace

Y X (cid:62) , 2 11 eferences [1] K.M. Anstreicher. Recent advances in the solution of quadratic assignment problems.

Math. Program. ,97(1-2, Ser. B):27–42, 2003. ISMP, 2003 (Copenhagen).[2] K.M. Anstreicher and N.W. Brixius. A new bound for the quadratic assignment problem based onconvex quadratic programming.

Math. Program. , 89(3, Ser. A):341–357, 2001.[3] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. Distributed optimization and statistical learningvia the alternating direction method of multipliers.

Found. Trends Machine Learning , 3(1):1–122, 2011.[4] S. Burer and R.D.C. Monteiro. Local minima and convergence in low-rank semideﬁnite programming.

Math. Program. , 103(3, Ser. A):427–444, 2005.[5] R.E. Burkard, S. Karisch, and F. Rendl. QAPLIB – a quadratic assignment problem library.

EuropeanJ. Oper. Res.

J. GlobalOptim. , 10(4):391–403, 1997.[7] E. de Klerk and R. Sotirov. Exploiting group symmetry in semideﬁnite programming relaxations of thequadratic assignment problem.

Math. Program. , 122(2, Ser. A):225–246, 2010.[8] R. Jain and P. Yao. A parallel approximation algorithm for positive semideﬁnite programming. In , pages 463–471.IEEE Computer Soc., Los Alamitos, CA, 2011.[9] F. Rendl and R. Sotirov. Bounds for the quadratic assignment problem using the bundle method.

Math.Program. , 109(2-3, Ser. B):505–524, 2007.[10] H. Sun, N. Wang, T.K. Pong, and H. Wolkowicz. Eigenvalue, quadratic programming, and semideﬁniteprogramming bounds for a cut minimization problem.

Comput. Optim. Appl. , (accepted Aug. 2015):toappear, 2015. 32 pages, submitted.[11] K.C. Toh, M.J. Todd, and R.H. T¨ut¨unc¨u. SDPT3—a MATLAB software package for semideﬁniteprogramming, version 1.3.

Optim. Methods Softw. , 11/12(1-4):545–581, 1999. Interior point methods.[12] Z. Wen, D. Goldfarb, and W. Yin. Alternating direction augmented Lagrangian methods for semideﬁniteprogramming.

Math. Program. Comput. , 2(3-4):203–230, 2010.[13] Q. Zhao, S.E. Karisch, F. Rendl, and H. Wolkowicz. Semideﬁnite programming relaxations for thequadratic assignment problem.