Fast ADMM for Semidefinite Programs with Chordal Sparsity
Yang Zheng, Giovanni Fantuzzi, Antonis Papachristodoulou, Paul Goulart, Andrew Wynn
FFast ADMM for Semidefinite Programs with Chordal Sparsity
Yang Zheng † , , Giovanni Fantuzzi † , , Antonis Papachristodoulou , Paul Goulart , Andrew Wynn Abstract — Many problems in control theory can be formu-lated as semidefinite programs (SDPs). For large-scale SDPs, itis important to exploit the inherent sparsity to improve thescalability. This paper develops efficient first-order methodsto solve SDPs with chordal sparsity based on the alternatingdirection method of multipliers (ADMM). We show that chordaldecomposition can be applied to either the primal or the dualstandard form of a sparse SDP, resulting in scaled versions ofADMM algorithms with the same computational cost. Eachiteration of our algorithms consists of a projection on theproduct of small positive semidefinite cones, followed by aprojection on an affine set, both of which can be carriedout efficiently. Our techniques are implemented in CDCS, anopen source add-on to MATLAB. Numerical experiments onlarge-scale sparse problems in SDPLIB and random SDPs withblock-arrow sparse patterns show speedups compared to somecommon state-of-the-art software packages.
I. I
NTRODUCTION
Semidefinite programs (SDPs) are a type of convex op-timization problems over the cone of positive semidefinite(PSD) matrices. Given b ∈ R m , C ∈ S n , and matrices A , . . . , A m ∈ S n that define the operators A ( X ) = (cid:104) A , X (cid:105) ... (cid:104) A m , X (cid:105) , A ∗ ( y ) = m (cid:88) i =1 A i y i , SDPs are typically written in the standard primal form min X (cid:104) C, X (cid:105) subject to A ( X ) = b,X ∈ S n + , (1)or in the standard dual form max y,Z (cid:104) b, y (cid:105) subject to A ∗ ( y ) + Z = C,Z ∈ S n + . (2)In the above and throughout this work, R m is the m -dimensional Euclidean space, S n is the space of n × n symmetric matrices, S n + is the subspace of PSD matrices,and (cid:104)· , ·(cid:105) denotes the inner product in the appropriate space. † Y. Zheng and G. Fantuzzi contributed equally to this work. Y. Zhengis supported by the Clarendon Scholarship and the Jason Hu Scholarship.G. Fantuzzi was partially supported by the EPSRC grant EP/J010537/1. Department of Engineering Science, University of Oxford, Parks Road,Oxford, OX1 3PJ, United Kingdom (e-mail: [email protected];[email protected]; [email protected]). Department of Aeronautics, Imperial College London, South KensingtonCampus, London, SW7 2AZ, United Kingdom (e-mail: [email protected];[email protected]).
SDPs have applications in control theory, machine learn-ing, combinatorics, and operations research [1]. Moreover,linear, quadratic, and second-order-cone programs, are partic-ular instances of SDPs [2]. Small to medium-sized SDPs canbe solved in polynomial time [3] using efficient second-orderinterior-point methods (IPMs) [4], [5]. However, many real-life problems are too large for the state-of-the-art interior-point algorithms, due to memory and CPU time constraints.One approach is to abandon IPMs, in favour of faster first-order methods (FOMs) with modest accuracy. For instance,Wen et al. proposed an alternating-direction augmented La-grangian method for large-scale SDPs in the dual standardform [6]. More recently, O’Donoghue et al. developed a first-order operator-splitting method to solve the homogeneousself-dual embedding (HSDE) of a primal-dual pair of conicprograms, which has the advantage of being able to provideprimal or dual certificates of infeasibility [7]. A secondapproach relies on the fact that large-scale SDPs are oftenstructured and/or sparse [1]. Exploiting sparsity in SDPs isan active and challenging area of research [8], one maindifficulty being that the optimal solution is typically densedespite the sparsity of the problem data. If, however, thesparsity pattern of the data is chordal or has sparse chordalextensions , Grone’s and Agler’s theorems [9], [10] allow re-placing the PSD constraint with a set of smaller semidefiniteconstraints, plus an additional set of equality constraints. Insome cases, the converted SDP can then be solved moreefficiently than the original problem. These ideas underly the domain- and range-space conversion techniques [11], [12],implemented in SparseCoLO [13].However, adding equality constraints often offsets thebenefit of working with smaller PSD cones. One possiblesolution is to exploit chordal sparsity directly in the IPMs:Fukuda et al. used Grone’s theorem [9] to develop a primal-dual path-following method for SDPs [11]; Burer proposeda nonsymmetric primal-dual IPM using Cholesky factors ofthe dual variable and maximum determinant completion ofthe primal variable [14]; and Andersen et al. developed fastrecursive algorithms for SDPs with chordal sparsity [15].Alternatively, one can solve the decomposed SDP withFOMs: Sun et al. proposed a first-order splitting method fordecomposable conic programs [16]; Kalbat & Lavaei appliedthe alternating-direction method of multipliers (ADMM) toSDPs with fully decomposable constraints [17]; Madani etal. developed a highly-parallelizable ADMM algorithm forsparse SDPs with inequality constraints with optimal powerflow applications [18].In this work we adopt the strategy of exploiting sparsityusing first-order algorithms in the spirit of [16]–[18], and a r X i v : . [ m a t h . O C ] J u l evelop efficient ADMM algorithms to solve large-scalesparse SDPs. Our contributions are:1) We combine ADMM and chordal decomposition tosolve sparse SDPs in primal or dual standard form. Theresulting algorithms are scaled versions of each other.This gives a conversion framework for the applicationof FOMs, analogous to that of [11], [12] for IPMs.2) In each iteration, the PSD constraint is enforced viaparallel projections onto small PSD cones. The affineconstraints are imposed by a quadratic program withequality constraints, and its KKT system matrix can befactorized before iterating the ADMM algorithm sinceit only depends on the problem data.3) We implement our methods in the open-sourceMATLAB solver CDCS (Cone Decomposition ConicSolver) [19]. Numerical simulations on random SDPswith block-arrow sparsity patterns and on four large-scale sparse problems in SDPLIB [20] demonstrate theefficiency of our algorithms compared to other solvers.The rest of this paper is organized as follows. Section IIreviews chordal sparsity and decomposition techniques. Weshow how to apply the ADMM to primal and dual standard-form SDPs in Sections III–IV, respectively, and report ournumerical experiments in Section V. Finally, Section VIoffers concluding remarks.II. P RELIMINARIES : C
HORDAL D ECOMPOSITION ANDTHE
ADMM A
LGORITHM
A. Chordal graphs
Let G ( V , E ) be an undirected graph with vertices V = { , , . . . , n } and edges E ⊆ V × V . A clique
C ⊆ V is asubset of vertices such that ( i, j ) ∈ E for any distinct vertices i, j ∈ C , and the number of vertices in C is denoted by |C| .If C is not a subset of any other clique, then it is referredto as a maximal clique . A cycle of length k in G is a setof pairwise distinct vertices { v , v , . . . , v k } ⊂ V such that ( v k , v ) ∈ E and ( v i , v i +1 ) ∈ E for i = 1 , . . . , k − . A chordis an edge joining two non-adjacent vertices in a cycle. Definition 1 (Chordal graph):
An undirected graph is chordal if all cycles of length four or higher have a chord.Note that if G ( V , E ) is not chordal, it can be chordalextended , i.e. , we can construct a chordal graph G (cid:48) ( V , E (cid:48) ) byadding edges to E such that G (cid:48) is chordal. Finding the chordalextension with the minimum number of additional edges isan NP-complete problem [21], but good chordal extensionscan be computed efficiently using several heuristics [22]. B. Sparse matrices defined by graphs
Let G = ( V , E ) be an undirected graph such that ( i, i ) ∈ E , i.e. , each node has a self-loop. We say that X is a sparsesymmetric matrix defined by G if X ij = X ji = 0 whenever ( i, j ) / ∈ E . The spaces of sparse and PSD sparse symmetricmatrices defined by G are S n ( E ,
0) = { X ∈ S n | X ij = X ji = 0 if ( i, j ) / ∈ E} , S n + ( E ,
0) = { X ∈ S n ( E , | X (cid:23) } . Similarly, we say that X is a partial symmetric matrix definedby G if X ij = X ji are given when ( i, j ) ∈ E , and arbitraryotherwise. Moreover, we say that M is a PSD completion ofthe partial symmetric matrix X if M (cid:23) and M ij = X ij when ( i, j ) ∈ E . We can then define the spaces S n ( E , ?) = { X ∈ S n | X ij = X ji given if ( i, j ) ∈ E} , S n + ( E , ?) = { X ∈ S n ( E , ?) | ∃ M (cid:23) , M ij = X ij ∀ ( i, j ) ∈ E} . Finally, given a clique C k of G , E k ∈ R |C k |× n is the matrixwith ( E k ) ij = 1 if C k ( i ) = j and zero otherwise, where C k ( i ) is the i -th vertex in C k , sorted in the natural ordering.The submatrix of X ∈ S n defined by C k is E k XE Tk ∈ S |C k | . C. Chordal decomposition of PSD matrices
The spaces S n + ( E , ?) and S n + ( E , are a pair of dualconvex cones for any undirected graph G ( V , E ) [15], [22].If G is chordal, S n + ( E , ?) and S n + ( E , can be expressed interms of several coupled smaller convex cones: Theorem 1 (Grone’s theorem [9]):
Let {C , C , . . . , C p } be the set of maximal cliques of a chordal graph G ( V , E ) .Then, X ∈ S n + ( E , ?) if and only if X k := E k XE Tk ∈ S |C k | + for all k = 1 , . . . , p . Theorem 2 (Agler’s theorem [10]):
Let {C , C , . . . , C p } be the set of maximal cliques of a chordal graph G ( V , E ) .Then, Z ∈ S n + ( E , if and only if there exist matrices Z k ∈ S |C k | + for k = 1 , . . . , p such that Z = (cid:80) pk =1 E Tk Z k E k . These results can be proven individually, but can also canbe derived from each other using the duality of the cones S n + ( E , ?) and S n + ( E , [22]. D. ADMM algorithm
The ADMM algorithm solves the optimization problem min f ( x ) + g ( y ) subject to Ax + By = c, where f and g are convex functions, x ∈ R n x , y ∈ R n y , A ∈ R n c × n x , B ∈ R n c × n y and c ∈ R n c . Given a penaltyparameter ρ > and a dual multiplier z ∈ R n c , the ADMMalgorithm minimizes the augmented Lagrangian L ρ ( x, y, z ) = f ( x ) + g ( y ) + ρ (cid:13)(cid:13)(cid:13)(cid:13) Ax + By − c + 1 ρ z (cid:13)(cid:13)(cid:13)(cid:13) with respect to the variables x and y separately, followed bya dual variable update: x ( n +1) = arg min x L ρ ( x, y ( n ) , z ( n ) ) , (3a) y ( n +1) = arg min y L ρ ( x ( n +1) , y, z ( n ) ) , (3b) z ( n +1) = z ( n ) + ρ ( Ax ( n +1) + By ( n +1) − c ) . (3c)The superscript ( n ) indicates that a variable is fixed to itsvalue at the n -th iteration. ADMM is particularly suitablewhen the minimizations with respect to each of the variables x and y in (3a) and (3b) can be carried out efficiently throughclosed-form expressions.II. ADMM FOR S PARSE P RIMAL - FORM
SDP S A. Reformulation and decomposition of the PSD constraint
Let the primal-standard-form SDP (1) be sparse with an aggregate sparsity pattern described by the graph G ( V , E ) ,meaning that ( i, j ) ∈ E if and only if the entry ij of atleast one of the data matrices C, A , . . . , A m , is nonzero.We assume that G is chordal (otherwise it can be chordalextended) and that its maximal cliques C , . . . , C p are small.Then, only the entries of the matrix variable X correspondingto the graph edges E appear in the cost and constraintfunctions, so the constraint X ∈ S n + can be replaced by X ∈ S n + ( E , ?) . Using Theorem 1, we can then reformulate (1) as min X,X ,...,X p (cid:104) C, X (cid:105) subject to A ( X ) = b,X k − E k XE Tk = 0 , k = 1 , . . . , p,X k ∈ S |C k | + , k = 1 , . . . , p. (4)In other words, we can decompose the original large semidef-inite cone into multiple smaller cones, at the expense of intro-ducing a set of consensus constraints between the variables.To ease the exposition, we rewrite (4) in a vectorized form.Letting vec : S n → R n be the usual operator mappinga matrix to the stack of its column, define the vectorizeddata c := vec ( C ) , A := (cid:2) vec ( A ) . . . vec ( A m ) (cid:3) T , thevectorized variables x := vec( X ) , x k := vec ( x k ) , k =1 , . . . , p, and the matrices H k := E k ⊗ E k such that x k = vec( X k ) = vec( E k XE Tk ) = H k x . In other words,the matrices H , . . . , H p are “entry-selector” matrices of ’sand ’s, whose rows are orthonormal, that project x onto thesubvectors x , . . . , x p , respectively. Denoting the constraints X k ∈ S |C k | + by x k ∈ S k , we can rewrite (4) as min x,x ,...,x p (cid:104) c, x (cid:105) subject to Ax = b,x k = H k x, k = 1 , . . . , p,x k ∈ S k , k = 1 , . . . , p. (5) B. The ADMM algorithm for primal SDPs
Moving the constraints Ax = b and x k ∈ S k in (5) to theobjective using the indicator functions δ ( · ) and δ S k ( · ) gives min x,x ,...,x p (cid:104) c, x (cid:105) + δ ( Ax − b ) + p (cid:88) k =1 δ S k ( x k ) subject to x k = H k x, k = 1 , . . . , p. (6)This problem is in the standard form for the application ofADMM. Given a penalty parameter ρ > and a Lagrangemultiplier λ k for each constraint x k = H k x , we define L := (cid:104) c, x (cid:105) + δ ( Ax − b )+ p (cid:88) k =1 (cid:34) δ S k ( x k ) + ρ (cid:13)(cid:13)(cid:13)(cid:13) x k − H k x + 1 ρ λ k (cid:13)(cid:13)(cid:13)(cid:13) (cid:35) , (7)and group the variables as X := { x } , Y := { x , . . . , x p } ,and Z := { λ , . . . , λ p } . As in (3), in each ADMM iteration,we minimize (7) with respect to X and Y , then update Z . Algorithm 1
ADMM for decomposed primal form SDPs Input: ρ > , (cid:15) tol > , initial guesses x (0) , x (0)1 , . . . , x (0) p , λ (0)1 , . . . , λ (0) p Setup:
Chordal decomposition, KKT factorization. while max( (cid:15) p , (cid:15) d ) ≥ (cid:15) tol do Compute x ( n ) with (9). for k = 1 , . . . , p : Compute x ( n ) k with (11). for k = 1 , . . . , p : Compute λ ( n ) k with (12). Update the residuals (cid:15) p , (cid:15) d . end while Minimization over X : Minimizing (7) over X isequivalent to the equality-constrained quadratic program min x (cid:104) c, x (cid:105) + ρ p (cid:88) k =1 (cid:13)(cid:13)(cid:13)(cid:13) x ( n ) k − H k x + 1 ρ λ ( n ) k (cid:13)(cid:13)(cid:13)(cid:13) subject to Ax = b. (8)Define D := (cid:80) pk =1 H Tk H k and let ρy be the multiplierfor the equality constraint. We can write the optimalityconditions for (8) as the KKT system (cid:20) D A T A (cid:21) (cid:20) xy (cid:21) = (cid:34)(cid:80) pk =1 H Tk (cid:16) x ( n ) k + ρ − λ ( n ) k (cid:17) − ρ − cb (cid:35) . (9)Note that D is a diagonal matrix, because the rows of eachmatrix H k are orthonormal, so (9) can be solved efficiently, e.g. , by block elimination. Moreover, the coefficient matrixis the same at every iteration, so its factorization can be pre-computed and cached before starting the ADMM iterations. Minimization over Y : Minimizing (7) over Y isequivalent to the p independent problems min x k (cid:13)(cid:13)(cid:13) x k − H k x ( n +1) + ρ − λ ( n ) k (cid:13)(cid:13)(cid:13) subject to x k ∈ S k . (10)In terms of the original matrix variables X , . . . , X p , thisamounts to a projection on the PSD cone. More precisely, if P k denotes the projection onto S |C k | + we have x ( n +1) k = vec (cid:110) P k (cid:104) vec − (cid:16) H k x ( n +1) − ρ − λ ( n ) k (cid:17)(cid:105)(cid:111) . (11)Since computing P k amounts to an eigenvalue decompositionand each cone S |C k | + is small by assumption, we can compute x ( n +1)1 , . . . , x ( n +1) p efficiently and in parallel. Updating the multipliers Z : Each multiplier λ k , k =1 , . . . , p , is updated with the usual gradient ascent rule, λ ( n +1) k = λ ( n ) k + ρ (cid:16) x ( n +1) k − H k x ( n +1) (cid:17) . (12)This computation is cheap, and can be parallelized.The ADMM algorithm is stopped after the n -th iteration ifthe relative primal/dual error measures (cid:15) p and (cid:15) d are smallerthan a specified tolerance, (cid:15) tol ; see [23] for more details onstopping conditions for a generic ADMM algorithm. Algo-rithm 1 summarizes the the steps to solve a decomposableSDP in standard primal form (5).V. ADMM FOR S PARSE D UAL - FORM
SDP S A. Reformulation of decomposition of the PSD constraint
Similar to Section III, suppose the aggregate sparsitypattern of an SDP in standard dual form (2) is describedby the chordal graph G ( V , E ) . The equality constraint in (2)implies that the PSD variable Z has the same sparsity patternas the aggregate sparsity pattern of the problem data, i.e. , Z ∈ S n + ( E , , so using Theorem 2 we can rewrite (2) as min y,Z ,...,Z p − (cid:104) b, y (cid:105) subject to A ∗ ( y ) + p (cid:88) k =1 E Tk Z k E k = C,Z k ∈ S |C k | + , k = 1 , . . . , p. (13)While the original PSD constraint has been replaced by mul-tiple smaller PSD constraints, it is not convenient to applyADMM to this problem form because the PSD variables Z , . . . , Z k in the equality constraint are weighted by thematrices E k . Instead, we replace Z , . . . , Z k in the equalityconstraint with slack variables V , . . . , V p such that Z k = V k , k = 1 , . . . , p . Defining z k := vec ( Z k ) and v k := vec ( V k ) forall k = 1 , . . . , p , and using the same vectorized notation as inSection III we then reformulate (13) in the vectorized form min y,z ,...,z p ,v ,...,v p − (cid:104) b, y (cid:105) subject to A T y + p (cid:88) k =1 H Tk v k = c,z k − v k = 0 , k = 1 , . . . , p,z k ∈ S k , k = 1 , . . . , p. (14) Remark 1:
Although we have derived (14) by applyingTheorem 2, (14) is exactly the dual of the decomposed primalSDP (5). Consequently, our analysis provides a decomposi-tion framework for the application of FOMs analogous tothat of [11], [12] for IPMs. This elegant picture, in whichthe decomposed SDPs inherit the duality between the originalones by virtue of the duality between Grone’s and Agler’stheorems, is shown in Fig. 1.
B. The ADMM algorithm for dual SDPs
Using indicator functions to move all but the equalityconstraints z k = v k , k = 1 , . . . , p , to the objective gives min −(cid:104) b, y (cid:105) + δ (cid:32) c − A T y − p (cid:88) k =1 H Tk v k (cid:33) + p (cid:88) k =1 δ S k ( z k ) subject to z k = v k , k = 1 , . . . , p. (15)Given a penalty parameter ρ > and a Lagrange multiplier λ k for each constraint z k = v k , k = 1 , . . . , p , we define L := −(cid:104) b, y (cid:105) + δ (cid:32) c − A T y − p (cid:88) k =1 H Tk v k (cid:33) + p (cid:88) k =1 (cid:34) δ S k ( z k ) + ρ (cid:13)(cid:13)(cid:13)(cid:13) z k − v k + 1 ρ λ k (cid:13)(cid:13)(cid:13)(cid:13) (cid:35) , (16)and group the variables as X := { y, v , . . . , v p } , Y := { z , . . . , z p } , and Z := { λ , . . . , λ p } . Primal SDP (1) Dual SDP (2)DecomposedPrimal SDP (5) DecomposedDual SDP (14)Algorithm 1 Algorithm 2
Grone’s Theorem Agler’s TheoremDualityDualityADMM ADMMScaling
Fig. 1. Duality relationships between primal and dual SDPs, and thedecomposed primal and dual SDPs. Minimization over X : Minimizing (16) over block X is equivalent to the equality-constrained quadratic program min y,v ,...,v p − (cid:104) b, y (cid:105) + ρ p (cid:88) k =0 (cid:13)(cid:13)(cid:13)(cid:13) z ( n ) k − v k + 1 ρ λ ( n ) k (cid:13)(cid:13)(cid:13)(cid:13) subject to c − A T y − p (cid:88) k =1 H Tk v k = 0 . (17)Let ρx be the multiplier for the equality constraint. Aftersome algebra, the optimality conditions for (17) can bewritten as the KKT system (cid:20) D A T A (cid:21) (cid:20) xy (cid:21) = (cid:34) c − (cid:80) pk =1 H Tk (cid:16) z ( n ) k + ρ − λ ( n ) k (cid:17) − ρ − b (cid:35) , (18)plus a set of p uncoupled equations for the variables v k , v k = z ( n ) k + 1 ρ λ ( n ) k + H k x, k = 1 , . . . , p. (19)The KKT system (18) is the same as (9) after rescaling x (cid:55)→ − x , y (cid:55)→ − y , c (cid:55)→ ρ − c and b (cid:55)→ ρb , and as inSection III-B.1 the factors of the coefficient matrix requiredto solve (18) can be pre-computed and cached. Consequently,updating X has the same the cost as in Section III-B.1 plusthe cost of (19), which is cheap and can also be parallelized. Minimization over Y : As in Section III-B.2, z , . . . , z p are updated with p independent and efficient projections z ( n +1) k = vec (cid:110) P k (cid:104) vec − (cid:16) v ( n +1) k − ρ − λ ( n ) k (cid:17)(cid:105)(cid:111) . (20) Updating the multipliers Z : The multipliers λ k , k =1 , . . . , p , are updated with the usual gradient ascent rule λ ( n +1) k = λ ( n ) k + ρ (cid:16) z ( n +1) k − v ( n +1) k (cid:17) . (21)As in Section III-B, we stop the ADMM algorithm whenthe relative primal/dual error measures (cid:15) p and (cid:15) d are smallerthan a specified tolerance, (cid:15) tol . Algorithm 2 summarizes thefull ADMM algorithm for sparse dual-standard-form SDPs. Remark 2:
The computational cost of (19) is the sameas (12), so the ADMM iterations for the decomposed dual-standard-form SDP (14) have the same cost as those forthe decomposed primal-standard-form SDP (5), plus the costof (21). However, if one minimizes (16) over Y before minimizing it over X , substituting (19) into (21) gives λ ( n +1) k = ρH k x ( n +1) , k = 1 , . . . , p. (22) lgorithm 2 ADMM for decomposed dual form SDPs Input: ρ > , (cid:15) tol > , initial guesses y (0) , z (0)1 , . . . , z (0) p , λ (0)1 , . . . , λ (0) p Setup:
Chordal decomposition, KKT factorization. while max( (cid:15) p , (cid:15) d ) ≥ (cid:15) tol do for k = 1 , . . . , p : Compute z ( n ) k with (20). Compute y ( n ) , x with (17). for k = 1 , . . . , p : Compute v ( n ) k with (19) Compute λ ( n ) k with (22) (no cost). Update the residuals (cid:15) p and (cid:15) d . end while Since H x, . . . , H p x have already been computed to update v , . . . , v p , updating λ , . . . , λ p requires only a scalingoperation. Consequently, the ADMM algorithms for theprimal- and dual-standard-form SDPs can be considered asscaled versions to each other, with the same leading-ordercomputational cost at each iteration.V. N UMERICAL S IMULATIONS
We have implemented our techniques in CDCS (Cone De-composition Conic Solver) [19], an open-source MATLABsolver. CDCS supports cartesian products of the followingcones: R n , non-negative orthant, second-order cone, and thePSD cone. Currently, only chordal decomposition techniquesfor semidefinite cones are implemented, while the other conetypes are not decomposed. Our codes can be downloadedfrom https://github.com/OxfordControl/CDCS .We tested CDCS on four sparse large-scale ( n ≥ , m ≥ ) problems in SDPLIB [20], as well as onrandomly generated SDPs with block-arrow sparse pattern,used as a benchmark in [16]. The performance is comparedto that of the IPM solver SeDuMi [24] and of the first-ordersolver SCS [25] on both the full SDPs (without decomposi-tion) and the SDPs decomposed by SparseCoLO [13].The comparison has two purposes: 1) SeDuMi computesaccurate optimal points, which can be used to assess thequality of the solution computed by CDCS; 2) SCS is a high-performance first-order solver for general conic programs,so we can assess the advantages of chordal decomposition.SeDuMi should not be compared to the other solvers onCPU time, because the latter only aim to achieve moderateaccuracy. In the experiments reported below, the terminationtolerance for CDCS and SCS was set to (cid:15) tol = 10 − , with amaximum of 2000 iterations. All experiments were carriedout on a PC with an Intel(R) Core(TM) i7 CPU, 2.8 GHzprocessor and 8GB of RAM. A. SDPs with block-arrow pattern
SDPs with the block-arrow sparsity pattern shown inFig. 2—which is chordal—are used as a benchmark in[16]. The SDP parameters are: the number of blocks, l ; theblock size, d ; the size of the arrow head, h ; the number ofconstraints, m . Here, we consider the following cases: 1) Fix l = 40 , d = 10 , h = 20 , vary m ; 2) Fix m = 1000 , d = 10 , h = 20 , vary l ; 3) Fix l = 40 , h = 10 , m = 1000 , vary d . l blocks dd hh Fig. 2. Block-arrow sparsity pattern: the number of blocks, l ; block size, d ; the size of the arrow head, h .TABLE IP ROBLEM STATISTICS FOR
SDPLIB
PROBLEMS maxG32 maxG51 thetaG51 qpG51Affine constraints, m n p The CPU times for different solvers, averaged over fiverandom problem instances, are shown in Fig. 3. CDCS isapproximately 10 times faster than SeDuMi and the com-bination SparseCoLO+SeDuMi, our Algorithm 2 being thefastest. Besides, the optimal value from CDCS was alwayswithin 0.02% of the accurate value from SeDuMi.
B. Sparse SDPs from SDPLIB
We consider two max-cut problems (maxG32 andmaxG51), a Lov´asz theta problem (thetaG51), and a box-constrained quadratic problem (qpG51) from SDPLIB [20].Table I reports the dimensions and chordal decompositiondetails of these large, sparse SDPs. Table II summarizes ournumerical results; maxG51, thetaG51 and qpG51 could notbe solved by the combination SparseCoLO+SeDuMi due tomemory overflow. For all four problems, CDCS (primal anddual) is faster than SeDuMi and can give speedups comparedto SCS and SparseCoLO+SCS. We remark that the stoppingobjective value from CDCS is within 2% of the optimal valuereturned by SeDuMi (which is highly accurate, and can beconsidered exact) in all four cases, and within 0.08% forthe max-cut problems maxG32 and maxG51 — a negligibledifference in applications.VI. C
ONCLUSION
We proposed a conversion framework for SDPs charac-terized by chordal sparsity suitable for the application ofFOMs, analogous to the conversion techniques for IPMsof [11], [12]. We also developed efficient ADMM algorithmsfor sparse SDPs in primal or dual standard form, which areimplemented in the solver CDCS. Numerical experimentson SDPs with block-arrow sparsity patterns and on largesparse problems in SDPLIB show that our methods canprovide speedups compared to both IPM solvers such asSeDuMi [24]—even when the chordal sparsity is exploitedusing SparseCoLO [13]—and the state-of-the-art first-ordersolver SCS [25]. Exploiting chordal sparsity in a first-orderHSDE formulation similar to that of [7] would be desirablein the future to be able to detect infeasibility. ig. 3. CPU time for SDPs with block-arrow patterns. Left to right: varying number of constraints; varying number of blocks; varying block size.TABLE IIR
ESULTS FOR THE PROBLEM INSTANCES IN
SDPLIBSeDuMi SparseCoLO+SeDuMi SCS SparseCoLO+SCS CDCS(primal) CDCS(dual)maxG32 Total time (s) 974.6 355.2 2.553 × × × × × × × Iterations 14 15 2000 960 238 127maxG51 Total time (s) 134.5 – 87.9 1.201 × × – 4.006 × × × × Iterations 16 – 540 2000 235 157thetaG51 Total time (s) 2.218 × – 424.2 1.346 × × – 2.330 × × – 1.288 × × × × Iterations 22 – 2000 2000 1287 1048 R EFERENCES[1] S. Boyd, L. El Ghaoui, E. Feron, and V. Balakrishnan,
Linear MatrixInequalities in System and Control Theory . SIAM, 1994.[2] S. Boyd and L. Vandenberghe,
Convex optimization . CambridgeUniversity Press, 2004.[3] L. Vandenberghe and S. Boyd, “Semidefinite programming,”
SIAMreview , vol. 38, no. 1, pp. 49–95, 1996.[4] F. Alizadeh, J.-P. A. Haeberly, and M. L. Overton, “Primal-dualinterior-point methods for semidefinite programming: convergencerates, stability and numerical results,”
SIAM J. Optim. , vol. 8, no. 3,pp. 746–768, 1998.[5] C. Helmberg, F. Rendl, R. J. Vanderbei, and H. Wolkowicz, “Aninterior-point method for semidefinite programming,”
SIAM J. Optim. ,vol. 6, no. 2, pp. 342–361, 1996.[6] Z. Wen, D. Goldfarb, and W. Yin, “Alternating direction augmentedlagrangian methods for semidefinite programming,”
Math. Program.Comput. , vol. 2, no. 3-4, pp. 203–230, 2010.[7] B. O’Donoghue, E. Chu, N. Parikh, and S. Boyd, “Conic optimizationvia operator splitting and homogeneous self-dual embedding,”
J.Optim. Theory Appl. , vol. 169, no. 3, pp. 1042–1068, 2016.[8] M. Andersen, J. Dahl, Z. Liu, and L. Vandenberghe, “Interior-pointmethods for large-scale cone programming,” in
Optimization formachine learning . MIT Press, 2011, pp. 55–83.[9] R. Grone, C. R. Johnson, E. M. S´a, and H. Wolkowicz, “Positivedefinite completions of partial hermitian matrices,”
Linear AlgebraAppl. , vol. 58, pp. 109–124, 1984.[10] J. Agler, W. Helton, S. McCullough, and L. Rodman, “Positivesemidefinite matrices with a given sparsity pattern,”
Linear algebraAppl. , vol. 107, pp. 101–149, 1988.[11] M. Fukuda, M. Kojima, K. Murota, and K. Nakata, “Exploitingsparsity in semidefinite programming via matrix completion I: Generalframework,”
SIAM J. Optim. , vol. 11, no. 3, pp. 647–674, 2001.[12] S. Kim, M. Kojima, M. Mevissen, and M. Yamashita, “Exploiting spar-sity in linear and nonlinear matrix inequalities via positive semidefinitematrix completion,”
Math. Program. , vol. 129, no. 1, pp. 33–68, 2011.[13] K. Fujisawa, S. Kim, M. Kojima, Y. Okamoto, and M. Yamashita,“Users manual for SparseCoLO: Conversion methods for sparse conic- form linear optimization problems,” Research Report B-453, TokyoInstitute of Technology, Tokyo 152-8552, Japan, Tech. Rep., 2009.[14] S. Burer, “Semidefinite programming in the space of partial positivesemidefinite matrices,”
SIAM J. Optim. , vol. 14, no. 1, pp. 139–172,2003.[15] M. S. Andersen, J. Dahl, and L. Vandenberghe, “Implementationof nonsymmetric interior-point methods for linear optimization oversparse matrix cones,”
Math. Program. Comput. , vol. 2, no. 3-4, pp.167–201, 2010.[16] Y. Sun, M. S. Andersen, and L. Vandenberghe, “Decomposition inconic optimization with partially separable structure,”
SIAM J. Optim. ,vol. 24, no. 2, pp. 873–897, 2014.[17] A. Kalbat and J. Lavaei, “A fast distributed algorithm for decompos-able semidefinite programs,” in
Proc. 54th IEEE Conf. Decis. Control ,Dec 2015, pp. 1742–1749.[18] R. Madani, A. Kalbat, and J. Lavaei, “ADMM for sparse semidefiniteprogramming with applications to optimal power flow problem,” in
Proc. 54th IEEE Conf. Decis. Control , Dec 2015, pp. 5932–5939.[19] Y. Zheng, G. Fantuzzi, A. Papachristodoulou, P. Goulart, and A. Wynn,“CDCS: Cone decomposition conic solver, version 1.0,” https://github.com/OxfordControl/CDCS, Sep. 2016.[20] B. Borchers, “SDPLIB 1.2, a library of semidefinite programming testproblems,”
Optim. Methods Softw. , vol. 11, no. 1-4, pp. 683–690, 1999.[21] M. Yannakakis, “Computing the minimum fill-in is NP-complete,”
SIAM Journal on Algebraic Discrete Methods , vol. 2, pp. 77–79, 1981.[22] L. Vandenberghe and M. S. Andersen, “Chordal graphs and semidefi-nite optimization,”
Found. Trends R (cid:13) Optim. , vol. 1, no. 4, pp. 241–433,2014.[23] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributedoptimization and statistical learning via the alternating directionmethod of multipliers,”
Found. Trends R (cid:13) Mach. Learn. , vol. 3, no. 1,pp. 1–122, 2011.[24] J. F. Sturm, “Using SeDuMi 1.02, a MATLAB toolbox for optimizationover symmetric cones,”