[PDF] Fast ADMM for Semidefinite Programs with Chordal Sparsity

Abstract

Many problems in control theory can be formulated as semidefinite programs (SDPs). For large-scale SDPs, it is important to exploit the inherent sparsity to improve the scalability. This paper develops efficient first-order methods to solve SDPs with chordal sparsity based on the alternating direction method of multipliers (ADMM). We show that chordal decomposition can be applied to either the primal or the dual standard form of a sparse SDP, resulting in scaled versions of ADMM algorithms with the same computational cost. Each iteration of our algorithms consists of a projection on the product of small positive semidefinite cones, followed by a projection on an affine set, both of which can be carried out efficiently. Our techniques are implemented in CDCS, an open source add-on to MATLAB. Numerical experiments on large-scale sparse problems in SDPLIB and random SDPs with block-arrow sparse patterns show speedups compared to some common state-of-the-art software packages.

Full PDF

FFast ADMM for Semideﬁnite Programs with Chordal Sparsity

Yang Zheng † , , Giovanni Fantuzzi † , , Antonis Papachristodoulou , Paul Goulart , Andrew Wynn Abstract — Many problems in control theory can be formu-lated as semideﬁnite programs (SDPs). For large-scale SDPs, itis important to exploit the inherent sparsity to improve thescalability. This paper develops efﬁcient ﬁrst-order methodsto solve SDPs with chordal sparsity based on the alternatingdirection method of multipliers (ADMM). We show that chordaldecomposition can be applied to either the primal or the dualstandard form of a sparse SDP, resulting in scaled versions ofADMM algorithms with the same computational cost. Eachiteration of our algorithms consists of a projection on theproduct of small positive semideﬁnite cones, followed by aprojection on an afﬁne set, both of which can be carriedout efﬁciently. Our techniques are implemented in CDCS, anopen source add-on to MATLAB. Numerical experiments onlarge-scale sparse problems in SDPLIB and random SDPs withblock-arrow sparse patterns show speedups compared to somecommon state-of-the-art software packages.

I. I

NTRODUCTION

Semideﬁnite programs (SDPs) are a type of convex op-timization problems over the cone of positive semideﬁnite(PSD) matrices. Given b ∈ R m , C ∈ S n , and matrices A , . . . , A m ∈ S n that deﬁne the operators A ( X ) =  (cid:104) A , X (cid:105) ... (cid:104) A m , X (cid:105)  , A ∗ ( y ) = m (cid:88) i =1 A i y i , SDPs are typically written in the standard primal form min X (cid:104) C, X (cid:105) subject to A ( X ) = b,X ∈ S n + , (1)or in the standard dual form max y,Z (cid:104) b, y (cid:105) subject to A ∗ ( y ) + Z = C,Z ∈ S n + . (2)In the above and throughout this work, R m is the m -dimensional Euclidean space, S n is the space of n × n symmetric matrices, S n + is the subspace of PSD matrices,and (cid:104)· , ·(cid:105) denotes the inner product in the appropriate space. † Y. Zheng and G. Fantuzzi contributed equally to this work. Y. Zhengis supported by the Clarendon Scholarship and the Jason Hu Scholarship.G. Fantuzzi was partially supported by the EPSRC grant EP/J010537/1. Department of Engineering Science, University of Oxford, Parks Road,Oxford, OX1 3PJ, United Kingdom (e-mail: [email protected];[email protected]; [email protected]). Department of Aeronautics, Imperial College London, South KensingtonCampus, London, SW7 2AZ, United Kingdom (e-mail: [email protected];[email protected]).

SDPs have applications in control theory, machine learn-ing, combinatorics, and operations research [1]. Moreover,linear, quadratic, and second-order-cone programs, are partic-ular instances of SDPs [2]. Small to medium-sized SDPs canbe solved in polynomial time [3] using efﬁcient second-orderinterior-point methods (IPMs) [4], [5]. However, many real-life problems are too large for the state-of-the-art interior-point algorithms, due to memory and CPU time constraints.One approach is to abandon IPMs, in favour of faster ﬁrst-order methods (FOMs) with modest accuracy. For instance,Wen et al. proposed an alternating-direction augmented La-grangian method for large-scale SDPs in the dual standardform [6]. More recently, O’Donoghue et al. developed a ﬁrst-order operator-splitting method to solve the homogeneousself-dual embedding (HSDE) of a primal-dual pair of conicprograms, which has the advantage of being able to provideprimal or dual certiﬁcates of infeasibility [7]. A secondapproach relies on the fact that large-scale SDPs are oftenstructured and/or sparse [1]. Exploiting sparsity in SDPs isan active and challenging area of research [8], one maindifﬁculty being that the optimal solution is typically densedespite the sparsity of the problem data. If, however, thesparsity pattern of the data is chordal or has sparse chordalextensions , Grone’s and Agler’s theorems [9], [10] allow re-placing the PSD constraint with a set of smaller semideﬁniteconstraints, plus an additional set of equality constraints. Insome cases, the converted SDP can then be solved moreefﬁciently than the original problem. These ideas underly the domain- and range-space conversion techniques [11], [12],implemented in SparseCoLO [13].However, adding equality constraints often offsets thebeneﬁt of working with smaller PSD cones. One possiblesolution is to exploit chordal sparsity directly in the IPMs:Fukuda et al. used Grone’s theorem [9] to develop a primal-dual path-following method for SDPs [11]; Burer proposeda nonsymmetric primal-dual IPM using Cholesky factors ofthe dual variable and maximum determinant completion ofthe primal variable [14]; and Andersen et al. developed fastrecursive algorithms for SDPs with chordal sparsity [15].Alternatively, one can solve the decomposed SDP withFOMs: Sun et al. proposed a ﬁrst-order splitting method fordecomposable conic programs [16]; Kalbat & Lavaei appliedthe alternating-direction method of multipliers (ADMM) toSDPs with fully decomposable constraints [17]; Madani etal. developed a highly-parallelizable ADMM algorithm forsparse SDPs with inequality constraints with optimal powerﬂow applications [18].In this work we adopt the strategy of exploiting sparsityusing ﬁrst-order algorithms in the spirit of [16]–[18], and a r X i v : . [ m a t h . O C ] J u l evelop efﬁcient ADMM algorithms to solve large-scalesparse SDPs. Our contributions are:1) We combine ADMM and chordal decomposition tosolve sparse SDPs in primal or dual standard form. Theresulting algorithms are scaled versions of each other.This gives a conversion framework for the applicationof FOMs, analogous to that of [11], [12] for IPMs.2) In each iteration, the PSD constraint is enforced viaparallel projections onto small PSD cones. The afﬁneconstraints are imposed by a quadratic program withequality constraints, and its KKT system matrix can befactorized before iterating the ADMM algorithm sinceit only depends on the problem data.3) We implement our methods in the open-sourceMATLAB solver CDCS (Cone Decomposition ConicSolver) [19]. Numerical simulations on random SDPswith block-arrow sparsity patterns and on four large-scale sparse problems in SDPLIB [20] demonstrate theefﬁciency of our algorithms compared to other solvers.The rest of this paper is organized as follows. Section IIreviews chordal sparsity and decomposition techniques. Weshow how to apply the ADMM to primal and dual standard-form SDPs in Sections III–IV, respectively, and report ournumerical experiments in Section V. Finally, Section VIoffers concluding remarks.II. P RELIMINARIES : C

HORDAL D ECOMPOSITION ANDTHE

ADMM A

LGORITHM

A. Chordal graphs

Let G ( V , E ) be an undirected graph with vertices V = { , , . . . , n } and edges E ⊆ V × V . A clique

C ⊆ V is asubset of vertices such that ( i, j ) ∈ E for any distinct vertices i, j ∈ C , and the number of vertices in C is denoted by |C| .If C is not a subset of any other clique, then it is referredto as a maximal clique . A cycle of length k in G is a setof pairwise distinct vertices { v , v , . . . , v k } ⊂ V such that ( v k , v ) ∈ E and ( v i , v i +1 ) ∈ E for i = 1 , . . . , k − . A chordis an edge joining two non-adjacent vertices in a cycle. Deﬁnition 1 (Chordal graph):

An undirected graph is chordal if all cycles of length four or higher have a chord.Note that if G ( V , E ) is not chordal, it can be chordalextended , i.e. , we can construct a chordal graph G (cid:48) ( V , E (cid:48) ) byadding edges to E such that G (cid:48) is chordal. Finding the chordalextension with the minimum number of additional edges isan NP-complete problem [21], but good chordal extensionscan be computed efﬁciently using several heuristics [22]. B. Sparse matrices deﬁned by graphs

Let G = ( V , E ) be an undirected graph such that ( i, i ) ∈ E , i.e. , each node has a self-loop. We say that X is a sparsesymmetric matrix deﬁned by G if X ij = X ji = 0 whenever ( i, j ) / ∈ E . The spaces of sparse and PSD sparse symmetricmatrices deﬁned by G are S n ( E ,

0) = { X ∈ S n | X ij = X ji = 0 if ( i, j ) / ∈ E} , S n + ( E ,

0) = { X ∈ S n ( E , | X (cid:23) } . Similarly, we say that X is a partial symmetric matrix deﬁnedby G if X ij = X ji are given when ( i, j ) ∈ E , and arbitraryotherwise. Moreover, we say that M is a PSD completion ofthe partial symmetric matrix X if M (cid:23) and M ij = X ij when ( i, j ) ∈ E . We can then deﬁne the spaces S n ( E , ?) = { X ∈ S n | X ij = X ji given if ( i, j ) ∈ E} , S n + ( E , ?) = { X ∈ S n ( E , ?) | ∃ M (cid:23) , M ij = X ij ∀ ( i, j ) ∈ E} . Finally, given a clique C k of G , E k ∈ R |C k |× n is the matrixwith ( E k ) ij = 1 if C k ( i ) = j and zero otherwise, where C k ( i ) is the i -th vertex in C k , sorted in the natural ordering.The submatrix of X ∈ S n deﬁned by C k is E k XE Tk ∈ S |C k | . C. Chordal decomposition of PSD matrices

The spaces S n + ( E , ?) and S n + ( E , are a pair of dualconvex cones for any undirected graph G ( V , E ) [15], [22].If G is chordal, S n + ( E , ?) and S n + ( E , can be expressed interms of several coupled smaller convex cones: Theorem 1 (Grone’s theorem [9]):

Let {C , C , . . . , C p } be the set of maximal cliques of a chordal graph G ( V , E ) .Then, X ∈ S n + ( E , ?) if and only if X k := E k XE Tk ∈ S |C k | + for all k = 1 , . . . , p . Theorem 2 (Agler’s theorem [10]):

Let {C , C , . . . , C p } be the set of maximal cliques of a chordal graph G ( V , E ) .Then, Z ∈ S n + ( E , if and only if there exist matrices Z k ∈ S |C k | + for k = 1 , . . . , p such that Z = (cid:80) pk =1 E Tk Z k E k . These results can be proven individually, but can also canbe derived from each other using the duality of the cones S n + ( E , ?) and S n + ( E , [22]. D. ADMM algorithm

The ADMM algorithm solves the optimization problem min f ( x ) + g ( y ) subject to Ax + By = c, where f and g are convex functions, x ∈ R n x , y ∈ R n y , A ∈ R n c × n x , B ∈ R n c × n y and c ∈ R n c . Given a penaltyparameter ρ > and a dual multiplier z ∈ R n c , the ADMMalgorithm minimizes the augmented Lagrangian L ρ ( x, y, z ) = f ( x ) + g ( y ) + ρ (cid:13)(cid:13)(cid:13)(cid:13) Ax + By − c + 1 ρ z (cid:13)(cid:13)(cid:13)(cid:13) with respect to the variables x and y separately, followed bya dual variable update: x ( n +1) = arg min x L ρ ( x, y ( n ) , z ( n ) ) , (3a) y ( n +1) = arg min y L ρ ( x ( n +1) , y, z ( n ) ) , (3b) z ( n +1) = z ( n ) + ρ ( Ax ( n +1) + By ( n +1) − c ) . (3c)The superscript ( n ) indicates that a variable is ﬁxed to itsvalue at the n -th iteration. ADMM is particularly suitablewhen the minimizations with respect to each of the variables x and y in (3a) and (3b) can be carried out efﬁciently throughclosed-form expressions.II. ADMM FOR S PARSE P RIMAL - FORM

SDP S A. Reformulation and decomposition of the PSD constraint

Let the primal-standard-form SDP (1) be sparse with an aggregate sparsity pattern described by the graph G ( V , E ) ,meaning that ( i, j ) ∈ E if and only if the entry ij of atleast one of the data matrices C, A , . . . , A m , is nonzero.We assume that G is chordal (otherwise it can be chordalextended) and that its maximal cliques C , . . . , C p are small.Then, only the entries of the matrix variable X correspondingto the graph edges E appear in the cost and constraintfunctions, so the constraint X ∈ S n + can be replaced by X ∈ S n + ( E , ?) . Using Theorem 1, we can then reformulate (1) as min X,X ,...,X p (cid:104) C, X (cid:105) subject to A ( X ) = b,X k − E k XE Tk = 0 , k = 1 , . . . , p,X k ∈ S |C k | + , k = 1 , . . . , p. (4)In other words, we can decompose the original large semidef-inite cone into multiple smaller cones, at the expense of intro-ducing a set of consensus constraints between the variables.To ease the exposition, we rewrite (4) in a vectorized form.Letting vec : S n → R n be the usual operator mappinga matrix to the stack of its column, deﬁne the vectorizeddata c := vec ( C ) , A := (cid:2) vec ( A ) . . . vec ( A m ) (cid:3) T , thevectorized variables x := vec( X ) , x k := vec ( x k ) , k =1 , . . . , p, and the matrices H k := E k ⊗ E k such that x k = vec( X k ) = vec( E k XE Tk ) = H k x . In other words,the matrices H , . . . , H p are “entry-selector” matrices of ’sand ’s, whose rows are orthonormal, that project x onto thesubvectors x , . . . , x p , respectively. Denoting the constraints X k ∈ S |C k | + by x k ∈ S k , we can rewrite (4) as min x,x ,...,x p (cid:104) c, x (cid:105) subject to Ax = b,x k = H k x, k = 1 , . . . , p,x k ∈ S k , k = 1 , . . . , p. (5) B. The ADMM algorithm for primal SDPs

Moving the constraints Ax = b and x k ∈ S k in (5) to theobjective using the indicator functions δ ( · ) and δ S k ( · ) gives min x,x ,...,x p (cid:104) c, x (cid:105) + δ ( Ax − b ) + p (cid:88) k =1 δ S k ( x k ) subject to x k = H k x, k = 1 , . . . , p. (6)This problem is in the standard form for the application ofADMM. Given a penalty parameter ρ > and a Lagrangemultiplier λ k for each constraint x k = H k x , we deﬁne L := (cid:104) c, x (cid:105) + δ ( Ax − b )+ p (cid:88) k =1 (cid:34) δ S k ( x k ) + ρ (cid:13)(cid:13)(cid:13)(cid:13) x k − H k x + 1 ρ λ k (cid:13)(cid:13)(cid:13)(cid:13) (cid:35) , (7)and group the variables as X := { x } , Y := { x , . . . , x p } ,and Z := { λ , . . . , λ p } . As in (3), in each ADMM iteration,we minimize (7) with respect to X and Y , then update Z . Algorithm 1

ADMM for decomposed primal form SDPs Input: ρ > , (cid:15) tol > , initial guesses x (0) , x (0)1 , . . . , x (0) p , λ (0)1 , . . . , λ (0) p Setup:

Chordal decomposition, KKT factorization. while max( (cid:15) p , (cid:15) d ) ≥ (cid:15) tol do Compute x ( n ) with (9). for k = 1 , . . . , p : Compute x ( n ) k with (11). for k = 1 , . . . , p : Compute λ ( n ) k with (12). Update the residuals (cid:15) p , (cid:15) d . end while Minimization over X : Minimizing (7) over X isequivalent to the equality-constrained quadratic program min x (cid:104) c, x (cid:105) + ρ p (cid:88) k =1 (cid:13)(cid:13)(cid:13)(cid:13) x ( n ) k − H k x + 1 ρ λ ( n ) k (cid:13)(cid:13)(cid:13)(cid:13) subject to Ax = b. (8)Deﬁne D := (cid:80) pk =1 H Tk H k and let ρy be the multiplierfor the equality constraint. We can write the optimalityconditions for (8) as the KKT system (cid:20) D A T A (cid:21) (cid:20) xy (cid:21) = (cid:34)(cid:80) pk =1 H Tk (cid:16) x ( n ) k + ρ − λ ( n ) k (cid:17) − ρ − cb (cid:35) . (9)Note that D is a diagonal matrix, because the rows of eachmatrix H k are orthonormal, so (9) can be solved efﬁciently, e.g. , by block elimination. Moreover, the coefﬁcient matrixis the same at every iteration, so its factorization can be pre-computed and cached before starting the ADMM iterations. Minimization over Y : Minimizing (7) over Y isequivalent to the p independent problems min x k (cid:13)(cid:13)(cid:13) x k − H k x ( n +1) + ρ − λ ( n ) k (cid:13)(cid:13)(cid:13) subject to x k ∈ S k . (10)In terms of the original matrix variables X , . . . , X p , thisamounts to a projection on the PSD cone. More precisely, if P k denotes the projection onto S |C k | + we have x ( n +1) k = vec (cid:110) P k (cid:104) vec − (cid:16) H k x ( n +1) − ρ − λ ( n ) k (cid:17)(cid:105)(cid:111) . (11)Since computing P k amounts to an eigenvalue decompositionand each cone S |C k | + is small by assumption, we can compute x ( n +1)1 , . . . , x ( n +1) p efﬁciently and in parallel. Updating the multipliers Z : Each multiplier λ k , k =1 , . . . , p , is updated with the usual gradient ascent rule, λ ( n +1) k = λ ( n ) k + ρ (cid:16) x ( n +1) k − H k x ( n +1) (cid:17) . (12)This computation is cheap, and can be parallelized.The ADMM algorithm is stopped after the n -th iteration ifthe relative primal/dual error measures (cid:15) p and (cid:15) d are smallerthan a speciﬁed tolerance, (cid:15) tol ; see [23] for more details onstopping conditions for a generic ADMM algorithm. Algo-rithm 1 summarizes the the steps to solve a decomposableSDP in standard primal form (5).V. ADMM FOR S PARSE D UAL - FORM

SDP S A. Reformulation of decomposition of the PSD constraint

Similar to Section III, suppose the aggregate sparsitypattern of an SDP in standard dual form (2) is describedby the chordal graph G ( V , E ) . The equality constraint in (2)implies that the PSD variable Z has the same sparsity patternas the aggregate sparsity pattern of the problem data, i.e. , Z ∈ S n + ( E , , so using Theorem 2 we can rewrite (2) as min y,Z ,...,Z p − (cid:104) b, y (cid:105) subject to A ∗ ( y ) + p (cid:88) k =1 E Tk Z k E k = C,Z k ∈ S |C k | + , k = 1 , . . . , p. (13)While the original PSD constraint has been replaced by mul-tiple smaller PSD constraints, it is not convenient to applyADMM to this problem form because the PSD variables Z , . . . , Z k in the equality constraint are weighted by thematrices E k . Instead, we replace Z , . . . , Z k in the equalityconstraint with slack variables V , . . . , V p such that Z k = V k , k = 1 , . . . , p . Deﬁning z k := vec ( Z k ) and v k := vec ( V k ) forall k = 1 , . . . , p , and using the same vectorized notation as inSection III we then reformulate (13) in the vectorized form min y,z ,...,z p ,v ,...,v p − (cid:104) b, y (cid:105) subject to A T y + p (cid:88) k =1 H Tk v k = c,z k − v k = 0 , k = 1 , . . . , p,z k ∈ S k , k = 1 , . . . , p. (14) Remark 1:

Although we have derived (14) by applyingTheorem 2, (14) is exactly the dual of the decomposed primalSDP (5). Consequently, our analysis provides a decomposi-tion framework for the application of FOMs analogous tothat of [11], [12] for IPMs. This elegant picture, in whichthe decomposed SDPs inherit the duality between the originalones by virtue of the duality between Grone’s and Agler’stheorems, is shown in Fig. 1.

B. The ADMM algorithm for dual SDPs

Using indicator functions to move all but the equalityconstraints z k = v k , k = 1 , . . . , p , to the objective gives min −(cid:104) b, y (cid:105) + δ (cid:32) c − A T y − p (cid:88) k =1 H Tk v k (cid:33) + p (cid:88) k =1 δ S k ( z k ) subject to z k = v k , k = 1 , . . . , p. (15)Given a penalty parameter ρ > and a Lagrange multiplier λ k for each constraint z k = v k , k = 1 , . . . , p , we deﬁne L := −(cid:104) b, y (cid:105) + δ (cid:32) c − A T y − p (cid:88) k =1 H Tk v k (cid:33) + p (cid:88) k =1 (cid:34) δ S k ( z k ) + ρ (cid:13)(cid:13)(cid:13)(cid:13) z k − v k + 1 ρ λ k (cid:13)(cid:13)(cid:13)(cid:13) (cid:35) , (16)and group the variables as X := { y, v , . . . , v p } , Y := { z , . . . , z p } , and Z := { λ , . . . , λ p } . Primal SDP (1) Dual SDP (2)DecomposedPrimal SDP (5) DecomposedDual SDP (14)Algorithm 1 Algorithm 2

Grone’s Theorem Agler’s TheoremDualityDualityADMM ADMMScaling

Fig. 1. Duality relationships between primal and dual SDPs, and thedecomposed primal and dual SDPs. Minimization over X : Minimizing (16) over block X is equivalent to the equality-constrained quadratic program min y,v ,...,v p − (cid:104) b, y (cid:105) + ρ p (cid:88) k =0 (cid:13)(cid:13)(cid:13)(cid:13) z ( n ) k − v k + 1 ρ λ ( n ) k (cid:13)(cid:13)(cid:13)(cid:13) subject to c − A T y − p (cid:88) k =1 H Tk v k = 0 . (17)Let ρx be the multiplier for the equality constraint. Aftersome algebra, the optimality conditions for (17) can bewritten as the KKT system (cid:20) D A T A (cid:21) (cid:20) xy (cid:21) = (cid:34) c − (cid:80) pk =1 H Tk (cid:16) z ( n ) k + ρ − λ ( n ) k (cid:17) − ρ − b (cid:35) , (18)plus a set of p uncoupled equations for the variables v k , v k = z ( n ) k + 1 ρ λ ( n ) k + H k x, k = 1 , . . . , p. (19)The KKT system (18) is the same as (9) after rescaling x (cid:55)→ − x , y (cid:55)→ − y , c (cid:55)→ ρ − c and b (cid:55)→ ρb , and as inSection III-B.1 the factors of the coefﬁcient matrix requiredto solve (18) can be pre-computed and cached. Consequently,updating X has the same the cost as in Section III-B.1 plusthe cost of (19), which is cheap and can also be parallelized. Minimization over Y : As in Section III-B.2, z , . . . , z p are updated with p independent and efﬁcient projections z ( n +1) k = vec (cid:110) P k (cid:104) vec − (cid:16) v ( n +1) k − ρ − λ ( n ) k (cid:17)(cid:105)(cid:111) . (20) Updating the multipliers Z : The multipliers λ k , k =1 , . . . , p , are updated with the usual gradient ascent rule λ ( n +1) k = λ ( n ) k + ρ (cid:16) z ( n +1) k − v ( n +1) k (cid:17) . (21)As in Section III-B, we stop the ADMM algorithm whenthe relative primal/dual error measures (cid:15) p and (cid:15) d are smallerthan a speciﬁed tolerance, (cid:15) tol . Algorithm 2 summarizes thefull ADMM algorithm for sparse dual-standard-form SDPs. Remark 2:

The computational cost of (19) is the sameas (12), so the ADMM iterations for the decomposed dual-standard-form SDP (14) have the same cost as those forthe decomposed primal-standard-form SDP (5), plus the costof (21). However, if one minimizes (16) over Y before minimizing it over X , substituting (19) into (21) gives λ ( n +1) k = ρH k x ( n +1) , k = 1 , . . . , p. (22) lgorithm 2 ADMM for decomposed dual form SDPs Input: ρ > , (cid:15) tol > , initial guesses y (0) , z (0)1 , . . . , z (0) p , λ (0)1 , . . . , λ (0) p Setup:

Chordal decomposition, KKT factorization. while max( (cid:15) p , (cid:15) d ) ≥ (cid:15) tol do for k = 1 , . . . , p : Compute z ( n ) k with (20). Compute y ( n ) , x with (17). for k = 1 , . . . , p : Compute v ( n ) k with (19) Compute λ ( n ) k with (22) (no cost). Update the residuals (cid:15) p and (cid:15) d . end while Since H x, . . . , H p x have already been computed to update v , . . . , v p , updating λ , . . . , λ p requires only a scalingoperation. Consequently, the ADMM algorithms for theprimal- and dual-standard-form SDPs can be considered asscaled versions to each other, with the same leading-ordercomputational cost at each iteration.V. N UMERICAL S IMULATIONS

We have implemented our techniques in CDCS (Cone De-composition Conic Solver) [19], an open-source MATLABsolver. CDCS supports cartesian products of the followingcones: R n , non-negative orthant, second-order cone, and thePSD cone. Currently, only chordal decomposition techniquesfor semideﬁnite cones are implemented, while the other conetypes are not decomposed. Our codes can be downloadedfrom https://github.com/OxfordControl/CDCS .We tested CDCS on four sparse large-scale ( n ≥ , m ≥ ) problems in SDPLIB [20], as well as onrandomly generated SDPs with block-arrow sparse pattern,used as a benchmark in [16]. The performance is comparedto that of the IPM solver SeDuMi [24] and of the ﬁrst-ordersolver SCS [25] on both the full SDPs (without decomposi-tion) and the SDPs decomposed by SparseCoLO [13].The comparison has two purposes: 1) SeDuMi computesaccurate optimal points, which can be used to assess thequality of the solution computed by CDCS; 2) SCS is a high-performance ﬁrst-order solver for general conic programs,so we can assess the advantages of chordal decomposition.SeDuMi should not be compared to the other solvers onCPU time, because the latter only aim to achieve moderateaccuracy. In the experiments reported below, the terminationtolerance for CDCS and SCS was set to (cid:15) tol = 10 − , with amaximum of 2000 iterations. All experiments were carriedout on a PC with an Intel(R) Core(TM) i7 CPU, 2.8 GHzprocessor and 8GB of RAM. A. SDPs with block-arrow pattern

SDPs with the block-arrow sparsity pattern shown inFig. 2—which is chordal—are used as a benchmark in[16]. The SDP parameters are: the number of blocks, l ; theblock size, d ; the size of the arrow head, h ; the number ofconstraints, m . Here, we consider the following cases: 1) Fix l = 40 , d = 10 , h = 20 , vary m ; 2) Fix m = 1000 , d = 10 , h = 20 , vary l ; 3) Fix l = 40 , h = 10 , m = 1000 , vary d . l blocks dd hh Fig. 2. Block-arrow sparsity pattern: the number of blocks, l ; block size, d ; the size of the arrow head, h .TABLE IP ROBLEM STATISTICS FOR

SDPLIB

PROBLEMS maxG32 maxG51 thetaG51 qpG51Afﬁne constraints, m n p The CPU times for different solvers, averaged over ﬁverandom problem instances, are shown in Fig. 3. CDCS isapproximately 10 times faster than SeDuMi and the com-bination SparseCoLO+SeDuMi, our Algorithm 2 being thefastest. Besides, the optimal value from CDCS was alwayswithin 0.02% of the accurate value from SeDuMi.

B. Sparse SDPs from SDPLIB

We consider two max-cut problems (maxG32 andmaxG51), a Lov´asz theta problem (thetaG51), and a box-constrained quadratic problem (qpG51) from SDPLIB [20].Table I reports the dimensions and chordal decompositiondetails of these large, sparse SDPs. Table II summarizes ournumerical results; maxG51, thetaG51 and qpG51 could notbe solved by the combination SparseCoLO+SeDuMi due tomemory overﬂow. For all four problems, CDCS (primal anddual) is faster than SeDuMi and can give speedups comparedto SCS and SparseCoLO+SCS. We remark that the stoppingobjective value from CDCS is within 2% of the optimal valuereturned by SeDuMi (which is highly accurate, and can beconsidered exact) in all four cases, and within 0.08% forthe max-cut problems maxG32 and maxG51 — a negligibledifference in applications.VI. C

ONCLUSION

We proposed a conversion framework for SDPs charac-terized by chordal sparsity suitable for the application ofFOMs, analogous to the conversion techniques for IPMsof [11], [12]. We also developed efﬁcient ADMM algorithmsfor sparse SDPs in primal or dual standard form, which areimplemented in the solver CDCS. Numerical experimentson SDPs with block-arrow sparsity patterns and on largesparse problems in SDPLIB show that our methods canprovide speedups compared to both IPM solvers such asSeDuMi [24]—even when the chordal sparsity is exploitedusing SparseCoLO [13]—and the state-of-the-art ﬁrst-ordersolver SCS [25]. Exploiting chordal sparsity in a ﬁrst-orderHSDE formulation similar to that of [7] would be desirablein the future to be able to detect infeasibility. ig. 3. CPU time for SDPs with block-arrow patterns. Left to right: varying number of constraints; varying number of blocks; varying block size.TABLE IIR

ESULTS FOR THE PROBLEM INSTANCES IN

SDPLIBSeDuMi SparseCoLO+SeDuMi SCS SparseCoLO+SCS CDCS(primal) CDCS(dual)maxG32 Total time (s) 974.6 355.2 2.553 × × × × × × × Iterations 14 15 2000 960 238 127maxG51 Total time (s) 134.5 – 87.9 1.201 × × – 4.006 × × × × Iterations 16 – 540 2000 235 157thetaG51 Total time (s) 2.218 × – 424.2 1.346 × × – 2.330 × × – 1.288 × × × × Iterations 22 – 2000 2000 1287 1048 R EFERENCES[1] S. Boyd, L. El Ghaoui, E. Feron, and V. Balakrishnan,

Linear MatrixInequalities in System and Control Theory . SIAM, 1994.[2] S. Boyd and L. Vandenberghe,

Convex optimization . CambridgeUniversity Press, 2004.[3] L. Vandenberghe and S. Boyd, “Semideﬁnite programming,”

SIAMreview , vol. 38, no. 1, pp. 49–95, 1996.[4] F. Alizadeh, J.-P. A. Haeberly, and M. L. Overton, “Primal-dualinterior-point methods for semideﬁnite programming: convergencerates, stability and numerical results,”

SIAM J. Optim. , vol. 8, no. 3,pp. 746–768, 1998.[5] C. Helmberg, F. Rendl, R. J. Vanderbei, and H. Wolkowicz, “Aninterior-point method for semideﬁnite programming,”

SIAM J. Optim. ,vol. 6, no. 2, pp. 342–361, 1996.[6] Z. Wen, D. Goldfarb, and W. Yin, “Alternating direction augmentedlagrangian methods for semideﬁnite programming,”

Math. Program.Comput. , vol. 2, no. 3-4, pp. 203–230, 2010.[7] B. O’Donoghue, E. Chu, N. Parikh, and S. Boyd, “Conic optimizationvia operator splitting and homogeneous self-dual embedding,”

J.Optim. Theory Appl. , vol. 169, no. 3, pp. 1042–1068, 2016.[8] M. Andersen, J. Dahl, Z. Liu, and L. Vandenberghe, “Interior-pointmethods for large-scale cone programming,” in

Optimization formachine learning . MIT Press, 2011, pp. 55–83.[9] R. Grone, C. R. Johnson, E. M. S´a, and H. Wolkowicz, “Positivedeﬁnite completions of partial hermitian matrices,”

Linear AlgebraAppl. , vol. 58, pp. 109–124, 1984.[10] J. Agler, W. Helton, S. McCullough, and L. Rodman, “Positivesemideﬁnite matrices with a given sparsity pattern,”

Linear algebraAppl. , vol. 107, pp. 101–149, 1988.[11] M. Fukuda, M. Kojima, K. Murota, and K. Nakata, “Exploitingsparsity in semideﬁnite programming via matrix completion I: Generalframework,”

SIAM J. Optim. , vol. 11, no. 3, pp. 647–674, 2001.[12] S. Kim, M. Kojima, M. Mevissen, and M. Yamashita, “Exploiting spar-sity in linear and nonlinear matrix inequalities via positive semideﬁnitematrix completion,”

Math. Program. , vol. 129, no. 1, pp. 33–68, 2011.[13] K. Fujisawa, S. Kim, M. Kojima, Y. Okamoto, and M. Yamashita,“Users manual for SparseCoLO: Conversion methods for sparse conic- form linear optimization problems,” Research Report B-453, TokyoInstitute of Technology, Tokyo 152-8552, Japan, Tech. Rep., 2009.[14] S. Burer, “Semideﬁnite programming in the space of partial positivesemideﬁnite matrices,”

SIAM J. Optim. , vol. 14, no. 1, pp. 139–172,2003.[15] M. S. Andersen, J. Dahl, and L. Vandenberghe, “Implementationof nonsymmetric interior-point methods for linear optimization oversparse matrix cones,”

Math. Program. Comput. , vol. 2, no. 3-4, pp.167–201, 2010.[16] Y. Sun, M. S. Andersen, and L. Vandenberghe, “Decomposition inconic optimization with partially separable structure,”

SIAM J. Optim. ,vol. 24, no. 2, pp. 873–897, 2014.[17] A. Kalbat and J. Lavaei, “A fast distributed algorithm for decompos-able semideﬁnite programs,” in

Proc. 54th IEEE Conf. Decis. Control ,Dec 2015, pp. 1742–1749.[18] R. Madani, A. Kalbat, and J. Lavaei, “ADMM for sparse semideﬁniteprogramming with applications to optimal power ﬂow problem,” in

Proc. 54th IEEE Conf. Decis. Control , Dec 2015, pp. 5932–5939.[19] Y. Zheng, G. Fantuzzi, A. Papachristodoulou, P. Goulart, and A. Wynn,“CDCS: Cone decomposition conic solver, version 1.0,” https://github.com/OxfordControl/CDCS, Sep. 2016.[20] B. Borchers, “SDPLIB 1.2, a library of semideﬁnite programming testproblems,”

Optim. Methods Softw. , vol. 11, no. 1-4, pp. 683–690, 1999.[21] M. Yannakakis, “Computing the minimum ﬁll-in is NP-complete,”

SIAM Journal on Algebraic Discrete Methods , vol. 2, pp. 77–79, 1981.[22] L. Vandenberghe and M. S. Andersen, “Chordal graphs and semideﬁ-nite optimization,”

Found. Trends R (cid:13) Optim. , vol. 1, no. 4, pp. 241–433,2014.[23] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributedoptimization and statistical learning via the alternating directionmethod of multipliers,”

Found. Trends R (cid:13) Mach. Learn. , vol. 3, no. 1,pp. 1–122, 2011.[24] J. F. Sturm, “Using SeDuMi 1.02, a MATLAB toolbox for optimizationover symmetric cones,”