[PDF] The Constant Trace Property in Noncommutative Optimization

Abstract

In this article, we show that each semidefinite relaxation of a ball-constrained noncommutative polynomial optimization problem can be cast as a semidefinite program with a constant trace matrix variable. We then demonstrate how this constant trace property can be exploited via first order numerical methods to solve efficiently the semidefinite relaxations of the noncommutative problem.

Full PDF

aa r X i v : . [ m a t h . O C ] F e b The Constant Trace Property in Noncommutative Optimization

Ngoc Hoang Anh Mai

LAASToulouse, [email protected]

Abhishek Bhardwaj

LAASToulouse, [email protected]

Victor Magron

LAASToulouse, [email protected]

ABSTRACT

In this article, we show that each semideﬁnite relaxation of aball-constrained noncommutative polynomial optimization prob-lem can be cast as a semideﬁnite program with a constant trace ma-trix variable. We then demonstrate how this constant trace prop-erty can be exploited via ﬁrst order numerical methods to solve ef-ﬁciently the semideﬁnite relaxations of the noncommutative prob-lem.

KEYWORDS noncommutative polynomial optimization, sums of hermitian squares,eigenvalue and trace optimization, conditional gradient-based aug-mented Lagrangian, constant trace property, semideﬁnite program-ming

ACM Reference Format:

Ngoc Hoang Anh Mai, Abhishek Bhardwaj, and Victor Magron. 2021. TheConstant Trace Property in Noncommutative Optimization. In

Proceedingsof ACM Conference (ISSAC’21).

ACM, New York, NY, USA, 8 pages.

Polynomial optimization problems (POP) are present in many ar-eas of mathematics, and science in general. There are many appli-cations in global optimization, control and analysis of dynamicalsystems to name a few [13], and being able to eﬃciently solve POPis of great importance.In this article we focus on noncommutative (nc) polynomial opti-mization problems (NCPOP), that is, polynomial optimization withnon-commuting variables. NCPOP has several applications in con-trol [26] and quantum information [6, 19, 21].Since the advent of interior point methods for semideﬁnite pro-grams (SDP) [2], there have been many approaches to solving POP,using powerful representation results from real algebraic geometry for positive polynomials. Inspired by Schmüdgen’s solution to the moment problem on compact semialgebraic sets [25], these meth-ods aim to provide certiﬁcates of global positivity. There are natu-ral analogues to these approaches in the nc setting, coming from free algebraic geometry [9], and the tracial moment problems [4].

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor proﬁt or commercial advantage and that copies bear this notice and the full cita-tion on the ﬁrst page. Copyrights for components of this work owned by others thanACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-publish, to post on servers or to redistribute to lists, requires prior speciﬁc permissionand/or a fee. Request permissions from [email protected].

A standard approach in the commutative setting, is

Lasserre’sHierarchy [12], which provides a sequence lower bounds on the op-timal values for POPs, with guaranteed convergence under somenatural constraints according to Putinar’s Positivstellensatz [24].This hierarchy and its nc extension to eigenvalue/trace optimiza-tion [5, 23], involve solving SDPs over the space of multivariatemoment and nc Hankel matrices, respectively.Due to the current capacity of interior-point SDP solvers such asMosek [1, 20], these hierarchies can only be applied when the mul-tivariate moment (or nc Hankel) matrices are of “moderate” size.Often restricting their use to polynomials of low degrees, or in fewvariables, with the situation being worse in the nc setting.A strategy for reducing the size of the SDP hierarchies is toexploit the sparsity structures of POPs. They include correlativesparsity (CS) in [11] and term sparsity (TS), CS-TS in [28] all ofwhich are the analogs of the commutative works about CS [27],TS [29, 30] and CS-TS [31].Encouraged by [7, 33], in [17, 18] the ﬁrst and third authorsshowed how to exploit the

Constant Trace Property (CTP) for SDPrelaxations of POPs, which is satisﬁed when the matrices involvedin the SDP relaxations have constant trace. By utilizing ﬁrst or-der spectral methods to solve the required SDP relaxations, theyattained signiﬁcant computational gains for POPs constrained onsimple domains, e.g., sphere, ball, annulus, box and simplex.In this article, we extend the exploitation of the CTP to NCPOPs.Our two main contributions are the following: First, we obtain anal-ogous results to [17, 18], which ensure the CTP for a broad class ofdense NCPOPs. In particular, if nc ball (or nc polydisc) constraint(s)is present, then CTP holds. We also extend this CTP-framework tosome NCPOPs with correlative sparsity. Secondly, We provide a Ju-lia package for solving NCPOPs with CTP. The package makes useof ﬁrst order methods for solving SDPs with CTP. We also demon-strate the numerical and computational eﬃciency of this approach,on some sample classes of dense NCPOPs and NCPOPs with cor-relative sparsity.

Here we introduce some basic preliminary knowledge neededin the sequel. For a more detailed introduction to the topics intro-duced in this section, the reader is referred to [5].

We denote by 𝑋 the noncommuting letters 𝑋 , . . . , 𝑋 𝑛 . Let h 𝑋 i = h 𝑋 , . . . , 𝑋 𝑛 i be the free monoid generated by 𝑋 , and call its ele-ments words in 𝑋 . Given a word 𝑤 = 𝑋 𝑖 . . . 𝑋 𝑖 𝑟 , 𝑤 ∗ is its reverse,i.e., 𝑤 ∗ = 𝑋 𝑖 𝑟 . . . 𝑋 𝑖 . Consider the free algebra R h 𝑋 i of polynomi-als in 𝑋 with coeﬃcients in R . Its elements are called noncommu-tative (nc) polynomials . Endow R h 𝑋 i with the involution 𝑓 → 𝑓 ∗ SSAC’21, July 2021, Saint Petersburg, Russia Mai, Bhardwaj and Magron which ﬁxes R ∪ (cid:8) 𝑋 (cid:9) pointwise. The length of the longest wordin a polynomial 𝑓 ∈ R h 𝑋 i is called the degree of 𝑓 and is de-noted deg ( 𝑓 ) . We write R h 𝑋 i 𝑑 for all nc polynomials of degreeat most 𝑑 . The set of symmetric elements of R h 𝑋 i is deﬁned asSym R h 𝑋 i = (cid:8) 𝑓 ∈ R h 𝑋 i : 𝑓 ∗ = 𝑓 (cid:9) . We employ the graded lexico-graphic ordering on all structures and objects we consider.We write h 𝑋 i 𝑑 for the set of all words in h 𝑋 i of degree at most 𝑑 , and we let W 𝑑 ( 𝑋 )≡ W 𝑑 be the column vector of words in h 𝑋 i 𝑑 ,and V 𝑑 ( 𝑋 )≡ V 𝑑 the column vector of words of degree 𝑑 . We alsodenote by W 𝑑 (resp. V 𝑑 ) the set of all entries of W 𝑑 ( 𝑋 ) (resp. V 𝑑 ( 𝑋 ) ). The length of W 𝑑 is equal to s ( 𝑑, 𝑛 ) : = Í 𝑑𝑖 = 𝑛 𝑖 , whichwe write as s ( 𝑑 ) , when contextually appropriate. Given a poly-nomial 𝑓 ∈ R h 𝑋 i 𝑑 , let f = ( 𝑓 𝑤 ) 𝑤 ∈ W 𝑑 ∈ R s ( 𝑑 ) be its vector ofcoeﬃcients. It is clear that every polynomial 𝑓 ∈ R h 𝑋 i 𝑑 is ofthe form 𝑓 = Í 𝑤 ∈ W 𝑑 𝑓 𝑤 𝑤 = f ∗ W 𝑑 = W ∗ 𝑑 f . For 𝑓 ∈ R h 𝑋 i let ⌈ 𝑓 ⌉ = ⌈ deg ( 𝑓 )/ ⌉ , and given some 𝑘 ∈ N , we deﬁne 𝑘 𝑓 : = 𝑘 − ⌈ 𝑓 ⌉ ,e.g., W 𝑘 −⌈ 𝑓 ⌉ = W 𝑘 𝑓 . We use standard notations on R 𝑚 , i.e., given a ∈ R 𝑚 , k a k denotes the usual 2-norm of a .Let S 𝑟 denote the space of real symmetric matrices of size 𝑟 , wewill normally omit the subscript 𝑟 when we discuss matrices ofarbitrary size, or if the size is clear from context. Given A ∈ S , A is positive semideﬁnite (psd) (resp. positive deﬁnite (pd)), if alleigenvalues of A are non-negative (resp. positive), and we write A (cid:23) A ≻ ( A ) the trace ( Í 𝑟𝑖 = 𝐴 𝑖,𝑖 ) ofthe matrix A ∈ S 𝑟 and tr ( A ) = 𝑟 Tr ( A ) is the normalized trace. Let S + (resp. S ++ ) be the cone of psd (resp. pd) matrices. For a subset S ⊆ S , we deﬁne S + : = S ∩ S + and S ++ : = S ∩ S ++ . We write 𝐴 = ( A , . . . , A 𝑛 ) ∈ S 𝑛 , and given 𝑞 ∈ R h 𝑋 i , by 𝑞 ( 𝐴 ) we mean theevaluation of 𝑞 ( 𝑋 ) on 𝐴 , i.e., replacement of the nc letters 𝑋 𝑖 withthe matrices A 𝑖 . We write diag ( B , . . . , B 𝑟 ) for the block diagonalmatrix with diagonal blocks being B 𝑖 .Finally, given a positive 𝑚 ∈ N , we write N ≥ 𝑚 = { 𝑚,𝑚 + , . . . } , [ 𝑚 ] = { , . . . , 𝑚 } , and we use |·| to denote the cardinality of a set. Let 𝔤 = { 𝑔 , . . . , 𝑔 𝑚 } and 𝔥 = { ℎ , . . . , ℎ ℓ } be subsets of Sym R h 𝑋 i ,with the requirement that 𝑔 =

1, unless otherwise stated.

The quadratic module generated by 𝔤 isthe set 𝑄 ( 𝔤 ) : =  𝑚 Õ 𝑖 = 𝑟 𝑗 Õ 𝑗 = 𝑝 ( 𝑗 )∗ 𝑖 𝑔 𝑖 𝑝 ( 𝑗 ) 𝑖 : 𝑟 𝑗 ∈ N ≥ , 𝑝 ( 𝑗 ) 𝑖 ∈ R h 𝑋 i  . The ideal generated by the set 𝔥 is the set 𝐼 ( 𝔥 ) : = 𝑄 ({± ℎ , . . . , ± ℎ ℓ }) .The quadratic module associated to 𝔤 = { 𝑔 } , is the set of sums ofHermitian squares (SOHS).Given 𝑘 ∈ N , the 𝑘 th -order truncation of 𝑄 ( 𝔤 ) (resp. 𝐼 ( 𝔥 ) ), de-noted by 𝑄 𝑘 ( 𝔤 ) (resp. 𝐼 𝑘 ( 𝔥 ) ), is the set of all polynomials in 𝑄 ( 𝔤 ) (resp. 𝐼 ( 𝔥 ) ) with degree at most 2 𝑘 . Moreover, one has 𝑄 𝑘 ( 𝔤 ) = ( 𝑚 Õ 𝑖 = Tr ( G 𝑖 W 𝑘 𝑔𝑖 𝑔 𝑖 W ∗ 𝑘 𝑔𝑖 ) : G 𝑖 (cid:23) ) ,𝐼 𝑘 ( 𝔥 ) = ( ℓ Õ 𝑖 = Tr ( H 𝑖 W 𝑘 ℎ𝑖 ℎ 𝑖 W ∗ 𝑘 ℎ𝑖 ) : H 𝑖 ∈ S ) . We say that 𝑄 ( 𝔤 ) + 𝐼 ( 𝔥 ) is Archimedean if for all 𝑞 ∈ R h 𝑋 i , there isa positive 𝑅 ∈ N such that 𝑅 − 𝑞 ∗ 𝑞 ∈ 𝑄 ( 𝔤 ) + 𝐼 ( 𝔥 ) . We deﬁne the semialgebraic set associ-ated to 𝔤 as D 𝔤 = (cid:8) 𝐴 ∈ S 𝑛 : ∀ 𝑔 ∈ 𝔤 , 𝑔 ( 𝐴 ) (cid:23) (cid:9) . We can naturally extend this notion from matrix tuples of the sameorder, to bounded self-adjoint operators on some Hilbert space H ,which make 𝑔 ( 𝐴 ) psd for all 𝑔 ∈ 𝔤 . This extension is called the operator semialgebraic set associated to 𝔤 , and we denote it as D ∞ 𝔤 .Similarly we deﬁne the variety associated to 𝔥 as D 𝔥 = (cid:8) 𝐴 ∈ S 𝑛 : ∀ ℎ ∈ 𝔥 , ℎ ( 𝐴 ) = (cid:9) , and the natural extension to the operator variety D ∞ 𝔥 . Suppose we havea truncated real valued sequence y = ( 𝑦 𝑤 ) 𝑤 ∈ W 𝑑 . For each suchsequence, we deﬁne the Riesz functional , 𝐿 y : R h 𝑋 i 𝑑 → R as 𝐿 y ( 𝑞 ) : = Í 𝑤 𝑞 𝑤 𝑦 𝑤 for 𝑞 = Í 𝑤 𝑞 𝑤 𝑤 ∈ R h 𝑋 i 𝑑 .Suppose further that y satisﬁes 𝑦 𝑤 = 𝑦 𝑤 ∗ for all 𝑤 ∈ W 𝑑 . We as-sociate to such y the nc Hankel matrix of order 𝑑 , M 𝑑 ( y ) , deﬁned as ( M 𝑑 ( y )) 𝑢,𝑣 = 𝐿 y ( 𝑢 ∗ 𝑣 ) , where 𝑢, 𝑣 ∈ W 𝑑 . Given 𝑞 ∈ Sym R h 𝑋 i , wedeﬁne the localizing matrix M 𝑑 𝑞 ( 𝑞 y ) as ( M 𝑑 𝑞 ( 𝑞 y )) 𝑢,𝑣 = 𝐿 y ( 𝑢 ∗ 𝑞𝑣 ) ,where now 𝑢, 𝑣 ∈ W 𝑑 𝑞 . Given 𝑓 ∈ Sym R h 𝑋 i , 𝔤 , 𝔥 ⊂ Sym R h 𝑋 i , the minimal eigenvalueof 𝑓 over D ∞ 𝔤 ∩ D ∞ 𝔥 is given by: 𝜆 min ( 𝑓 , 𝔤 , 𝔥 ) = inf n v ∗ 𝑓 ( 𝐴 ) v : 𝐴 ∈ D ∞ 𝔤 ∩ D ∞ 𝔥 , k v k = o . (2.1)We will assume that the eigenvalue minimization problem (EG) (2.1)has at least one global minimizer. We can approximate the solutionof EG (2.1) from below with a hierarchy of converging SOHS relax-ations [23], indexed by 𝑘 ∈ N : 𝜌 𝑘 : = sup { 𝜉 ∈ R : 𝑓 − 𝜉 ∈ 𝑄 𝑘 ( 𝔤 ) + 𝐼 𝑘 ( 𝔥 )} . Each relaxation gives rise to the following SDP 𝜌 𝑘 = sup 𝜉, G 𝑖 , H 𝑗  𝜉 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) 𝑓 − 𝜉 = 𝑚 Õ 𝑖 = Tr (cid:16) G 𝑖 W 𝑘 𝑔𝑖 𝑔 𝑖 W ∗ 𝑘 𝑔𝑖 (cid:17) + ℓ Õ 𝑗 = Tr (cid:18) H 𝑗 W 𝑘 ℎ𝑗 ℎ 𝑗 W ∗ 𝑘 ℎ𝑗 (cid:19) , H 𝑗 ∈ S , and G 𝑖 (cid:23)  . (2.2)Our primary interest is in the dual formulation of this SDP, whichcan be stated as 𝜏 𝑘 : = inf y ∈ R s ( 𝑘 )  𝐿 y ( 𝑓 ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) 𝑦 = , M 𝑘 ( y ) (cid:23) , M 𝑘 𝑔𝑖 ( 𝑔 𝑖 y ) (cid:23) , 𝑖 ∈ [ 𝑚 ] , M 𝑘 ℎ𝑗 ( ℎ 𝑗 y ) = , 𝑗 ∈ [ ℓ ]  . (2.3)Let 𝑘 min : = max {⌈ 𝑓 ⌉ , ⌈ 𝑔 𝑖 ⌉ , ⌈ ℎ 𝑗 ⌉ : 𝑖 ∈ [ 𝑚 ] , 𝑗 ∈ [ ℓ ]} . When 𝑄 ( 𝔤 )+ 𝐼 ( 𝔥 ) is Archimedean, both ( 𝜌 𝑘 ) 𝑘 ∈ N ≥ 𝑘 min and ( 𝜏 𝑘 ) 𝑘 ∈ N ≥ 𝑘 min converge to 𝜆 min ( 𝑓 , 𝔤 , 𝔥 ) due to an nc analog of Putinar’s Posi-tivstellensatz [8]. he Constant Trace Property in Noncommutative Optimization ISSAC’21, July 2021, Saint Petersburg, Russia Let 𝑓 , 𝔤 , 𝔥 be as above. The minimal trace of 𝑓 over D 𝔤 ∩ D 𝔥 istr min ( 𝑓 , 𝔤 , 𝔥 ) = inf (cid:8) tr ( 𝑓 ( 𝐴 )) : 𝐴 ∈ D 𝔤 ∩ D 𝔥 (cid:9) . (2.4)For trace optimization, we need some additional deﬁnitions thatcapture the speciﬁc properties of the tr operator.Let us start ﬁrst, with cyclic equivalence . Given two polynomials 𝑝, 𝑞 ∈ R h 𝑋 i , we say that 𝑝 is cyclically equivalent to 𝑞 if 𝑝 − 𝑞 isa sum of commutators, i.e., 𝑝 − 𝑞 = Í 𝑘𝑖 = ( 𝑢 𝑖 𝑣 𝑖 − 𝑣 𝑖 𝑢 𝑖 ) for some 𝑘 ∈ N and 𝑢 𝑖 , 𝑣 𝑖 ∈ R h 𝑋 i , and we write 𝑝 cyc ∼ 𝑞 . One can now deﬁnethe cyclic quadratic module 𝑄 cyc ( 𝔤 ) , as the set of all polynomials 𝑓 ∈ Sym R h 𝑋 i which are cyclically equivalent to some element of 𝑄 ( 𝔤 ) (see [5, Deﬁnition 1.56]).We cannot in general work with the sets D ∞ 𝔤 , D ∞ 𝔥 , since the al-gebra of bounded operators over a Hilbert space H does not admita trace if H is inﬁnite dimensional. Instead we restrict to a certainsubset of ﬁnite von Neumann algebras of type I and II, a subsetof the algebra of bounded operators on H , and we denote this by D II 𝔤 . Then we consider the following relaxation of (2.4):tr II min ( 𝑓 , 𝔤 , 𝔥 ) = inf n tr ( 𝑓 ( 𝐴 )) : 𝐴 ∈ D II 𝔤 ∩ D II 𝔥 o . (2.5)A discussion of von Neumann algebras is beyond the scope of thisarticle, and we refer the reader to [5, Deﬁnition 1.59] for more de-tails. An SOHS relaxation hierarchy, indexed by 𝑘 ∈ N , for (2.5)can be written as 𝜌 tr 𝑘 : = sup n 𝜉 ∈ R : 𝑓 − 𝜉 ∈ 𝑄 cyc 𝑘 ( 𝔤 ) + 𝐼 cyc 𝑘 ( 𝔥 ) o (2.6)which once again, can be written and solved as an SDP. The dualformulation of this SDP, which is our primary interest, is 𝜏 tr 𝑘 : = inf y ∈ R s ( 𝑘 )  𝐿 y ( 𝑓 ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) 𝑦 = , and 𝑦 𝑢 = 𝑦 𝑣 if 𝑢 cyc ∼ 𝑣, M 𝑘 ( y ) (cid:23) , M 𝑘 𝑔𝑖 ( 𝑔 𝑖 y ) (cid:23) , 𝑖 ∈ [ 𝑚 ] , M 𝑘 ℎ𝑗 ( ℎ 𝑗 y ) = , 𝑗 ∈ [ ℓ ]  . (2.7)Compared to the relaxation (2.3) for EG, (2.7) has several addi-tional linear constraints arising from cyclic equivalences. UnderArchimedeanity of 𝑄 ( 𝔤 ) , ( 𝜌 tr 𝑘 ) 𝑘 ∈ N ≥ 𝑘 min is monotonically increas-ing, and converges to tr II min ( 𝑓 , 𝔤 , 𝔥 ) , see [5, Corollary 5.5]. In this section we develop a framework which exploits CTP forNCPOPs. Our results below hold for both eigenvalue (2.1) and trace(2.5) minimization hierarchies (2.3) and (2.7) respectively. We pro-vide suﬃcient conditions under which CTP is guaranteed, as wellas simple linear programming methods to check these conditions.We conclude by examining some special cases.

We ﬁrst give a precise deﬁnition of CTP for NCPOP. Recall thesets 𝔤 and 𝔥 from §2. For every 𝑘 ∈ N ≥ 𝑘 min , deﬁne s 𝑘 : = Í 𝑚𝑖 = s ( 𝑘 𝑔 𝑖 ) ,and the set S ( 𝑘 ) ⊆ S s 𝑘 as S ( 𝑘 ) : = n Y ∈ S s 𝑘 : Y = diag ( Y , . . . , Y 𝑚 ) , and each Y 𝑖 ∈ S s ( 𝑘 𝑔𝑖 ) o . Letting D 𝑘 ( y ) : = diag ( M 𝑘 ( y ) , M 𝑘 𝑔 ( 𝑔 y ) , . . . , M 𝑘 𝑔𝑚 ( 𝑔 𝑚 y )) , SDP(2.3) can be rewritten as 𝜏 𝑘 = inf y ∈ R s ( 𝑘 ) ( 𝐿 y ( 𝑓 ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) 𝑦 = , D 𝑘 ( y ) ∈ S ( 𝑘 )+ , M 𝑘 ℎ𝑗 ( ℎ 𝑗 y ) = , 𝑗 ∈ [ ℓ ] ) (3.1)and we can similarly reformulate SDP (2.7) to 𝜏 tr 𝑘 : = inf y ∈ R s ( 𝑘 )  𝐿 y ( 𝑓 ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) 𝑦 = , and 𝑦 𝑢 = 𝑦 𝑣 if 𝑢 cyc ∼ 𝑣, D 𝑘 ( y ) ∈ S ( 𝑘 )+ , M 𝑘 ℎ𝑗 ( ℎ 𝑗 y ) = , 𝑗 ∈ [ ℓ ]  . (3.2) Deﬁnition 3.1 (CTP) . We say that an NCPOP has CTP if for every 𝑘 ∈ N ≥ 𝑘 min , there exists 𝑎 𝑘 > P 𝑘 ∈ S ( 𝑘 )++ such that for all y ∈ R s ( 𝑘 ) , M 𝑘 ℎ𝑗 ( ℎ 𝑗 y ) = , 𝑗 ∈ [ ℓ ] ,𝑦 = ) ⇒ Tr ( P ∗ 𝑘 D 𝑘 ( y ) P 𝑘 ) = 𝑎 𝑘 . In other words, we say that NCPOP (2.1) or (2.5) has CTP if eachdual SDP relaxation (3.1) or (3.2) has an equivalent form involvinga psd matrix whose trace is constant. In this case, 𝑎 𝑘 is the constanttrace and P 𝑘 is the change of basis matrix. The next proposition isan example of an NCPOP which has CTP. Proposition 3.2 (nc polydisc eqality).

Let 𝑚 = , ℓ ≥ 𝑛 and ℎ 𝑗 = 𝑋 𝑗 − , for 𝑗 ∈ [ 𝑛 ] . Then M 𝑘 ℎ𝑗 ( ℎ 𝑗 y ) = , 𝑗 ∈ [ ℓ ] ,𝑦 = ) ⇒ Tr ( D 𝑘 ( y )) = s ( 𝑘 ) . (3.3) Proof.

Note that D 𝑘 ( y ) = M 𝑘 ( y ) since 𝔤 = { } . Suppose that M 𝑘 ℎ𝑗 ( ℎ 𝑗 y ) = 𝑗 ∈ [ ℓ ] , and 𝑦 =

1. This implies that for every 𝑗 ∈[ 𝑛 ] , the diagonal of M 𝑘 ℎ𝑗 ( ℎ 𝑗 y ) is zeros, i.e., 𝐿 y ( 𝑢 ∗ ( 𝑋 𝑗 − ) 𝑢 ) = 𝑢 ∈ W 𝑘 − . This now implies, for every 𝑤 = 𝑋 𝑖 . . . 𝑋 𝑖 𝑟 ∈ W 𝑘 𝑦 𝑤 ∗ 𝑤 = 𝐿 y ( 𝑤 ∗ 𝑤 ) = 𝐿 y ( 𝑋 𝑖 𝑟 . . . 𝑋 𝑖 𝑋 𝑖 . . . 𝑋 𝑖 𝑟 ) = 𝐿 y ( 𝑋 𝑖 𝑟 . . . 𝑋 𝑖 ( 𝑋 𝑖 − ) 𝑋 𝑖 . . . 𝑋 𝑖 𝑟 )+ 𝐿 y ( 𝑋 𝑖 𝑟 . . . 𝑋 𝑖 𝑋 𝑖 . . . 𝑋 𝑖 𝑟 ) = 𝐿 y ( 𝑋 𝑖 𝑟 . . . 𝑋 𝑖 𝑋 𝑖 . . . 𝑋 𝑖 𝑟 ) = · · · = 𝐿 y ( 𝑋 𝑖 𝑟 𝑋 𝑖 𝑟 ) = 𝐿 y ( 𝑋 𝑖 𝑟 − ) + 𝐿 y ( ) = 𝑦 = . This yields Tr ( M 𝑘 ( y )) = Í 𝑤 ∈ W 𝑘 𝑦 𝑤 ∗ 𝑤 = s ( 𝑘 ) . (cid:3) A general solution method for solving NCPOPs which satisfyCTP can be described as follows. We ﬁrst convert the 𝑘 -th order re-laxation (3.1) or (3.2) to a standard (primal) SDP with CTP and thenleverage appropriate ﬁrst-order algorithms, such as CGAL [32] or spectral method (SM) [17, Appendix A.3], which exploit CTP tosolve the SDP.For a detailed exposition on how the SDP (3.1) or (3.2) can beconverted to a standard (primal) form, the reader is invited to con-sult [18]. There one will also ﬁnd explanations of how the pri-mal and dual forms of the SDP are related, and their use withCGAL/SM. SSAC’21, July 2021, Saint Petersburg, Russia Mai, Bhardwaj and Magron

We now provide a suﬃcient condition for NCPOP to satisfy CTP.For 𝑘 ∈ N ≥ 𝑘 min , let 𝑄 ◦ 𝑘 ( 𝔤 ) be the interior of the truncated quadraticmodule 𝑄 𝑘 ( 𝔤 ) , i.e., 𝑄 ◦ 𝑘 ( 𝔤 ) : = ( 𝑚 Õ 𝑖 = Tr ( G 𝑖 W 𝑘 𝑔𝑖 𝑔 𝑖 W ∗ 𝑘 𝑔𝑖 ) : G 𝑖 ≻ ) . Theorem 3.3.

Suppose that for every 𝑘 ∈ N , the following inclu-sion holds: R > ⊂ 𝑄 ◦ 𝑘 ( 𝔤 ) . (3.4) Then NCPOP (2.1) and (2.4) satisfy CTP.

Proof.

Let 𝑘 ∈ N ≥ 𝑘 min and 𝑎 𝑘 > 𝑎 𝑘 ∈ 𝑄 ◦ 𝑘 ( 𝔤 ) . Thenwe can write 𝑎 𝑘 = 𝑚 Õ 𝑖 = Tr (cid:16) G 𝑖 W 𝑘 𝑔𝑖 𝑔 𝑖 W ∗ 𝑘 𝑔𝑖 (cid:17) , (3.5)with each G 𝑖 ∈ S ++ . We denote by G / 𝑖 the square root of G 𝑖 . Set P 𝑘 = diag ( G / , . . . , G / 𝑚 ) . From this and (3.5), 𝑎 𝑘 = 𝐿 y (cid:16)Í 𝑚𝑖 = Tr (cid:16) G 𝑖 W 𝑘 𝑔𝑖 𝑔 𝑖 W ∗ 𝑘 𝑔𝑖 (cid:17)(cid:17) = Í 𝑚𝑖 = Tr (cid:16) G 𝑖 M 𝑘 𝑔𝑖 ( 𝑔 𝑖 y ) (cid:17) = Í 𝑚𝑖 = Tr (cid:16) G / 𝑖 M 𝑘 𝑔𝑖 ( 𝑔 𝑖 y ) G / 𝑖 (cid:17) = Tr ( P 𝑘 D 𝑘 ( y ) P 𝑘 ) . (cid:3) The following lemmas will be used later on.

Lemma 3.4.

For every 𝑟 ∈ N ≥ , there exists a positive real se-quence ( 𝑐 ( 𝑟 − ) 𝑢 ) 𝑢 ∈ W 𝑟 − such that Õ 𝑤 ∈ V 𝑟 𝑤 ∗ 𝑤 = + Õ 𝑢 ∈ W 𝑟 − 𝑐 ( 𝑟 − ) 𝑢 𝑢 ∗ © « Õ 𝑗 ∈[ 𝑛 ] 𝑋 𝑗 − ª®¬ 𝑢 . (3.6) Proof.

We intend to prove (3.6) by induction on 𝑟 . One has Í 𝑤 ∈ V 𝑤 ∗ 𝑤 = Í 𝑗 ∈[ 𝑛 ] 𝑋 𝑗 = +( Í 𝑗 ∈[ 𝑛 ] 𝑋 𝑗 − ) since V = ( 𝑋 𝑗 ) 𝑗 ∈[ 𝑛 ] ,yielding that (3.6) is true with 𝑟 =

1. Assume that (3.6) is true with 𝑟 = 𝑡 . We claim that (3.6) is true with 𝑟 = 𝑡 + ∀ 𝑢 ∈ W 𝑡 , 𝑐 ( 𝑡 ) 𝑢 : = ( 𝑐 ( 𝑡 − ) 𝑢 + 𝑢 ∈ W 𝑡 − , . (3.7)Indeed, it holds that Í 𝑤 ∈ V 𝑡 + 𝑤 ∗ 𝑤 = Í 𝑖 ,...,𝑖 𝑡 + ∈[ 𝑛 ] 𝑋 𝑖 . . . 𝑋 𝑖 𝑡 + 𝑋 𝑖 𝑡 + . . . 𝑋 𝑖 = Í 𝑖 ,...,𝑖 𝑡 + ∈[ 𝑛 ] 𝑋 𝑖 . . . 𝑋 𝑖 𝑡 ( 𝑋 𝑖 𝑡 + − / 𝑛 ) 𝑋 𝑖 𝑡 . . . 𝑋 𝑖 + 𝑛 Í 𝑖 ,...,𝑖 𝑡 + ∈[ 𝑛 ] 𝑋 𝑖 . . . 𝑋 𝑖 𝑡 𝑋 𝑖 𝑡 . . . 𝑋 𝑖 = Í 𝑖 ,...,𝑖 𝑡 ∈[ 𝑛 ] 𝑋 𝑖 . . . 𝑋 𝑖 𝑡 ( Í 𝑖 𝑡 + ∈[ 𝑛 ] 𝑋 𝑖 𝑡 + − ) 𝑋 𝑖 𝑡 . . . 𝑋 𝑖 + Í ˜ 𝑤 ∈ W 𝑡 ˜ 𝑤 ∗ ˜ 𝑤 = Í 𝑣 ∈ W 𝑡 𝑣 ∗ ( Í 𝑗 ∈[ 𝑛 ] 𝑋 𝑗 − ) 𝑣 + + Í 𝑢 ∈ W 𝑡 − 𝑐 ( 𝑡 − ) 𝑢 𝑢 ∗ (cid:16)Í 𝑗 ∈[ 𝑛 ] 𝑋 𝑗 − (cid:17) 𝑢 , where the latter equality is due to the induction assumption. (cid:3) Lemma 3.5.

For every 𝑘 ∈ N ≥ , there exists a positive real se-quence ( 𝑑 ( 𝑘 − ) 𝑢 ) 𝑢 ∈ W 𝑘 − such that Õ 𝑤 ∈ W 𝑘 𝑤 ∗ 𝑤 = + 𝑘 + Õ 𝑢 ∈ W 𝑘 − 𝑑 ( 𝑘 − ) 𝑢 𝑢 ∗ © « Õ 𝑗 ∈[ 𝑛 ] 𝑋 𝑗 − ª®¬ 𝑢 . (3.8) Proof.

Let 𝑘 ∈ N . From Lemma 3.4, we obtain that Õ 𝑤 ∈ W 𝑘 𝑤 ∗ 𝑤 = + Õ 𝑟 ∈[ 𝑘 ] Õ 𝑤 ∈ V 𝑟 𝑤 ∗ 𝑤 = + 𝑘 + Õ 𝑟 ∈[ 𝑘 ] Õ 𝑢 ∈ W 𝑟 − 𝑐 ( 𝑟 − ) 𝑢 𝑢 ∗ © « Õ 𝑗 ∈[ 𝑛 ] 𝑋 𝑗 − ª®¬ 𝑢 , yielding the selection 𝑑 ( 𝑘 − ) 𝑢 = Í 𝑟 ∈[ deg ( 𝑢 )+ ] 𝑐 ( 𝑟 − ) 𝑢 , for 𝑢 ∈ W 𝑘 − in (3.8). Hence the desired result follows. (cid:3) The next result shows that CTP is satisﬁed whenever an NCPOPinvolves a ball constraint. For a real symmetric matrix A , denotethe largest eigenvalue of A by 𝜆 max ( A ) . Theorem 3.6. If − Í 𝑗 ∈[ 𝑛 ] 𝑋 𝑗 ∈ 𝔤 then the inclusions (3.4) holdand therefore NCPOP (2.1) and (2.4) have CTP. Proof.

Without loss of generality, set 𝑔 𝑚 : = − Í 𝑗 ∈[ 𝑛 ] 𝑋 𝑗 andlet 𝑘 ∈ N ≥ 𝑘 min be ﬁxed. By Lemma 3.5, 𝑎 𝑘 = Tr ( W 𝑘 W ∗ 𝑘 ) + Tr ( G 𝑚 W 𝑘 − 𝑔 𝑚 W ∗ 𝑘 − ) , where 𝑎 𝑘 = + 𝑘 and G 𝑚 = diag (( 𝑑 ( 𝑘 − ) 𝑢 ) 𝑤 ∈ W 𝑘 − ) is pd. Denoteby I 𝑡 the identity matrix of size s ( 𝑡 ) for 𝑡 ∈ N . Let U be a realsymmetric matrix such that 𝑚 − Õ 𝑖 = Tr ( W 𝑘 𝑔𝑖 𝑔 𝑖 W ∗ 𝑘 𝑔𝑖 ) = Tr ( UW 𝑘 W ∗ 𝑘 ) . Let 𝛿 > I 𝑘 − 𝛿 U ≻

0, namely, 𝛿 = /(| 𝜆 max ( U )| + ) .Note G : = I 𝑘 − 𝛿 U . Then 𝑎 𝑘 = Tr ( G W 𝑘 W ∗ 𝑘 ) + 𝛿 Í 𝑚 − 𝑖 = Tr ( W 𝑘 𝑔𝑖 𝑔 𝑖 W ∗ 𝑘 𝑔𝑖 )+ Tr ( G 𝑚 W 𝑘 − 𝑔 𝑚 W ∗ 𝑘 − ) , which implies 𝑎 𝑘 ∈ 𝑄 ◦ 𝑘 ( 𝔤 ) , the desired result. (cid:3) Even though this is not of crucial interest in the context of thispaper, we mention that Theorem 3.6 can be used to prove thatstrong duality holds for the primal-dual (2.2)-(2.3) for all 𝑘 ≥ 𝑘 min (see also [28, Theorem 3.6] which is an nc analog of [10]). Thefollowing corollary states that polynomials positive deﬁnite on asemialgebraic set belong to the interior of the truncated quadraticmodule for a suﬃciently large truncation order when an nc ballconstraint is present. Corollary 3.7.

Assume that 𝑄 ( 𝔤 ) Archimedean. Let 𝑞 ∈ Sym R h 𝑋 i ,such that 𝑞 ( 𝐴 ) ≻ for all 𝐴 ∈ D 𝔤 . If − Í 𝑗 ∈[ 𝑛 ] 𝑋 𝑗 ∈ 𝔤 , then 𝑞 ∈ 𝑄 ◦ 𝑘 ( 𝔤 ) for 𝑘 suﬃciently large. Proof.

Let 1 − Í 𝑗 ∈[ 𝑛 ] 𝑋 𝑗 ∈ 𝔤 . Then for all 𝐴 = ( A , . . . , A 𝑛 ) ∈D 𝔤 , I − Í 𝑗 ∈[ 𝑛 ] A 𝑗 (cid:23)

0, so I (cid:23) A 𝑗 , 𝑗 ∈ [ 𝑛 ] , where I is the identitymatrix. It implies that D 𝔤 is bounded. Thus there exists a smallenough 𝜀 > ( 𝑞 − 𝜀 )( 𝐴 ) = 𝑞 ( 𝐴 ) − 𝜀 I ≻ 𝐴 ∈ D 𝔤 .By using the nc analog of Putinar’s Positivstellensatz [5, Theorem1.32], there exists ˜ 𝑘 ∈ N such that 𝑞 − 𝜀 ∈ 𝑄 𝑘 ( 𝔤 ) for all 𝑘 ≥ ˜ 𝑘 .Let 𝑘 ∈ N ≥ ˜ 𝑘 be ﬁxed. By Theorem 3.6, 𝜀 ∈ 𝑄 ◦ 𝑘 ( 𝔤 ) and therefore 𝑞 = ( 𝑞 − 𝜀 ) + 𝜀 ∈ 𝑄 ◦ 𝑘 ( 𝔤 ) , which yields the desired conclusion. (cid:3) he Constant Trace Property in Noncommutative Optimization ISSAC’21, July 2021, Saint Petersburg, Russia Remark . Combining the proof of Theorem 3.6 with the proofof Theorem 3.3, one can obtain explicit expressions for 𝑎 𝑘 and P 𝑘 in Deﬁnition 3.1. Namely, 𝑎 𝑘 = + 𝑘 and P 𝑘 = diag (cid:16) G / , √ 𝛿 I 𝑘 𝑔 , . . . , √ 𝛿 I 𝑘 𝑔𝑚 − , G / 𝑚 (cid:17) . However, in our experience this choice leads to poor numericalproperties. In the next section we provide a hierarchy of linearprograms (LPs) inspired from the inclusions (3.4), to obtain the con-stant trace 𝑎 𝑘 and the change of basis matrix P 𝑘 which achieve abetter numerical performance. For any 𝑘 ∈ N ≥ 𝑘 min , let b S ( 𝑘 ) be the set of real diagonal matricesof size s ( 𝑘 ) and consider the following linear program (LP)inf 𝜉, G 𝑖 , H 𝑗  𝜉 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) 𝜉 = 𝑚 Õ 𝑖 = Tr (cid:16) G 𝑖 W 𝑘 𝑔𝑖 𝑔 𝑖 W ∗ 𝑘 𝑔𝑖 (cid:17) + ℓ Õ 𝑗 = Tr (cid:18) H 𝑗 W 𝑘 ℎ𝑗 ℎ 𝑗 W ∗ 𝑘 ℎ𝑗 (cid:19) , G 𝑖 − I 𝑖 ∈ b S ( 𝑘 𝑔𝑖 )+ , 𝑖 ∈ { } ∪ [ 𝑚 ]  , (3.9)where I 𝑖 is the identity matrix of size 𝑠 ( 𝑘 𝑔 𝑖 ) for 𝑖 ∈ { } ∪ [ 𝑚 ] . Lemma 3.9.

If LP (3.9) has a feasible solution ( 𝜉 𝑘 , G 𝑖,𝑘 , H 𝑗,𝑘 ) forevery 𝑘 ∈ N ≥ 𝑘 min , then NCPOP (2.1) and (2.4) have CTP with 𝑎 𝑘 = 𝜉 𝑘 and P 𝑘 = diag ( G / ,𝑘 , . . . , G / 𝑚,𝑘 ) . The proof of Lemma 3.9 is similar to that of Theorem 3.3 with 𝑎 𝑘 = 𝜉 𝑘 and G 𝑖 = G 𝑖,𝑘 , 𝑖 ∈ { } ∪ [ 𝑚 ] .We provide in the following proposition a more detailed descrip-tion of some feasible solutions to (3.9) in the special cases of the ncpolydisc and the nc ball. Proposition 3.10.

Suppose either 𝔤 = (cid:8) , − Í 𝑖 ∈[ 𝑛 ] 𝑋 𝑖 (cid:9) or that 𝔤 = (cid:8) 𝑛 − 𝑋 𝑖 : 𝑖 ∈ [ 𝑛 ] (cid:9) ∪ { } . Then LP (3.9) has a feasible solutionfor every 𝑘 ∈ N ≥ 𝑘 min , and therefore NCPOP (2.1) and (2.4) satisfyCTP. Proof.

It is suﬃcient in both cases to show that (3.9) has a fea-sible solution for every 𝑘 ∈ N ≥ 𝑘 min .Let 𝑚 = 𝑔 = − Í 𝑗 ∈[ 𝑛 ] 𝑋 𝑗 . By Lemma 3.5, 𝑎 𝑘 = Tr ( W 𝑘 W ∗ 𝑘 )+ Tr ( G W 𝑘 − 𝑔 W ∗ 𝑘 − ) , where G = diag (( 𝑑 ( 𝑘 − ) 𝑢 ) 𝑤 ∈ W 𝑘 − ) is pd.Denote by I 𝑠 ( 𝑘 ) the identity matrix of size 𝑠 ( 𝑘 ) . Thus with largeenough 𝑟 > ( 𝑟𝑎 𝑘 , ( 𝑟 I 𝑠 ( 𝑘 ) , 𝑟 G ) , ) is a feasible solution of (3.9),for every 𝑘 ∈ N ≥ 𝑘 min .On the other hand, if 𝑚 = 𝑛 and 𝑔 𝑗 = 𝑛 − 𝑋 𝑗 , for 𝑗 ∈ [ 𝑛 ] , thenby Lemma 3.5, 𝑎 𝑘 = Tr ( W 𝑘 W ∗ 𝑘 ) + Tr ( GW 𝑘 − ( Í 𝑗 ∈[ 𝑛 ] 𝑔 𝑗 ) W ∗ 𝑘 − ) = Tr ( W 𝑘 W ∗ 𝑘 ) + Í 𝑗 ∈[ 𝑛 ] Tr ( GW 𝑘 − 𝑔 𝑗 W ∗ 𝑘 − ) , where the diagonal ma-trix G = diag (( 𝑑 ( 𝑘 − ) 𝑢 ) 𝑤 ∈ W 𝑘 − ) is pd. Thus with large enough 𝑟 > ( 𝑟𝑎 𝑘 , ( 𝑟 I 𝑠 ( 𝑘 ) , 𝑟 G ) , ) is a feasible solution of (3.9), for every 𝑘 ∈ N ≥ 𝑘 min . (cid:3) Since small constant traces are highly desirable for eﬃciency ofﬁrst-order algorithms (e.g. CGAL), we search for an optimal solu-tion of LP (3.9) instead of just a feasible solution.

Algorithm 1 below solves EG (2.1) where CTP can be veriﬁed byLP. A similar algorithm solves NCPOP (2.4).

Algorithm 1

SpecialEP-CTP

Input:

EG (2.1) and a relaxation order 𝑘 ∈ N ≥ 𝑘 min Output:

The optimal value 𝜏 𝑘 of SDP (3.1) Solve LP (3.9) to obtain an (optimal) solution ( 𝜉 𝑘 , G 𝑖,𝑘 , H 𝑗,𝑘 ) ; Let 𝑎 𝑘 = 𝜉 𝑘 and P 𝑘 = diag ( G / ,𝑘 , . . . , G / 𝑚,𝑘 ) ; Compute the optimal value 𝜏 𝑘 of SDP (3.1) by running an algorithmbased on ﬁrst-order methods, and which exploits CTP. Two examples of algorithms based on ﬁrst-order methods andwhich exploit CTP are CGAL [32] or SM [17, Appendix A.3].

In this section, we show that the CTP can analogously be ex-ploited for EG (2.1) with sparse nc polynomials. For brevity, wefocus on EG, and show only the framework for correlative sparsity (CS) [11], however the trace minimization setting, as well as theframeworks for term sparsity (TS) as well as correlative-term spar-sity (CS-TS) [28] are very similar. We note that the proofs are verysimilar to those presented in §3, and so we omit them for brevity.To begin with, we deﬁne CS and present the associated approx-imation hierarchies for EG (2.1) satisfying CS, which were initiallyproposed in [11, 27].

For 𝑤 = 𝑋 𝑖 . . . 𝑋 𝑖 𝑟 , let var ( 𝑤 ) : = { 𝑖 , . . . , 𝑖 𝑟 } . For 𝐼 ⊆ [ 𝑛 ] , let 𝑋 ( 𝐼 ) : = (cid:8) 𝑋 𝑗 : 𝑗 ∈ 𝐼 (cid:9) and W 𝐼𝑑 ( 𝑋 ) : = { 𝑤 ∈ W 𝑑 ( 𝑋 ) : var ( 𝑤 ) ⊆ 𝑋 ( 𝐼 )} with length s ( 𝑑, | 𝐼 |) : = Í 𝑑𝑖 = | 𝐼 | 𝑖 . Similarly, we note V 𝐼𝑑 ( 𝑋 ) : = (cid:8) 𝑤 ∈ V 𝑑 ( 𝑋 ) : var ( 𝑤 ) ⊆ 𝑋 ( 𝐼 ) (cid:9) . Given y = ( 𝑦 𝑤 ) 𝑤 ∈ W 𝑑 , and 𝐼 ⊆[ 𝑛 ] , the nc Hankel submatrix associated to 𝐼 of order 𝑑 is deﬁnedas ( M 𝑑 ( y , 𝐼 )) 𝑢,𝑣 : = 𝐿 y ( 𝑢 ∗ 𝑣 ) , for 𝑢, 𝑣 ∈ W 𝐼𝑑 and for 𝑞 ∈ R h 𝑋 ( 𝐼 )i , the localizing (sub)matrix is ( M 𝑑 𝑞 ( 𝑞 y , 𝐼 )) 𝑢,𝑣 : = 𝐿 y ( 𝑢 ∗ 𝑞𝑣 ) , for 𝑢, 𝑣 ∈ W 𝐼𝑑 𝑞 . Assume that (cid:8) 𝐼 𝑗 (cid:9) 𝑗 ∈[ 𝑝 ] (with 𝑛 𝑗 : = | 𝐼 𝑗 | ) are the maximal cliquesof (a chordal extension of) the correlative sparsity pattern (csp)graph associated to EG (2.1), as deﬁned in [11, 27]. Let (cid:8) 𝐽 𝑗 (cid:9) 𝑗 ∈[ 𝑝 ] (resp. (cid:8) 𝑊 𝑗 (cid:9) 𝑗 ∈[ 𝑝 ] ) be a partition of [ 𝑚 ] (resp. [ ℓ ] ) such that for all 𝑖 ∈ 𝐽 𝑗 , 𝑔 𝑖 ∈ R h 𝑋 ( 𝐼 𝑗 )i (resp. 𝑖 ∈ 𝑊 𝑗 , ℎ 𝑖 ∈ R h 𝑋 ( 𝐼 𝑗 )i ), for every 𝑗 ∈ [ 𝑝 ] . For each 𝑗 ∈ [ 𝑝 ] , let 𝑚 𝑗 : = | 𝐽 𝑗 | , 𝑙 𝑗 : = | 𝑊 𝑗 | and 𝔤 𝐽 𝑗 : = (cid:8) 𝑔 𝑖 : 𝑖 ∈ 𝐽 𝑗 (cid:9) , 𝔥 𝑊 𝑗 : = (cid:8) ℎ 𝑖 : 𝑖 ∈ 𝑊 𝑗 (cid:9) . Then 𝑄 ( 𝔤 𝐽 𝑗 ) (resp. 𝐼 ( 𝔥 𝑊 𝑗 ) ) is aquadratic module (resp. an ideal) in R h 𝑋 ( 𝐼 𝑗 )i , for each 𝑗 ∈ [ 𝑝 ] .For each 𝑘 ∈ N ≥ 𝑘 min , consider the hierarchy of sparse SOHSrelaxations 𝜌 cs 𝑘 : = sup  𝜉 : 𝑓 − 𝜉 ∈ Õ 𝑗 ∈[ 𝑝 ] (cid:16) 𝑄 ( 𝔤 𝐽 𝑗 ) 𝑘 + 𝐼 ( 𝔥 𝑊 𝑗 ) 𝑘 (cid:17)  . (4.1)This relaxation can be stated as a primal SDP similar to (2.2), butwe are mostly interested in the dual SDP, which can be stated as SSAC’21, July 2021, Saint Petersburg, Russia Mai, Bhardwaj and Magron follows 𝜏 cs 𝑘 : = inf y ∈ R s ( 𝑘 )  𝐿 y ( 𝑓 ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) 𝑦 = , M 𝑘 ( y , 𝐼 𝑗 ) (cid:23) , 𝑗 ∈ [ 𝑝 ] . M 𝑘 𝑔𝑖 ( 𝑔 𝑖 y , 𝐼 𝑗 ) (cid:23) , 𝑖 ∈ 𝐽 𝑗 , 𝑗 ∈ [ 𝑝 ] , M 𝑘 ℎ𝑖 ( ℎ 𝑖 y , 𝐼 𝑗 ) = , 𝑖 ∈ 𝑊 𝑗 , 𝑗 ∈ [ 𝑝 ]  . (4.2)It is shown in [11, Corollary 6.6] that the primal-dual SDP pairarising from (4.1) are guaranteed to converge to the optimal valueif there are ball constraints present on each clique of variables. Consider EG (2.1) with CS described in §4.1. Given 𝑗 ∈ [ 𝑝 ] , 𝑘 ∈ N ≥ 𝑘 min , and y ∈ R s ( 𝑘 ) , let D 𝑘 ( y , 𝐼 𝑗 ) : = diag ( M 𝑘 ( y , 𝐼 𝑗 ) , ( M 𝑘 𝑔𝑖 ( 𝑔 𝑖 y , 𝐼 𝑗 )) 𝑖 ∈ 𝐽 𝑗 ) , with size denoted by 𝑠 𝑘,𝑗 . Then SDP (4.2) can be rewritten as 𝜏 cs 𝑘 = inf y ∈ R s ( 𝑘 ) ( 𝐿 y ( 𝑓 ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) 𝑦 = , D 𝑘 ( y , 𝐼 𝑗 ) (cid:23) , 𝑗 ∈ [ 𝑝 ] , M 𝑘 ℎ𝑖 ( ℎ 𝑖 y , 𝐼 𝑗 ) = , 𝑖 ∈ 𝑊 𝑗 , 𝑗 ∈ [ 𝑝 ] ) . (4.3)As in the dense case, let us deﬁne S ( 𝑘,𝑗 ) : = (cid:26) Y ∈ S s 𝑘,𝑗 : Y = diag ( Y , ( Y 𝑖 ) 𝑖 ∈ 𝐽 𝑗 ) , Y ∈ S s ( 𝑘,𝑛 𝑗 ) and each Y 𝑖 ∈ S s ( 𝑘 𝑔𝑖 ,𝑛 𝑖 ) (cid:27) . We deﬁne CTP for EP with CS as follows.

Deﬁnition 4.1. (CS-CTP) We say that EG (2.1) with CS has CTP iffor every 𝑘 ∈ N ≥ 𝑘 min and for every 𝑗 ∈ [ 𝑝 ] , there exists a positivenumber 𝑎 ( 𝑗 ) 𝑘 and P ( 𝑗 ) 𝑘 ∈ S ( 𝑘,𝑗 )++ such that for all y ∈ R s ( 𝑘 ) , M 𝑘 ℎ𝑖 ( ℎ 𝑖 y , 𝐼 𝑗 ) = , 𝑖 ∈ 𝑊 𝑗 ,𝑦 = (cid:27) ⇒ Tr (cid:16) P ( 𝑗 ) 𝑘 D 𝑘 ( y , 𝐼 𝑗 ) P ( 𝑗 ) 𝑘 (cid:17) = 𝑎 ( 𝑗 ) 𝑘 . The following result provides a suﬃcient condition for an EG(2.1) with CS to satisfy CTP.

Theorem 4.2.

Assume that there is an nc ball constraint on eachclique of variables, i.e., − Í 𝑖 ∈ 𝐼 𝑗 𝑋 𝑖 ∈ 𝔤 𝐽 𝑗 , for every 𝑗 ∈ [ 𝑝 ] . Thenone has R > ⊆ 𝑄 ◦ 𝑘 ( 𝔤 𝐽 𝑗 ) , for all 𝑘 ∈ N ≥ 𝑘 min and for all 𝑗 ∈ [ 𝑝 ] . As aconsequence, EG (2.1) has CTP. A proof of Theorem 4.2 can be obtained in a similar fashion to§3.2, by considering each clique of variables.

As in the dense case, given an EG (2.1) with CS, we can verifyif CS-CTP is satisﬁed via a hierarchy of LPs. For every 𝑘 ∈ N ≥ 𝑘 min and for every 𝑗 ∈ [ 𝑝 ] , let b S ( 𝑘,𝑗 ) be the set of real diagonal matricesof size s ( 𝑘, 𝑛 𝑗 ) and consider the following LPinf 𝜉, G 𝑖 , H 𝑖  𝜉 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) G 𝑖 − I ( 𝑗 ) 𝑖 ∈ b S ( 𝑘 𝑔𝑖 ,𝑗 )+ , 𝑖 ∈ 𝐽 𝑗 ∪ { } ,𝜉 = Õ 𝑖 ∈{ }∪ 𝐽 𝑗 Tr (cid:16) G 𝑖 W 𝐼 𝑗 𝑘 𝑔𝑖 𝑔 𝑖 ( W 𝐼 𝑗 𝑘 𝑔𝑖 ) ∗ (cid:17) + Õ 𝑖 ∈ 𝑊 𝑗 Tr (cid:16) H 𝑖 W 𝐼 𝑗 𝑘 ℎ𝑖 ℎ 𝑖 ( W 𝐼 𝑗 𝑘 ℎ𝑖 ) ∗ (cid:17)  , (4.4)where I ( 𝑗 ) 𝑖 is the identity matrix of size 𝑠 ( 𝑘 𝑔 𝑖 , 𝑗 ) , for every 𝑖 ∈ { }∪ 𝐽 𝑗 . Lemma 4.3.

Let EG (2.1) with CS be as described in §4.1. If LP (4.4) has a feasible solution ( 𝜉 ( 𝑗 ) 𝑘 , G ( 𝑗 ) 𝑖,𝑘 , H ( 𝑗 ) 𝑖,𝑘 ) , for every 𝑘 ∈ N ≥ 𝑘 min and for every 𝑗 ∈ [ 𝑝 ] , then EG (2.1) satisﬁes CS-CTP with P ( 𝑗 ) 𝑘 = diag (( G ( 𝑗 ) ,𝑘 ) / , (( G ( 𝑗 ) 𝑖,𝑘 ) / ) 𝑖 ∈ 𝐽 𝑖 ) and 𝑎 ( 𝑗 ) 𝑘 = 𝜉 ( 𝑗 ) 𝑘 , for 𝑘 ∈ N ≥ 𝑘 min and for 𝑗 ∈ [ 𝑝 ] . Similar to §3, two special cases where CS-CTP can be veriﬁedthrough LP (4.4), are the nc polydisc, and the nc ball on each cliqueof variables.

Proposition 4.4.

Let EG (2.1) with CS be as described in §4.1.Suppose either of the following holds • Case 1: 𝔤 𝐽 𝑗 = n − Í 𝑖 ∈ 𝐽 𝑗 𝑋 𝑖 o , 𝑗 ∈ [ 𝑝 ] . • Case 2: 𝔤 𝐽 𝑗 = n | 𝐽 𝑗 | − 𝑋 𝑖 : 𝑖 ∈ 𝐽 𝑗 o , 𝑗 ∈ [ 𝑝 ] .Then LP (4.4) has a feasible solution for every 𝑘 ∈ N ≥ 𝑘 min , and there-fore EG (2.1) satisﬁes CS-CTP. The proof of the Proposition 4.4 is similar to the dense setting.

Algorithm 2 below solves EG (2.1) with CS and whose CS-CTPcan be veriﬁed by solving LP (4.4).

Algorithm 2

SpecialEP-CS-CTP

Input:

An EG (2.1) with CS and a relaxation order 𝑘 ∈ N ≥ 𝑘 min Output:

The optimal value 𝜏 cs 𝑘 of SDP (4.2) for 𝑗 ∈ [ 𝑝 ] do Solve LP (4.4) to obtain an optimal solution ( 𝜉 ( 𝑗 ) 𝑘 , G ( 𝑗 ) 𝑖,𝑘 , H ( 𝑗 ) 𝑗,𝑘 ) ; Let 𝑎 ( 𝑗 ) 𝑘 = 𝜉 ( 𝑗 ) 𝑘 and P ( 𝑗 ) 𝑘 = diag (( G ( 𝑗 ) ,𝑘 ) / , . . . , ( G ( 𝑗 ) 𝑚,𝑘 ) / ) ; Compute the optimal value 𝜏 cs 𝑘 of SDP (4.3) by running an algorithmbased on ﬁrst-order methods and which exploits CTP. In this section we report results of numerical experiments con-ducted on the eigenvalue minimization problem (2.1). These resultswere obtained by executing Algorithm 1 and Algorithm 2, respec-tively for dense and sparse randomly generated instances of ncquadratically constrained quadratic problems (QCQPs) with CTP.In the dense case, one computes the ﬁrst and second order SDPrelaxations, namely the optimal values 𝜏 and 𝜏 of SDP (3.1). Sim-ilarly in the sparse case, one computes the optimal values 𝜏 cs1 and 𝜏 cs2 of SDP (4.3). The experiments are performed in Julia 1.3.1 withthe following software packages: • NCTSSOS [28] is a modeling library for solving Moment-SOSrelaxations of sparse EPs based on JuMP (with Mosek 9.1used as SDP solver). • Arpack [15] is used to compute the smallest eigenvalues andthe corresponding eigenvectors of real symmetric matricesof (potentially) large size, which is based on the implicitlyrestarted Arnoldi method [14].Both implementation of Algorithm 1 and 2 are available online: https://github.com/maihoanganh/ctpNCPOP . he Constant Trace Property in Noncommutative Optimization ISSAC’21, July 2021, Saint Petersburg, Russia Table 1:

Numerical results for randomly generated dense QCQPswith nc ball constraint • EP size: 𝑚 = 𝑙 = ⌈ 𝑛 / ⌉ ; SDP size: 𝜔 = 𝑎 max =

3; CGALaccuracy: 𝜀 = − .EP size SDP size Mosek CGAL 𝑛 𝑙 𝑘 𝑠 max 𝜁 val time val time10 3 1 11 5 -3.2413 1 -3.2411 22 111 815 -3.1110 28 -3.1107 5920 5 1 21 7 -3.5534 0.03 -3.5525 12 421 5587 − − -3.5026 20330 8 1 31 10 -4.6984 0.1 -4.6954 12 931 18415 − − -4.6819 1392 We use a desktop computer with an Intel(R) Core(TM) i7-8665UCPU @ 1.9GHz × 𝑛 , 𝑚 and 𝑙 , respectively. We denote by 𝑘 the relaxationorder used to solve the dense SDP (3.1) and the sparse SDP (4.3).For NCPOP with CS, let us denote by 𝑢 max the largest size of vari-able cliques and 𝑝 the number of variable cliques. We note 𝜔 , 𝑠 max , 𝜁 and 𝑎 max the number of psd blocks, the largest size of psd blocks,the number of aﬃne equality constraints and the largest constanttrace of matrices involved in the SDP relaxations, respectively. Let“val” stand for the approximate optimal value of the SDP relaxationwith desired accuracy 𝜀 for CGAL, and let “time” be the correspond-ing running time in seconds. We use “ − ” to indicate that the calcu-lation runs out of space. For all examples tested in this paper, themodeling time for both NCTSSOS and ctpNCPOP is typically negligi-ble compared to the solving time of Mosek and CGAL. Hence thetotal running time mainly depends on the solvers and we comparetheir performances below.

Test problems:

We construct randomly generated dense QCQPswith nc ball and nc polydisk constraints as follows:(1) Generate a dense quadratic polynomial objective function 𝑓 = Í 𝑤 ∈ W ¯ 𝑓 𝑤 ( 𝑤 + 𝑤 ∗ ) ∈ Sym R h 𝑋 i with coeﬃcients ¯ 𝑓 𝑤 randomly chosen w.r.t. the uniform probability distributionon (− , ) .(2) Do one of the following two cases: • nc ball: let 𝑚 = 𝑔 : = − Í 𝑟 ∈[ 𝑛 ] 𝑋 𝑟 ; • nc polydisk: let 𝑚 = 𝑛 and 𝑔 𝑖 : = 𝑛 − 𝑋 𝑖 , 𝑖 ∈ [ 𝑛 ] ;(3) Take a random point a in { 𝑥 ∈ R 𝑛 : 𝑔 𝑖 ( 𝑥 ) ≥ , 𝑖 ∈ [ 𝑚 ]} w.r.t.the uniform distribution;(4) For every 𝑗 ∈ [ ℓ ] , generate a dense quadratic polynomial ℎ 𝑗 = Í 𝑤 ∈ W ¯ ℎ ( 𝑗 ) 𝑤 ( 𝑤 + 𝑤 ∗ ) ∈ Sym R h 𝑋 i :(i) for each 𝑤 ∈ W \{ } , select a random coeﬃcient ¯ ℎ ( 𝑗 ) 𝑤 in (− , ) w.r.t. the uniform distribution;(ii) set ¯ ℎ ( 𝑗 ) : = − Í 𝑤 ∈ W \{ } ¯ ℎ ( 𝑗 ) 𝑤 𝑤 ( a ) .Then a is a feasible solution of EG (2.1).The numerical results are displayed in Tables 1 and 2. The re-sults show that CGAL is typically the fastest solver and returns an Table 2:

Numerical results for randomly generated dense QCQPswith nc polydisk constraints • EP size: 𝑚 = 𝑛 , 𝑙 = ⌈ 𝑛 / ⌉ ; SDP size: 𝜔 = 𝑛 + 𝑎 max =

3, CGALaccuracy: 𝜀 = − .EP size SDP size Mosek CGAL 𝑛 𝑙 𝑘 𝑠 max 𝜁 val time val time10 2 1 11 13 -3.2165 0.009 -3.2154 0.42 111 1343 -3.2039 26 -3.2037 22920 3 1 21 24 -4.5773 0.03 -4.5767 22 421 9514 − − -4.5147 75330 3 1 31 36 -5.1182 0.8 -5.1172 32 931 31311 − − -5.0717 2215 approximate optimal value which diﬀers from 1% w.r.t. the one re-turned by Mosek when 𝑛 ≤

20. Mosek runs out of memory when 𝑛 ≥

20 and 𝑘 =

2, while CGAL still works well. With our currentsetting for the CGAL accuracy, the approximate optimal value iscorrect up to the two ﬁrst accuracy digits, so we can guarantee thatthe bound improves from 𝑘 = 𝑘 = Test problems:

We construct randomly generated nc QCQPs withCS and ball constraints on each clique of variables as follows:(1) Take a positive integer 𝑢 , 𝑝 : = ⌊ 𝑛 / 𝑢 ⌋ + 𝐼 𝑗 =  [ 𝑢 ] , if 𝑗 = , { 𝑢 ( 𝑗 − ) , . . . , 𝑢 𝑗 } , if 𝑗 ∈ { , . . . , 𝑝 − } , { 𝑢 ( 𝑝 − ) , . . . , 𝑛 } , if 𝑗 = 𝑝 ; (5.1)(2) Generate a quadratic polynomial objective function 𝑓 = Í 𝑗 ∈[ 𝑝 ] 𝑓 𝑗 such that for each 𝑗 ∈ [ 𝑝 ] , 𝑓 𝑗 = Í 𝑤 ∈ W 𝐼𝑗 ¯ 𝑓 ( 𝑗 ) 𝑤 ( 𝑤 + 𝑤 ∗ ) ∈ Sym R h 𝑋 ( 𝐼 𝑗 )i , and the coeﬃcients are randomly generatedas in the dense setting;(3) Take 𝑚 = 𝑝 and 𝑔 𝑗 : = − Í 𝑖 ∈ 𝐼 𝑗 𝑋 𝑖 , 𝑗 ∈ [ 𝑚 ] .(4) Take a random point a as in the dense setting;(5) Let 𝑟 : = ⌊ 𝑙 / 𝑝 ⌋ and 𝑊 𝑗 : = ( {( 𝑗 − ) 𝑟 + , . . . , 𝑗𝑟 } , if 𝑗 ∈ [ 𝑝 − ] , {( 𝑝 − ) 𝑟 + , . . . , 𝑙 } , if 𝑗 = 𝑝 . (5.2)For every 𝑗 ∈ [ 𝑝 ] and every 𝑖 ∈ 𝑊 𝑗 , generate a quadraticpolynomial ℎ 𝑖 = Í 𝑤 ∈ W 𝐼𝑗 ¯ ℎ ( 𝑖 ) 𝑤 ( 𝑤 + 𝑤 ∗ ) ∈ Sym R h 𝑋 ( 𝐼 𝑗 )i as in the dense setting to ensure that a is a feasible solutionof EG (2.1).The number of variables is ﬁxed as 𝑛 = 𝑢 so that the number of variable cliques 𝑝 decreasesaccordingly. The numerical results are displayed in Table 3. Againresults in Table 3 show that CGAL is slower than Mosek for 𝑘 = 𝑘 = 𝑢 ≤ 𝑘 = 𝑢 ≥

15, while CGAL isonce again able to obtain improved lower bounds.

SSAC’21, July 2021, Saint Petersburg, Russia Mai, Bhardwaj and Magron

Table 3:

Numerical results for randomly generated QCQPs with CSand nc ball constraint on each clique of variables • EP size: 𝑛 = 𝑚 = 𝑝 , 𝑙 = 𝑢 max = 𝑢 +

1; SDP size: 𝜔 = 𝑝 , 𝑎 max =

3; CGAL accuracy: 𝜀 = − .EP size SDP size Mosek CGAL 𝑢 𝑝 𝑘 𝜔 𝑠 max 𝜁 val time val time10 100 1 200 12 541 -2.9659 3 -2.9662 2062 200 133 91850 -2.9594 32008 -2.9598 779015 66 1 132 27 405 -2.3230 1 -2.3225 382 132 703 185592 − − -2.3179 1005120 50 1 100 22 341 -2.1517 4 -2.1515 542 100 463 290908 − − -2.1260 11791 We have provided a constructive proof that the constant traceproperty holds for semideﬁnite relaxations of eigenvalue or traceoptimization problems, whenever an nc ball constraint is present.This property can be easily veriﬁed by solving a hierarchy of linearprograms, when the only involved inequality constraints are eithernoncommutative ball or nc polydisk constraints. This allows one touse ﬁrst order methods exploiting the constant trace property (e.g.,CGAL) to solve the semideﬁnite relaxations of large-scale eigen-value problems more eﬃciently than with second order interior-point solvers (e.g., Mosek). We have experimentally demonstratedsome of these computational gains on eigenvalue minimization.Similar gains shall be achievable for trace minimization. For manytesting examples in this paper, the relative optimality gap of CGALw.r.t. Mosek is always smaller than 1%.As a topic of further research, we intend to rely on our frame-work to tackle applications arising from quantum information andcondensed matter, including bounds on maximal violation levelsfor Bell inequalities [21] or ground state energies of many bodyHamiltonians [3]. Preliminary experiments not reported in this pa-per show that relying on CGAL improves some existing bounds forBell inequalities, while Mosek runs out memory. We intend to im-prove our software implementation to overcome the accuracy is-sues arising when using CGAL. A related investigation track is todesign a numerical method for ﬁnding the constant trace and thechange of basis for noncommutative problems with arbitrary in-equality constraints (possibly including nc ball constraint). Ideally,ﬁrst order semideﬁnite solvers should have rich numerical prop-erties when combined with the constant trace and the change ofbasis matrix obtained in our method. This will allow us to design,implement and analyze a hybrid numeric-symbolic scheme as in[16, 22], to obtain exact nonnegativity certiﬁcates of noncommuta-tive problems.

REFERENCES [1] E. D. Andersen and K. D. Andersen. The MOSEK interior point optimizer forlinear programming: an implementation of the homogeneous algorithm. In

Highperformance optimization , pages 197–232. Springer, 2000.[2] M. F. Anjos and J. B. Lasserre, editors.

Handbook on semideﬁnite, conic and poly-nomial optimization , volume 166 of

International Series in Operations Research &Management Science . Springer, New York, 2012. [3] T. Barthel and R. Hübener. Solving condensed-matter ground-state problems bysemideﬁnite relaxations.

Physical review letters , 108(20):200404, 2012.[4] S. Burgdorf and I. Klep. The truncated tracial moment problem.

Journal ofOperator Theory , pages 141–163, 2012.[5] S. Burgdorf, I. Klep, and J. Povh.

Optimization of polynomials in non-commutingvariables , volume 2. Springer, 2016.[6] S. Gribling, D. de Laat, and M. Laurent. Bounds on entanglement dimensionsand quantum graph parameters via noncommutative polynomial optimization.

Mathematical Programming , 170(1):5–42, 2018.[7] C. Helmberg and F. Rendl. A spectral bundle method for semideﬁnite program-ming.

SIAM Journal on Optimization , 10(3):673–696, 2000.[8] J. Helton and S. McCullough. A positivstellensatz for non-commutative poly-nomials.

Transactions of the American Mathematical Society , 356(9):3721–3737,2004.[9] J. W. Helton, I. Klep, and S. McCullough. Free convex algebraic geometry.

Semi-deﬁnite optimization and convex algebraic geometry , 13:341–405, 2013.[10] C. Josz and D. Henrion. Strong duality in lasserre’s hierarchy for polynomialoptimization.

Optimization Letters , 10(1):3–10, 2016.[11] I. Klep, V. Magron, and J. Povh. Sparse noncommutative polynomial optimiza-tion. arXiv preprint arXiv:1909.00569 , 2019.[12] J. B. Lasserre. Global optimization with polynomials and the problem of mo-ments.

SIAM Journal on optimization , 11(3):796–817, 2001.[13] J.-B. Lasserre.

Moments, positive polynomials and their applications , volume 1.World Scientiﬁc, 2010.[14] R. B. Lehoucq and D. C. Sorensen. Deﬂation techniques for an implicitlyrestarted Arnoldi iteration.

SIAM Journal on Matrix Analysis and Applications ,17(4):789–821, 1996.[15] R. B. Lehoucq, D. C. Sorensen, and C. Yang.

ARPACK users’ guide: solution oflarge-scale eigenvalue problems with implicitly restarted Arnoldi methods . SIAM,1998.[16] V. Magron and M. S. E. Din. On Exact Polya and Putinar’s Representations. In

ISSAC’18: Proceedings of the 2018 ACM International Symposium on Symbolic andAlgebraic Computation . ACM, New York, NY, USA, 2018.[17] N. H. A. Mai, J.-B. Lasserre, V. Magron, and J. Wang. Exploiting constant traceproperty in large-scale polynomial optimization. arXiv preprint arXiv:2012.08873 ,2020.[18] N. H. A. Mai, V. Magron, and J.-B. Lasserre. A hierarchy of spectral relaxationsfor polynomial optimization. arXiv preprint arXiv:2007.09027 , 2020.[19] J. Marecekand J. Vala. Quantum optimal control via magnus expansion: The non-commutative polynomial optimization problem. arXiv preprint arXiv:2001.06464 ,2020.[20] A. MOSEK. The MOSEK Optimization Toolbox, Version 8.1, 2017.[21] K. F. Pál and T. Vértesi. Quantum bounds on bell inequalities.

Physical ReviewA , 79(2):022120, 2009.[22] H. Peyrl and P. Parrilo. Computing sum of squares decompositions with rationalcoeﬃcients.

Theoretical Computer Science , 409(2):269–281, 2008.[23] S. Pironio, M. Navascués, and A. Acin. Convergent relaxations of polynomialoptimization problems with noncommuting variables.

SIAM Journal on Opti-mization , 20(5):2157–2180, 2010.[24] M. Putinar. Positive polynomials on compact semi-algebraic sets.

Indiana Univ.Math. J. , 42(3):969–984, 1993.[25] K. Schmüdgen. The k-moment problem for compact semi-algebraic sets.

Math-ematische Annalen , 289(1):203–206, 1991.[26] R. E. Skelton, T. Iwasaki, and D. E. Grigoriadis.

A uniﬁed algebraic approach tocontrol design . CRC Press, 1997.[27] H. Waki, S. Kim, M. Kojima, and M. Muramatsu. Sums of squares and semideﬁ-nite program relaxations for polynomial optimization problems with structuredsparsity.

SIAM Journal on Optimization , 17(1):218–242, 2006.[28] J. Wang and V. Magron. Exploiting term sparsity in noncommutative polynomialoptimization. arXiv preprint arXiv:2010.06956 , 2020.[29] J. Wang, V. Magron, and J.-B. Lasserre. Chordal-TSSOS: a moment-SOS hierarchythat exploits term sparsity with chordal extension.

SIAM Journal on Optimiza-tion , 31(1):114–141, 2021.[30] J. Wang, V. Magron, and J.-B. Lasserre. Tssos: A moment-sos hierarchy thatexploits term sparsity.

SIAM Journal on Optimization , 31(1):30–58, 2021.[31] J. Wang, V. Magron, J. B. Lasserre, and N. H. A. Mai. CS-TSSOS: Correla-tive and term sparsity for large-scale polynomial optimization. arXiv preprintarXiv:2005.02828 , 2020.[32] A. Yurtsever, O. Fercoq, and V. Cevher. A conditional gradient-based augmentedlagrangian framework. arXiv preprint arXiv:1901.04013 , 2019.[33] A. Yurtsever, J. A. Tropp, O. Fercoq, M. Udell, and V. Cevher. Scalable semideﬁ-nite programming. arXiv preprint arXiv:1912.02949arXiv preprint arXiv:1912.02949