Symmetry reduction in AM/GM-based optimization
Philippe Moustrou, Helen Naumann, Cordian Riener, Thorsten Theobald, Hugues Verdure
SSYMMETRY REDUCTION IN AM/GM-BASED OPTIMIZATION
PHILIPPE MOUSTROU, HELEN NAUMANN, CORDIAN RIENER, THORSTEN THEOBALD,AND HUGUES VERDURE
Abstract.
The arithmetic mean/geometric mean-inequality (AM/GM-inequality) fa-cilitates classes of non-negativity certificates and of relaxation techniques for polynomialsand, more generally, for exponential sums. Here, we present a first systematic study ofthe AM/GM-based techniques in the presence of symmetries under the linear action ofa finite group. We prove a symmetry-adapted representation theorem and develop tech-niques to reduce the size of the resulting relative entropy programs. We study in moredetail the complexity gain in the case of the symmetric group. In this setup, we can showin particular certain stabilization results. We exhibit several sequences of examples ingrowing dimensions where the size of the problem stabilizes. Finally, we provide somenumerical results, emphasizing the computational speed-up. Introduction
Deciding whether a real function only takes non-negative values is a fundamental ques-tion in real algebraic geometry. Non-negativity certificates and optimization approachesare tightly related to each other by observing that the infimum f ∗ of a function f : R n → R can be expressed as the largest λ ∈ R for which f − λ is non-negative on R n : f ∗ = inf { f ( x ) : x ∈ R n } = sup { λ ∈ R : f − λ is non-negative on R n } . Both in the context of polynomials and in the broader context of exponential sums, thelast years have seen strong interest in non-negativity certificates and optimization tech-niques based on the arithmetic mean/geometric mean-inequality (AM/GM inequality).More precisely, an exponential sum (or signomial ) supported on a finite subset
T ⊂ R n isa linear combination (cid:80) α ∈T c α exp( (cid:104) α, x (cid:105) ) with real coefficients c α . In particular cases, thenon-negativity of the real function defined by an exponential sum can be decided via thearithmetic-geometric mean inequality. For example, for support points α , . . . , α m ∈ R n and coefficients λ = ( λ , . . . , λ m ) ∈ R n + satisfying (cid:80) mi =1 λ i = 1 and (cid:80) mi =1 λ i α i = α , theexponential sum m (cid:88) i =1 λ i exp( (cid:104) α i , x (cid:105) ) − exp( (cid:104) α , x (cid:105) )is non-negative on R n as a consequence of the weighted arithmetic-geometric mean inequal-ity, namely (cid:80) mi =1 λ i exp( (cid:104) α i , x (cid:105) ) (cid:62) (cid:81) mi =1 (exp( (cid:104) α i , x (cid:105) )) λ i . Clearly, sums of such exponentialsums are non-negative as well. Note that exponential sums can be seen as a generalization Date : February 26, 2021.2010
Mathematics Subject Classification.
Key words and phrases.
Positive functions, SAGE certificates, Symmetry reduction, Symmetric group. a r X i v : . [ m a t h . O C ] F e b P. MOUSTROU, H. NAUMANN, C. RIENER, T. THEOBALD, H. VERDURE of polynomials: when
T ⊂ N n , the transformation x i = ln y i gives polynomial functions y (cid:55)→ (cid:80) α ∈T c α y α on R n> .These AM/GM-based certificates appear to be particularly useful in sparse settings.In the specialized situation of polynomials, they can be seen as an alternative to non-negativity certificates based on sums of squares. The ideas of these approaches go back toReznick [25] and have been recently brought back into the focus of the developments byPantea, Koeppl, and Craciun [23], Chandrasekaran and Shah [5] (“ SAGE ” cone: sums ofarithmetic-geometric exponentials ) and Iliman and de Wolff [15] (“
SONC ” cone: sums ofnon-negative circuit polynomials ), see also [18] for a generalized, uniform framework. TheAM/GM certificates can be effectively obtained by relative entropy programming (see [5,6]), and in restricted settings these relative entropy programs become geometric programs[16]. These techniques have been extended to cover constrained situations, prominentlyby the work of Murray, Chandrasekaran and Wierman based on partial dualization [21].This method can also be approached from sublinear circuits, see [22]. Furthermore, inthe setting of polynomials, the AM/GM-based approaches can be combined with sums ofsquares [17]. Other recent approaches to sparse polynomials besides the ones based on theAM/GM inequality can be found in the sparse moment hierarchies [30, 31].From an algebraic point of view, a problem is symmetric when it is invariant under somegroup action. Symmetries are ubiquitous in the context of polynomials and optimization,since they manifest both in the problem formulation and the solution set. This often allowsto reduce the complexity of the corresponding algorithmic questions. Regarding the set ofsolutions, it was observed by Terquem as early as in 1840 that a symmetric polynomialdoes not always have a fully symmetric minimizer (see also Waterhouse’s survey [32]).However, in many instances, the set of minimizers contains highly symmetric points (see[12, 19, 26, 29]). With respect to problem formulations, symmetry reduction has providedessential advances in many situations (see, for example, [2, 7, 9]), especially in the contextof sums of squares (see [1, 4, 8, 13, 14, 24, 27]).The current paper departs from the question to which extent symmetries can be ex-ploited in AM/GM-based optimization assuming that the problem affords symmetries.We provide a first systematic study of the AM/GM-based approaches in G -invariant sit-uations under the action of a group G . Our focus is on symmetry-adapted representationtheorems, and algorithmic symmetry reduction techniques. Our contributions.
1. We prove a symmetry-adapted decomposition theorem anddevelop a symmetry-adapted relative entropy formulation SAGE exponentials in a general G -invariant setting.2. This adaption reduces the size of the resulting relative entropy programs or geo-metric programs, see Theorem 3.1, Theorem 4.1 and Corollary 4.3. As revealed by thesestatements, the gain depends on the orbit structure of the group action.3. In the case of the symmetric group, we use combinatorial aspects of the representationtheory of the symmetric group in order to measure the size of the resulting relative entropyprogram. In particular, we identify situations in which the size of the symmetry adaptedrelative entropy program stabilizes with respect to the number of variables. YMMETRY REDUCTION IN AM/GM-BASED OPTIMIZATION 3
4. We evaluate the structural results in the paper in terms of computations. In situationswith strong symmetry structure, the number of variables and the number of equations andinequalities becomes substantially smaller. Accordingly, the interior-point solvers underly-ing the computation of SAGE bounds then show strong reductions of computation time. Invarious cases, the symmetry-adapted computation succeeds when the conventional SAGEcomputation fails.We mostly concentrate on the unconstrained optimization, but the techniques can gen-erally also be extended to the constrained case. See, for example, Theorem 4.5.The paper is structured as follows. After collecting relevant notions and concepts inSection 2, we provide in Section 3 a specific way of writing sums of arithmetic-geometricexponentials in the presence of a group symmetry. In Section 4, we study how to charac-terize and to decide whether a G -symmetric exponential sum is contained in the SAGEcone with reduced relative entropy programs. The case of the symmetric group is treatedin Section 5, while Section 6 provides experimental results of an implementation of thesymmetry reduction techniques. We conclude the paper in Section 7. Acknowledgement.
The authors gratefully acknowledge partial support through theproject “Real Algebraic Geometry and Optimization” jointly funded by the German Aca-demic Exchange Service DAAD and the Research Council of Norway RCN, and throughthe Tromsø Research foundation grant agreement 17matteCR.2.
Preliminaries
Throughout the article, we use the notation N = { , , , , . . . } . For a finite subset T ⊂ R n , let R T be the set of |T | -tuples whose components are indexed by the set T . Wedenote by (cid:104)· , ·(cid:105) the standard Euclidean inner product in R n . The SAGE cone.
For a given non-empty finite set T , we consider exponential sums sup-ported on T as defined in the Introduction. For finite T ⊂ R n , the SAGE cone C SAGE ( T )is defined as C SAGE ( T ) := (cid:88) β ∈T C AGE ( T \ { β } , β ) , where for A := T \ { β } C AGE ( A , β ) := (cid:110) f = (cid:88) α ∈A c α e (cid:104) α,x (cid:105) + c β e (cid:104) β,x (cid:105) : c α (cid:62) α ∈ A , c β ∈ R , f ( x ) (cid:62) R n (cid:111) denotes the non-negative exponential sums which may only have a negative coefficient inthe term indexed by β (see [5]). The elements in these cones are called SAGE signomials and
AGE signomials , respectively. The cone C SAGE ( T ) is a closed convex cone in R T (see [18, Proposition 2.10]).Membership to this convex cone can be decided in terms of relative entropy program-ming. For a finite set ∅ (cid:54) = A ⊂ R n , denote by D : R A > × R A > → R , D ( ν, γ ) = (cid:88) α ∈A ν α ln (cid:18) ν α γ α (cid:19) P. MOUSTROU, H. NAUMANN, C. RIENER, T. THEOBALD, H. VERDURE the relative entropy function , which can be extended to R A + × R A + → R ∪ {∞} via theconventions 0 · ln y = 0 for y (cid:62) y · ln y = ∞ for y >
0. To decide membership ofa given signomial f supported on T to the SAGE cone, assume that f is written in theform f = (cid:88) α ∈A c α exp( (cid:104) α, x (cid:105) ) + (cid:88) β ∈B c β exp( (cid:104) β, x (cid:105) )with c α > α ∈ A and c β < β ∈ B . In this notation, the overall support set of f is T = A ∪ B . Accordingly, for disjoint sets ∅ (cid:54) = A ⊂ R n and B ⊂ R n , it is convenient todenote by(2.1) C SAGE ( A , B ) := (cid:88) β ∈B C AGE ( A ∪ B \ { β } , β )the signed SAGE cone , which allows negative coefficients only in a certain subset B of thesupport A ∪ B . This a common notation in optimization viewpoints [10, 11, 16, 20, 21].
Proposition 2.1 ([20]) . A signomial f belongs to C SAGE ( A , B ) if and only if for every β ∈ B there exist c ( β ) ∈ R A + and ν ( β ) ∈ R A + such that (cid:80) α ∈A ν ( β ) α α = ( (cid:80) α ∈A ν ( β ) α ) β for β ∈ B ,D ( ν ( β ) , e · c ( β ) ) (cid:54) c β for β ∈ B , (cid:80) β ∈B c ( β ) α (cid:54) c α for α ∈ A . Note that this proposition reflects the statement of Murray, Chandrasekaran and Wier-man [20] that every SAGE signomial can be decomposed into AGE signomials in such away that every term with a negative coefficient only appears in a single AGE signomial.
Optimizing over the SAGE cone.
Since the SAGE cone is contained in the cone ofnon-negative signomials, relaxing to the SAGE cone gives an approximation of the globalinfimum f ∗ of a signomial f supported on T : f SAGE = sup { λ ∈ R : f − λ ∈ C SAGE ( T ) } satisfying f SAGE (cid:54) f ∗ . Constrained versions.
While many aspects of this article are devoted to the uncon-strained situation, we briefly collect the extension of SAGE certificates to the constrainedsituation. Let K be a convex and closed subset of R n . For a convex set K ⊂ R n and anon-empty finite set T ⊂ R n , the K -SAGE cone C K ( T ) is defined (see [21]) as C K ( T ) := (cid:88) β ∈T C K ( T \ { β } , β ) , where for A := T \ { β } , C K ( A , β ) := (cid:110) f = (cid:88) α ∈A c α e (cid:104) α,x (cid:105) + c β e (cid:104) β,x (cid:105) : c α (cid:62) α ∈ A , c β ∈ R , f ( x ) (cid:62) K (cid:111) . YMMETRY REDUCTION IN AM/GM-BASED OPTIMIZATION 5
Moreover, (2.1) can be generalized by defining, for disjoint sets ∅ (cid:54) = A ⊂ R n and B ⊂ R n ,the signed K -SAGE cone C K ( A , B ) := (cid:88) β ∈B C K ( A , β ) . This is the set of K -SAGE signomials, where negative coefficients are only possible in acertain subset B of the support A ∪ B . The following decomposition result holds.
Theorem 2.2 ([21], Corollary 5) . If f ∈ C K ( A , B ) with c α > for all α ∈ A and c β < for all β ∈ B (cid:54) = ∅ , then there exist K -AGE signomials f β ∈ C K ( A , β ) for β ∈ B such that f = (cid:80) β ∈B f β . For the constrained approach, a similar result to Proposition 2.1 is known.
Proposition 2.3 ([21]) . f ∈ C K ( A∪B ) if and only if for every β ∈ B there exist c ( β ) ∈ R A + and ν ( β ) ∈ R A + such that D ( ν ( β ) , e · c ( β ) ) + sup x ∈ K (cid:104)− (cid:80) α ∈A ν ( β ) α ( α − β ) , x (cid:105) (cid:54) c β for β ∈ B , (cid:80) β ∈B c ( β ) α (cid:54) c α for α ∈ A . Orbit decompositions of symmetric exponential sums
In this section, we provide a structural result on the decomposition of symmetric SAGEexponentials as sums of orbits of (non-symmetric) AGE exponentials.Let G be a finite group acting linearly on R n on the left, namely we have a grouphomomorphism ϕ : G → GL n ( R ) σ (cid:55)→ ϕ ( σ ) . For σ ∈ G and x ∈ R n , we denote by σ · x the image of x through ϕ ( σ ). In order to get aleft action on the set of functions defined on R n , we need to take(3.1) ( σ ∗ f )( x ) = f ( σ − · x ) = f ( ϕ ( σ − )( x )) . For a signomial f ( x ) = (cid:80) α c α exp( (cid:104) α, x (cid:105) ), we see an exponent vector α as an element ofthe dual space. Then, the dual action of G on the exponent vectors is given by σ ⊥ α := ϕ ( σ − ) ( α ) , where A denotes the adjoint operator of A . Note that this is a left action as well.Therefore, even if the exponents and the variables lie in isomorphic spaces, the actions of G on these spaces are different and dual to each other, and satisfy (cid:104) α, σ · x (cid:105) = (cid:104) α, ϕ ( σ )( x ) (cid:105) = (cid:104) ϕ ( σ ) ( α ) , x (cid:105) = (cid:104) σ − ⊥ α, x (cid:105) and furthermore, for a signomial f ,(3.2) ( σ ∗ f )( x ) = f ( σ − · x ) = (cid:88) α c α exp( (cid:104) α, σ − · x (cid:105) ) = (cid:88) α c α exp( (cid:104) σ ⊥ α, x (cid:105) ) . From now on, in order to keep notations as light as possible, with a slight abuse ofnotations, we write σ ( x ) = σ · x for the action on the variables, σf = σ ∗ f for the action P. MOUSTROU, H. NAUMANN, C. RIENER, T. THEOBALD, H. VERDURE on functions, and σ ( α ) = σ ⊥ α for the dual action. Even if the actions are different, thecontext should clarify the correspondence.For a set S ⊂ R n of exponent vectors, the orbit of S under G is G · S = { σ ( s ) : s ∈ S , σ ∈ G } . We call a subset ˆ
S ⊂ S a set of orbit representatives for S if ˆ S is an inclusion-minimalset with ( G · ˆ S ) = S . Moreover, let Stab β := { σ ∈ G : σ ( β ) = β } denote the stabilizer of an exponent vector β .In the following statements, we consider G -invariant signomials f . It is convenient towrite f here in the form(3.3) f = (cid:88) α ∈A c α exp( (cid:104) α, x (cid:105) ) + (cid:88) β ∈B c β exp( (cid:104) β, x (cid:105) )with c α > α ∈ A and c β < β ∈ B , i.e., f is an element of the signed SAGE cone C SAGE ( A , B ) introduced in Section 2. As already mentioned, in this notation, the overallsupport set of f is A ∪ B . Theorem 3.1.
Let K ⊂ R n be convex and G -invariant, let f be a G -invariant signomialof the form (3.3) and ˆ B be a set of orbit representatives for B . Then f ∈ C K ( A , B ) if andonly if for every ˆ β ∈ ˆ B , there exists a K -AGE signomial h ˆ β ∈ C K ( A , ˆ β ) such that (3.4) f = (cid:88) ˆ β ∈ ˆ B (cid:88) ρ ∈ G/ Stab( ˆ β ) ρh ˆ β . The functions h ˆ β can be chosen to be invariant under the action of Stab( ˆ β ) . Here, ρ ∈ G/ Stab( ˆ β ) shortly denotes that ρ runs over a set of representatives of the leftquotient space G/ Stab( ˆ β ), which is defined through the left cosets { σ Stab( ˆ β ) : σ ∈ G } .We will also use the right quotient space, denoted by Stab( ˆ β ) \ G , further below. Proof.
Since it is clear that a signomial f of the form (3.4) is non-negative, we only haveto show the converse direction. Let f ∈ C K ( A , B ). By Theorem 2.2, there exist K -AGEsignomials f β ∈ C K ( A , β ) for β ∈ B , such that f = (cid:80) β ∈B f β . The G -invariance of f gives(3.5) f = 1 | G | (cid:88) σ ∈ G σf = 1 | G | (cid:88) σ ∈ G (cid:88) β ∈B σf β . The idea is to group in this sum all the σf β that have the same “possibly negative” term.According to (3.2), the possibly negative term of σf β is given by σ ( β ). For any β ∈ B ,the signomial h β = 1 | G | (cid:88) σ ∈ G σf σ − ( β ) is a sum of K -AGE signomials in C K ( A , β ), hence it is contained in C K ( A , β ) as well.Moreover, (3.5) can be expressed as f = 1 | G | (cid:88) σ ∈ G (cid:88) β ∈B σf β = 1 | G | (cid:88) σ ∈ G (cid:88) γ ∈B σf σ − ( γ ) = (cid:88) γ ∈B h γ . YMMETRY REDUCTION IN AM/GM-BASED OPTIMIZATION 7
Let β ∈ B and ˆ β ∈ ˆ B be the representative of its orbit in ˆ B . If σ, τ ∈ G are such that σ ( ˆ β ) = τ ( ˆ β ) = β , then τ − σ ∈ Stab( ˆ β ) and τ = σ in G/ Stab( ˆ β ). Hence, f = (cid:88) ˆ β ∈ ˆ B (cid:88) ρ ∈ G/ Stab ˆ β h ρ ( ˆ β ) . (3.6)Now observe that h ρ ( β ) = ρh β for every β ∈ B and ρ ∈ G, because | G | ρh β = (cid:88) σ ∈ G ρσf σ − ( β ) = (cid:88) τ ∈ G τ f τ − ρ ( β ) = | G | h ρ ( β ) . (3.7)Substituting (3.7) into (3.6) gives f = (cid:80) ˆ β ∈ ˆ B (cid:80) ρ ∈ G/ Stab ˆ β ρh ˆ β as desired. Moreover, theStab( ˆ β )-invariance of h ˆ β for ˆ β ∈ ˆ B follows from (3.7). (cid:3) Remark 3.2.
Note that the previous results extend naturally to compact/reductivegroups, since they mainly rely on the existence of a Reynolds operator. For the sake ofsimplicity, we presented them for finite groups, where the Reynolds operator correspondsto a finite average over the group.4.
Symmetry reduction in relative entropy programming
Building upon the previous decomposition theorem, we provide a symmetry-adaptedrelative entropy formulation for containment in the SAGE cone.
Theorem 4.1.
Let ˆ B be a set of orbit representatives for B . A G -invariant signomial f of the form (3.3) is contained in C SAGE ( A , B ) if and only if for every ˆ β ∈ ˆ B there exist c ( ˆ β ) ∈ R A + and ν ( ˆ β ) ∈ R A + , invariant under the action of Stab( ˆ β ) , such that (cid:88) α ∈A ν ( ˆ β ) α ( α − ˆ β ) = 0 for every ˆ β ∈ ˆ B , (4.1) D ( ν ( ˆ β ) , e · c ( ˆ β ) ) (cid:54) c ˆ β for every ˆ β ∈ ˆ B , (4.2) (cid:88) ˆ β ∈ ˆ B (cid:88) σ ∈ Stab ( ˆ β ) \ G c ( ˆ β ) σ ( α ) (cid:54) c α for every α ∈ A . (4.3) Remark 4.2.
The right coset condition (4.3) can equivalently be expressed in terms ofthe left cosets, (cid:88) ˆ β ∈ ˆ B (cid:88) σ ∈ G/ Stab ˆ β c ( ˆ β ) σ − ( α ) (cid:54) c α for every α ∈ A . Namely, if β ∈ B , ˆ β ∈ ˆ B and σ, τ ∈ G are such that σ − ( ˆ β ) = τ − ( ˆ β ) = β , then τ σ − ∈ Stab( ˆ β ) and τ = σ in the right quotient space Stab( ˆ β ) \ G . Proof of Theorem 4.1. If f is G -symmetric, then, by Theorem 3.1, there exist Stab( ˆ β )-invariant AGE signomials h ˆ β ∈ C SAGE ( A , ˆ β ) for every ˆ β ∈ ˆ B such that f = (cid:88) ˆ β ∈ ˆ B (cid:88) ρ ∈ G/ Stab( ˆ β ) ρh ˆ β . P. MOUSTROU, H. NAUMANN, C. RIENER, T. THEOBALD, H. VERDURE
Writing h ˆ β in the form h ˆ β = (cid:88) α ∈A c ( ˆ β ) α exp( (cid:104) α, x (cid:105) ) + c ˆ β exp( (cid:104) ˆ β, x (cid:105) )with coefficients c ( ˆ β ) α and c ˆ β for α ∈ A and ˆ β ∈ ˆ B , the two conditions (4.1) and (4.2) followfrom the property h ˆ β ∈ C SAGE ( A , ˆ β ). For (4.3), we observe that for α ∈ A , the coefficientof exp( (cid:104) α, x (cid:105) ) in ρh ˆ β is c ( ˆ β ) ρ − ( α ) . We obtain inequality (4.3), even with equality, by setting σ := ρ − and summing over ˆ β ∈ ˆ B and over σ ∈ Stab( ˆ β ) \ G , following Remark 4.2.Moreover, the Stab( h ˆ β )-invariance of h ˆ β implies the Stab( ˆ β )-invariance of c ( ˆ β ) . In orderto make ν ( ˆ β ) invariant under Stab( ˆ β ), we can replace it with µ ( ˆ β ) α = 1 | Stab( ˆ β ) | (cid:88) σ ∈ Stab( ˆ β ) ν ( ˆ β ) σ ( α ) . Obviously, this has no influence on (4.3). For (4.1), we have | Stab( ˆ β ) | (cid:88) α ∈A µ ( ˆ β ) α ( α − ˆ β ) = (cid:88) α ∈A (cid:88) σ ∈ Stab( ˆ β ) ν ( ˆ β ) σ ( α ) ( α − ˆ β )= (cid:88) σ ∈ Stab( ˆ β ) σ − (cid:88) α ∈A ν ( ˆ β ) σ ( α ) ( σ ( α ) − σ ( ˆ β ))= (cid:88) σ ∈ Stab( ˆ β ) σ − (cid:88) α ∈A ν ( ˆ β ) α ( α − ˆ β )) = 0 . Finally, for (4.2), using c ( ˆ β ) α = c ( ˆ β ) σ ( α ) for σ ∈ Stab( ˆ β ) and applying Jensen’s inequality onthe convex function x (cid:55)→ x ln x gives, for all α ∈ A , µ ( ˆ β ) α ln µ ( ˆ β ) α c ( ˆ β ) α = | Stab( ˆ β ) | (cid:88) σ ∈ Stab( ˆ β ) ν ( ˆ β ) σ ( α ) ln | Stab( ˆ β ) | (cid:80) σ ∈ Stab( ˆ β ) ν ( ˆ β ) σ ( α ) c ( ˆ β ) α = c ( ˆ β ) α (cid:80) σ ∈ Stab( ˆ β ) ν ( ˆ β ) σ ( α ) /c ( ˆ β ) σ ( α ) | Stab( ˆ β ) | ln (cid:80) σ ∈ Stab( ˆ β ) ν ( ˆ β ) σ ( α ) /c ( ˆ β ) σ ( α ) | Stab( ˆ β ) | (cid:54) c ( ˆ β ) α | Stab( ˆ β ) | (cid:88) σ ∈ Stab( ˆ β ) ν ( ˆ β ) σ ( α ) c ( ˆ β ) σ ( α ) ln ν ( ˆ β ) σ ( α ) c ( ˆ β ) σ ( α ) . Using again the Stab( ˆ β )-invariance of c ( ˆ β ) and the precondition then yields (cid:88) α ∈A µ ( ˆ β ) α ln µ ( ˆ β ) α ec ( ˆ β ) α (cid:54) | Stab( ˆ β ) | (cid:88) σ ∈ Stab( ˆ β ) (cid:88) α ∈A ν ( ˆ β ) σ ( α ) ln ν ( ˆ β ) σ ( α ) ec ( ˆ β ) σ ( α ) (cid:54) | Stab( ˆ β ) | (cid:88) σ ∈ Stab( ˆ β ) c ˆ β = c ˆ β . YMMETRY REDUCTION IN AM/GM-BASED OPTIMIZATION 9
Conversely, assume that c ( ˆ β ) and ν ( ˆ β ) , invariant under the action of Stab( ˆ β ), sat-isfy (4.1)–(4.3). Let β ∈ B and ˆ β ∈ ˆ B be the representative of its orbit in ˆ B . If σ, τ ∈ G are such that σ ( β ) = τ ( β ) = ˆ β , then τ σ − ∈ Stab( ˆ β ) and τ = σ in Stab( ˆ β ) \ G . Since c ( ˆ β ) and ν ( ˆ β ) are invariant under Stab( ˆ β ), we have c ( ˆ β ) τ ( α ) = c ( ˆ β ) σ ( α ) , ν ( ˆ β ) τ ( α ) = ν ( ˆ β ) σ ( α ) for α ∈ A . Thus we can define c ( β ) α = c ( ˆ β ) σ ( α ) , ν ( β ) α = ν ( ˆ β ) σ ( α ) for α ∈ A , which is independent of σ such that σ ( β ) = ˆ β . As a consequence, if τ ∈ Stab( ˆ β ) \ G , then c ( τ − ( ˆ β )) α = c ( ˆ β ) τ ( α ) is well defined.To see that the first conditions of Proposition 2.1 are satisfied, let β ∈ B and σ ∈ G such that σ ( β ) = ˆ β . Then (cid:88) α ∈A ν ( β ) α ( α − β ) = (cid:88) α ∈A ν ( ˆ β ) σ ( α ) ( α − σ − ( ˆ β ))= σ − (cid:88) α ∈A ν ( ˆ β ) σ ( α ) ( σ ( α ) − ˆ β ) = σ − (cid:88) α ∈A ν ( ˆ β ) α ( α − ˆ β ) = 0and D ( ν ( β ) , ec ( β ) ) = D ( ν ( ˆ β ) , ec ( ˆ β ) ) (cid:54) c ˆ β = c β . For the third condition of Proposition 2.1, we obtain (cid:88) β ∈B c ( β ) α = (cid:88) ˆ β ∈ ˆ B (cid:88) τ ∈ Stab( ˆ β ) \ G c ( τ − ( ˆ β )) α = (cid:88) ˆ β ∈ ˆ B (cid:88) τ ∈ Stab( ˆ β ) \ G c ( ˆ β ) τ ( α ) (cid:54) c α , which altogether shows that f ∈ C SAGE ( A , B ). (cid:3) The following consequence of Theorem 4.1 further reduces the number of variables inthe relative entropy program, since a certain number of c ( ˆ β ) α and ν ( ˆ β ) α are actually equal,and we can take each c ( ˆ β ) , ν ( ˆ β ) in the ground set R A / Stab( ˆ β )+ . Corollary 4.3.
Let ˆ A and ˆ B be a set of orbit representatives for A and B . A G -invariantsignomial f of the form (3.3) is contained in C SAGE ( A , B ) if and only if for every ˆ β ∈ ˆ B there exist c ( ˆ β ) ∈ R A / Stab( ˆ β )+ and ν ( ˆ β ) ∈ R A / Stab( ˆ β )+ such that (cid:88) α ∈A / Stab( ˆ β ) ν ( ˆ β ) α (cid:88) α (cid:48) ∈ Stab( ˆ β ) · α ( α (cid:48) − ˆ β ) = 0 for every ˆ β ∈ ˆ B , (4.4) (cid:88) α ∈A / Stab( ˆ β ) (cid:12)(cid:12)(cid:12) Stab( ˆ β ) · α (cid:12)(cid:12)(cid:12) ν ( ˆ β ) α ln ν ( ˆ β ) α ec ( ˆ β ) α (cid:54) c ˆ β for every ˆ β ∈ ˆ B , (4.5) (cid:88) ˆ β ∈ ˆ B | Stab( α ) || Stab( ˆ β ) | (cid:88) γ ∈ ( G · α ) / Stab( ˆ β ) (cid:12)(cid:12)(cid:12) Stab( ˆ β ) · γ (cid:12)(cid:12)(cid:12) c ( ˆ β ) γ (cid:54) c α for every α ∈ ˆ A . (4.6) Proof.
For (4.4) and (4.5), equivalence to their versions in Theorem 4.1 is straightforwardto check. For (4.6), equivalence to (4.3) follows by observing that for every α ∈ A (cid:88) σ ∈ Stab( ˆ β ) \ G c ( ˆ β ) σ ( α ) = (cid:88) σ ∈ Stab( ˆ β ) \ G | Stab( ˆ β ) | (cid:88) τ ∈ Stab( ˆ β ) c ( ˆ β ) τ ( σ ( α )) = 1 | Stab( ˆ β ) | (cid:88) ρ ∈ G c ( ˆ β ) ρ ( α ) = | Stab( α ) || Stab( ˆ β ) | (cid:88) γ ∈ G · α c ( ˆ β ) γ = | Stab( α ) || Stab( ˆ β ) | (cid:88) γ ∈ ( G · α ) / Stab( ˆ β ) (cid:12)(cid:12)(cid:12) Stab( ˆ β ) · γ (cid:12)(cid:12)(cid:12) c ( ˆ β ) γ , and the last expression only depends on the orbit G · α rather than on α itself. (cid:3) Remark 4.4.
Note that we cannot simply assume c ( β ) α = c ( β ) α (cid:48) for some α (cid:48) ∈ G · α and,similarly, we cannot simply assume ν ( β ) α = ν ( β ) α (cid:48) for some α (cid:48) ∈ G · α , for instance due to (2.1).Namely, if an element β lies in conv A with barycentric coordinates λ , say β = (cid:80) α ∈A λ α α ,then for any σ ∈ G , we have σ ( β ) = σ (cid:32)(cid:88) α ∈A λ α α (cid:33) = (cid:88) α ∈A σ ( λ α α ) = (cid:88) α ∈A λ α σ ( α )rather than σ ( β ) = σ ( (cid:80) α ∈A λ α α ) = (cid:80) α ∈A λ σ ( α ) σ ( α ). Of course, this caveat does notoccur whenever there is a single inner term.For symmetric constraint sets K , a constrained version of Theorem 4.1 (and similarly,of Corollary 4.3) can be given as well. The proof is similar. Corollary 4.5.
Let K ⊂ R n be convex and G -invariant. A G -invariant signomial f of theform (3.3) is contained in C K ( A , B ) if and only if for every ˆ β ∈ ˆ B there exist c ( ˆ β ) ∈ R A + and ν ( ˆ β ) ∈ R A + such that D ( ν ( ˆ β ) , e · c ( ˆ β ) ) + sup x ∈ K (cid:104) (cid:0) − (cid:80) α ∈A ν ( ˆ β ) α ( α − ˆ β ) (cid:1) , x (cid:105) (cid:54) c ˆ β for every ˆ β ∈ ˆ B , (cid:80) ˆ β ∈ ˆ B (cid:80) σ ∈ Stab ˆ β \ G c ( ˆ β ) σ ( α ) (cid:54) c α for every α ∈ A . To close this section we discuss the resulting complexity reduction:Note that the initial relative entropy formulation which does not take the symmetryinto consideration will involve 2 |B||A| variables. Furthermore, since every vector equalityin (4.4) brings n scalar equalities, it will consist of |B| n + |B| + |A| (in)equalities.In contrast, let us analyze the number of variables and constraints involved in therelative entropy program in Corollary 4.3. Observe that A / Stab( ˆ β ) is the disjoint unionof the G · ˆ α/ Stab( ˆ β ) where ˆ α runs through ˆ A . It follows that for every pair ˆ β ∈ ˆ B , ˆ α ∈ ˆ A ,we have exactly 2 | ( G · ˆ α ) / Stab( ˆ β ) | variables c ( ˆ β ) γ and ν ( ˆ β ) γ .By definition, | ( G · ˆ α ) / Stab( ˆ β ) | is the number of Stab( ˆ β )-orbits in G · ˆ α . Since G · ˆ α is in bijection with Stab ˆ α \ G we get a bijection between ( G · ˆ α ) / Stab( ˆ β ) and the setof double cosets Stab( ˆ α ) \ G/ Stab( ˆ β ). Therefore, the number of orbits in question equals YMMETRY REDUCTION IN AM/GM-BASED OPTIMIZATION 11 | Stab( ˆ α ) \ G/ Stab( ˆ β ) | , satisfying, according to Burnside’s Lemma (see for instance [28,Lemma 7.24.5]): | Stab( ˆ α ) \ G/ Stab( ˆ β ) | = 1 | Stab( ˆ α ) || Stab( ˆ β ) | (cid:88) σ ∈ Stab(ˆ α ) τ ∈ Stab( ˆ β ) | G σ,τ | , where | G σ,τ | is the number of elements of G fixed under the action of ( σ, τ ). From anotherpoint of view, this number can be interpreted in terms of representation theory as follows:It is given by the inner product of the two characters corresponding to the representationsinduced respectively by the trivial representations of Stab( ˆ α ) and Stab( ˆ β ) on G (see [28,Exercise 7.77.a.] for more details).Furthermore, (4.4) amounts to | ˆ A| + | ˆ B| inequalities, together with one vector equalityfor every element of ˆ B . We observe that for a given ˆ β , this vector is invariant by Stab( ˆ β )and therefore is contained in ( R n ) Stab( ˆ β ) , the subspace of R n of points fixed by Stab( ˆ β ).Thus, by projecting onto this subspace the number of resulting equations reduces todim (cid:16) ( R n ) Stab( ˆ β ) (cid:17) . As a conclusion, we obtain: Theorem 4.6.
Let ˆ A and ˆ B be a set of orbit representatives for A and B . For ˆ α ∈ ˆ A , ˆ β ∈ ˆ B , denote by ˆ α G ˆ β the cardinality | Stab( ˆ α ) \ G/ Stab( ˆ β ) | , and by n ˆ β the dimension ofthe fixed subspace ( R n ) Stab( ˆ β ) . Then, the relative entropy program in Corollary 4.3 consistsof (cid:88) ˆ α ∈ ˆ A ˆ β ∈ ˆ B ˆ α G ˆ β variables, (cid:88) ˆ β ∈ ˆ B n ˆ β scalar equalities, and | ˆ A| + | ˆ B| inequalities. The case of the symmetric group
In this section, we focus our attention to the case of the Symmetric group S n acting on R n by permutation of the coordinates: for σ ∈ S n , x ∈ R n , σ ( x ) = ( x σ − (1) , . . . , x σ − ( n ) ) . Note that because the action is orthogonal, the dual action on the exponent vectors is thesame. Optimization problems invariant under this action can arise in different contextsnaturally, for example, in the context of graph homomorphisms ([3]). This action is verynatural, and the theory of representation of the symmetric group is very well understood,and affords strong connections with combinatorics. This connection has been successfullyused in several instances to reduce the sizes of optimization problems. In particular it wasshown in [27, Theorem 4.7 ] (see also [8, Theorem 3.21]) that the size of a semi-definiteprogram which certifies if a given symmetric polynomial is a sum of squares is stabilizingonce the number of variables is big enough. Similarly to these results we show in Theorem5.2 an analogous result in the AM/GM setup. This result mainly stems from the factthat the cardinalities appearing in Theorem 4.6 have a combinatorial interpretation inthe context of symmetric group actions.
First, up to permutation, every α ∈ R n is of the form α = ( α , . . . , α (cid:124) (cid:123)(cid:122) (cid:125) λ , α , . . . , α (cid:124) (cid:123)(cid:122) (cid:125) λ , . . . , α k , . . . , α k (cid:124) (cid:123)(cid:122) (cid:125) λ k ) , with λ (cid:62) λ . . . (cid:62) λ k > α ) is, up to conjugation, of theform S λ × · · · × S λ k , so that | Stab( α ) | = λ ! · · · λ k !. The corresponding partition λ = ( λ , . . . , λ k ) of n is calledthe orbit type Λ( α ) of α . We then denote by len( α ) the length of this partition, namelylen( α ) = k . Consequently, for ˆ β ∈ ˆ B , the dimension n ˆ β of the fixed subspace ( R n ) Stab( ˆ β ) isprecisely len( ˆ β ).Furthermore, let α ∈ ˆ A of orbit type Λ( α ) = ( λ , . . . , λ k ), and β ∈ ˆ B of orbit typeΛ( β ) = ( µ , . . . , µ (cid:96) ). Then, the interpretation of ˆ α ( S n ) ˆ β as the inner product of charactersgives a combinatorial understanding of the number ˆ α ( S n ) ˆ β = | Stab( ˆ α ) \S n / Stab( ˆ β ) | : it isgiven by the number N Λ( α ) , Λ( β ) = |M Λ( α ) , Λ( β ) | , where M Λ( α ) , Λ( β ) is the set of matrices ofsize k × (cid:96) with non-negative integer coefficients such that, for 1 (cid:54) i (cid:54) k the elements ofthe i th row sum up to λ i , and for 1 (cid:54) j (cid:54) (cid:96) the elements of the j th column sum up to µ j . This quantity can be alternatively computed by using the so-called Kostka numbers defined for pairs of partitions. More precisely, we have ˆ α ( S n ) ˆ β = N Λ( α ) , Λ( β ) = (cid:88) µ K µ, Λ( α ) K µ, Λ( β ) , where µ runs through the partitions of n . For more details about these interpretations,see [28, Chapter 7], in particular Corollary 7.12.3 therein.Now we illustrate the potential gain of this reduction, already in a very small example: Example 5.1.
Consider the support set { α , . . . , α } = { (0 , , T , (7 , , T , (0 , , T , (0 , , T , (1 , , T , (1 , , T , (2 , , T , (2 , , T } and let G := S be the symmetric groupon three elements. In order to avoid too heavy notation, we will write c ( i ) j instead of c ( α i ) α j and ν ( i ) j instead of ν ( α i ) α j . Consider a signomial f ( x , x , x ) = (cid:88) i =0 c i e (cid:104) α i , ( x ,x ,x ) (cid:105) , with c , c , c , c > c , c , c , c <
0, i.e., set A = { α , . . . , α } , B = { α , . . . , α } .Then ˆ A = { α , α } and ˆ B = { α , α } are sets of orbit representatives. The correspondingpartitions are Λ( α ) = Λ( α ) = (3), and Λ( α ) = Λ( α ) = (2 , f ∈ C SAGE ( A , B ) if and only there exist c (4) = ( c (4)0 , c (4)1 , c (4)3 ), ν (4) = ( ν (4)0 , ν (4)1 , ν (4)3 ), YMMETRY REDUCTION IN AM/GM-BASED OPTIMIZATION 13 c (7) = ( c (7)0 , c (7)1 ) and ν (7) = ( ν (7)0 , ν (7)1 ) satisfying the conditions ν (4)0 ( α − α ) + ν (4)1 ( α + α − α ) + ν (4)3 ( α − α ) = 0 ,ν (7)0 ( α − α ) + ν (7)1 ( α + α + α − α ) = 0 ,ν (4)0 ln ν (4)0 c (4)0 + 2 ν (4)1 ln ν (4)1 c (4)1 + ν (4)3 ln ν (4)3 c (4)3 (cid:54) c ,ν (7)0 ln ν (7)0 c (7)0 + 3 ν (7)1 ln ν (7)1 c (7)1 (cid:54) c , c (4)0 + c (7)0 (cid:54) c , c (4)1 + c (4)3 + c (7)1 (cid:54) c . Note that here len( α ) = 2 and len( α ) = 1 so that the two vectorial equations bringtogether 2 + 1 scalar equations. In total, we get 2(1 + 1 + 2 + 1) = 10 variables and2 + 1 + 2 + 2 = 7 linear constraints, against 2 · · · n ∈ N and start with a signomial f n in n variables, represented by the orbitrepresentatives of the exponent vectors ˆ A and ˆ B , as well as the corresponding coefficients.For each of these exponents α , we denote by ˜Λ( α ) the orbit type of α where we forget aboutthe entries . For instance, when n = 3, ˆ A = { ˆ α } = { (1 , , } and ˆ B = { ˆ β } = { (0 , , } ,then ˜Λ( ˆ α ) = (2 ,
1) while ˜Λ( ˆ β ) = (1). Note that these sequences do not have to be partitionsof n , we therefore introduce wt( α ) = (cid:88) λ ∈ ˜Λ( α ) λ, counting the number of non-zero coordinates of α , and refer to it as the weight of α . Hencewt(1 , ,
2) = 3, while wt(0 , ,
1) = 1. Now, for every n (cid:62) n , we can see α as an exponentin R n , by adding n − n zeroes. This procedure does not affect ˜Λ( α ) and wt( α ). In thisway, we can define for every n > n , the unique S n -invariant signomial f n whose supportis made of the S n -orbits of ˆ A and ˆ B with the corresponding coefficients. Clearly, in thissituation, the number of constraints C n in Corollary 4.3 does not depend on n , since itonly involves | ˆ B| , | ˆ A| , and the length of the elements in ˆ B , which does not change when n (cid:62) n + 1. In this framework, a similar phenomenon holds for the number of variables: Theorem 5.2.
Let n ∈ N , and ˆ A , ˆ B be finite orbit representatives of exponent vectorsin R n . Consider, for n (cid:62) n , the signomial f n previously defined, and denote by V n the number of variables in the symmetry-adapted relative entropy program in Corollary 4.3.Let m = max { wt( α ) : α ∈ ˆ A ∪ ˆ B} . Then, for every n (cid:62) m , V n = V m .Proof. We shall show by induction that for every n (cid:62) m , V n = V m . The initial stepbeing obvious, assume n > m . The definition of m ensures that for n ≥ m , for every α in ˆ A ∪ ˆ B , the coordinate occurring the most in α is 0, and thereforeΛ( α ) = ( n − wt( α ) , λ , . . . , λ k ) , where ( λ , . . . , λ k ) = ˜Λ( α ). Remember that the number of variables is given by V n = 2 (cid:88) ˆ α ∈ ˆ A , ˆ β ∈ ˆ B N n Λ(ˆ α ) , Λ( ˆ β ) , where, if Λ( ˆ α ) = ( n − wt ( ˆ α ) , λ , . . . , λ k ) and Λ( ˆ β ) = ( n − wt( ˆ β ) , µ , . . . , µ (cid:96) ), the quantity N n Λ(ˆ α ) , Λ( ˆ β ) counts the number of matrices of size ( k + 1) × ( (cid:96) + 1) with non-negative integercoefficients of the form(5.1) n − wt( ˆ β ) µ µ . . . µ (cid:96) · · · . . . · n − wt( ˆ α ) · · · . . . · λ ... ... ... . . . ... ... · · · . . . · λ k where the labels give the sum of the coefficients in the corresponding row/column. Sinceˆ A and ˆ B keep the same number of elements, we only need to show that for every ˆ α ∈ ˆ A ,ˆ β ∈ ˆ B , then N n Λ(ˆ α ) , Λ( ˆ β ) = N n − α ) , Λ( ˆ β ) .If we start with a matrix of the form(5.2) n − − wt( ˆ β ) µ µ . . . µ (cid:96) · · · . . . · n − − wt( ˆ α ) · · · . . . · λ ... ... ... . . . ... ... · · · . . . · λ k , adding 1 to the top left coefficients provides a matrix of the form (5.1), wich proves N n Λ(ˆ α ) , Λ( ˆ β ) (cid:62) N n − α ) , Λ( ˆ β ) .In order to show the reverse inequality, we claim that the top left coefficient in (5.1)cannot be 0. Indeed, if it is 0, then the sum of the coefficients in the first row is at most µ + µ + . . . + µ (cid:96) = wt( ˆ β ) . This implies n − wt( ˆ α ) (cid:54) wt( ˆ β ), which forces n (cid:54) m and gives a contradiction. YMMETRY REDUCTION IN AM/GM-BASED OPTIMIZATION 15
Now, if the top left coefficient is a positive integer, substracting 1 to this coefficient pro-vides a matrix of the form (5.2), which proves N n Λ(ˆ α ) , Λ( ˆ β ) (cid:54) N n − α ) , Λ( ˆ β ) , and hence V n = V n − . (cid:3) Building on this, we can actually show that for a large class of problems we have astabilization.
Theorem 5.3.
Let k, l, w ∈ N be fixed. Then for every integer n (cid:62) w and every S n -invariant signomial f ∈ C ( A , B ) with | ˆ A| (cid:54) k , | ˆ B| (cid:54) l , and max ˆ γ ∈ ˆ A∪ ˆ B wt(ˆ γ ) (cid:54) w, the number of constraints and the number of variables of the symmetry adapted programare bounded by constants only depending on k , l and w : C n (cid:54) k + l + l ( w + 1) and V n (cid:54) lku ( w ) , where u ( w ) = w (cid:88) i =0 (cid:18) wi (cid:19) i ! . Proof.
Let us begin with the number of constraints. This follows from Theorem 5.2, be-cause | ˆ A| (cid:54) k , | ˆ B| (cid:54) l and n ˆ β (cid:54) w + 1, since wt( ˆ β ) (cid:54) w. As in the previous proof, wehave Λ( ˆ α ) = n − wt( ˆ α ) and similarly for ˆ β . For the number of variables, we will showthat(5.3) N Λ(ˆ α ) , Λ( ˆ β ) (cid:54) N ( n − w, w ) , ( n − w, w ) = u ( w )for every ˆ α, ˆ β satisfying the conditions of the theorem. This will be done in two steps.First, we show that if λ = ( λ , . . . , λ t ,
1) is a partition, and λ (cid:48) = ( λ , λ , . . . , λ t − , λ t + 1),then for every partition µ , we have N λ (cid:48) ,µ (cid:62) N λ,µ and N µ,λ (cid:48) (cid:62) N µ,λ . Indeed, there is a surjection from the set M λ (cid:48) ,µ onto M λ,µ . Namely let ( x , . . . , x k ) denotethe t -th line of an element in M λ,µ . Let s be such that x s >
0. Replacing the t -th line ofthis element by ( x , . . . , x s − , x s − , x s +1 , . . . x k ) and inserting (0 , . . . , , , , . . . ,
0) as the( t + 1)-th line we get an element in N λ (cid:48) ,µ . By applying this procedure recursively for rowsand columns we get the inequality in (5.3).To show the equality in (5.3), observe that the top-left element of N ( n − w, w ) , ( n − w, w ) hasto be an integer k between n − w and n − w . For every such choice, we have to distribute n − w − k ones in the first row and first column. This gives (cid:0) wn − w − k (cid:1) possibilities. Restrictedto the w × w lower right submatrix, these selected lines and columns contain only 0. Foreach of these possibilities, after removing these chosen lines and columns, we get an( n − k ) × ( n − k ) matrix which contains exactly one 1 per line and column. There are( n − k )! such matrices. By a change of the index variable, we get the desired result: V n =2 (cid:88) ˆ α ∈ ˆ A ˆ β ∈ ˆ B ˆ α ( S n ) ˆ β = 2 (cid:88) ˆ α ∈ ˆ A ˆ β ∈ ˆ B N Λ( α ) , Λ( β ) ≤ lku ( w ) . (cid:3) We conclude this section by giving explicit estimates on the signomials where | ˆ B| = 1,and ˆ A = { , ˆ α } . We have chosen four different classes of examples that show the influenceof the sizes of the orbits on the numbers of variables and constraints. These classes rep-resent extremal situations, namely when the orbits are either very large or very small. Inthese situations, we can actually compute the exact number of variables and constraintsin both cases according to the previous discussions. Note that the last case falls into theframework of Theorem 5.2, where wt( ˆ α ) = wt( ˆ β ) = 1. There is also a stabilization in thefirst sequence: when len( ˆ β ) = 1, for every ˆ α ∈ ˆ A , the number of variables ˆ α ( S n ) ˆ β is equalto 1. The subsequent table summarizes our analysis. Specific signomials realizing the casesare given in Examples 6.1–6.4 in the next section. Standard method Symmetric method |S n · ˆ β | |S n · ˆ α | V n C n V n C n Example1 n ! 2 n ! + 3 n ! + n + 2 5 4 6.1 n ! n n + 1) n ! + 1 ( n + 1)( n ! + 1) 2 n + 3 n + 3 6.2 n ! n ! 2( n ! + 1) n ! + 1 n !( n + 2) + 1 2 n ! + 3 n + 3 6.3 n n n ( n + 1) + 1 ( n + 1) Table 1.
Comparison of the parameters when ˆ A = { , ˆ α } and ˆ B = { ˆ β } .6. Numerical experiments
To illustrate the previous considerations, we present in this section classes of examplesthat spotlight the computational gains by the comparison of calculation times in the caseof the symmetric group. For these computations, we used the ECOS solver and Python3.7 on an Intel(R) Xeon(R) Platinum 8168 CPU with 2.7 GHz and 768 GB of RAM underCentOS Linux release 7.9.2009. Keeping the previous notation, for the standard method,that is the method that does not exploit the symmetries, the input consists of A , B aswell as the coefficients, while for the symmetry-adapted version, the input is ˆ A , ˆ B andthe coefficients. This difference of input is mainly due to practical considerations and doesnot in itself influence the comparison of the time used by the solver. When both methodsgive an answer, the bounds coincide.In all the tables in the sequel, dim is the dimension, V n and C n are the number ofvariables and constraints of the program, while t s and t r denote the solver time andthe overall running time (including the building of the optimization program) in seconds.While it might happen that the standard method is slightly faster for very small instances,the size growth of the program in the standard method makes it quickly unsolvable. Inthat case this is represented by “ − ” in the table. The symmetric approach allows howeverto go further, and we give all the results until the solver warns about a possible inaccuracy.In this case, we mark the bound with “ ∗ ”. YMMETRY REDUCTION IN AM/GM-BASED OPTIMIZATION 17
The first four examples give numerical results for each of the classes discussed in Table 1.We tried to choose the coefficients in a way that avoids numerical issues, namely preventingthe bound to be either too small or too large.
Example 6.1.
Consider first the signomial f (1) n = 1 n ! (cid:88) σ ∈S n σ exp( (cid:104) α, x (cid:105) ) − exp( (cid:104) β, x (cid:105) ) , where β = (1 , . . . ,
1) and α = (1 , , . . . , n ). The numerical results are shown in Table 2. Standard method Symmetric method dim bound V n C n t s t r V n C n t s t r . . . . . . ∗ Table 2.
Numerical results for f (1) n . Example 6.2.
Consider now the signomial f (2) n = ( n − n (cid:88) i =1 exp( n x i ) − (cid:88) σ ∈S n σ exp( (cid:104) β, x (cid:105) ) , where β = (1 , , . . . , n ) (and α = ( n , , . . . , Standard method Symmetric method dim bound V n C n t s t r V n C n t s t r . . . . .
152 0 . . . .
422 5 .
843 15 9 0.0423 0.04587 -1024 80641 40328 57 .
26 66 .
67 17 10 0.0491 0.05388 -8418 725761 362889 1514 2211 19 11 0.0568 0.06269 -77355 7257601 3628810 – – 21 12 0.0661 0.083510 79833601 39916811 – – 23 13 – –
Table 3.
Numerical results for f (2) n . Example 6.3.
Next, we consider the case where both orbits are of maximal size. Let f (3) n = 1 n (cid:88) σ ∈S n exp( (cid:104) α, x (cid:105) ) − n (cid:88) σ ∈S n σ exp( (cid:104) β, x (cid:105) ) , where β = (1 , , . . . , n ) and α = (2 , , . . . , n ).The numerical results are shown in Table 4. Standard method Symmetric method dim bound V n C n t s t r V n C n t s t r . . . . Table 4.
Numerical results for f (3) n . Example 6.4.
Finally, we consider the case where both orbits are small. Let f (4) n = 1 n n (cid:88) i =1 exp( n x i ) − n n (cid:88) i =1 exp(( n − x + · · · + x n ) + x i ) , ( β = ( n, n − , n − , . . . , n −
1) and α = ( n , , . . . , Standard method Symmetric method dim bound V n C n t s t r V n C n t s t r .
019 01 0 . . . . . ∗ Table 5.
Numerical results for f (4) n . Example 6.5.
Finally, we give an example where A and B consist of two orbits each:ˆ A = { ( n , , . . . , , (1 , , . . . , n ) } and ˆ B = { (1 , . . . , , (1 , , . . . , n ) } . In this case, we are still able to compute the number of constraints and the number ofvariables. With the standard approach, V n = 2( n ! + n + 1)( n ! + 1) + 1 , C n = ( n ! + 1)( n + 2) + n, while using symmetries, V n = 2 n ! + 2 n + 9 , C n = n + 6 . YMMETRY REDUCTION IN AM/GM-BASED OPTIMIZATION 19
Table 6 shows the numerical results for the signomials g n = 1 n n (cid:88) i =1 exp( n x i ) + 1 n (cid:88) σ ∈S n σ exp( (cid:104) α, x (cid:105) ) − exp( x + · · · + x n ) − n (cid:88) σ ∈S n σ exp( (cid:104) β, x (cid:105) )for α = (1 , , . . . , n ) and β = (1 , , . . . , n ). Standard method Symmetric method dim bound V n C n t s t r V n C n t s t r . . . . Table 6.
Numerical results for g n .7. Conclusion and open questions
We have developed techniques to exploit symmetries in AM/GM-based optimizationand confirmed their benefit in terms of computational results. In particular, in the case ofsymmetric signomials, we showed that both theoretically as well as practically our orbitreduction allow for substantial computational gains. This motivates a theoretical studyof the strength of the AM/GM bounds in this framework. In particular, it encouragesthe comparison of the symmetric SAGE cones with respect to the cone of symmetricnon-negative signomials.
References [1] C. Bachoc, D. C. Gijswijt, A. Schrijver, and F. Vallentin. Invariant semidefinite programs. In
Hand-book on Semidefinite, Conic and Polynomial Optimization , pages 219–269. Springer, 2012.[2] C. Bachoc and F. Vallentin. New upper bounds for kissing numbers from semidefinite programming.
J. Amer. Math. Soc. , 21(3):909–924, 2008.[3] G. Blekherman, A. Raymond, M. Singh, and R. Thomas. Simple graph density inequalities with nosum of squares proofs.
Combinatorica , 40(4):455–471, 2020.[4] G. Blekherman and C. Riener. Symmetric non-negative forms and sums of squares. To appear in
Discrete Comput. Geom. , 2021.[5] V. Chandrasekaran and P. Shah. Relative entropy relaxations for signomial optimization.
SIAM J.Optim. , 26(2):1147–1173, 2016.[6] V. Chandrasekaran and P. Shah. Relative entropy optimization and its applications.
Math. Program.,Ser. A , 161(1-2):1–32, 2017.[7] E. de Klerk and R. Sotirov. Exploiting group symmetry in semidefinite programming relaxations ofthe quadratic assignment problem.
Math. Program. , 122(2):225, 2010.[8] S. Debus and C. Riener. Reflection groups and cones of sums of squares. arXiv preprintarXiv:2011.09997 , 2020.[9] C. Dobre and J. Vera. Exploiting symmetry in copositive programs via semidefinite hierarchies.
Math.Program. , 151(2):659–680, 2015. [10] M. Dressler, J. Heuer, H. Naumann, and T. de Wolff. Global optimization via the dual SONCcone and linear programming. In
Proc. 45th International Symposium on Symbolic and AlgebraicComputation , pages 138–145, 2020.[11] M. Dressler, S. Iliman, and T. de Wolff. An approach to constrained polynomial optimization vianonnegative circuit polynomials and geometric programming.
J. Symb. Comp. , 91:149–172, 2019.[12] T. Friedl, C. Riener, and R. Sanyal. Reflection groups, reflection arrangements, and invariant realvarieties.
Proc. Amer. Math. Soc. , 146(3):1031–1045, 2018.[13] K. Gatermann and P. A Parrilo. Symmetry groups, semidefinite programs, and sums of squares.
J.Pure & Applied Algebra , 192(1-3):95–128, 2004.[14] A. Heaton, S. Ho¸sten, and I. Shankar. Symmetry adapted Gram spectrahedra. Preprint,arXiv:2004.09641, 2020.[15] S. Iliman and T. de Wolff. Amoebas, nonnegative polynomials and sums of squares supported oncircuits.
Res. Math. Sci. , 3(paper no. 9), 2016.[16] S. Iliman and T. de Wolff. Lower bounds for polynomials with simplex Newton polytopes based ongeometric programming.
SIAM J. Optim. , 26(2):1128–1146, 2016.[17] O. Karaca, G. Darivianakis, P. Beuchat, A. Georghiou, and J. Lygeros. The REPOP toolbox: Tack-ling polynomial optimization using relative entropy relaxations. In , volume 50(1), pages 11652–11657. Elsevier, 2017.[18] L. Katth¨an, H. Naumann, and T. Theobald. A unified framework of SAGE and SONC polynomialsand its duality theory. To appear in
Math. Comput. , 2021.[19] P. Moustrou, C. Riener, and H. Verdure. Symmetric ideals, Specht polynomials and solutions tosymmetric systems of equations.
J. Symb. Comp. , 107:106–121, 2021.[20] R. Murray, V. Chandrasekaran, and A. Wierman. Newton polytopes and relative entropy optimiza-tion. To appear in
Found. Comput. Math. , 2021.[21] R. Murray, V. Chandrasekaran, and A. Wierman. Signomial and polynomial optimization via relativeentropy and partial dualization. To appear in
Math. Program. Comput. , 2021.[22] R. Murray, H. Naumann, and T. Theobald. Sublinear circuits and the constrained signomial opti-mization problem. Preprint, arXiv:2006.06811, 2020.[23] C. Pantea, H. Koeppl, and G. Craciun. Global injectivity and multiple equilibria in uni- and bi-molecular reaction networks.
Discrete and Continuous Dynamical Systems - Series B , 17(6):2153–2170, May 2012.[24] A. Raymond, J. Saunderson, M. Singh, and R. R. Thomas. Symmetric sums of squares over k -subsethypercubes. Math. Program., Ser. A , 167(2):315–354, 2018.[25] B. Reznick. Forms derived from the arithmetic-geometric inequality.
Math. Annalen , 283(3):431–464,1989.[26] C. Riener. On the degree and half-degree principle for symmetric polynomials.
J. Pure & AppliedAlgebra , 216(4):850–856, 2012.[27] C. Riener, T. Theobald, L. Jansson-Andr´en, and J. B. Lasserre. Exploiting symmetries in SDP-relaxations for polynomial optimization.
Math. Oper. Res. , 38(1):122–141, 2013.[28] R. P. Stanley.
Enumerative Combinatorics , volume 2. Cambridge University Press, 1999.[29] V. Timofte. On the positivity of symmetric polynomial functions.: Part i: General results.
J. Math.Analysis and Applications , 284(1):174–190, 2003.[30] J. Wang, V. Magron, and J. B. Lasserre. Chordal-TSSOS: a moment-SOS hierarchy that exploitsterm sparsity with chordal extension.
SIAM J. Optim. , 31:114–141, 2021.[31] J. Wang, V. Magron, and J. B. Lasserre. TSSOS: a moment-SOS hierarchy that exploits term sparsity.
SIAM J. Optim. , 31:1–29, 2021.[32] W. C. Waterhouse. Do symmetric problems have symmetric solutions?
Amer. Math. Monthly ,90(6):378–387, 1983.