[PDF] Dual optimal design and the Christoffel-Darboux polynomial

Abstract

The purpose of this short note is to show that the Christoffel-Darboux polynomial, useful in approximation theory and data science, arises naturally when deriving the dual to the problem of semi-algebraic D-optimal experimental design in statistics. It uses only elementary notions of convex analysis. Geometric interpretations and algorithmic consequences are mentioned.

Full PDF

aa r X i v : . [ m a t h . O C ] S e p Dual optimal design and theChristoﬀel-Darboux polynomial

Yohann De Castro , Fabrice Gamboa , Didier Henrion , , Jean Bernard Lasserre , Draft of September 9, 2020

Abstract

The purpose of this short note is to show that the Christoﬀel-Darboux polynomial,useful in approximation theory and data science, arises naturally when deriving thedual to the problem of semi-algebraic D-optimal experimental design in statistics. Ituses only elementary notions of convex analysis.

In [1] the problem of optimal design of statistical experiments was revisited in the broadframework of polynomial regressions on semi-algebraic domains. A numerical solution wasproposed, based on the so-called moment-SOS (sums of squares) hierarchy of semideﬁniteprogramming relaxations [2]. Whereas optimality arguments were used in [1] to derivemany of the results, the dual to the problem of optimal experimental design was not ex-plicitly constructed and studied. It is the purpose of this note to clarify this point in aself-contained and direct way. We use elementary arguments of convex analysis to show howthe Christoﬀel-Darboux polynomial, ubiquitous in approximation theory and data science[3], arises naturally in the dual to the D-optimal design problem.

Let S n denote the set of symmetric real matrices of size n . Given a vector v d ( x ) whoseelements form a basis of the vector space of real polynomials of degree up to d in the vector Institut Camille Jordan UMR 5208, ´Ecole Centrale de Lyon, 36 Avenue Guy de Collongue, F-69134cully, France. Institut de Math´ematiques de Toulouse, Universit´e Paul Sabatier, CNRS, 118 route de Narbonne, F-31062 Toulouse, France. CNRS, LAAS, 7 avenue du colonel Roche, F-31400 Toulouse, France. Faculty of Electrical Engineering, Czech Technical University in Prague, Technick´a 2, CZ-16626 Prague,Czechia. x ∈ R n , let M d ( X ) := (cid:26) y ∈ R n d : y = Z X v d ( x ) dµ ( x ) for some positive Borel measure µ on X (cid:27) denote the convex cone of moments of degree up to d on a given compact semi-algebraic set X ∈ R n with non-empty interior. This cone has dimension n d := ( n + d )! n ! d ! . Given a measure µ and its moment vector y ∈ M d ( X ), deﬁne the moment matrix M d ( y ) := Z X v d ( x ) v ⋆d ( x ) dµ ( x ) ∈ S n d where the star denotes transposition. Each entry of the above matrix v d ( x ) v ⋆d ( x ) is a poly-nomial of degree up to 2 d , and hence it is a linear combination of elements in basis vector v d ( x ). Consequently, each entry of M d ( y ) is a linear combination of entries of the momentvector y , whose dimension is n d . It follows that M d can be interpreted as a linear map from R n d to S n d , and let M ⋆d denote its adjoint map from S n d to R n d , deﬁned such thattrace( M d ( y ) X ) = M ⋆d ( X ) y (1)holds for all X ∈ S n d and y ∈ R n d .Given a matrix A ∈ R m × n d , a vector b ∈ R m , a vector c ∈ R n d and a strictly concavefunction φ from S n d to R , consider the primal optimal design probleminf y c ⋆ y − φ ( M d ( y ))s.t. Ay = by ∈ M d ( X ) (2)where the inﬁmum is with respect to vectors y ∈ R n d . In problem (2) we introduce the matrix variable Y ∈ S n d and equality constraint Y = M d ( y ),the Lagrange multipliers X , z , p and the Lagrangian f ( X , z , p , Y , y ) := c ⋆ y − φ ( Y ) + trace { X ( Y − M d ( y )) } + z ⋆ ( b − Ay ) − p ⋆ y where X ∈ S n d , z ∈ R m and p belongs to P d ( X ), the convex cone of polynomials that arepositive on X , which is dual to the moment cone M d ( X ) according to the Riesz-HavilandTheorem [2, Theorem 3.1]. Use (1) to rearrange terms as follows f ( X , z , p , Y , y ) = b ⋆ z + { trace( XY ) − φ ( Y ) } + { c ⋆ − z ⋆ A − M ⋆d ( X ) − p ⋆ } y . The dual problem to (2) is obtained by minimizing the dual function g ( X , z , p ) := inf Y ∈ S nd , y ∈ R nd f ( X , z , p , Y , y ) . p ⋆ = c ⋆ − z ⋆ A − M ⋆d ( X ) . (3)Deﬁning φ ⋆ ( X ) := inf Y ∈ S nd { trace( XY ) − φ ( Y ) } as the concave conjugate function to φ , the problem of maximizing the dual function becomessup X , z b ⋆ z + φ ⋆ ( X )s.t. c ⋆ − z ⋆ A − M ⋆d ( X ) ∈ P d ( X ) (4)where the supremum is with respect to matrices X ∈ S n d and vector z ∈ R m . Since M ⋆d ( X ) v d ( x ) = v d ( x ) ⋆ Xv d ( x ), the conic constraint in dual problem (4) can be formulatedas a polynomial positivity constraint( c ⋆ − z ⋆ A ) v d ( x ) ≥ p ( x )satisﬁed for all x ∈ X . Theorem 1 (Duality for optimal design)

Problems (2) and (4) are in strong duality,i.e. their values coincide.

Proof:

Weak duality, i.e. the value of primal problem (2) is greater than or equal to thevalue of dual problem (4), follows from the inequality φ ⋆ ( X ) + φ ( Y ) ≤ trace( XY )which holds by deﬁnition of φ ⋆ for every positive semideﬁnite pair X , Y . Indeed, for anyfeasible X , y , z it holds0 ≤ ( c ⋆ − z ⋆ A − M ⋆d ( X )) y = c ⋆ y − b ⋆ z − trace( XM d ( y )) ≤ c ⋆ y − b ⋆ z − φ ⋆ ( X ) − φ ( M d ( y ))and hence c ⋆ y − φ ( M d ( y )) ≥ b ⋆ z + φ ⋆ ( X ) . Strong duality, i.e. the above inequality is an equality for any optimal values ˆ X , ˆ y , ˆ z followsfrom concavity of function φ and the so-called Slater qualiﬁcation constraint [2, SectionC.1], i.e. the existence of an interior point for primal problem (2): choose e.g. the vector ofmoments of an atomic measure supported on X with more than n d distinct atoms. Thenthe moment matrix M d ( y ) is positive deﬁnite. Equivalently, the complementarity conditionˆ p ⋆ ˆ y = 0 (5)holds for ˆ p ⋆ := c ⋆ − ˆ z ⋆ A − M ⋆d ( ˆ X ) as in (3). (cid:3) Christoﬀel-Darboux polynomial

In [1] various functions φ are considered, depending on the optimal design problem of interest.In optimal design problem (2), let φ ( Y ) := log det Y , Ay := Z X dµ ( x ) , b = 1 , c = 0 (6)i.e. we are minimizing over probability measures supported on X a function φ which is theclassical barrier function used in interior point methods for semideﬁnite programming [5].The domain of φ is the cone of positive deﬁnite matrices. This optimal design problem has thesame solution as the D-optimal design problem corresponding to the positively homogeneousobjective function (det Y ) − /n d . Theorem 2

Problem (2) with data (6) has a unique solution ˆ y ∈ M d ( { x ∈ X : p ( x ) = n d } ) where p ( x ) := v ⋆d ( x ) M − d (ˆ y ) v d ( x ) is the Christoﬀel-Darboux polynomial. Proof:

Uniqueness of the solution ˆ y follows from convexity of the feasibiliy set and strictconcavity of the objective function in problem (2). The solution ˆ y is an interior point, i.e. M d (ˆ y ) is positive deﬁnite. As explained in the proof of Theorem 1, the Karush-Kuhn-Tucker(KKT) optimality conditions are necessary and suﬃcient for an optimal solution, see e.g. [2,Section C.1]: all partial derivatives of the Lagrange dual function f must vanish, and thisimplies that ∂f∂ Y = ˆ X − ∂φ∂ Y ( M d (ˆ y )) = 0and hence that ˆ X = M − d (ˆ y )for an optimal primal-dual pair ˆ X , ˆ y . From the complementarity condition (5) and property(1) we deduce that the optimal ˆ z satisﬁesˆ z = − M ⋆d ( M − d (ˆ y ))ˆ y = − trace( M d (ˆ y ) M − d (ˆ y )) = − n d . Complementarity condition (5) means that an optimal vector of moments ˆ y corresponds to ameasure µ supported on the zero level set of the optimal positive polynomial with coeﬃcientsˆ p := M ⋆d ( M − d (ˆ y )) − n d , i.e. the algebraic set { x ∈ X : p ( x ) = n d } . (cid:3) The concave conjugate function is φ ⋆ ( X ) = n d + φ ( X ), its domain is the cone of positivedeﬁnite matrices, and from the proof of Theorem 2, the dual design problem (4) has thesimple form sup X log det X s.t. n d − v d ( x ) ⋆ Xv d ( x ) ≥ , ∀ x ∈ X . (7)Its solution is ˆ X = M − d (ˆ y ) where ˆ y is the unique solution of problem (2). The Christoﬀel-Darboux polynomial p ( x ) = v ⋆d ( x ) ˆ Xv d ( x ) is SOS since matrix ˆ X is positive deﬁnite, anddual matrix X is such that n d ≥ v d ( x ) ⋆ Xv d ( x ) ≥ x ∈ X .4rom Theorem 2 the optimal sequence of moments ˆ y in primal problem (2) has a representingatomic measure µ whose atoms are given by the level set of the Christoﬀel-Darboux polyno-mial. This sequence of moments is unique, but there could be another measure, atomic ornot, with the same moments.Dual problem (7) has also an interpretation in computational geometry. Indeed if d = 1 then x p ( x ) = n d − v d ( x ) T Xv d ( x ) is a quadratic polynomial and so the set E := { x ∈ R n : p ( x ) ≥ } is an ellipsoid that contains X , and log det X is related to the volume of E . Sothe dual design problem is also equivalent to the problem of ﬁnding the ellipsoid of minimumvolume that contains X , which is the celebrated L¨owner-John ellipsoid problem. For d = 1this was already observed in [6] and therefore (7) can be considered as a generalization tothe case d > X and with log det X as aproxy for the volume of E . In this note we use only elementary concepts of convex analysis to show that the Christoﬀel-Darboux polynomial, so useful in approximation theory and data analysis [3], also arisesnaturally in the dual problem of D-optimal experimental design with semi-algebraic data,a standard convex optimization problem in statistics. Numerically, problem (2) is solvedwith the moment-SOS hierarchy, i.e the moment cone M d ( X ) is relaxed with a hierarchyof projections of spectrahedra of increasing size. As shown in [1], the Christoﬀel-Darbouxpolynomial can be used as a certiﬁcate of ﬁnite convergence of the hierarchy: the contactpoints of its level set at n d with X are the support of an optimal design.The Christoﬀel-Darboux polynomial corresponds to a particular choice of a convex functionto be minimized in the design problem, namely the logarithmic barrier function of the positivesemideﬁnite cone. It would be interesting to study the polynomials arising in the dual designproblem corresponding to other convex functions of the eigenvalues of positive semideﬁnitematrices [4]. Acknowledgement

Support from the ANR-3IA Artiﬁcial and Natural Intelligence Toulouse Institute is gratefullyacknowledged.