Dual optimal design and the Christoffel-Darboux polynomial
Yohann de Castro, Fabrice Gamboa, Didier Henrion, Jean Lasserre
aa r X i v : . [ m a t h . O C ] S e p Dual optimal design and theChristoffel-Darboux polynomial
Yohann De Castro , Fabrice Gamboa , Didier Henrion , , Jean Bernard Lasserre , Draft of September 9, 2020
Abstract
The purpose of this short note is to show that the Christoffel-Darboux polynomial,useful in approximation theory and data science, arises naturally when deriving thedual to the problem of semi-algebraic D-optimal experimental design in statistics. Ituses only elementary notions of convex analysis.
In [1] the problem of optimal design of statistical experiments was revisited in the broadframework of polynomial regressions on semi-algebraic domains. A numerical solution wasproposed, based on the so-called moment-SOS (sums of squares) hierarchy of semidefiniteprogramming relaxations [2]. Whereas optimality arguments were used in [1] to derivemany of the results, the dual to the problem of optimal experimental design was not ex-plicitly constructed and studied. It is the purpose of this note to clarify this point in aself-contained and direct way. We use elementary arguments of convex analysis to show howthe Christoffel-Darboux polynomial, ubiquitous in approximation theory and data science[3], arises naturally in the dual to the D-optimal design problem.
Let S n denote the set of symmetric real matrices of size n . Given a vector v d ( x ) whoseelements form a basis of the vector space of real polynomials of degree up to d in the vector Institut Camille Jordan UMR 5208, ´Ecole Centrale de Lyon, 36 Avenue Guy de Collongue, F-69134cully, France. Institut de Math´ematiques de Toulouse, Universit´e Paul Sabatier, CNRS, 118 route de Narbonne, F-31062 Toulouse, France. CNRS, LAAS, 7 avenue du colonel Roche, F-31400 Toulouse, France. Faculty of Electrical Engineering, Czech Technical University in Prague, Technick´a 2, CZ-16626 Prague,Czechia. x ∈ R n , let M d ( X ) := (cid:26) y ∈ R n d : y = Z X v d ( x ) dµ ( x ) for some positive Borel measure µ on X (cid:27) denote the convex cone of moments of degree up to d on a given compact semi-algebraic set X ∈ R n with non-empty interior. This cone has dimension n d := ( n + d )! n ! d ! . Given a measure µ and its moment vector y ∈ M d ( X ), define the moment matrix M d ( y ) := Z X v d ( x ) v ⋆d ( x ) dµ ( x ) ∈ S n d where the star denotes transposition. Each entry of the above matrix v d ( x ) v ⋆d ( x ) is a poly-nomial of degree up to 2 d , and hence it is a linear combination of elements in basis vector v d ( x ). Consequently, each entry of M d ( y ) is a linear combination of entries of the momentvector y , whose dimension is n d . It follows that M d can be interpreted as a linear map from R n d to S n d , and let M ⋆d denote its adjoint map from S n d to R n d , defined such thattrace( M d ( y ) X ) = M ⋆d ( X ) y (1)holds for all X ∈ S n d and y ∈ R n d .Given a matrix A ∈ R m × n d , a vector b ∈ R m , a vector c ∈ R n d and a strictly concavefunction φ from S n d to R , consider the primal optimal design probleminf y c ⋆ y − φ ( M d ( y ))s.t. Ay = by ∈ M d ( X ) (2)where the infimum is with respect to vectors y ∈ R n d . In problem (2) we introduce the matrix variable Y ∈ S n d and equality constraint Y = M d ( y ),the Lagrange multipliers X , z , p and the Lagrangian f ( X , z , p , Y , y ) := c ⋆ y − φ ( Y ) + trace { X ( Y − M d ( y )) } + z ⋆ ( b − Ay ) − p ⋆ y where X ∈ S n d , z ∈ R m and p belongs to P d ( X ), the convex cone of polynomials that arepositive on X , which is dual to the moment cone M d ( X ) according to the Riesz-HavilandTheorem [2, Theorem 3.1]. Use (1) to rearrange terms as follows f ( X , z , p , Y , y ) = b ⋆ z + { trace( XY ) − φ ( Y ) } + { c ⋆ − z ⋆ A − M ⋆d ( X ) − p ⋆ } y . The dual problem to (2) is obtained by minimizing the dual function g ( X , z , p ) := inf Y ∈ S nd , y ∈ R nd f ( X , z , p , Y , y ) . p ⋆ = c ⋆ − z ⋆ A − M ⋆d ( X ) . (3)Defining φ ⋆ ( X ) := inf Y ∈ S nd { trace( XY ) − φ ( Y ) } as the concave conjugate function to φ , the problem of maximizing the dual function becomessup X , z b ⋆ z + φ ⋆ ( X )s.t. c ⋆ − z ⋆ A − M ⋆d ( X ) ∈ P d ( X ) (4)where the supremum is with respect to matrices X ∈ S n d and vector z ∈ R m . Since M ⋆d ( X ) v d ( x ) = v d ( x ) ⋆ Xv d ( x ), the conic constraint in dual problem (4) can be formulatedas a polynomial positivity constraint( c ⋆ − z ⋆ A ) v d ( x ) ≥ p ( x )satisfied for all x ∈ X . Theorem 1 (Duality for optimal design)
Problems (2) and (4) are in strong duality,i.e. their values coincide.
Proof:
Weak duality, i.e. the value of primal problem (2) is greater than or equal to thevalue of dual problem (4), follows from the inequality φ ⋆ ( X ) + φ ( Y ) ≤ trace( XY )which holds by definition of φ ⋆ for every positive semidefinite pair X , Y . Indeed, for anyfeasible X , y , z it holds0 ≤ ( c ⋆ − z ⋆ A − M ⋆d ( X )) y = c ⋆ y − b ⋆ z − trace( XM d ( y )) ≤ c ⋆ y − b ⋆ z − φ ⋆ ( X ) − φ ( M d ( y ))and hence c ⋆ y − φ ( M d ( y )) ≥ b ⋆ z + φ ⋆ ( X ) . Strong duality, i.e. the above inequality is an equality for any optimal values ˆ X , ˆ y , ˆ z followsfrom concavity of function φ and the so-called Slater qualification constraint [2, SectionC.1], i.e. the existence of an interior point for primal problem (2): choose e.g. the vector ofmoments of an atomic measure supported on X with more than n d distinct atoms. Thenthe moment matrix M d ( y ) is positive definite. Equivalently, the complementarity conditionˆ p ⋆ ˆ y = 0 (5)holds for ˆ p ⋆ := c ⋆ − ˆ z ⋆ A − M ⋆d ( ˆ X ) as in (3). (cid:3) Christoffel-Darboux polynomial
In [1] various functions φ are considered, depending on the optimal design problem of interest.In optimal design problem (2), let φ ( Y ) := log det Y , Ay := Z X dµ ( x ) , b = 1 , c = 0 (6)i.e. we are minimizing over probability measures supported on X a function φ which is theclassical barrier function used in interior point methods for semidefinite programming [5].The domain of φ is the cone of positive definite matrices. This optimal design problem has thesame solution as the D-optimal design problem corresponding to the positively homogeneousobjective function (det Y ) − /n d . Theorem 2
Problem (2) with data (6) has a unique solution ˆ y ∈ M d ( { x ∈ X : p ( x ) = n d } ) where p ( x ) := v ⋆d ( x ) M − d (ˆ y ) v d ( x ) is the Christoffel-Darboux polynomial. Proof:
Uniqueness of the solution ˆ y follows from convexity of the feasibiliy set and strictconcavity of the objective function in problem (2). The solution ˆ y is an interior point, i.e. M d (ˆ y ) is positive definite. As explained in the proof of Theorem 1, the Karush-Kuhn-Tucker(KKT) optimality conditions are necessary and sufficient for an optimal solution, see e.g. [2,Section C.1]: all partial derivatives of the Lagrange dual function f must vanish, and thisimplies that ∂f∂ Y = ˆ X − ∂φ∂ Y ( M d (ˆ y )) = 0and hence that ˆ X = M − d (ˆ y )for an optimal primal-dual pair ˆ X , ˆ y . From the complementarity condition (5) and property(1) we deduce that the optimal ˆ z satisfiesˆ z = − M ⋆d ( M − d (ˆ y ))ˆ y = − trace( M d (ˆ y ) M − d (ˆ y )) = − n d . Complementarity condition (5) means that an optimal vector of moments ˆ y corresponds to ameasure µ supported on the zero level set of the optimal positive polynomial with coefficientsˆ p := M ⋆d ( M − d (ˆ y )) − n d , i.e. the algebraic set { x ∈ X : p ( x ) = n d } . (cid:3) The concave conjugate function is φ ⋆ ( X ) = n d + φ ( X ), its domain is the cone of positivedefinite matrices, and from the proof of Theorem 2, the dual design problem (4) has thesimple form sup X log det X s.t. n d − v d ( x ) ⋆ Xv d ( x ) ≥ , ∀ x ∈ X . (7)Its solution is ˆ X = M − d (ˆ y ) where ˆ y is the unique solution of problem (2). The Christoffel-Darboux polynomial p ( x ) = v ⋆d ( x ) ˆ Xv d ( x ) is SOS since matrix ˆ X is positive definite, anddual matrix X is such that n d ≥ v d ( x ) ⋆ Xv d ( x ) ≥ x ∈ X .4rom Theorem 2 the optimal sequence of moments ˆ y in primal problem (2) has a representingatomic measure µ whose atoms are given by the level set of the Christoffel-Darboux polyno-mial. This sequence of moments is unique, but there could be another measure, atomic ornot, with the same moments.Dual problem (7) has also an interpretation in computational geometry. Indeed if d = 1 then x p ( x ) = n d − v d ( x ) T Xv d ( x ) is a quadratic polynomial and so the set E := { x ∈ R n : p ( x ) ≥ } is an ellipsoid that contains X , and log det X is related to the volume of E . Sothe dual design problem is also equivalent to the problem of finding the ellipsoid of minimumvolume that contains X , which is the celebrated L¨owner-John ellipsoid problem. For d = 1this was already observed in [6] and therefore (7) can be considered as a generalization tothe case d > X and with log det X as aproxy for the volume of E . In this note we use only elementary concepts of convex analysis to show that the Christoffel-Darboux polynomial, so useful in approximation theory and data analysis [3], also arisesnaturally in the dual problem of D-optimal experimental design with semi-algebraic data,a standard convex optimization problem in statistics. Numerically, problem (2) is solvedwith the moment-SOS hierarchy, i.e the moment cone M d ( X ) is relaxed with a hierarchyof projections of spectrahedra of increasing size. As shown in [1], the Christoffel-Darbouxpolynomial can be used as a certificate of finite convergence of the hierarchy: the contactpoints of its level set at n d with X are the support of an optimal design.The Christoffel-Darboux polynomial corresponds to a particular choice of a convex functionto be minimized in the design problem, namely the logarithmic barrier function of the positivesemidefinite cone. It would be interesting to study the polynomials arising in the dual designproblem corresponding to other convex functions of the eigenvalues of positive semidefinitematrices [4]. Acknowledgement
Support from the ANR-3IA Artificial and Natural Intelligence Toulouse Institute is gratefullyacknowledged.