[PDF] Whitney's Theorem, Triangular Sets and Probabilistic Descent on Manifolds

Abstract

We examine doing probabilistic descent over manifolds implicitly defined by a set of polynomials with rational coefficients. The system of polynomials is assumed to be triangularized. An application of Whitney's embedding theorem allows us to work in a reduced dimensional embedding space. A numerical continuation method applied to the reduced-dimensional embedded manifold is used to drive the procedure.

Full PDF

JJOTA manuscript No. (will be inserted by the editor)

Whitney’s Theorem, Triangular Sets and ProbabilisticDescent on Manifolds

David W. Dreisigmeyer the date of receipt and acceptance should be inserted later

Abstract

We examine doing probabilistic descent over manifolds implicitlydeﬁned by a set of polynomials with rational coeﬃcients. The system of poly-nomials is assumed to be triangularized. An application of Whitney’s embed-ding theorem allows us to work in a reduced dimensional embedding space. Anumerical continuation method applied to the reduced-dimensional embeddedmanifold is used to drive the procedure.

Keywords

Probabilistic descent · Manifold · Nonlinear optimization

Mathematics Subject Classiﬁcation (2000) · In an optimization problem with equality constraints the feasible set often hasnice geometric properties. If we ‘stand’ at a typical point the feasible set willlocally look to us like a Euclidean space. But this locally Euclidean space willbe contained in a larger space. For example, the feasible set could be a one-dimensional string contained in a hundred-dimensional ambient space. It seemslike a waste to work with a hundred variables when the geometric structurewe’re actually interested in is only one-dimensional. Whiney’s theorem tells usthat in fact we could work in a three-dimensional ambient space. The stringcan be projected from the original hundred dimensions into a random three-dimensional subspace. Provided we can ﬁnd a map from the projected string

David W. Dreisigmeyer, Corresponding authorUnited States Census BureauCenter for Economic StudiesSuitland, MDDepartment of Electrical and Computer EngineeringColorado State UniversityFort Collins, [email protected] a r X i v : . [ m a t h . O C ] A ug David W. Dreisigmeyer back to the original string we lose nothing by working in the three-dimensionalspace.Sard’s theorem is well known in optimization since it tells us that equalityconstraints often give a reasonable feasible set over which we work. Or at leastit will alert us when this may not be the case. Whitney’s theorem is perhapsless well known. The theorem oﬀers a potentially large decrease in the eﬀectivedimension one needs to optimize in, which is certainly an attractive proposal.However this is not cost free. Firstly, the decrease in dimensionality is achievedby a linear projection from the initial ambient space into a lower-dimensionalworking space. This requires some sort of variable elimination from the set ofequality constraints in order to do the projection. The variable elimination isthe nonlinear analogue of triangularizing a matrix. Secondly, the reconstruc-tion from the reduced dimension working space back into the original spaceis generally nonlinear. A numerical implementation of the inverse (implicit)function theorem over a manifold is required. This is a nonlinear version ofbasic and nonbasic variables. The manifold in the reduced dimensional spaceserves as the basic variables and the embedding in the higher dimensionalspace represents the nonbasic variables.A decade ago the author was interested in applying Whitney’s theorem tooptimization problems. Unfortunately the required machinery was not quitesuﬃcient enough to give any satisfactory results. In the intervening years devel-opments in polynomial decompositions over real numbers has made it possibleto provide a sketch of the main ideas. While the method is not quite completeit is complete enough to oﬀer some potential applications and, more impor-tantly, a clear map of what work still needs to be done. In other words: itworks but it may not yet work well enough .The paper is organized as follows. We give a simple example in Section 2that demonstrates the basic problem and principles we will be dealing with.Section 3 covers the necessary results from diﬀerential geometry. The mainresult is Whitney’s weak embedding theorem which gives an upper bound onthe minimal embedding dimension of a manifold. In Section 4 we look at someneeded results from real algebraic geometry. In particular some real polyno-mials decompositions are introduced. These put the constraint equations intoa form convenient for applying Whitney’s theorem, which we examine in Sec-tion 5. Moving over manifolds in a controllable and intelligent way is necessary.Section 6 looks at some methods to accomplish this. The full optimizationprocedure is laid out in Section 7 with Section 8 providing an example. Adiscussion follows in Section 9.

Here we give a simple example in order to motivate and demonstrate themethodology that will be developed. Consider the optimization problemmin z ∈ R f ( z ) hitney’s Theorem, Triangular Sets and Probabilistic Descent on Manifolds 3 subject to the constraints0 = G ( z ) = z + z + z + z ,0 = G ( z ) = z z + z z + z z + z z ,0 = G ( z ) = z z z + z z z + z z z + z z z and0 = G ( z ) = z z z z − g ( x, u ) = u x −

10 = g ( y , u ) = y + u g ( y , x ) = y + x (1)where we renamed the variables u = z , x = z , y = z and y = z . Lookingat the polynomials in (1) we see that they are in a ‘triangular form’: g i onlydepends on u , x or y j where j < i .Looking at the new polynomials we see that it is not necessary to workin R . Only the ( x, u ) ∈ R need to be explicitly optimized over, treating y ( x, u ) = − u and y ( x, u ) = − x as functions of u and x . The original opti-mization problem now becomesmin ( x,u ) ∈ R ˆ f ( y ( x, u ) , y ( x, u ) , x, u ) subject to g ( x, u ) = 0where f ( z , z , z , z ) = ˆ f ( z , z , z , z ). [We will typically drop the ‘hat’ no-tation on the objective function when the variables are simply permuted. Thecontext should make it clear when this occurs.]This example highlights the key techniques we will employ. First, we havean optimization problem initially formulated in a ‘large’ Euclidean space. Theconstraints are given by polynomials with rational coeﬃcients. A triangulardecomposition can be performed on the original constraint statements to derivea new set of constraints. This new constraint set is then used to reduce theeﬀective dimension we need to optimize in. The optimization can now proceedin a lower-dimensional Euclidean space versus the original one. As part of theprojection into the lower-dimensional space an inverse function is used to mapfrom the feasible set in the lower dimension back to the original feasible set. A manifold M is a subset of R n that looks locally like R m , n > m . Let I be anindex set and for each i ∈ I let V i ⊂ M be an open subset where (cid:83) i ∈I V i = M .Additionally, let there be homeomorphisms φ i : V i → R m . Each pair ( V i , φ i )is called a chart and the set { ( V i , φ i ) } i ∈I is called an atlas .When two open subsets V i and V j of M overlap there will be a homeo-morphic transition function τ ij : R m → R m on the overlap V i ∩ V j deﬁned by David W. Dreisigmeyer τ ij = φ j ◦ φ − i . The transition functions allow us to stitch together the chartsinto a coherent whole. This then deﬁnes a topological manifold .The transition functions are often taken to be diﬀerentiable. In this casewe have a diﬀerentiable manifold . If the τ ij are all C k , k ≥

1, then M is saidto be a C k -manifold. If M is a C ∞ -manifold then it is called smooth . Every C k -manifold can be made smooth [2].At every point p ∈ M ⊂ R n a tangent space T p M can be attached. If aninner product h : T p M × T p M → R is deﬁned for every p ∈ M and variessmoothly over M then we have a Riemannian manifold . The function h ( · , · ) iscalled a metric and with it notions of lengths and angles in T p M are availableover M .The manifolds we consider arise from equality constraints. Deﬁnition 3.1 (Regular level sets)

Let g : R m + k → R k . For the set M = { z : g ( z ) = c } assume ∇ g ( z ) is full rank. Then M is a regular level set of g ( z ). Additionally, the null space of ∇ g ( z ) coincides with the tangent spaceto M . (cid:3) This paper will consider equality constraints given by members of Q [ z ], thepolynomials in z = [ z , . . . , z m + k ] with rational coeﬃcients. Then by Sard’stheorem almost every level set is a Riemannian manifold. Theorem 3.1 (Regular level set manifolds and Sard’s theorem [2])

Let g : R m + k → R k be a C ∞ function. Then almost every level set of g ( z ) isregular and a m -dimensional manifold. A regular level set acquires a metricfrom its embedding in R m + k by restricting the standard Euclidean metric tothe tangent space T p M given by null( ∇ g ( p )) for all p ∈ M . One issue is that the k in Theorem 3.1 may be much larger than m . Soeven though the manifold may be low-dimensional it could be embedded in avery high-dimensional space. Whitney’s theorem gives an upper bound on theminimal embedding dimensional for any manifold. Theorem 3.2 (Whitney’s weak embedding theorem [2])

Let M be an m -dimensional manifold. Then M can be embedded in R m +1 . Further, if M is embedded in R m + k , k > m + 1 , a random projection into a m + 1 subspacegives an embedding of M in R m +1 . Here we cover the necessary background from real algebraic geometry. Ourmain focus is putting a set of polynomials P ⊂ Q [ z ] into a convenient formfor utilizing Whitney’s theorem. Deﬁnition 4.1 (Semi-algebraic sets, real algebraic varieties and Nashmanifolds)

A set S ⊂ R n is a semi-algebraic set if it satisﬁes a set of poly-nomial equations g i ( z ) = 0, i = 1 , . . . , j , and polynomial strict inequalities h i ( z ) > i = 1 , . . . , l , or a ﬁnite union of such sets. A real algebraic variety hitney’s Theorem, Triangular Sets and Probabilistic Descent on Manifolds 5 is deﬁned using only a set of polynomial equations g i ( z ) = 0, i = 1 , . . . , j ,without ﬁnite unions. Nash manifolds are semi-algebraic sets that are alsomanifolds. (cid:3)

This paper focuses on Nash manifolds that are also varieties.In order to utilize Whitney’s theorem the polynomials deﬁning a Nashmanifold will need to be decomposed into a friendlier form. Let the Nashmanifold

N ⊂ R m + k be deﬁned by the set of polynomial equations g i ( z ) = 0, i = 1 , . . . , k . Assign to the entries of z the ordering z < z < . . . < z m + k .Deﬁne Q i = Q [ z , . . . , z i ] as the ring of polynomials in the variables z , . . . , z i with rational coeﬃcients. deg ( g, z i ) is the degree of polynomial g in the vari-able z i . mvar ( g ) is the greatest variable z i such that deg ( g, z i ) (cid:54) = 0. We candecompose a g ∈ Q i into the form g = ιz di + τ where d = deg ( g, z i ) > ι ∈ Q i − and τ ∈ Q i with deg ( τ, z i ) < d . Then init ( g ) = ι is the initial of g , mdeg ( g ) = d is the main degree of g , rank ( g ) = z di is the rank of g , tail ( g ) = τ is the tail of g and head ( g ) = g − tail ( g ) = ιz di is the head of g .Every polynomial is then decomposed as p = init ( p ) rank ( p ) + tail ( p ). Deﬁnition 4.2 (Triangular sets)

Let T ⊂ Q n . We call T a triangular set if no member of T is constant and for any pair g, h ∈ T , g (cid:54) = h , we have mvar ( g ) (cid:54) = mvar ( h ). If z i = mvar ( g ) for some g ∈ T then z i is called algebraicwith respect to T . Otherwise z i is called free with respect to T . (cid:3) Triangular sets are a nonlinear generalization of triangulating a matrix. Theyare very convenient for using Whitney’s theorem.

Example 4.1

Let T = { z z − , z + z , z + z } . This is a triangular set where mvar ( z z −

1) = z , mvar ( z + z ) = z and mvar ( z + z ) = z .Then z is free and the other variables are algebraic. (cid:3) One of the diﬃculties we face with using triangular sets is that most of theprior work on them has been with C [ z ], the polynomials with complex coeﬃ-cients. Our concern is only with the real solutions of a system of polynomialsin Q [ z ] which has seen recent development [3, 4]. We’ll develop the necessarymachinery for working with Nash manifolds but ﬁrst we’ll look at a standardsolution method over C .The algorithm developed by Kalkbrener in [1] seems like a reasonable choicefor calculating triangular sets [5, 6]. This method builds on a particular typeof triangular set which requires some additional machinery. David W. Dreisigmeyer

Deﬁnition 4.3 (Saturated ideals)

Given a triangular set T = { t , . . . , t k } the ideal (cid:104) T (cid:105) generated by T is given by (cid:104) T (cid:105) = { q ∈ Q [ z ] : q = (cid:80) i p i t i , p i ∈ Q [ z ] } .Let h T = (cid:81) i init ( t i ). The saturated ideal of T is deﬁned as sat ( T ) = { q ∈ Q [ z ] : ∃ n ∈ N such that h nT q ∈ (cid:104) T (cid:105)} with sat ( ∅ ) = (cid:104) (cid:105) . (cid:3) The ideal (cid:104) T (cid:105) gives all valid equality constraints that follow from those in T .We see that (cid:104) T (cid:105) ⊂ sat ( T ) but sat ( T ) can contain additional polynomials. Deﬁnition 4.4 (Regular chains)

Let T = { t , . . . , t k } be a triangular set.A p ∈ Q [ z ] is regular modulo sat ( T ) if p (cid:54)∈ sat ( T ) and there does not exista q ∈ Q [ z ] where q (cid:54)∈ sat ( T ) but pq ∈ sat ( T ). Deﬁne T

Let T = { T , . . . , T l } be a set of regular chains and V ( F ) be an algebraic variety. T isa Kalkbrener triangular decomposition of V ( F ) if V ( F ) = (cid:83) i V ( sat ( T i )). (cid:3) So V ( sat ( T i )) is geometrically more relevant to Kalkbrener’s method than V ( T i ): each V ( sat ( T i )) gives some piece of V ( F ). We can relate V ( T ) and V ( sat ( T )). Deﬁnition 4.6 (Regular zeros)

Given a triangular set T = { t , . . . , t k } let h T = (cid:81) i init ( t i ). The regular zeros W ( T ) of the triangular set T is deﬁnedas W ( T ) = V ( T ) − V ( h T ). (cid:3) For a triangular set T it is the case that [7] V ( sat ( T )) = W ( T )where closure is with respect to the Zariski topology. If V ( T ) is also a manifoldthen W ( T ) = V ( T ) = V ( sat ( T )). Deﬁnition 4.7 (Radical of (cid:104) T (cid:105) ) The radical of the ideal (cid:104) T (cid:105) is (cid:112) (cid:104) T (cid:105) = { q ∈ Q [ z ] : ∃ n ∈ N such that q n ∈ (cid:104) T (cid:105)} .That is, (cid:112) (cid:104) T (cid:105) consists of all those polynomials that vanish on V ( T ). (cid:3) Since V ( T ) = V ( sat ( T )) it follows from Hilbert’s Nullstellensatz [8] that (cid:112) (cid:104) T (cid:105) = (cid:112) sat ( T ). From this we see that a Kalkbrener triangular decomposi-tion T = { T , . . . , T l } of F ⊂ Q [ z ] gives (cid:112) (cid:104) F (cid:105) = (cid:84) i (cid:112) sat ( T i ).A Kalkbrener triangular decomposition allows for complex solutions butNash manifolds only work with the real solutions. hitney’s Theorem, Triangular Sets and Probabilistic Descent on Manifolds 7 We ﬁrst look at an easy to understand case. Consider the linear system A z = b .This set of polynomials deﬁnes a special type of manifold that is globally like R m when A ∈ R k × ( m + k ) . Let k > m + 1 for what follows. It is convenient tohave A in a triangular form A = k − m − m + 1 m (cid:20) (cid:21) A A A k − m − A A m + 1with A and A upper triangular. Partition the z as z = [ y x u ]with y ∈ R k − m − , x ∈ R m +1 and u ∈ R m . Whitney’s theorem tells us that thespace given by x and u is suﬃcient to embed the manifold implicitly deﬁnedby the set of polynomials A z − b . Let the original manifold deﬁned in R m + k bedenoted by M and the projected image in R m +1 be denoted by M (cid:48) . Partition b as b = [ b b ]. We can solve the equation[ A A ] (cid:20) xu (cid:21) = b to ﬁnd the points on M (cid:48) . Call one of these [ x ∗ u ∗ ] T ∈ M (cid:48) . This can then bemapped onto the corresponding point on M by solving[ A A A ]  yx ∗ u ∗  = b for y . While we may not want to do it in this case, the optimization canbe carried out in R m +1 and we can then map from M (cid:48) to M by ﬁnding y = y ( x , u ). Using the full vector z = [ y x u ] we can pull back any functiondeﬁned on M to M (cid:48) . A function that is pulled back could be, e.g., the objectivefunction or an inequality constraint.Now we can repeat the above example for the general nonlinear case. Wewill make the simplifying assumption that M is path connected. If there aremultiple pieces of M then the following argument still carries through for eachindividual piece. After a possible reordering of the coordinates, partition the z ∈ R m + k from Deﬁnition 3.1 as z = [ y x u ]where y ∈ R k − m − , x ∈ R m +1 and u ∈ R m . We take the members of g ( z ) tobe triangularized [4] in the x and y as g (cid:63) ( x , u ) : R m +1 → R m +1 where (2a) g (cid:63)i ( x , u ) = g (cid:63)i ( x , . . . , x i , u ) (2b) David W. Dreisigmeyer for i = 1 , . . . , m + 1 and g ◦ ( y ; x , u ) : R k − m − → R k − m − where (2c) g ◦ j ( y ; x , u ) = g ◦ j ( y , . . . , y j ; x , u ) (2d)for j = 1 , . . . , k − m −

1, with g = [ g (cid:63) g ◦ ] T . In (2c) the x and u are determinedfrom (2a) and are treated as parameters. If there is at most one solution to g ◦ ( y ; x , u ) = for every x and u such that g (cid:63) ( x , u ) = , then g (cid:63) ( x , u ) gives alow-dimensional embedding of our manifold. Theorem 3.2 tells us that typicallyany choice for the x and u should accomplish this.We see that the optimization can progress in R m +1 by using the x and u variables. The manifold deﬁned implicitly by (2a) can then be mapped ontothe original manifold by solving (2c) for the unique y ( x , u ). We then use z ( u , x ) = [ y ( x , u ) x u ]to pull back the objective function and any inequality constraints from themanifold embedding in R m + k to the embedding in R m +1 .The ability to triangularize the polynomials in g ( z ) is what allows Whit-ney’s theorem to be easily implemented. There are two things to notice aboutthis triangularization. First, all of this is restricted to working only with thereal versus complex roots. Second, as noted above, we only work on the highestdimensional subcomponent of the variety V ( g ).Finding the function y ( x , u ) is the crucial step. The most obvious way ofdoing this is the one we presented above. However it’s unimportant how thisis done in practice. Any numerical implementation of the implicit functiontheorem applied to g ◦ ( y , x , u ) will suﬃce, where now x and u are treated asvariables. We will need a procedure to intelligently move over the manifold M (cid:48) ⊂ R m +1 .One advantage of working with a lower-dimensional embedding space is thatmethods that may be expensive when working in high-dimensions can becomefeasible. This is counter-balanced by the need to map back into the originalspace, say with the y ( x , u ) from Section 4. But we’re free to use diﬀerentmethods to accomplish these diﬀerent goals. We move along a path on themanifold M (cid:48) with one technique and drag along a point on M ⊂ R m + k witha diﬀerent one. Numerical continuation methods will likely play a key roll inany implementation employing Whitney’s theorem. We mention the references[9–11] in addition to the ones below.On a Riemannian manifold M a geodesic is the shortest path betweentwo (nearby) points on M and generalizes the notion of a straight line. The hitney’s Theorem, Triangular Sets and Probabilistic Descent on Manifolds 9 coupled nonlinear equations for a geodesic on M are [12]¨ z i + Γ ijk ˙ z j ˙ z k = 0 Γ ijk = (cid:2) ∇ g + (cid:3) i ∂ g ∂z j ∂z k where [ ∇ g + ] i is the i -th row of the pseudo-inverse of ∇ g . The Einstein sum-mation convention is used above where a repeated index in a term is summedover. The Γ ijk are the Christoﬀel symbols associated with g . The distancemoved over the manifold in a unit time period is controlled by the length ofthe initial tangent vector ˙ z . For M (cid:48) ⊂ R m +1 g (cid:63) is substituted for g and weonly work with the x and u variables. If 2 m + 1 is ‘small’ it’s possible thatﬁnding geodesics on M (cid:48) would be feasible. One could then ﬁnd y ( x , u ) for theend point of a geodesic and use this to pull back any necessary functions from M to M (cid:48) .In [13] a method of approximating a manifold by locally projecting downtangent spaces was developed. Very little of the machinery is needed. As statedin Theorem 3.1 the tangent space T p M (cid:48) to M (cid:48) at p = [ x u ] ∈ M (cid:48) is givenby null( ∇ g (cid:63) ( p )). Let U (cid:48) be an orthonormal basis for T p M (cid:48) . Then the vector q = p + U (cid:48) w lies in T p M (cid:48) when the latter is viewed as an aﬃne subspace of R m +1 . Set N (cid:48) = [ ∇ g (cid:63) ( p )] + where A + is the Moore-Penrose pseudoinverse ofthe matrix A . The tangent vector q is projected down onto M (cid:48) by q n +1 = q n − N (cid:48) g (cid:63) ( q n ), n = 0 , , . . . , with (3a) q = p + U (cid:48) w (3b)for some w ∈ R m . Computationally this is the most attractive of the pro-cedures presented and is the one we will use for the remainder. Since this isa numerical implementation of the implicit function theorem there will be aradius around the origin in T p M (cid:48) where it will be guaranteed to converge toa point on M (cid:48) . This is restated as a restriction on the length of the w in (3b). All of the pieces are in place to state the optimization procedure. Let theoptimization problem be stated asmin z ∈ R m + k f ( z ) subject to ˜ g ( z ) ≤ (4)with ˜ g : R m + k → R k and ˜ g i ∈ Q [ z ]. Triangularize ˜ g ( z ) into g ( z ) which has thestructure presented in (2). Take g (cid:63) : R m +1 → R m +1 as deﬁning the manifold M (cid:48) .We will assume we have a black-box implementation B ( U (cid:48) , w ) of the map-ping deﬁned by (3). B ( U (cid:48) , w ) will return FALSE if the algorithm doesn’t con-verge to a point on M (cid:48) close to the base point p ∈ M (cid:48) . That is, B ( U (cid:48) , w ) isalso an oracle for when the implicit function theorem no longer holds around p . We will additionally assume a black-box implementation Φ : M (cid:48) → M ofthe implicit function mapping. For p ∈ M (cid:48) the point Φ ( p ) = [ y ( p ) p ] is theunique point that lies on M .Now we will use the probabilistic descent method developed in [14]. Aparticular implementation is given by Procedure 1. The Move step in Proce-dure 1 picks a new base point on M (cid:48) when the oracle B ( U (cid:48) , w ) tells us that theimplicit function implementation is failing. The base point deﬁnes the tangentspace we are working in. If the base point eventually becomes ﬁxed then theconvergence results are the same as those presented in [14]. If the base pointnever becomes ﬁxed we say the algorithm didn’t converge and no solution wasfound. Procedure 1 (Probabilistic descent on manifolds)

Specify the maximum step sizeincrease α max ∈ (0 , ∞ ] and an α ∈ (0 , α max ] . Set θ = 1 / and γ = 2 . Let the forc-ing function be ρ ( α ) = Cα for some constant C . Start with an initial point p ∈ M (cid:48) and set the current base point b = p and the current tangent vector w = ∈ R m . Let ˜ f = f ◦ Φ ( p ) : M (cid:48) → M . Pick a maximum number of iterations j max ∈ N . Finally let j = 0 .Tangents Let U (cid:48) j be an orthonormal basis for T b j M (cid:48) . Pick a random u ∈ R m such that (cid:107) u (cid:107) = 1 . Construct the two vectors w ± = w j ± α j u .Move If B ( U (cid:48) j , w + ) or B ( U (cid:48) j , w − ) returns FALSE set p j +1 = p j , b j +1 = p j , w j +1 = and α j +1 = θα j . If j > j max return p j otherwise increment j and Goto the Tangents step.Polling Let p + = B ( U j , w + ) and p − = B ( U j , w − ) . If ˜ f ( p + ) < ˜ f ( p j ) − ρ ( α j ) then set p j +1 = p + , w j = w + and declare the poll successful. Else if ˜ f ( p − ) < ˜ f ( p j ) − ρ ( α j ) then set p j +1 = p − , , w j = w − and declare the poll successful. Otherwise set p j +1 = p j and declare the poll unsuccessful.Step Size If the Polling step was successful set α j +1 = min( α max , γα j ) . Otherwise set α j +1 = θα j . Set b j = b j − . If j > j max return p j otherwise increment j and Gotothe Tangents step.

Because we are working with the vector space T p M (cid:48) we can also lay outa mesh. This means that the MADS algorithm ﬁrst developed in [15] is alsoviable. The advantage here is the ability to handle inequality constraints aswell. Here we look at a simple example that demonstrates how to ﬁnd the piecesfor Procedure 1. Let the equality constraints be given by g ( y, x, u ) = (cid:20) u + x + y u + x − (cid:21) = . See Figure 1 for the solution set of g ( y, x, u ). This triangularized system of hitney’s Theorem, Triangular Sets and Probabilistic Descent on Manifolds 11 Fig. 1

The manifolds M and M (cid:48) for the example in Section 8. M (cid:48) is only embedded in R given by the coordinates [ x u ]. polynomials gives the pair g (cid:63) ( x, u ) = u + x − g ◦ ( y ; x, u ) = u + x + y .Using g ◦ ( y ; x, u ) we can ﬁnd y as a function of x and u : y ( x, u ) = − (cid:112) u + x .This is the implicit function we need to map from M (cid:48) back to M . From thiswe have Φ = [ y ( x, u ) x u ]. We also need the gradient of g (cid:63) ( x, u ) to ﬁnd thetangent space of M (cid:48) : ∇ g (cid:63) ( x, u ) = (cid:20) x u (cid:21) .Looking at Figure 1 we see that at least two diﬀerent tangent spaces are re-quired to implement the implicit function mapping T p M (cid:48) to M (cid:48) . For B ( U (cid:48) , w )we could have it return FALSE whenever, say, (cid:107) q n − q (cid:107) > /

2, with the q i asin (3). This would require more than two diﬀerent tangent spaces to cover M (cid:48) but it’s likely an acceptable criterion to function as an oracle. We have not implemented the procedure in Section 7 numerically. There aremultiple components to the method each of which is a signiﬁcant undertakingin itself to be done correctly and eﬃciently. However this was not our mainpurpose. Rather we wished to draw upon classic and recent results that worktogether in ways that are perhaps novel for nonlinear optimization problems.

Disclaimer

Any opinions and conclusions expressed herein are those of the author anddo not necessarily represent the views of the U.S. Census Bureau. The research in this paperdoes not use any conﬁdential Census Bureau information. This was authored by an employeeof the US national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Governmentpurposes only.

References

1. Kalkbrener, M.: A generalized euclidean algorithm for computing triangular represen-tations of algebraic varieties. Journal of Symbolic Computation (2), 143 – 167 (1993)2. Hirsch, M.W.: Diﬀerential Topology. Graduate Texts in Mathematics. Springer NewYork (1997)3. Chen, C., Davenport, J.H., Moreno Maza, M., Xia, B., Xiao, R.: Computing with semi-algebraic sets represented by triangular decomposition. In: Proceedings of the 36thInternational Symposium on Symbolic and Algebraic Computation, pp. 75–82 (2011)4. Chen, C., Davenport, J.H., May, J.P., Maza, M.M., Xia, B., Xiao, R.: Triangular de-composition of semi-algebraic systems. Journal of Symbolic Computation , 3 – 26(2013)5. Aubry, P., Maza, M.M.: Triangular sets for solving polynomial systems: a comparativeimplementation of four methods. Journal of Symbolic Computation (1), 125 – 154(1999)6. Chen, C., Maza, M.M.: Algorithms for computing triangular decomposition of polyno-mial systems. Journal of Symbolic Computation (6), 610 – 642 (2012)7. Aubry, P., Lazard, D., Maza, M.M.: On the theories of triangular sets. Journal ofSymbolic Computation (1), 105–124 (1999)8. Cox, D.A., Little, J., O’Shea, D.: Ideals, Varieties, and Algorithms, 4th edn. Under-graduate Texts in Mathematics. Springer International Publishing (2015)9. Allgower, E.L., Schmidt, P.H.: An algorithm for piecewise-linear approximation of animplicitly deﬁned manifold. SIAM Journal on Numerical Analysis (2), 322–346 (1985)10. Allgower, E.L., Georg, K.: Piecewise linear methods for nonlinear equations and op-timization. Journal of Computational and Applied Mathematics (1), 245 – 261(2000)11. Rheinboldt, W.C.: Numerical continuation methods: a perspective. Journal of Compu-tational and Applied Mathematics (1), 229 – 244 (2000)12. Dreisigmeyer, D.W.: Equality constraints, Riemannian manifolds and direct searchmethods. (2006)13. Brodzik, M.: The computation of simplicial approximations of implicitly deﬁned p-dimensional manifolds. Computers & Mathematics with Applications (6), 93 – 113(1998)14. Gratton, S., Royer, C.W., Vicente, L.N., Zhang, Z.: Direct search based on probabilisticdescent. SIAM Journal on Optimization (3), 1515–1541 (2015)15. Audet, C., J. E. Dennis, J.: Mesh adaptive direct search algorithms for constrainedoptimization. SIAM Journal on Optimization17