Optimization with Equality and Inequality Constraints Using Parameter Continuation
OOptimization with Equality and Inequality Constraints Using ParameterContinuation ˚ Mingwu Li : and Harry Dankowicz : Abstract.
We generalize the successive continuation paradigm introduced by Kern´evez and Doedel [16] forlocating locally optimal solutions of constrained optimization problems to the case of simultaneousequality and inequality constraints. The analysis shows that potential optima may be found at theend of a sequence of easily-initialized separate stages of continuation, without the need to seed thefirst stage of continuation with nonzero values for the corresponding Lagrange multipliers. A keyenabler of the proposed generalization is the use of complementarity functions to define relaxedcomplementary conditions, followed by the use of continuation to arrive at the limit required bythe Karush-Kuhn-Tucker theory. As a result, a successful search for optima is found to be possiblealso from an infeasible initial solution guess. The discussion shows that the proposed paradigm iscompatible with the staged construction approach of the coco software package. This is evidencedby a modified form of the coco core used to produce the numerical results reported here. Theseillustrate the efficacy of the continuation approach in locating stationary solutions of an objectivefunction along families of two-point boundary value problems and in optimal control problems.
Key words. constrained optimization, feasible solutions, complementarity conditions, boundary-value prob-lems, periodic orbits, optimal control, successive continuation, software implementation
AMS subject classifications.
1. Introduction.
Parameter continuation techniques enable a global study of smooth man-ifolds of solutions to underdetermined systems of equations. It stands to reason that theyshould also be useful for optimization problems constrained to such manifolds. Starting froma single solution and local information about the governing equations, such techniques gen-erate a successively expanding, suitably dense cover of solutions, among which optima maybe sought. Suggestive examples include optimization along families of periodic or quasiperi-odic solutions of nonlinear dynamical systems or optimal control problems with end-pointconstraints.As shown first by Kern´evez and Doedel [16, 9], parameter continuation techniques maybe effectively deployed as a core element of a search strategy for optima along a constraintmanifold. This is accomplished by seeking simultaneous solutions to the original set of equa-tions and a set of additional adjoint conditions, linear and homogeneous in a set of unknownLagrange multipliers. The technique introduced in [16] demonstrates how local optima maybe found at the terminal points of sequences of continuation runs, with each successive runinitialized by the final solution found in the previous run. Remarkably, by the linearity andhomogeneity of the adjoint conditions, the initial runs may be conveniently initialized withvanishing Lagrange multipliers from solutions to the original set of equations. Kern´evez andDoedel’s technique thereby overcomes the difficulty of determining values for the original un- ˚ This paper was submitted on August 16, 2018 : Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign, Urbana, IL([email protected], [email protected]). a r X i v : . [ m a t h . O C ] S e p M. LI, H. DANKOWICZ knowns and the Lagrange multipliers that provide an adequate initial solution guess to thenecessary conditions for local optima.The objective of this paper is to extend Kern´evez and Doedel’s technique to optimizationproblems with simultaneous equality and inequality constraints. This nontrivial generaliza-tion is here accomplished by seeking simultaneous solutions to the original set of equations,additional adjoint conditions that are again linear and homogeneous in a set of unknownLagrange multipliers, and relaxed nonlinear complementarity conditions in terms of the in-equality functions and the corresponding Lagrange multipliers. As before, local optima arefound at the terminal points of sequences of consecutive continuation runs with linearity andhomogeneity again enabling initialization with vanishing Lagrange multipliers. Notably, here,certain stages of continuation are used to drive the relaxation parameters to zero in order toensure that the necessary nonlinear complementarity conditions are exactly satisfied.In the absence of inequality constraints, Kern´evez and Doedel’s technique relies on theexistence of a branch point for the initial continuation problem from which emanate twoone-dimensional branches of solutions with vanishing and non-vanishing Lagrange multipliers,respectively. As shown here, such a branch point may also be found in the presence ofinequality constraints. Interestingly, inequality constraints afford additional opportunities forgenerating solutions with non-vanishing Lagrange multipliers from an initial solution withall zero multipliers. Remarkably, in the presence of inequality constraints, the successivecontinuation technique may even benefit from initialization on solutions that violate theseconstraints. An original contribution of this manuscript is the formulation and proof of severalkey lemmas that establish these properties for large classes of optimization problems.Example applications of Kern´evez and Doedel’s technique can be found in [7, 6, 22, 23]. Ineach case, the governing set of equations, including the adjoint conditions, forms a two-pointboundary-value problem that is analyzed using the software package auto [8]. Such an imple-mentation is also possible for the extension to the presence of inequality constraints, as nothingin the formulation relies on a particular software implementation. Nevertheless, recent work bythe present authors [18] demonstrates the implementation of a staged construction approachfor the adjoint equations in the matlab -based software platform coco [20], supporting theassembly of the full continuation problem from partial problems with predefined structureand adjoints. A powerful example of this functionality is the optimal design of a transfertrajectory between two halo orbits near a libration point of the circular restricted three-bodyproblem [18]. In this paper, we use a further extension of coco to locate stationary points ofan algebraic-integral objective functional constrained by a two-point boundary-value problemand an integral inequality, as well as for an optimal control problem under integral inequalityconstraints.We finally note a further use of continuation methods when seeking solutions to singular,non-smooth, or non-differentiable optimization problems as limits of continuous families ofregularized optimization problems. Such applications will not be considered here, but includeregularized optimal control problems with differentiable solutions that approach discontinu-ous bang-bang control solutions in the singular limit [2, 15], as well as regularized optimalcontrol problems with index-1 differential-algebraic constraints that converge to higher-indexconstraints in the singular limit [10]. Such regularizations can, in principle, be combined withthe techniques described in this paper, provided that the singular limits can be reached in the
ONSTRAINED OPTIMIZATION USING PARAMETER CONTINUATION 3 corresponding stages of continuation.The remainder of this paper is organized as follows. In section 2, we formulate the first-order necessary conditions for local optima of an objective function in the presence of equalityand real-valued inequality constraints on a general Banach space. We describe the use ofa nonsmooth complementarity function to enforce the complementary slackness conditionson the inequality functions and the corresponding Lagrange multipliers. Section 3 presentsthe application of the successive continuation technique to a finite-dimensional optimizationproblem that motivates the subsequent theoretical development. The detailed analysis illus-trates the additional flexibility afforded by the presence of inequality constraints and lays thefoundation for an algorithmic implementation in numerical software. The generalization tothe infinite-dimensional context is discussed in section 4, with reference to several originalkey lemmas that are proved in Appendix A. After presenting some implementation details insection 5 that are particular to the advantages afforded by the staged construction paradigmof coco , we consider additional examples in section 6 to demonstrate the effectiveness of theproposed optimization approach. A brief summary and several directions for future researchare considered in the concluding section 7.
2. Problem statement.
Consider the problem of finding a locally unique pair p ˆ u, ˆ µ q thatis a stationary point of the function p u, µ q ÞÑ µ under the equality constraints F p u, µ q “ G p u q ď
0, where(2.1) F p u, µ q ÞÑ ˆ Φ p u q Ψ p u q ´ µ ˙ . Here, Φ : U Ñ Y , Ψ : U Ñ R l and G : U Ñ R q are continuously Fr´echet differentiablemappings, and U and Y are real Banach spaces with duals U ˚ and Y ˚ . We refer to the set t u P U : G p u q ď u as the feasible region and to its complement as the infeasible region.Let A : “ t i : G i p ˆ u q “ u denote the set of indices of active inequality constraints evaluatedat ˆ u , and suppose that the range of p D Φ p ˆ u q , DG A p ˆ u qq equals Y ˆ R | A | . It follows from Corollary9.4 in [1] that there exist unique ˆ λ P Y ˚ , ˆ η P R l , and ˆ σ P R q that satisfy the generalizedKarush-Kuhn-Tucker (KKT) optimality conditions(2.2) Φ p ˆ u q “ , Ψ p ˆ u q ´ ˆ µ “ , (2.3) p D Φ p ˆ u qq ˚ ˆ λ ` p D Ψ p ˆ u qq ˚ ˆ η ` p DG p ˆ u qq ˚ ˆ σ “ , ˆ η “ , ˆ η t ,...,l u “ , and(2.4) ˆ σ i ě , ´ G i p ˆ u q ě , ˆ σ i G i p ˆ u q “ , ď i ď q, where p D Φ p u qq ˚ : Y ˚ Ñ U ˚ , p D Ψ p u qq ˚ : R l Ñ U ˚ and p DG p u qq ˚ : R q Ñ U ˚ are the adjointsof the Fr´echet derivatives D Φ p u q , D Ψ p u q and DG p u q , respectively.Inspired by the general theory of nonlinear complementarity problems [3, 12, 21], we findit convenient to convert (2.4) into a set of nonlinear equations. Specifically, let χ : R ˆ R Ñ R be a function that satisfies the conditions(2.5) χ p a, b q “ ðñ a, b ě , ab “ . M. LI, H. DANKOWICZ
Then (2.4) is equivalent to the condition(2.6) K p ˆ σ, ´ G p ˆ u qq : “ ` χ p ˆ σ , ´ G p ˆ u qq . . . χ p ˆ σ q , ´ G q p ˆ u qq ˘ J “ . In this paper, we let χ equal the Fischer-Burmeister function(2.7) χ p a, b q “ a a ` b ´ a ´ b, whose contour plot is shown in Fig. 1. In particular, for κ ą χ p a, b q “ κ ñ a “ ´ κ p b ` κ q p b ` κ q , b ą ´ κ. As long as p a, b q ‰ p , q ,(2.9) χ a p a, b q : “ B χ B a p a, b q “ a ? a ` b ´ , χ b p a, b q : “ B χ B b p a, b q “ b ? a ` b ´ , from which we conclude that(2.10) χ a p , b q “ ´ , χ b p , b q “ sgn p b q ´ , for b ‰
0. In particular, χ b p , b q “ b ą
0. The function χ is clearly singular at p , q . Figure 1.
Contour plot of the Fischer-Burmeister function χ p a, b q “ ? a ` b ´ a ´ b . ONSTRAINED OPTIMIZATION USING PARAMETER CONTINUATION 5
3. Motivating example.
Consider the problem of locating a local minimum of the function u : “ p x, y q ÞÑ Ψ p u q : “ p x ´ q ` p y ´ q subject to the inequalities(3.1) G p u q : “ x ` y ´ ď , G p u q : “ ´ x ` y ď . In the notation of the previous section, U “ R , Y “ H , l “
1, and q “
2. By the Karush-Kuhn-Tucker conditions, there exist unique non-negative scalars ˆ σ and ˆ σ such that2 ˆ ˆ x ´ y ´ ˙ ` ˆ ˙ ˆ σ ` ˆ ´ ˙ ˆ σ “ , ˆ σ p ˆ x ` y ´ q “ , ˆ σ p´ ˆ x ` ˆ y q “ . (3.2)It follows that a candidate stationary point in the feasible region is located at ˆ u “ p { , { q ,since ‚ ˆ σ “ ˆ σ “ u “ p , q , but G p , q “ ę ‚ ˆ σ “ , ˆ σ ‰ u “ p { , { q and ˆ σ “ ´ { ğ ‚ ˆ σ ‰ , ˆ σ “ u “ p { , { q and ˆ σ “ { ě ‚ ˆ σ ‰ , ˆ σ ‰ u “ p { , { q , ˆ σ “ {
25, and ˆ σ “ ´ { ğ G “ G ă
0, alongwhich Ψ evaluates to 3 p y ´ y ` q and attains a minimum at ˆ y “ {
3. Moreover, for (cid:15) ! p { ` (cid:15)v, { ` (cid:15)w q « ´ G p { ` (cid:15)v, { ` (cid:15)w q{
3. We conclude that ˆ u is a unique localminimum of Ψ in the feasible region.We illustrate next a method for locating the minimum at ˆ u using a successive continuationapproach that connects an initial point u to ˆ u via a sequence of intersecting one-dimensionalmanifolds. To this end, consider the following four regions: U `{´ “ t u P R : G p u q ą , G p u q ă u , U ´{` “ t u P R : G p u q ă , G p u q ą u , U `{` “ t u P R : G p u q ą , G p u q ą u , and U ´{´ “ t u P R : G p u q ă , G p u q ă u . It follows that U `{´ , U ´{` , and U `{` areopen subsets of the infeasible region, while U ´{´ is an open subset of the feasible region.Suppose that u P U `{´ and let κ , : “ χ p , ´ G p u qq “ G p u q ą
0. It follows that u lies on a locally unique one-dimensional solution manifold of the equation(3.3) χ p , ´ G p u qq ´ κ , “ ðñ G p u q “ G p u q . The point(3.4) u : “ ˆ ` G p u q , ` G p u q ˙ is a stationary point of Ψ on this manifold and G ă u and u provided that G p u q ă $’’&’’% Ψ p u q ´ µ “ , p D Ψ p u qq J η ` p DG p u qq J σ “ ,χ p σ , ´ G p u qq ´ κ “ ,χ p σ , ´ G p u qq ´ κ “ , (3.5) M. LI, H. DANKOWICZ with κ “ κ , , κ “
0, and unknowns p u, µ , η , σ , σ q . Then, every solution u P U `{´ of(3.3) corresponds to a solution p u, Ψ p u q , , , q of (3.5). Indeed, for every such point with u ‰ u , the corresponding Jacobian is found to have full rank. Thus, provided that u ‰ u ,the one-dimensional solution manifold of (3.3) through u corresponds to a locally uniqueone-dimensional solution manifold of (3.5) through p u , Ψ p u q , , , q .Since u is a stationary point of Ψ along a level curve of G , the matrices D Ψ p u q and DG p u q are linearly dependent. It follows that if G p u q ă
12 then p u , Ψ p u q , , , q is abranch point of (3.5) through which passes a secondary one-dimensional solution manifold,locally parameterized by η , along which σ “ D Ψ p u q and DG p u q arelinearly dependent. Substitution yields y “ x ´ σ “ p ´ x q η , where x is implicitlydefined by(3.6) χ p p ´ x q η, ´ x q ´ κ , “ . Since G p u q ă
12, it follow from (2.8) that G ă η P r , s . Denotethe corresponding u for η “ u . Then, p u , Ψ p u q , p ´ x q , , κ , q is a solution to (3.5)with η “ κ “
0, and unknowns p u, µ , σ , σ , κ q . Indeed, this point lies on a locally uniqueone-dimensional manifold of such solutions, along which σ “ y “ x ´ σ “ p ´ x q ,where x is implicitly defined by(3.7) χ p p ´ x q , ´ x q ´ κ “ . Since G p u q ă
12, it again follows from (2.8) that G ă κ P r , κ , s .The corresponding u for κ “ u , as expected.Numerical results using parameter continuation with the coco software package validatethe above analysis. With u “ p , q P U `{´ , we have G p u q “ κ , “
8. Continuationalong the solution manifold to (3.5) with κ “ κ , and κ “ is detected at p x, y, µ , η , σ , σ q “ p . , . , . , , , q represented by the red dots in Figure 2. Branch switching at the stationary point resultsin the secondary branch terminating at the point p . , . , . , . , ´ . , q represented by the blue dots in Figure 2. Finally, continuation along the solution manifoldto (3.5) with η “ κ “ p x, y, µ , σ , σ , κ q “ p . , . , . , . , , q , consistent with theanalytical solution.Notably, the initial continuation from u terminates with a failure of the Newton solverto converge at the singular point p x, y, µ , η , σ , σ q “ p { , { , { , , , q on G “ G “ σ and σ , also terminatesat this point. This branch extends in a direction oppositely aligned (angle greater than 90 ˝ )to the direction along which continuation arrives at the singular point, thereby explainingthe failure of the pseudo-arclength algorithm to bypass the singular point (cf. the right panelof Figure 3). Moreover, since η decreases from 0 as σ increases from 0 along this secondarybranch, it cannot be used to reach a point with η “ u P U ´{` and let κ , “ χ p , ´ G p u qq “ G p u q ą u lies on a one-dimensional solution manifold of the equation(3.8) χ p , ´ G p u qq ´ κ , “ ðñ G p u q “ G p u q . ONSTRAINED OPTIMIZATION USING PARAMETER CONTINUATION 7 x y G ( x; y ) = 0 G ( x; y ) = 0
010 10.50.5 <
01 0 -0.5
Figure 2.
Projections of continuation paths associated with a successive search for stationary solutions.Here, dark green thin lines and hollow markers are used to denote projections of black thick lines and filledmarkers in three-dimensional space onto the three coordinate planes. Gray planes are used to represent tightconstraints. Starting from u “ p , q and holding κ and κ fixed at and , respectively, a fold point in µ ,denoted by the red dots, is detected along the first solution manifold in the η “ σ “ σ “ subspace. Alongthe secondary branch, blue dots denote locations where η “ . With fixed η , the terminal points (black dots)on the tertiary manifolds denote the stationary points where κ “ . The magenta square in the left panelcorresponds to a singular point for the initial stage of continuation. Regular Positively aligned Negatively aligned i i + 1 i i + 1 i Figure 3.
Illustration of the pseudo-arclength algorithm for one-dimensional continuation. Starting froma point i located on a solution manifold, the next point i ` is obtained by two steps in the algorithm. In thefirst step, a predictor denoted by a red cross is generated along the tangent direction at point i . A projectioncondition is then applied in the second step to locate the point i ` on the manifold. Singular points are denotedby magenta squares. In the left panel, no singular point is encountered. In the middle panel, the extendingdirections of two branches terminating on the singular point are positively aligned and the algorithm may beable to bypass such a singular point. By contrast, in the right panel, the secondary branch extends in a directionoppositely aligned to the first branch, and no local solution exists to the projection condition. The algorithm isnot able to bypass the singular point in this case. The point(3.9) u “ ˆ ´ G p u q , ` G p u q ˙ is a stationary point of Ψ on this manifold, but G p u q ą
0. The segment of the manifold
M. LI, H. DANKOWICZ between u and u intersects the G “ u “ ˆ ´ G p u q , ` G p u q ˙ . We again considering the system of equations in (3.5), this time with κ “ κ “ κ , ,and unknowns p u, µ , η , σ , σ q . As before, the one-dimensional solution manifold of (3.8)through u corresponds to a locally unique one-dimensional solution manifold of (3.5) through p u , Ψ p u q , , , q . Rather than reaching the stationary point of Ψ , this manifold terminatesat u , which is a singular point for the corresponding Jacobian. Interestingly, a secondarybranch, locally parameterized by η and along which G ” σ , σ ‰
0, also terminates atthis point. Substitution yields x “ ´ y , σ “ p ` y q η {
5, and σ “ p ´ y q η {
5, where y is implicitly defined by(3.11) χ p p ´ y q η { , ´ y q ´ κ , “ . It follows from (2.8) that σ ą η P p , s . Denote the corresponding u for η “ u . Then p u , Ψ p u q , p ` y q{ , p ´ y q{ , κ , q is a solution to (3.5) with η “ κ “
0, and unknowns p u, µ , σ , σ , κ q . Indeed, this point lies on a locally unique one-dimensional manifold of such solutions on G “
0, along which x “ ´ y , σ “ p ` y q{ σ “ p ´ y q{
5, where y is implicitly defined by(3.12) χ p p ´ y q{ , ´ y q ´ κ “ . It is again easy to show from (2.8) that σ ą κ P r , κ , s . Thecorresponding u for κ “ u , as expected.Continuation results are again consistent with the above analysis. Starting from u “p´ , q P U ´{` , we have G p u q “ κ , “
8. Continuation along the solution man-ifold to (3.5) with κ “ κ “ κ , results in a curve that appears to terminateat p x, y, µ , η , σ , σ q “ p´ . , . , . , , , q represented by the magenta dotsin Figure 4. As it happens, the pseudo-arclength algorithm manages to cross this point (cf. themiddle panel in Figure 3) and continuation proceeds along the secondary branch in the G “ p x, y, µ , η , σ , σ q “ p´ . , . , . , , . , ´ . q represented by the blue dots in Figure 4. Finally, continuation along the solution manifoldto (3.5) with η “ κ “ p x, y, µ , σ , σ , κ q “ p . , . , . , . , , q , consistent with theanalytical solution.Suppose, instead, that u P U `{` and let κ , : “ χ p , G p u qq “ G p u q ą κ , : “ χ p , G p u qq “ G p u q ą
0. We again consider the system of equations in (3.5), this timewith κ “ κ , , κ “ κ , , and unknowns p u, µ , η , σ , σ q . We obtain a one-dimensionalsolution manifold through u along which σ “ p ´ x ´ y q η { σ “ p x ´ y ´ q η { x and y are implicitly defined by χ p p ´ x ´ y q η { , ´ x ´ y q ´ κ , “ , (3.13) χ p p x ´ y ´ q η { , x ´ y q ´ κ , “ . (3.14) ONSTRAINED OPTIMIZATION USING PARAMETER CONTINUATION 9 y x -2020 -4 G ( x; y ) = 0 G ( x; y ) = 0 < Figure 4.
Projections of continuation paths associated with a successive search for stationary solutions.Here, dark green thin lines and hollow markers are used to denote projections of black thick lines and filledmarkers in three-dimensional space onto the three coordinate planes. Gray planes are used to represent tightconstraints. Starting from u “ p´ , q and holding κ and κ fixed at and , respectively, continuation firstproceeds along a solution manifold in the η “ σ “ σ “ subspace. Rather than terminating at a singularpoint (magenta squares) on the G “ surface, the pseudo-arclength continuation algorithm bypasses this pointand switches to a secondary branch in the G “ surface along which η , σ , and σ vary. The first runterminates at the point corresponding to η “ (blue dots). In the second run, with fixed η , the terminalpoints (black dots) on the second manifolds denote the stationary points where κ “ . Denote the u corresponding to η “ u . Then p u , Ψ p u q , p ´ x ´ y q{ , p x ´ y ´ q{ , κ , q is a solution to (3.5) with η “ κ “ κ , , and unknowns p u, µ , σ , σ , κ q .Indeed, this point lies on a locally unique one-dimensional manifold of such solutions, alongwhich σ “ p ´ x ´ y q{ σ “ p x ´ y ´ q{
5, where x and y are implicitly defined by χ p p ´ x ´ y q{ , ´ x ´ y q ´ κ “ , (3.15) χ p p x ´ y ´ q{ , x ´ y q ´ κ , “ . (3.16)Denote the u corresponding to κ “ u . Then p u , Ψ p u q , p ´ x ´ y q{ , p x ´ y ´ q{ , κ , q is a solution to (3.5) with η “ κ “
0, and unknowns p u, µ , σ , σ , κ q . Indeed,this point lies on a locally unique one-dimensional manifold of such solutions, along which σ “ p ´ x ´ y q{ σ “ p x ´ y ´ q{
5, where x and y are implicitly defined by χ p p ´ x ´ y q{ , ´ x ´ y q “ , (3.17) χ p p x ´ y ´ q{ , x ´ y q ´ κ “ . (3.18)The corresponding u for κ “ u , as expected.The numerical results in Figure 5 confirm these predictions. Here, u “ p , q impliesthat G p u q “ G p u q “ κ , “
12, and κ , “
2. Continuation along the solu-tion manifold to (3.5) with κ “ κ , and κ “ κ , results in a curve that intersects thepoint p x, y, µ , η , σ , σ q “ p . , . , . , . , ´ . , ´ . q represented bythe yellow dots in Figure 5. Continuation from this point along the solution manifold to (3.5)with η “ κ “ κ , results in a curve that intersects the point p x, y, µ , σ , σ , κ q “ p . , . , . , . , ´ . , q represented by the blue dots Figure 5. Finally, con-sistent with the analytical solution, continuation along the solution manifold to (3.5) with η “ κ “ p x, y, µ , σ , σ , κ q “ p . , . , . , . , , q . x y G ( x; y ) = 0 G ( x; y ) = 0 -0.5 0 0.5 1 σ -1.4-1.2-1-0.8-0.6-0.4-0.20 σ Figure 5.
Projections of continuation paths associated with a successive search for stationary solutions.Here, dark green thin lines and hollow markers are used to denote projections of black thick lines and filledmarkers in three-dimensional space onto the three coordinate planes. Gray planes are used to represent tightconstraints. Starting from u “ p , q and holding κ and κ fixed at and , respectively, continuationproceeds along a one-dimensional solution manifold until η “ (yellow dots). Fixing η and varying κ ,continuation is conducted until κ “ (blue dots). Finally, with fixed κ and κ free to vary, the terminalpoints (black dots) on the tertiary manifolds denote the stationary points where κ “ . We finally consider u P U ´{´ . Locally, the solutions to the system of equations (3.5)with κ “ κ “
0, and unknowns p u, µ , η , σ , σ q constitute a two-dimensional manifoldof the form p u, Ψ p u q , , , q for arbitrary u « u . We obtain a one-dimensional manifold byintroducing the function Ψ : u ÞÑ y and considering the system of equations $’’’’&’’’’% Ψ p u q ´ µ “ , Ψ p u q ´ µ “ , p D Ψ p u qq J η ` p DG p u qq J σ “ ,χ p σ , ´ G p u qq ´ κ “ ,χ p σ , ´ G p u qq ´ κ “ , (3.19)with κ “ κ “ µ “ y , and unknowns p u, µ , η , η , σ , σ q . Along this manifold, η “ η “ σ “ σ “ y “ y and x ranges between 3 ´ y and y correspond-ing to singular points on the G “ G “ u “ p , y q is a stationary point of Ψ along this manifold that lies in U ´{´ provided that y ă { p , y , p y ´ q , η , p ´ y q η , , q . It fol-lows that p , y , p y ´ q , y , p ´ y q , , q is a solution to the system of equations (3.5) with κ “ κ “ η “
1, and unknowns p u, µ , µ , η , σ , σ q . The corresponding one-dimensionalsolution manifold consists of solutions of the form p , µ , p µ ´ q , µ , p ´ µ q , , q and ONSTRAINED OPTIMIZATION USING PARAMETER CONTINUATION 11 terminates at the singular point p , { , { , { , , , q on the G “ G ” σ ‰
0, also terminates at thispoint. Substitution yields x “ ´ y , σ “ p y ´ q , σ “
0, and η “ p ´ y q . Thecorresponding u for η “ u , as expected. These predictions are confirmed bythe numerical results in Figure 6 where y “ ´
2. The singular point on G “ x y -1 110 -2 020 G ( x; y ) = 0 G ( x; y ) = 0 < Figure 6.
Projections of continuation paths associated with a successive search for stationary solutions.Here, dark green thin lines and hollow markers are used to denote projections of black thick lines and filledmarkers in three-dimensional space onto the three coordinate planes. Gray planes are used to represent tightconstraints. Starting from u “ p , ´ q and holding κ , κ , and µ fixed at , , and ´ , respectively, a foldpoint in µ , denoted by the red dots, is detected along the first solution manifold in the η “ η “ σ “ σ “ subspace. Along the secondary branch, blue dots denote locations where η “ . Notably, u remains unchangedalong this branch. Finally, with fixed η and η free to vary, the terminal points (black dots) on the tertiarymanifolds denote the stationary points where η “ . This run bypasses a singular point (magenta squares) andthen continues in the plane G p x, y q “ .
4. Successive continuation.
The finite-dimensional motivating example in the previoussection highlights a general approach to locating candidate stationary points of an objectivefunction along a constraint manifold. In this section, we generalize this procedure to theinfinite-dimensional context that includes boundary-value problem and integral constraintsand objective functions.We proceed to consider the function(4.1) F aug : p u, λ, η, σ, µ, ν, κ q ÞÑ ¨˚˚˚˚˝ Φ p u q Ψ p u q ´ µ p D Φ p u qq ˚ λ ` p D Ψ p u qq ˚ η ` p DG p u qq ˚ ση ´ νK p σ, ´ G p u qq ´ κ ˛‹‹‹‹‚ , which augments the function F by incorporating a subset of the left-hand sides of the nec-essary KKT conditions. Various restrictions of F aug result by fixing different subsets of thecomponents of µ , ν and κ . For example, the preceding discussion shows that ` u, λ, η, σ, µ ˘ “ ` ˆ u, ˆ λ, ˆ η, ˆ σ, Ψ p ˆ u q ˘ is a root of the restriction obtained by fixing ν “ ν t ,...,l u “ κ “ a priori approximation. To overcomethis difficulty, we propose a modification to the successive continuation algorithm introducedby Kern´evez and Doedel [16] and described by us in [18] in the context of a nonlinear functionsimilar in form to (4.1) but with q “ u is a root of Φ for which the set of active constraints isempty, i.e., t i : G i p u q “ u “ H , thus avoiding the singularity of χ at p , q . Then, p u , , , , Ψ p u q , , κ q is a root of F aug provided that the elements of κ indexed by Z : “t i : G i p u q ă u equal 0, while those indexed by P : “ t i : G i p u q ą u are positive. We pro-ceed to develop a series of continuation problems whose solutions correspond to points on asequence of embedded manifolds that continuously connect this initial root of the augmentedcontinuation problem to a root of this problem with ν “ ν t ,...,l u “
0, and κ “ I Ă t , . . . , l u and J : “ t , . . . , l uz I , and consider therestriction F rest obtained by fixing the values of µ I , ν J , and κ to Ψ I p u q , 0, and κ , respectively.It follows by construction that p u, λ, η, σ, µ , µ J , ν , ν I q “ p u , , , , Ψ p u q , Ψ J p u q , , q is aroot of F rest . Indeed, by continuity of G , every root u « u of the function(4.2) F red : u ÞÑ ¨˝ Φ p u q Ψ I p u q ´ Ψ I p u q K P p , ´ G p u qq ´ κ , P ˛‚ corresponds to a root of the form p u, , , , Ψ p u q , Ψ J p u q , , q of F rest .Now suppose that the range of DF red p u q is Y ˆ R | I | ˆ R | P | and that its nullspace is ofdimension d ´| I |´| P | , where d equals the dimension of the nullspace of D Φ p u q . It follows fromthe implicit function theorem that all roots of F red near u lie on a locally unique d ´ | I | ´ | P | -dimensional manifold. Consider the special case that | I | ` | P | “ d ´
1. Then, by Corollary A.2,all roots of F rest sufficiently close to p u , , , , Ψ p u q , Ψ J p u q , , q lie on a one-dimensionalmanifold of points of the form p u, , , , Ψ p u q , Ψ J p u q , , q , for some root u of F red providedthat p D Ψ p u qq ˚ is linearly independent of p DF red p u qq ˚ . If, instead, u is a stationary pointof Ψ along the one-dimensional solution manifold of F red “
0, then Lemma A.3 impliesthat p u , , , , Ψ p u q , Ψ J p u q , , q is a branch point of F rest through which runs a secondaryone-dimensional solution manifold, locally parameterized by η , along which λ , η I , and σ P vary.Suppose that no element of G Z equals 0 along this manifold for η P r , s . Continuationcan then proceed from the point with η “ F rest “ η “ I to J and fixing the corresponding elements of ν once they equal 0. Suppose that no elementof G Z equals 0 along any such segment. Continuation can then proceed from the point with η “ η ,...,l “ F rest obtainedby fixing η and successively allowing the elements of κ P to vary and fixing them once theyequal 0. Provided that no element of G Z equals 0 along any such segment, the final pointcorresponds to the sought stationary point.Alternatively, consider the case that | I | ` | P | “ d , in which case DF red p u q is a bijection.Lemma A.5 then implies that p u , , , , Ψ p u q , Ψ J p u q , , q lies on a one-dimensional man-ifold of solutions to F rest “ η , along which λ , η I , and σ P vary. ONSTRAINED OPTIMIZATION USING PARAMETER CONTINUATION 13
Suppose that no element of G Z equals 0 along this manifold for η P r , s . The stationarypoint ˆ u may again be sought following the approach in the preceding paragraph.Finally, note that if any element of G Z were to equal 0 along any of the segments,Lemma A.6 allows for the possibility of branch switching to a one-dimensional solution mani-fold along the corresponding zero surface. This manifold is again locally parameterized by η ,and σ k , σ P ‰ η close, but not equal to 0. The stationary point ˆ u may again be soughtfollowing the successive continuation approach.The various approaches to locating a stationary point in the example in Section 3 corre-spond to the possibilities discussed above. Throughout the analysis, d “ ‚ In the case that u P U `{´ , I “ H and P “ t u so that | I | ` | P | “ d ´
1. The analysisproceeds by locating a fold along the solution manifold with trivial Lagrange multi-pliers, branch switching to a secondary branch with nontrivial Lagrange multipliers,and then driving κ to 0. ‚ In the case that u P U ´{` , I “ H and P “ t u so that, again, | I | ` | P | “ d ´
1. Theanalysis proceeds by continuing to a singular point on G “
0, branch switching ontoa secondary branch on G “
0, and then driving κ to 0. ‚ In the case that u P U `{` , I “ H and P “ t , u so that | I | ` | P | “ d . The analysisproceeds by continuing along a branch of nontrivial Lagrange multipliers and thensuccessively driving κ and κ to 0. ‚ Finally, in the case that u P U ´{´ , the problem is enlarged with the function Ψ ,thereby making I “ t u and P “ H so that, again, | I | ` | P | “ d ´
1. The analysisproceeds by locating a fold along the solution manifold with trivial Lagrange multipli-ers, branch switching to a secondary branch with nontrivial Lagrange multipliers, andthen driving η to 0, first along a branch with G ‰ G “ G “ u and theset I .We conclude this section with a few comments on the proposed algorithm: ‚ Initialization.
The algorithm requires the user to select the initial point u and theset of initially inactive continuation parameters I such that I ` P P t d, d ´ u , where d denotes the dimension of the solution manifold to the zero problem Φ p u q “
0. Asseen in the motivating example, there is a great deal of flexibility in this selection,one that may allow for different approaches to one of possibly many local extrema. Inparticular, initial solution guesses in the infeasible region may be used for the successfulsearch of optima. We are not able to propose a systematic selection algorithm beyondthe principles outlined above. ‚ Number of continuation runs.
The number of subproblems analyzed in the successivecontinuation approach is determined by the number of control or design variablesrather than by the number of constraints. Indeed, in the generalized Kern´evez andDoedel approach, d ` I ` P “ d , continuation is performed until ν “ d runs drive nonzero elements of ν I and κ P to 0, one at a time.If, instead, I ` P “ d ´
1, due to branch switching, the first two runs are conducted to drive ν to 1. The remaining d ´ ν I and κ P to 0. ‚ Continuation order.
The algorithm requires the user to commit to an order in whichelements associated with I are released, and constraints associated with P are imposed.If the solution to the first-order necessary conditions is not unique, different choicesmay yield distinct locally optimal solutions, as illustrated in ref. [18]. We leave theeffects of the imposed order of constraints to future studies.
5. Some implementation details.
A fundamental element of the successive continuationalgorithm is the analysis of solutions of the augmented continuation problem F aug “ µ , ν , and κ . A practical implementation of the al-gorithm should then consider two perspectives, namely, that of formulating the correspondingcontinuation problem and that of solving the formulated problem. As we did in the motivat-ing example, one may manually derive the restricted continuation problem and then solve itanalytically using symbolic computation packages. Given the difficulty of finding analyticalsolutions, numerical continuation arises as a powerful alternative for characterizing the solu-tion manifolds for each of the restricted continuation problems. One can apply packages like auto [8] and coco [20] to perform such continuation.The numerical results documented in this paper were produced using coco because of itssupport for automatic generation of the corresponding adjoints (or discretized approximationsthereof). The construction of adjoints is not a trivial task if the optimization problem hasdifferential or integral constraints. Packages supporting the automatic computation of adjointsinclude sundials [14] for ordinary differential equations and differential-algebraic equations,and dolfin-adjoint [11] for partial differential equations. However, numerical continuationis not available in these packages. A recent release of coco provides a predefined library ofrealizations for the adjoints of common types of integral, differential, and algebraic operators.This feature of coco , coupled with the ease in which continuation parameters may be fixedor released, makes the implementation of the algorithm both intuitive and straightforwardusing this package. Nevertheless, every problem considered in this paper could in theory beapproached using other continuation packages.A key feature of coco is its support for a staged paradigm of problem construction,in which a continuation problem is decomposed into a “forward-coupled” set of equations.Specifically, at each stage of construction, equations are added that depend on subsets ofvariables introduced in previous stages and a new set of variables. The full set of unknownsis therefore not known until the complete problem has been constructed.In its original form (released in 2013), coco was designed to construct problems of theform F p u, µ q “
0, where(5.1) F : p u, µ q ÞÑ ˆ Φ p u q Ψ p u q ´ µ ˙ in terms of sets of finite-dimensional zero functions Φ, monitor functions Ψ, continuationvariables u , and continuation parameters µ . Various restrictions obtained by fixing elements of µ could then be realized at run-time using the appropriate coco syntax. Each such restrictionwould correspond to analysis of an embedded submanifold of the solution manifold to the ONSTRAINED OPTIMIZATION USING PARAMETER CONTINUATION 15 original zero problem Φ p u q “
0. For infinite-dimensional problems, F would be implementedin terms of suitable discretizations of u , Φ, and Ψ.In our recent work [18], an expanded definition of the coco construction paradigm allowedfor analysis of problems of the form F p u, λ, η, µ, ν q “
0, where(5.2) F : p u, λ, η, µ, ν q ÞÑ ¨˚˚˝ Φ p u q Ψ p u q ´ µ Λ J Φ p u q ¨ λ ` Λ J Ψ p u q ¨ ηη ´ ν ˛‹‹‚ in terms of additional sets of finite-dimensional matrix-valued adjoint functions Λ Φ and Λ Ψ , continuation multipliers λ and η , and continuation parameters ν . The expanded definitionsupported the use of a successive continuation technique for locating stationary points of oneof the monitor functions along the solution manifold to the zero problem Φ p u q “
0. In thiscontext, the transposes Λ J Φ and Λ J Ψ represented the (discretized if necessary) adjoints of theFrechet derivatives of the functions Φ and Ψ, and λ and η were the corresponding Lagrangemultipliers .Notably, in (5.2), the additional entries Λ J Φ p u q ¨ λ ` Λ J Ψ p u q ¨ η may be combined with theoriginal zero problem to form an expanded zero problem in u , λ , and η . Similarly, in termsof the augmented monitor functions p u, λ, η q ÞÑ p Ψ p u q , η q the second and last entries maybe combined into a single term of the form of the bottom entry in (5.1). Moreover, λ and η are introduced only once all the original zero and monitor functions have been added, atwhich point all continuation variables have been defined. Indeed, the corresponding equationscan be added automatically by the coco core, rather than manually constructed by a user,provided that the user constructs Λ J Φ and Λ J Ψ either concurrently with the addition of thecorresponding zero and monitor functions or at the very least before calling the coco coreto perform continuation. This expanded functionality was implemented with the November2017 release of coco and discussed in tutorial documentation included with the release (https://sourceforge.net/projects/cocotools/files/releases/).In the context of the proposed treatment of optimization under simultaneous equality andinequality constraints, we consider a further extension of the coco construction paradigm toproblems of the form F p u, λ, η, σ, µ, ν, ξ, κ q “
0, where(5.3) F : p u, λ, η, σ, µ, ν, ξ, κ q ÞÑ ¨˚˚˚˚˚˚˝ Φ p u q Ψ p u q ´ µ Λ J Φ p u q ¨ λ ` Λ J Ψ p u q ¨ η ` Λ J G p u q ¨ ση ´ νG p u q ´ ξK p σ, ´ G p u qq ´ κ ˛‹‹‹‹‹‹‚ in terms of additional sets of finite-dimensional inequality functions G , matrix-valued adjointfunctions Λ G , continuation multipliers σ , and continuation parameters ξ and κ . Here, Λ J G represents the (discretized if necessary) adjoint of the Frechet derivative of the function G and σ are the corresponding Lagrange multipliers. We match this expanded definition againstthe original coco construction paradigm by combining Λ J Φ p u q ¨ λ ` Λ J Ψ p u q ¨ η ` Λ J G p u q ¨ σ with the original zero problem and defining the augmented monitor function p u, λ, η, σ q ÞÑp Ψ p u q , η, G p u q , K p σ, ´ G p u qqq . As before, λ , η , and σ are introduced only once all the originalzero, monitor, and inequality functions have been added, at which point all continuation vari-ables have been defined. Again, the corresponding equations can be added automatically bythe coco core, rather than manually constructed by a user, provided that the user constructsΛ J Φ , Λ J Ψ , and Λ J G either concurrently with the addition of the corresponding zero, monitor, andinequality functions or at the very least before calling the coco core to perform continuation.Such a modification to the coco core was implemented to produce the results reported inthis paper.In the case that U and Y are finite dimensional, the adjoints of the linearizations of Φ, Ψ,and G are straightforward to construct since they simply equal the transposes of the corre-sponding Jacobians. When U and Y are infinite dimensional, the adjoint contributions can bederived using a Lagrangian formalism. In [18], a library of realizations of such functions andtheir adjoints for algebraic and integro-differential boundary-value problems were established.For example, for boundary-value problems defined in terms of ordinary differential equations,the unknown functions together with the corresponding unknown Lagrange multiplier func-tions were discretized over a finite mesh in the independent variable in terms of continuous,piecewise-polynomial functions. The original differential equations and the corresponding ad-joint differential equations were then discretized by requiring that these be satisfied by thefunctional approximants on a set of collocation nodes. These collocation nodes and the asso-ciated quadrature weights were also used to approximate any integral functions of the originalunknowns or Lagrange multipliers. Finally, in the implementation in coco , an adaptive meshalgorithm that varies the sizes and number of mesh intervals was implemented to obtain nu-merical solutions with desirable accuracy within reasonable limits on computational efforts.We utilize this adaptive mesh algorithm in the computations performed in the next section.For further details of the numerical implementation the reader is referred to [18, 5].We finally remark on the possibility of restricting continuation to a level surface of G k forsome k , either by fixing ξ k , or by fixing κ k provided that σ k is constant along the solutionmanifold if G k ‰ G k “
0. As we saw in the finite-dimensional example, we mayswitch from continuation away from G k “ κ k “ G k “ σ k “ G k “
0. The current implementation doesnot consider such detection and, instead, assumes manual switching between branches.
6. Applications.6.1. Doedel’s example.
We revisit a two-point boundary-value problem from auto [9]augmented by an inequality constraint. Consider the following objective functional(6.1) J : “ p p ` p ` p q ` ż p x p t q ´ q d t ONSTRAINED OPTIMIZATION USING PARAMETER CONTINUATION 17 subject to the differential equations(6.2) x “ x , x “ ´ p exp p x ` p x ` p x q and boundary conditions(6.3) x p q “ , x p q “ . There are three local extrema [9, 18], two of which violate the integral inequality constraint(6.4) G int : “ . ´ ż x p t q d t ď . We apply the formalism from previous sections to locate the remaining extremum. Throughoutthe analysis, we restrict attention to a computational domain defined by ´ . ď p ď . ´ . ď p ď . ´ . ď p ď .
0, and 0 ď J ď . u “ p x p t q , p q , Φ represents the boundary-value problem, d “
3, and q “
1. Let Ψ : u ÞÑ J , while Ψ i : u ÞÑ p i ´ for i “ , ,
4, such that l “
4, and denotethe corresponding continuation parameters µ J , µ p , µ p , and µ p . Finally, let κ int denotethe continuation parameter for the NCP condition associated with the integral inequalityconstraint. We let λ p t q , λ bc , η J , η p , η p , η p , and σ int denote the corresponding Lagrangemultipliers, and let ν J , ν p , ν p , and ν p denote the remaining continuation parameters.Since l ` q ě d , the requirement | I | ` | P | “ d ´ I and P . We consider two cases here, viz., P “ t u with I “ t u and P “ H with I “ t , u , respectively.In the first case, let u “ p , , . , q , in which case G int p u q “ . µ p , ν p “ ν p “
0, and κ int “ µ J . As predicted by Lemma A.3, con-tinuation is now possible along a secondary solution manifold, emanating from the extremumand parameterized by ν J . In contrast to the case when P “ H , the value of u changes alongthis manifold, which is consistent with prediction given by Corollary A.4. As required by thesuccessive continuation paradigm, continuation is performed until ν J “
1. Next, we proceedto fix ν J “ µ p to vary during continuation from this point until ν p “
0. Wearrive at the desired result by fixing ν p “ κ int to vary during continuation fromthis point until κ int “
0. Notably, while we find it possible to drive ν J and ν p monotonicallyto 1 and 0, respectively, in the corresponding continuation runs, two fold points in the valueof κ int are encountered on the way to 0 in the final continuation run.In the second case, we perform continuation of solutions to the original boundary-valueproblem from x p t q “ p “ p “ .
1, and p “ p . As seen in Figure 8,three fold points are detected and we locate two local extrema ( F P , ) in the value of µ J within the feasible region. We let u equal the F P of this initial run. Next, we considercontinuation along the one-dimensional solution manifold with trivial Lagrange multipliersobtained by fixing µ p , µ p , ν p “
0, and κ int “ p p p p p J -6 J int -3 -2 -1 0 1 σ int -0.4-0.200.20.40.6 G i n t Figure 7.
Projections of continuation paths associated with a successive search for stationary solutions.Here, dark green thin lines and hollow markers are used to denote projections of black thick lines and filledmarkers in three-dimensional space onto the three coordinate planes. Starting at u “ p , , . , q and holding µ p , ν p , ν p , and κ int fixed at , , , and , respectively, a fold point in µ J , denoted by red dots, is detectedalong the first solution manifold in the subspace with vanishing Lagrange multipliers. Along the secondarymanifold, blue dots denote locations where ν J “ . With ν J , ν p , ν p , and κ int fixed at , , , and ,respectively, the cyan dots denote locations where ν p “ . As seen in the bottom left panel, ν p decreasesmonotonically from . ˆ ´ to . Finally, the terminal points (black dots) on the fourth manifold denotethe stationary points where κ int “ assuming fixed ν J , ν p , ν p , and ν p . Notably, the last manifold crosses G int “ at a regular point (yellow diamond) with σ int ‰ . The green dots denote points on the boundary ofthe computational domain. parameters to vary. Branch switching from F P or F P should allow for continuation alongsecondary branches of solutions with nonzero Lagrange multipliers and unchanged values u .This is verified by the numerical results in Figure 8.Starting from F P , we drive ν J to 1. As predicted by Corollary A.4, u is preserved inthis run because P “ H . Next, we fix ν J “ µ p to vary until ν p “
0. In the finalstage, we fix ν p “ µ p to vary until ν p “
0. In each of these runs the terminalvalues are approached monotonically. The final point corresponds to the local extremum at p “ . p “ . p “ . J “ . ONSTRAINED OPTIMIZATION USING PARAMETER CONTINUATION 19
F P , we again drive ν J to 1 monotonically. However, once we fix ν J “ µ p tovary, we observe a failure of the Newton solver to converge as we approach a singular pointon G int “ ν p “
0. This point is denoted by the magenta squares in Figure 8. If weallow larger computational domain, it turns out that we can arrive at ν p “ -0.21.50 1 30.2 p p p F P F P MX F P p J p F P F P F P MX012 10 8 p p J F P p p J F P MX Figure 8.
Projections of continuation paths associated with a successive search for stationary solutions.Here, dark green thin lines and hollow markers are used to denote projections of black thick lines and filledmarkers in three-dimensional space onto the three coordinate planes. A preliminary run is conducted to obtainan initial solution in the feasible region. Starting at ˜ u “ p , , . , q , three fold points (denoted by F P , F P and F P and identified by red dots) in µ J are detected during continuation of the original boundaryvalue problem (ignoring the inequality constraint and adjoint conditions) with µ p and µ p fixed at . and ,respectively. The last two fold points are located within the feasible region. Taking F P as u , both F P and F P correspond to branch points during continuation with µ p , µ p , ν p , and κ int fixed at . , , , and ,respectively, along the first solution manifold in the subspace of vanishing Lagrange multipliers. Along eachof the secondary manifolds emanating from F P (bottom right) and F P (bottom left), respectively, blue dotsdenote locations where ν J “ . Continuation along the tertiary manifolds with ν J , µ p , ν p , and κ int fixed at , , , and , respectively, reaches a point with ν p “ (cyan dot) in the bottom left panel, but terminates ata singular point (magenta squares labeled by MX) or at a point on the boundary of the computational domainbefore reaching ν p in the bottom right panel. In the case of the bottom left panel, the terminal points (blackdots) on the manifolds obtained from continuation with ν J , ν p , ν p , and κ int fixed at , , , and , respectively,denote stationary points where ν p “ . The green dots denote points on the boundary of the computationaldomain. Consider the problem of minimizing the objective functional(6.5) J “ ż ´ θ p t q ` θ p t q ` u p t q ¯ d t ONSTRAINED OPTIMIZATION USING PARAMETER CONTINUATION 21 on solutions of the nonlinear inverted pendulum dynamical system [19] shown in Figure 9 andgoverned by the differential equations in terms of displacement x , rotation θ and input u p M ` m q : x ´ ml θ sin θ ` ml : θ cos θ “ u, m : x cos θ ` ml : θ ´ mg sin θ “ , (6.6)and initial conditions θ p q “ . θ p q “ x p q “ x p q “
0, subject to integral inequalityconstraints of the form(6.7) } u } : “ ż t f u p t q dt ď E c or(6.8) } y } : “ ż t f p θ p t q ` x p t qq dt ď Y c , where E c and Y c are input and output thresholds, respectively. Following the scheme in [18],we accommodate this optimal control problem within the proposed optimization framework byparameterizing the control input using a 10-term truncated Chebyshev-polynomial expansionand let the problem parameters p , . . . , p denote the unknown coefficients of the expansion.In the notation of previous sections, it follows that d “
10 and l “
11. We restrict attentionthroughout to variations in the elements of ν p that approach 0 monotonically. In all thenumerical results reported here, M “ m “ . l “ .
5, and g “ . M x mlu θ g Figure 9.
Schematic of dynamical system corresponding to (6.6) (adapted from [19]).
In order to explore the effects of boundedness of input and output, we first solve theoptimal control problem in the absence of such bounds. It follows that q “ P “ H . Weset p , “ ¨ ¨ ¨ “ p , “ x p t q via forward simulation. Forthis optimization problem, we follow the successive continuation approach by1. taking I “ t , ..., u , i.e., fixing µ p t ,..., u and allowing µ p vary to yield a one-dimensional manifold. A fold point of µ J is detected along this manifold;2. branching off from the fold point and driving ν J to 1;3. fixing ν J “
1, and then successively allowing each of the remaining elements of µ p tovary (in order of the expansion of u p t q ), and fixing the corresponding element of ν p once it equals 0. The resulting optimal trajectories and control input are represented by solid lines in Figure 10and Figure 11, respectively. For this optimal solution, the input integral } u } “ . } y } “ . ˆ ´ , and J “ . t -0.0500.050.1 θ ( t ) No BoundBounded InputBounded Output t -0.100.10.20.30.40.50.6 x ( t ) No BoundBounded InputBounded Output
Figure 10.
Optimal time histories for θ p t q (left panel) and x p t q (right panel) in the case without inequalityconstraints (solid lines), with input integral inequality (dashed lines), and with output integral inequality (dottedlines). t -5051015 u ( t ) No BoundBounded InputBounded Output
Figure 11.
Optimal time histories u p t q for the control input in the case without inequality constraints (solidlines), with input integral inequality (dashed lines), and with output integral inequality (dotted lines). We next consider optimization subject to the integral inequality (6.7) with E c “ q “
1. Let κ input denotes the continuation parameter for the NCPcondition of (6.7). With initial parameters p “ p “ p “ p “ p t ,..., u “ ONSTRAINED OPTIMIZATION USING PARAMETER CONTINUATION 23 we construct an initial solution x p t q located in te infeasible region by forward simulation.Since P “ t u , the inactive problem parameter set I should be selected in such a way that | I | “ d ´ ´ | P | “
8. To this end, we apply the successive continuation approach by1. taking I “ t , ..., u , i.e., fixing µ p t ,..., u and κ input , and allowing µ p and µ p to varyto yield a one-dimensional manifold. Several fold points of µ J are detected along thismanifold;2. branching off from the fold associated with the smallest value of µ J , and driving ν J to1;3. fixing ν J “
1, and then successively allowing each of the remaining elements of µ p tovary (in order of the expansion of u p t q ), and fixing the corresponding element of ν p once it equals 0;4. driving κ input to 0.The resulting optimal trajectories and control input are represented by dashed lines in Fig-ure 10 and Figure 11, respectively. With bounded control input, we observe that x p t q oscillateswith larger amplitude (see the right panel of Figure 10) and the objective functional is in-creased to J “ . Y c “ .
01. With initial parameters p “ x p t q that is again located in theinfeasible region. Following the same approach as in the case of bounded input, we obtain theoptimal trajectories and control input represented by dotted lines in Figure 10 and Figure 11,respectively. The bound on the output dampens the vibration of x p t q , as can be seen in theright panel of Figure 10. For this optimal solution, we have } u p t q} “ . J “ .
7. Conclusions.
As advertised in the introduction, this paper has developed a rigorousframework within which the successive continuation paradigm for single-objective-functionconstrained optimization of Kern´evez and Doedel [16] may be extended to the case of simulta-neous equality and inequality constraints. The discussion has also shown that the structure ofthe corresponding continuation problems fits naturally with the coco construction paradigm.Indeed, a forthcoming release of coco will include documented support for this extended func-tionality. The finite- and infinite-dimensional examples illustrate the general methodology aswell as number of opportunities for further development and automation.In this study, the nonsmooth Fischer-Burmeister function was used to express comple-mentarity constraints in the form of equalities. Notably, the singularity of this function atthe origin was associated with singular points along various solution manifolds and a poten-tial failure of the Newton solver to converge. Some generalized Newton methods have beendeveloped to tackle such a singularity. One approach is to replace regular derivatives byClarke subdifferentials [3] or other generalized Jacobians [13]. An alternative approach is toapproximate the nonsmooth problem by a family of smooth problems [3, 21]. More specif-ically, the NCP function χ may be approximated by a family of smooth approximants χ (cid:15) ,parameterized by the scalar (cid:15) , such that the solutions to the perturbed problems χ (cid:15) “ (cid:15) that converges to the solution of χ “ (cid:15) Ñ
0. Such a smoothing approach is analogous to the homotopy approach used in this paper to satisfyinequality constraints. The pesky singularity encountered in this study could thus be avoidedby further expanding the successive continuation technique to a stage during which (cid:15) is drivento 0.It has been tacitly assumed that each successive stage of continuation is able to drive theappropriate continuation parameters to their desired values, preferably monotonically. Butthe examples showed that this may not be possible within a given computational domain,or may only be possible by occasionally bearing in a direction away from the desired values.Similar observations were already made in [18] and generalize in this paper to the relaxationparameters κ . Indeed, in the presence of inequality constraints, we observed instances in whichsolutions branches would terminate on singular points on zero-level surfaces of the inequalityfunctions before the desired end points were reached. In some cases, the pseudo-arclengthcontinuation algorithm automatically switched to a separate branch in such a zero-level surfaceallowing us to continue to drive a component of ν or κ to its desired value. In other cases,this did not happen automatically, and we did not attempt to locate the secondary branchmanually.On a related note, we observe that in each successive stage of continuation implementedin the examples, only one component of ν or κ at a time was driven to its desired value.Alternatively, one might imagine first driving the component of ν associated with the objectivefunction to 1 and then attempting to drive multiple remaining components of ν and κ to 0simultaneously. Furthermore, although this did not happen in our examples, one imagines thepossibility that continuation in a zero-level surface of an inequality function would terminateat a singular point (with the corresponding component of σ equal to 0). Further continuationmight then need to branch off the zero-level surface in order to locate the desired stationarypoint. Clearly, any automated search for stationary points would need to consider these manypossibilities.Finally, we note that the approach in this paper is restricted to finite-dimensional inequal-ity constraints and does not automatically generalize to the infinite-dimensional case. If thelatter is discretized before the formulation of adjoints [24, 17], then the present frameworkis again applicable. If, as advocated here and in [18], the formulation of adjoints precedesdiscretization, an appropriate and consistent discretization scheme for the original equations,adjoint equations, and complementarity constraints needs to be carefully established [4, 13].In either case, a high-dimensional vector κ of relaxation parameters may result. Driving eachcomponent of κ to zero one-by-one could be very time-consuming, so a smarter search strategy(as alluded to in the previous paragraph) would be desirable. Appendix A. Essential lemmas.
Let R p L q and N p L q denote the range and nullspace ofa linear map L : V ÞÑ W . We say that L is full rank if V “ V ‘ N p L q and L ˇˇ V is a bijectiononto W . This holds, for example, if R p L q “ W and dim p N p L qq ă 8 . By the implicit functiontheorem, it follows that if F p u q “ F betweentwo Banach spaces V and W , and if DF p u q is full rank with finite-dimensional nullspace,then the roots of F near u lie on a manifold with tangent space spanned by N p DF p u qq .Recall the construction of the augmented function F aug in (4.1), the restriction F rest ob-tained by fixing the values of µ I , ν J , and κ to Ψ I p u q , 0, and κ , respectively, and the re- ONSTRAINED OPTIMIZATION USING PARAMETER CONTINUATION 25 duced function F red in (4.2) obtained by retaining only entries corresponding to constraintson u in F rest “
0. Assume throughout that F red p u q “ t i : G i p u q “ u “ H . Lemma A.1.
Suppose that the linear map DF red p u q is full rank with one-dimensionalnullspace. This also holds for DF rest p u , , , , Ψ p u q , Ψ J p u q , , q provided that p D Ψ p u qq ˚ is linearly independent of p DF red p u qq ˚ .Proof. By assumption, the inverse image of every w P Y ˆ R | I | ˆ R | P | is a one-dimensionalaffine subspace of U of the form v ‘ N p DF red p u qq for some v . By the theory of Fredholmoperators, it follows that N ` p DF red p u qq ˚ ˘ “ U ˚ , such that U ˚ “ Σ ‘ R ` p DF red p u qq ˚ ˘ . If p D Ψ p u qq ˚ is linearly independent of p DF red p u qq ˚ , it follows that U ˚ “ R ` p D Ψ p u qq ˚ ˘ ‘ R ` p DF red p u qq ˚ ˘ . The claim thenfollows by inspection of the image of the linear map DF rest p u , , , , Ψ p u q , Ψ J p u q , , q . Corollary A.2.
Suppose that the linear map DF red p u q is full rank with one-dimensionalnullspace and that p D Ψ p u qq ˚ is linearly independent of p DF red p u qq ˚ . It follows that theroots of F rest sufficiently close to p u , , , , Ψ p u q , Ψ J p u q , , q lie on a one-dimensionalmanifold of points of the form p u, , , , Ψ p u q , Ψ J p u q , , q for some root u of F red . Lemma A.3.
Suppose that the linear map DF red p u q is full rank with one-dimensionalnullspace and that u is a stationary point of Ψ along the corresponding one-dimensionalsolution manifold. Then, generically, p u , , , , Ψ p u q , Ψ J p u q , , q is a branch point of F rest through which runs a secondary one-dimensional solution manifold, locally parameterized by η , along which λ , η I , and σ P vary.Proof. By assumption, there exists a unique vector z P ` Y ˆ R | I | ˆ R | P | ˘ ˚ such that p D Ψ p u qq ˚ “ p DF red p u qq ˚ z . It follows by inspection of its image that the linear map DF rest p u , , , , Ψ p u q , Ψ J p u q , , q has a two-dimensional nullspace and is no longer fullrank. Indeed, for u « u , roots of F rest correspond to solutions to the system of equations(A.1) $’’&’’% Φ p u q “ , Ψ I p u q ´ Ψ I p u q “ ,K P p σ, ´ G p u qq ´ κ , P “ , p D Φ p u qq ˚ λ ` p D Ψ p u qq ˚ η ` p D Ψ I p u qq ˚ η I ` p DG P p u qq ˚ σ P “ . For every σ P with } σ P } “ (cid:15) !
1, there exists a one-dimensional manifold of solutions to the firstthree equations. By continuity, each such manifold generically contains a unique stationarypoint of Ψ close to u . At each such point, the fourth equation may be solved for λ , η I ,and σ P for given η ą p´ λ { η , ´ η I { η , ´ σ P { η q is close to z . It followsthat there exists a least one such point where the two values of the vector σ P agree for some0 ă η !
1. The claim follows by considering variations in (cid:15) . Corollary A.4.
Under the assumptions of Lemma A.3, u varies along the secondary branchonly if P ‰ H . Lemma A.5.
Suppose that the linear map DF red p u q is a bijection, i.e., that u is a locallyunique root of F red . Then, the point p u , , , , Ψ p u q , Ψ J p u q , , q lies on a one-dimensionalmanifold of solutions to F rest “ locally parameterized by η , along which λ , η I , and σ P vary. Proof.
By assumption, there exists a unique inverse image v P U for every w P Y ˆ R | I | ˆ R | P | . By the standard theory of Fredholm operators, N ` p DF red p u qq ˚ ˘ “ U ˚ “ R ` p DF red p u qq ˚ ˘ for all u « u . In particular, there exists a unique z P ` Y ˆ R | I | ˆ R | P | ˘ ˚ suchthat p D Ψ p u qq ˚ “ p DF red p u qq ˚ z . For every σ P with } σ P } “ (cid:15) !
1, there exists a uniquesolution near u to the first three equations in (A.1). At each such point, the fourth equationmay be solved for λ , η I , and σ P for given η ą p´ λ { η , ´ η I { η , ´ σ P { η q is close to z . It follows that there exists a least one such point where the two values of thevector σ P agree for some 0 ă η !
1. The claim follows by considering variations in (cid:15) . Lemma A.6.
Suppose that t i : G i p u q “ u “ k , the linear map DF red p u q is full rankwith one-dimensional nullspace, and p D Ψ p u qq ˚ and p DG k p u qq ˚ are linearly independentof p DF red p u qq ˚ . Then, p u , , , , Ψ p u q , Ψ J p u q , , q lies on a one-dimensional manifold ofsolutions to the continuation problem obtained by substituting G k p u q “ for the correspondingnonlinear complementary condition in F rest “ . This manifold is locally parameterized by η ,and σ k , σ P ‰ for η close, but not equal to .Proof. By assumption, there exists a unique p z, ζ q P ` Y ˆ R | I | ˆ R | P | ˆ R ˘ ˚ such that p D Ψ p u qq ˚ “ p DF red p u qq ˚ z ` p DG k p u qq ˚ ζ . For u « u , roots of the modified continuationproblem correspond to solutions to the system of equations obtained by adding p DG k p u qq ˚ σ k to the left-hand side of the last equation in (A.1) and appending G k p u q “
0. For every σ P with } σ P } “ (cid:15) !
1, there exists a unique one-dimensional manifold of solutions to the firstthree equations in (A.1). By continuity, each such manifold generically contains a uniqueintersection with G k “
0. At each such point, the remaining equation may be solved for λ , η I , σ P , and σ k for given η ą p´ λ { η , ´ η I { η , ´ σ P { η , ´ σ k { η q is closeto p z, ζ q . It follows that there exists a least one such point where the two values of the vector σ P agree for some 0 ă η !
1. The claim follows by considering variations in (cid:15) .The point u in the lemma is a singular point of the restricted continuation problem F rest “ σ k is typically positive only on one sideof u . It follows that two solution manifolds of the restricted continuation problem terminateat u , one in G k “ σ k ą
0) and one away from G k “ σ k “ u , provided that the tangent directions are positively aligned. REFERENCES [1]
A. Ben-Tal and J. Zowe , A unified theory of first and second order conditions for extremum problems intopological vector spaces , in Optimality and Stability in Mathematical Programming, Springer, 1982,pp. 39–76, https://doi.org/10.1007/BFb0120982.[2]
R. Bertrand and R. Epenoy , New smoothing techniques for solving bang–bang optimal control prob-lemsnumerical results and statistical interpretation , Optimal Control Applications and Methods, 23(2002), pp. 171–197, https://doi.org/10.1002/oca.709.[3]
S. C. Billups and K. G. Murty , Complementarity problems , J. Comput. Appl. Math., 124 (2000),pp. 303–318, https://doi.org/10.1016/S0377-0427(00)00432-5.[4]
A. E. Bryson and Y.-C. Ho , Applied Optimal Control , Hemisphere, Washington, DC, 1975.[5]
H. Dankowicz and F. Schilder , Recipes for Continuation , SIAM, 2013.
ONSTRAINED OPTIMIZATION USING PARAMETER CONTINUATION 27 [6]
G. D’Avino, S. Crescitelli, P. Maffettone, and M. Grosso , A critical appraisal of the π -criterionthrough continuation/optimization , Chem. Eng. Sci., 61 (2006), pp. 4689–4696, https://doi.org/10.1016/j.ces.2006.02.024.[7] G. D’Avino, S. Crescitelli, P. Maffettone, and M. Grosso , On the choice of the optimal periodicoperation for a continuous fermentation process , Biotechnol. Prog., 26 (2010), pp. 1580–1589, https://doi.org/10.1002/btpr.461.[8]
E. Doedel, T. F. Fairgrieve, B. Sandstede, A. R. Champneys, Y. A. Kuznetsov, andX. Wang , auto-07p : Continuation and bifurcation software for ordinary differential equations ,https://sourceforge.net/projects/auto-07p/ (accessed 2018/11/30).[9] E. Doedel, H. B. Keller, and J. P. Kern´evez , Numerical analysis and control of bifurcation problems(ii): Bifurcation in infinite dimensions , Internat. J. Bifur. Chaos Appl. Sci. Engrg., 1 (1991), pp. 745–772, https://doi.org/10.1142/S0218127491000555.[10]
B. C. Fabien , Indirect solution of inequality constrained and singular optimal control problems via asimple continuation method , J. Dyn. Syst. Meas. Control, 136 (2014), p. 021003, https://doi.org/10.1115/1.4025596.[11]
P. E. Farrell, D. A. Ham, S. W. Funke, and M. E. Rognes , Automated derivation of the adjointof high-level transient finite element programs , SIAM Journal on Scientific Computing, 35 (2013),pp. C369–C393.[12]
M. C. Ferris and J.-S. Pang , Engineering and economic applications of complementarity problems ,SIAM Review, 39 (1997), pp. 669–713, https://doi.org/10.1137/S0036144595285963.[13]
M. Gerdts , Global convergence of a nonsmooth newton method for control-state constrained optimalcontrol problems , SIAM J. Optim., 19 (2008), pp. 326–350, https://doi.org/10.1137/060657546.[14]
A. C. Hindmarsh, P. N. Brown, K. E. Grant, S. L. Lee, R. Serban, D. E. Shumaker, andC. S. Woodward , SUNDIALS: Suite of nonlinear and differential/algebraic equation solvers , ACMTransactions on Mathematical Software (TOMS), 31 (2005), pp. 363–396.[15]
F. Jiang, H. Baoyin, and J. Li , Practical techniques for low-thrust trajectory optimization with ho-motopic approach , Journal of Guidance, Control, and Dynamics, 35 (2012), pp. 245–258, https://doi.org/10.2514/1.52476.[16]
J. Kern´evez and E. Doedel , Optimization in bifurcation problems using a continuation method , inBifurcation: Analysis, Algorithms, Applications, Springer, 1987, pp. 153–160, https://doi.org/10.1007/978-3-0348-7241-6 16.[17]
T. Lau ß , S. Oberpeilsteiner, W. Steiner, and K. Nachbagauer , The discrete adjoint gradientcomputation for optimization problems in multibody dynamics , J. Comput. Nonlinear Dynam., 12(2017), 031016, https://doi.org/10.1115/1.4035197.[18]
M. Li and H. Dankowicz , Staged construction of adjoints for constrained optimization of integro-differential boundary-value problems , SIAM J. Appl. Dyn. Syst., 17 (2018), pp. 1117–1151, https://doi.org/10.1137/17M1143563.[19]
L. B. Prasad, B. Tyagi, and H. O. Gupta , Optimal control of nonlinear inverted pendulum dynamicalsystem with disturbance input using PID controller & LQR , in Control System, Computing andEngineering (ICCSCE), 2011 IEEE International Conference on, IEEE, 2011, pp. 540–545, https://doi.org/10.1109/ICCSCE.2011.6190585.[20]
F. Schilder, H. Dankowicz, and M. Li , coco , http://sourceforge.net/projects/cocotools (accessed2018/11/30).[21] D. Sun and L. Qi , On NCP-functions , Comput. Optim. Appl., 13 (1999), pp. 201–220, https://doi.org/10.1023/A:1008669226453.[22]
J. O. Toilliez and A. J. Szeri , Optimized translation of microbubbles driven by acoustic fields , J.Acoust. Soc. Am., 123 (2008), pp. 1916–1930, https://doi.org/10.1121/1.2887413.[23]
M. Wyczalkowski and A. J. Szeri , Optimization of acoustic scattering from dual-frequency drivenmicrobubbles at the difference frequency , J. Acoust. Soc. Am., 113 (2003), pp. 3073–3079, https://doi.org/10.1121/1.1570442.[24]