Successive Convexification: A Superlinearly Convergent Algorithm for Non-convex Optimal Control Problems
SSUCCESSIVE CONVEXIFICATION: A SUPERLINEARLYCONVERGENT ALGORITHM FOR NON-CONVEX OPTIMALCONTROL PROBLEMS ∗ YUANQI MAO † , MICHAEL SZMUK † , XIANGRU XU † , AND
BEHC¸ ET AC¸ ıKMES¸E † Abstract.
This paper presents the
SCvx algorithm, a successive convexification algorithm de-signed to solve non-convex constrained optimal control problems with global convergence and su-perlinear convergence-rate guarantees. The proposed algorithm can handle nonlinear dynamics andnon-convex state and control constraints. It solves the original problem to optimality by successivelylinearizing non-convex dynamics and constraints about the solution of the previous iteration. Theresulting convex subproblems are numerically tractable, and can be computed quickly and reliablyusing convex optimization solvers, making the
SCvx algorithm well suited for real-time applications.Analysis is presented to show that the algorithm converges both globally and superlinearly, guaran-teeing i) local optimality recovery: if the converged solution is feasible with respect to the originalproblem, then it is also a local optimum; ii) strong convergence: if the Kurdyka–(cid:32)Lojasiewicz (KL)inequality holds at the converged solution, then the solution is unique. The superlinear rate ofconvergence is obtained by exploiting the structure of optimal control problems, showcasing thatfaster rate of convergence can be achieved by leveraging specific problem properties when comparedto generic nonlinear programming methods. Numerical simulations are performed for a non-convexquad-rotor motion planning problem, and corresponding results obtained using Sequential QuadraticProgramming (SQP) and general purpose Interior Point Method (IPM) solvers are provided for com-parison. The results show that the convergence rate of the
SCvx algorithm is indeed superlinear, andthat
SCvx outperforms the other two methods by converging in less number of iterations.
Key words. optimal control, nonlinear dynamics, state constraints, control constraints, globalconvergence, superlinear convergence
AMS subject classifications.
1. Introduction.
The Successive Convexification (
SCvx ) algorithm was first in-troduced in [37] to solve non-convex optimal control problems with nonlinear systemdynamics. This paper extends the
SCvx algorithm to include non-convex state andcontrol constraints, further establishes the uniqueness of the converged solution if theKurdyka–(cid:32)Lojasiewicz (KL) inequality holds, and more importantly, proves that thealgorithm converges not only globally, but also superlinearly. This algorithm placesminimal requirements on the underlying dynamical system, and applies to a multi-tude of real-world optimal control problems, such as autonomous landing of reusablerockets [8], drone path planning with obstacle avoidance [48], optimal power flow forsmart grids [14], or more generally nonlinear Model Predictive Control (MPC) [35].Methods to compute the solution to such non-convex optimal control problemshave been the subject of much interest in recent years, and with the proliferation ofself-driving cars, reusable rockets, and autonomous drone delivery systems, interestwill only intensify. These methods can be broadly classified into indirect and directmethods [7]. Indirect methods use techniques from classical optimal control theory[6, 40, 31] to determine the necessary conditions of optimality, which are then solvedas a two-point boundary-value-problem. However, the necessary conditions for com-plex problems, problems with nonlinear dynamics and non-convex state and controlconstraints, are very difficult to write down, let alone be solved. Direct methods, onthe other hand, offer a cleaner solution process and are less affected by the problem ∗ Submitted to the editors February 28, 2019. † University of Washington, Seattle, WA ([email protected], [email protected], [email protected],[email protected]). 1 a r X i v : . [ m a t h . O C ] F e b Y. MAO, M. SZMUK, X. XU, AND B. AC¸ ıKMES¸E complexity. They parameterize the state and/or control signal using a set of basisfunctions whose coefficients are found via parameter optimization [28, 34]. As such,we will focus on direct methods in this paper.For the aforementioned parameter optimization problem, often finding a locallyoptimal solution, or even a dynamically feasible solution, quickly and reliably is prefer-able to finding the globally optimal solution slowly and non-deterministically. This isparticularly true for real-time control systems, where safety, stability, and determin-ism are typically prioritized over optimality. A common way to find locally optimalsolutions is through nonlinear programming, for which there exist a number of readilyavailable off-the-shelf software packages, including the Sequential Quadratic Program-ming (SQP) solver
SNOPT [20], the Interior Point Method (IPM) solver
IPOPT [50], andthe SQP-based optimal control interface
ACADO Toolkit [27]. However, the conver-gence behavior of generic nonlinear programming algorithms is often highly dependenton the initial guess provided to the solver. Furthermore, provided that convergence isachieved, generic nonlinear programming techniques offer few bounds, if any, on thecomputational effort required to achieve convergence. As a result, these techniquesare typically not applicable for real-time autonomous applications.Convex optimization problems, on the other hand, can be reliably solved in poly-nomial time to global optimality [39]. More importantly, recent advances have shownthat these problems can be solved in real-time by both generic Second-Order ConeProgramming (SOCP) solvers [15], and customized solvers that exploit the specificstructure of the problem [38, 17]. To leverage the power of convex programming insolving non-convex optimal control problems, the non-convex problems must be con-vexified, transformed into a convex optimization problem , which is the focus of thispaper. Along this line, recent results on a technique known as lossless convexification proved that optimal control problems with a certain class of non-convex control con-straints can be posed equivalently as relaxed convex optimal control problems [2, 3].However, lossless convexification cannot handle nonlinear dynamics, non-convex stateconstraints, or more general non-convex control constraints. Hence, there is a largeclass of optimal control problems that are not convexifiable at this time.Sequential Convex Programming (SCP) offers a way to handle more generic non-linear dynamics and non-convex constraints, and has been applied to quad-rotor mo-tion planning problems [5, 44]. While SCP usually performs well in practice, nogeneral convergence results have been reported. To properly address the issue ofconvergence, [37] proposed the first version of
SCvx algorithm, which utilizes a suc-cessive convexification framework to handle nonlinear dynamics with safe-guardingfeatures like virtual controls and trust regions . More importantly, [37] gave a proofof global convergence for the
SCvx algorithm. In this paper, we use the SCP andsuccessive convexification synonymously, but prefer the latter name as it suggests acomplimentary approach to lossless convexification.Building on these previous results, this paper presents three main contributions.First, it develops a framework that extends the results from [37] to include non-convexstate and control constraints. Unlike algorithms specialized to handle specific typesof non-convex constraints (for instance, some SCP-type methods [32, 36] are able tohandle problems where every constraint function itself is convex, e.g. the union ofconvex keep-out zones), the framework in this paper is more general, and appliesto a larger class of non-convex constraints. Moreover, state and control constraintsare handled using the same mechanism used to handle the nonlinear dynamics, thusreducing the implementation complexity of the algorithm.Second, the convergence result in this paper is stronger than classical numerical
UCCESSIVE CONVEXIFICATION FOR NON-CONVEX OPTIMAL CONTROL
SCvx algorithm by verifyingthose conditions.Third, under minimal assumptions, we prove that convergence rate of the
SCvx algorithm is superlinear, markedly faster than most nonlinear programming methods(e.g. linear convergence for SQP without perfect Hessian approximation [9, Theo-rem 3.3]). In contrast to generic nonlinear programming methods that are largelyproblem agnostic, our algorithm leverages the structure of optimal control problemsto obtain its superlinear rate of convergence. The simulation results in this paperalso validate the convergence rate claim. In practice, the
SCvx algorithm has alreadybeen showcased in multiple agile quad-rotor obstacle avoidance flight demonstrations,where it was executed on an onboard embedded processor in real time [48, 47].The remainder of this paper is organized as follows. In section 2, we outlinethe problem formulation and the
SCvx algorithm. In section 3, we provide proofsof global convergence, including both weak and strong convergence. In section 4,we show superlinear convergence rate under control related conditions. In section 5,we present simulation results of a non-convex quad-rotor motion planning exampleproblem, and compare our results to those produced by SQP or IPM based methods.Lastly, in section 6 we end with concluding remarks.
2. Successive Convexification.2.1. Problem Formulation.
A continuous-time optimal control problem has tobe discretized before it can be solved using optimization methods on a digital computer[28]. One may use, for instance, Gauss Collocation Method [23] or PseudospectralMethod [19] to achieve that. Therefore in this paper, it is sufficient for us to justconsider the discrete-time finite-horizon optimal control problem as in Problem 2.1.
Problem
Non-Convex Optimal Control Problem (2.1a) min u C ( x, u ) := N (cid:88) i =1 φ ( x i , u i ) , subject to: x i +1 = f ( x i , u i ) i = 1 , , . . . , N − , (2.1b) s ( x i , u i ) ≤ i = 1 , , . . . , N, (2.1c) x i ∈ X i ⊆ R n x i = 1 , , . . . , N, (2.1d) u i ∈ U i ⊆ R n u i = 1 , , . . . , N − . (2.1e) Y. MAO, M. SZMUK, X. XU, AND B. AC¸ ıKMES¸E
In Problem 2.1, x i and u i represent the state and control vectors at the i th discretetime step, X i and U i are assumed to be convex and compact sets which include theboundary conditions, and N denotes the total number of time steps. For simplicity,and without loss of generality, we assume that N is a fixed integer (hence the finaltime is fixed; see [46] for a treatment of free-final-time problems). More concisely,these variables can be written as x := [ x T , x T , . . . , x TN ] T ∈ X ⊆ R n x N ,u := [ u T , u T , . . . , u TN − ] T ∈ U ⊆ R n u ( N − , where X is the Cartesian product of X i , and U is the Cartesian product of U i . Weassume that the function φ : R n x × R n u → R in (2.1a) is convex and continuouslydifferentiable. This is a reasonable assumption for many real-world optimal controlproblems. For example, the minimum-fuel problem has φ ( x i , u i ) = (cid:107) u i (cid:107) . The systemdynamics are represented by (2.1b), where f : R n x × R n u → R n x is assumed to be acontinuously differentiable nonlinear function. The state constraints are representedby (2.1c), where s : R n x × R n u → R n s is assumed to be a continuously differentiablenon-convex function. Often, lossless convexification can be leveraged to handle thenon-convex control constraints (see e.g. [2, 3]). This should be done whenever possible.In summary, the non-convexity of the optimal control problem stated above isdue to (2.1b) and (2.1c). SCvx
Algorithm.
The basic operating principle of the
SCvx algo-rithm involves linearizing the non-convex parts of Problem 2.1 about the solution ofthe k th iteration. This results in a convex subproblem that is solved to full optimality(which makes this different than standard trust-region based methods), resulting in anew solution for the ( k + 1) th iteration. This process is repeated in succession untilconvergence is achieved. In practice, the SCvx algorithm is simple to initialize, aswill be demonstrated in section 5. In this paper, we will sometimes use the term succession to refer to an iteration of the
SCvx algorithm.In general, the solution to the convex subproblem will not be optimal with re-spect to the original non-convex problem. To recover optimality, the algorithm musteventually converge to a solution that satisfies the first-order optimality conditionsof Problem 2.1.To achieve this, we begin by denoting the solution to the k th iteration as ( x k , u k ).At each time step i , let A ki := ∂∂x i f ( x i , u i ) (cid:12)(cid:12)(cid:12) x ki ,u ki , B ki := ∂∂u i f ( x i , u i ) (cid:12)(cid:12)(cid:12) x ki ,u ki ,S ki := ∂∂x i s ( x i , u i ) (cid:12)(cid:12)(cid:12) x ki ,u ki , Q ki := ∂∂u i s ( x i , u i ) (cid:12)(cid:12)(cid:12) x ki ,u ki , and define d := x − x k , d i := x i − x ki , w := u − u k , and w i := u i − u ki in termsof the solution to the current iteration, ( x, u ). At the i th time step, the first-orderapproximation of (2.1b) and (2.1c) about ( x ki , u ki ) is given by x ki +1 + d i +1 = f ( x ki , u ki ) + A ki d i + B ki w i , (2.2a) s ( x ki , u ki ) + S ki d i + Q ki w i ≤ , (2.2b)which is a linear system with respect to the incremental state and control variables, d i and w i , of the convex subproblem. The linearization procedure affords the benefit of UCCESSIVE CONVEXIFICATION FOR NON-CONVEX OPTIMAL CONTROL x, u ) and it results in an infeasible convex subproblem. This undesirablephenomenon is often encountered during the early iterations of the process, and werefer to it as artificial infeasibility . To mitigate the effects of artificial infeasibilityresulting from the linearization of nonlinear dynamics, we augment the linearizeddynamics in (2.2a) with an unconstrained virtual control term, v i ∈ R n v (2.3) x ki +1 + d i +1 = f ( x ki , u ki ) + A ki d i + B ki w i + E ki v i , where E ki ∈ R n x × n v is selected such that im( E ki ) = R n x , and thus guaranteeing one-step controllability. Since v i is left unconstrained, any state in the feasible region ofthe convex subproblem is reachable in finite time. For example, for a mass-spring-damper system, the virtual control can be interpreted as a synthetic force to ensurefeasibility with respect to state and control constraints. Consequently, the resultingaugmented convex subproblem is no longer vulnerable to artificial infeasibility arisingfrom the linearization of (2.1b).While the virtual control term makes any state reachable in finite time, it doesnot allow the problem to retain feasibility when the state and control constraintsin (2.2b) define an empty feasible set (e.g. consider the union of the state constraints[1 , , . . . , x ≥ (cid:15) and [1 , , . . . , x ≤ − (cid:15) , for (cid:15) > virtual buffer zone term, s (cid:48) i ∈ R n s + (meaning s (cid:48) i ≥ s ( x ki , u ki ) + S ki d i + Q ki w i − s (cid:48) i ≤ .s (cid:48) i can be understood as a relaxation term that allows the non-convex state andcontrol constraints in (2.2b) to be violated. In an obstacle avoidance example, s (cid:48) i canbe interpreted as a measure of the obstacle constraint violation necessary to retainfeasibility at the i th time step.To ensure that v i and s (cid:48) i are used only when necessary, we define v := [ v T , v T , . . . , v TN − ] T ∈ R n v ( N − s (cid:48) := [ s (cid:48) T , s (cid:48) T , . . . , s (cid:48) TN − ] T ∈ R n s ( N − , and augment the linear cost function with the term (cid:80) N − i =1 λ i P ( E i v i , s (cid:48) i ) , where the λ i ’s are sufficiently large positive penalty weights, and P : R n x × R n s → R is an exactpenalty function (which will be more precisely defined later in (3.3)). Thus, to obtainthe solution for the ( k + 1) th iteration, we use the penalized cost given by(2.5) L k ( d, w ) := C ( x k + d, u k + w ) + N − (cid:88) i =1 λ i P ( E ki v i , s (cid:48) i ) . Note that we omitted the dependence of v i and s (cid:48) i in L k for simplicity. Its corre-sponding nonlinear penalized cost is given by(2.6) J ( x, u ) := C ( x, u ) + N − (cid:88) i =1 λ i P (cid:0) x i +1 − f ( x i , u i ) , s ( x i , u i ) (cid:1) . Y. MAO, M. SZMUK, X. XU, AND B. AC¸ ıKMES¸E
The second adverse effect of linearization is artificial unboundedness . This phe-nomenon occurs when the local properties of the non-convex problems are extrapo-lated well beyond the neighborhood of the linearization point. For example, considerthe following simple non-convex optimization problemmin x , s.t. x = x , whose solution is ( x ∗ , x ∗ ) = (0 , x ∗ , x ∗ ) renders the linearized problem unbounded.To mitigate the risk of artificial unboundedness, we impose the following trustregion constraint to ensure that u does not deviate significantly from the controlinput obtained in the previous iteration, u k : (cid:107) w (cid:107) ≤ r k . By selecting r k appropriately, this constraint, in conjunction with (2.3), ensures that x remains sufficiently close to the state vector obtained in the previous iteration, x k ,thus keeping the solution within the region where the linearization is accurate.We now present the final problem formulation and a summary of the SCvx al-gorithm. At the ( k + 1) th iteration, we solve the convex optimal control subprob-lem, Problem 2.2. For many applications, U and X are simple convex sets, e.g.second order cones, in which case Problem 2.2 can be readily solved by SOCP solvers(e.g. [17]) in a matter of milliseconds. Problem
Convex Optimal Control Subproblem min d,w L k ( d, w ) , subject to: u k + w ∈ U, x k + d ∈ X, (cid:107) w (cid:107) ≤ r k . The
SCvx algorithm solves Problem 2.1 according to the steps outlined in Algo-rithm 2.1. It can be considered as a trust-region-type algorithm, and follows standardtrust region radius update rules. However, it also differs from conventional trustregion methods in some important ways, as discussed later in subsection 2.3.In line 2, we initialize the algorithm by using somewhat random initial state x and control u , which need not to be dynamically feasible. This is importantbecause finding a feasible solution itself is a difficult non-convex problem, and SCvx fortunately does not require that. There are no strict rules to following in choosingparameters. Generally speaking, ρ is very close to 0, ρ is slightly greater than 0,and ρ can be marginally less than 1. r l can be slightly greater than 0.The quality of the linear approximation used in Problem 2.2 can be understood byinspecting the ratio ρ k in (2.9), which compares the realized (nonlinear) cost reduction∆ J k in (2.7), to the predicted (linear) cost reduction ∆ L k in (2.8) based on theprevious (i.e. k th ) iteration. In line 13, when ρ k is small (i.e. when ρ k < ρ (cid:28) UCCESSIVE CONVEXIFICATION FOR NON-CONVEX OPTIMAL CONTROL Algorithm 2.1
The
SCvx
Algorithm procedure SCvx ( x , u , λ, (cid:15) tol ) input Select initial state x ∈ X and control u ∈ U . Initialize trust regionradius r >
0. Select penalty weight λ >
0, and parameters 0 < ρ < ρ < ρ < r l > α > , β > while not converged, i.e. ∆ J k > (cid:15) tol do step 1 At ( k + 1) th succession, solve Problem 2.2 at ( x k , u k , r k ) to get anoptimal solution ( d, w ). step 2 Compute the actual change in the penalty cost (2.6):(2.7) ∆ J k = J ( x k , u k ) − J ( x k + d, u k + w ) , and the predicted change by the convex cost (2.5):(2.8) ∆ L k = J ( x k , u k ) − L k ( d, w ) . if ∆ J k = 0 then stop , and return ( x k , u k ); else compute the ratio(2.9) ρ k := ∆ J k / ∆ L k . end if step 3 if ρ k < ρ then reject this step, contract the trust region radius, i.e. r k ← r k /α and goback to step 1 ; else accept this step, i.e. x k +1 ← x k + d , u k +1 ← u k + w , and update thetrust region radius r k +1 by r k +1 = r k /α, if ρ k < ρ ; r k , if ρ ≤ ρ k < ρ ; βr k , if ρ ≤ ρ k . end if r k +1 ← max { r k +1 , r l } , k ← k + 1, and go back to step 1 . end while return ( x k +1 , u k +1 ). end procedure the realized nonlinear cost reduction. Consequently, the solution ( d, w ) is rejected,the trust region is contracted by a factor of α <
1, and the step is repeated. As willbe shown in section 3, the contraction of the trust region ensures that this conditioncan occur only a finite number of times, thus guaranteeing that the algorithm will notremain in this state indefinitely.In line 15, if ρ k is such that the linearization accuracy is deficient, yet acceptable Y. MAO, M. SZMUK, X. XU, AND B. AC¸ ıKMES¸E (i.e. when ρ ≤ ρ k < ρ ), then the solution ( d, w ) is accepted, but the trust region iscontracted. The former is done to avoid unnecessarily discarding the solution that hasalready been computed, while the latter is done to improve the linearization accuracyof the next succession.When ρ k is sufficiently large, yet significantly less than unity (i.e. when ρ ≤ ρ k < ρ ), the linearization is deemed sufficiently accurate. Consequently, the solution( d, w ) is accepted, and the trust region size is retained.Lastly, when ρ k is close to unity, the linear cost reduction accurately predictsthe realized nonlinear cost reduction. Moreover, if ρ k is greater than unity, thenthe linear approximation under-predicts the cost reduction, and is thus conservative.These conditions indicate that the linearization is accurate or conservative enough toenlarge the trust region. Hence, when ρ k ≥ ρ , the solution ( d, w ) is accepted, and thetrust region size is increased by a factor of β > w , and therefore d , in the next succession. Oneimportant distinction between
SCvx and typical trust-region-type algorithms lies inthe subproblem solved at each iteration. Conventional trust-region algorithms usu-ally perform a line search along the Cauchy arc to achieve a sufficient reduction [13].However, in the
SCvx algorithm, each convex subproblem is solved to full optimal-ity, thus increasing the number of inner solver iterations (e.g. IPM iterations) ateach succession, while decreasing the number of outer
SCvx iterations. Qualitativelyspeaking, the number of successions is reduced by achieving a greater cost reductionat each succession. Thanks to the capabilities of existing IPM algorithms, and dueto recent advancements in IPM customization techniques (e.g. [15, 17]), each convexsubproblem can be solved quickly enough to outweigh the negative impacts of solvingit to full optimality.It is also crucial to point out some key differences between the
SCvx algorithm andSQP-based methods (e.g. [9]). SQP-based methods typically make use of second-orderinformation when approximating the Hessian of the cost function (and in some cases,of the constraints). This requires techniques like the Broyden-Fletcher-Goldfarb-Shanno (BFGS) update, which can be computationally expensive. Furthermore,additional conditions must be satisfied in order to ensure that the computed Hes-sian is positive-definite, and therefore, to guarantee that the resulting subproblem isconvex [18]. For these two reasons, SQP-based methods are not well suited for real-time autonomous applications. On the other hand, the
SCvx algorithm relies only onfirst-order information obtained through linearization. While the first-order approxi-mation may be less accurate than its second-order counterpart, the resulting error isproperly regulated by the trust region updating mechanism outlined in Algorithm 2.1.Furthermore, since the Jacobian matrices can be determined analytically, very littlecomputational effort is expended in setting up each succession. Most importantly, as aconsequence of linearization, the resulting subproblems are automatically guaranteed to be convex, thus further ensuring the robustness of the convergence process.
3. Global Convergence.
This section presents two main convergence results.The first one, weak convergence, means that the algorithm generates a set of conver-gent subsequences, all of which satisfy the first order conditions. This is not conver-gence in its strict sense due to potential oscillation between several limit points (i.e.,accumulation points), but suprisingly most of the convergence claims of nonlinear op-timization schemes fall into this category (see [1] and the references therein). It is alsorelatively easy to prove due to the well-known Bolzano-Weierstrass theorem (see e.g.
UCCESSIVE CONVEXIFICATION FOR NON-CONVEX OPTIMAL CONTROL
SCvx algorithm is indeed strongly convergent.
In this section, we extend the convergence resultfrom [37] to facilitate the addition of state and control constraints. Since the stateand control variables become indistinguishable once the optimal control problem isimplemented as a numerical parameter optimization problem, we perform the fol-lowing analysis accordingly. In lieu of the state and control variables, x and u ,we will use z := [ x T , u T ] T as our optimization variable, where z ∈ R n z and where n z = n x N + n u ( N − z . Thus, we have the finite-dimensional non-convexoptimization problem in Problem 3.1. Problem
Original Non-Convex Problem min z g ( z ) , (3.1a) subject to: g i ( z ) = 0 , ∀ i ∈ I eq , (3.1b) g i ( z ) ≤ , ∀ i ∈ I ineq , (3.1c) h j ( z ) ≤ , ∀ j ∈ J ineq , (3.1d)In Problem 3.1, I eq := { , , . . . , e } represents the set of non-convex equality con-straint indices, I ineq := { e + 1 , . . . , p } , represents the set of non-convex inequalityconstraint indices, and J ineq := { , , . . . , q } represents the set of convex inequalityindices. Correspondingly, equations (3.1b)-(3.1d) represent the nonlinear system dy-namics, the non-convex state and control constraints, and the convex state and controlconstraints, respectively. We assume that g i ( z ) and h j ( z ) are continuously differen-tiable for all i ∈ I eq ∪ I ineq and j ∈ J ineq , respectively. To simplify the analysis thatfollows, we assume that g ( z ) ∈ C , but note that g ( z ) can be an element of C inpractice.Since we are restricting our analysis to discrete-time systems, z is a finite dimen-sional vector. Consequently, the 1-norm used in [37] manifests itself as a finite sumof absolute values. Therefore, to incorporate state and control inequality constraints,the exact penalty function γ ( · ) used in [37] can be extended as follows J ( z ) := g ( z ) + (cid:88) i ∈I eq λ i | g i ( z ) | + (cid:88) i ∈I ineq λ i max (cid:0) , g i ( z ) (cid:1) + (cid:88) j ∈J ineq τ j max (cid:0) , h j ( z ) (cid:1) , (3.2)where λ i ≥ τ j ≥ λ := [ λ , λ , . . . , λ p ] and τ := [ τ , τ , . . . , τ q ]represent the penalty weights. To facilitate subsequent proofs, we include the convexconstraints, h j ( z ), in J ( z ). However, in practice, these constraints can be excludedfrom the cost function and included as explicit convex constraints. Next, we can nowexpress the corresponding penalized problem in Problem 3.2.Note that J ( z ) is non-convex, and thus needs to be convexified. According tothe SCvx algorithm, at ( k + 1) th succession, we linearize g ( z ), g i ( z ) = 0 for all0 Y. MAO, M. SZMUK, X. XU, AND B. AC¸ ıKMES¸E
Problem
Non-Convex Penalty Problem min z J ( z ) , ∀ z ∈ R n z .i ∈ I eq ∪ I ineq , and h j ( z ) for all j ∈ J ineq about z k . This procedure is repeated untilconvergence, and produces a sequence of convex penalty functions given by L k ( d ) := g ( z k ) + ∇ g ( z k ) T d + (cid:88) i ∈I eq λ i | g i ( z k ) + ∇ g i ( z k ) T d | + (cid:88) i ∈I ineq λ i max (cid:0) , g i ( z k ) + ∇ g i ( z k ) T d (cid:1) + (cid:88) j ∈J ineq τ j max (cid:0) , h j ( z k ) + ∇ h j ( z k ) T d (cid:1) . (3.3)Note that L k ( d ) is not exactly linearized J ( z ), since the linearization is applied insidethe | · | and max(0 , · ) functions. L k ( d ) is then incorporated into the linearized penaltyproblem in Problem 3.3. Problem
Linearized Penalty Problem min d L k ( d )subject to: (cid:107) d (cid:107) ≤ r k . Again, we remind the reader that the convex constraints are linearized to simplifythe notations in the analysis that follows, and that, in practice, they are handled ex-plicitly (in their original form) by the convex optimization solver. The corresponding actual cost reduction in (2.7) can be rewritten as(3.4) ∆ J ( z, d ) = J ( z ) − J ( z + d ) , while the predicted cost reduction in (2.8) becomes(3.5) ∆ L ( d ) = J ( z ) − L ( d ) . To proceed with convergence analysis, we now introduce some preliminary results andassumptions.
Theorem
Local Optimum , Theorem 4.1 in [24]).
Let N (¯ z ) denote anopen neighborhood of ¯ z that contains feasible point of Problem . Then, if thereexist ¯ λ ≥ , ¯ τ ≥ , and ¯ z ∈ R n z such that J (¯ z ) ≤ J ( z ) for all penalty weights λ ≥ ¯ λ and τ ≥ ¯ τ , and for all z ∈ N (¯ z ) , then ¯ z is a local optimum of Problem . Assumption
LICQ ). Define the following sets of indices corresponding tothe active inequality constraints: I ac (¯ z ) := { i | g i (¯ z ) = 0 , i ∈ I ineq } ⊆ I ineq ,J ac (¯ z ) := { j | h j (¯ z ) = 0 , j ∈ J ineq } ⊆ J ineq . UCCESSIVE CONVEXIFICATION FOR NON-CONVEX OPTIMAL CONTROL Furthermore, define G eq (¯ z ) as a matrix whose columns comprise of ∇ g i (¯ z ) for all i ∈ I eq , G ac,g (¯ z ) as a matrix whose columns comprise of ∇ g i (¯ z ) for all i ∈ I ac (¯ z ) , and G ac,h (¯ z ) as a matrix whose columns comprise of ∇ h i (¯ z ) for all j ∈ J ac (¯ z ) . Then, theLinear Independence Constraint Qualification (LICQ) is satisfied at ¯ z if the columnsof the following matrix are linearly independent: (3.6) G ac (¯ z ) := { G eq (¯ z ) ; G ac,g (¯ z ) ; G ac,h (¯ z ) } . The following theorem states the first-order necessary conditions (i.e. the KKT con-ditions) for a point ¯ z to be a local optimum of Problem 3.1 (in the same sense as The-orem 3.4). Theorem
Karush–Kuhn–Tucker (KKT) , Theorem 3.2 in [24]).
If theconstraints of Problem satisfy the LICQ assumption (Assumption ), and ¯ z is a local optimum of the original problem, Problem , then there exist Lagrangemultipliers ¯ µ i for all i ∈ I eq , ¯ µ i ≥ for all i ∈ I ineq , and ¯ σ j ≥ for all j ∈ J ineq such that (3.7) ∇ g (¯ z ) + (cid:88) i ∈I eq ¯ µ i ∇ g i (¯ z ) + (cid:88) i ∈ I ac (¯ z ) ¯ µ i ∇ g i (¯ z ) + (cid:88) j ∈ J ac (¯ z ) ¯ σ j ∇ h j (¯ z ) = 0 . We refer to a point that satisfies the above conditions as a
KKT point .We say a penalty function is exact if there exists finite penalty weights ¯ µ i and¯ σ j such that Problem 3.1 and Problem 3.2 produce equivalent optimality conditions.Since Theorem 3.6 already gives the optimality condition for Problem 3.1, we nowexamine the first-order conditions for Problem 3.2.For fixed λ , the cost J ( z ) is not differentiable everywhere because of the non-smoothness of | · | and max(0 , · ). However, since g i ( · ) and h j ( · ) are both continuouslydifferentiable, J ( z ) is locally Lipschitz continuous. Hence we need the following defi-nition: Definition
GDD , Definition 1.3 in [11]). If J ( z ) is locally Lipschitz con-tinuous, then the Generalized Directional Derivative (GDD) of J ( z ) at some ¯ z in anydirection s exists, and is defined as follows: (3.8) dJ (¯ z, s ) := lim sup z → ¯ zδ → + J ( z + δs ) − J ( z ) δ . Hence, dJ also satisfies the following implicit relationship by using similar reasoningto that used in [18, Lemma 14.2.5],(3.9) dJ (¯ z, s ) = max { ν T s | ν ∈ ∂J (¯ z ) } , where ∂J (¯ z ) is the generalized differential , defined as(3.10) ∂J (¯ z ) = { ν | J (¯ z + y ) ≥ J (¯ z ) + ν T y, ∀ y ∈ R n z } . Applying Theorem 1 from [12] with the above definitions, we have
Lemma
Necessary Condition for Local Optimality ). If ¯ z is a locallyoptimal solution of the non-convex penalty problem, Problem , then ∈ ∂J (¯ z ) . Next, define the set of stationary points of Problem 3.2 as S := { z ∈ R n z | ∈ ∂J ( z ) } . Y. MAO, M. SZMUK, X. XU, AND B. AC¸ ıKMES¸E
Then, Lemma 3.8 states that if ¯ z solves Problem 3.2, then ¯ z ∈ S .We now present the exactness result. Note that this theorem is fairly well-known,and its proof can be found in, for example, Theorem 4.4 and 4.8 of [24]. Theorem
Exactness of the Penalty Function ). If ¯ z is a KKT pointof the original Problem with multipliers ¯ µ i for all i ∈ I eq ∪ I ineq and ¯ σ j for all j ∈ J ineq , and if the penalty weights λ and τ satisfy λ i > | ¯ µ i | , ∀ i ∈ I eq ∪ I ineq ,τ j > | ¯ σ j | , ∀ j ∈ J ineq , then ¯ z ∈ S , and by definition ¯ z satisfies the stationarity condition ∈ ∂J (¯ z ) for thepenalty Problem .Conversely, if a stationary point ¯ z ∈ S of the penalty Problem is feasible for theoriginal Problem , then it is also a KKT point of the original Problem , andtherefore, (3.7) holds at ¯ z . Although Theorem 3.9 does not suggest a way to find such λ and τ , it neverthelesshas important theoretical implications. In our current implementation, we select “suf-ficiently” large constant λ ’s and τ ’s, a strategy that has been shown to work well inpractice. Alternatively, the values of λ and τ can be adjusted after each succession,based on the value of the dual solution obtained in the previously solved subproblem.However, it is the second (i.e. “converse”) part of Theorem 3.9 that is particularlyimportant to our subsequent convergence analysis. Specifically, it guarantees that aslong as we can find a stationary point for Problem 3.2 that is also feasible for Prob-lem 3.1, then that point also satisfies the KKT conditions for Problem 3.1.Now, two important convergence results will be stated. The first one deals withthe case of finite convergence, while the second handles situation where an infinitesequence { z k } is generated by the SCvx algorithm.
Theorem
Given Assumption , the predicted cost reductions ∆ L k definedin (3.5) are nonnegative for all k . Furthermore, ∆ L k = 0 implies that z k ∈ S , andtherefore that z k is a stationary point of the non-convex penalty Problem .Proof. Denote d as the solution to the convex subproblem (i.e. Problem 3.3), andnote that d = 0 is always a feasible solution to said problem. Hence we have L k ( d ) ≤ L k (0) = J ( z k ) . Therefore, ∆ L k = J ( z k ) − L k ( d ) ≥
0, and ∆ L k = 0 holds if and only if d = 0 isthe minimizer of Problem 3.3. When ∆ L k = 0, one can directly apply Lemma 3.8and [37, Lemma 2] to get z k ∈ S (see [37] for a more detailed proof).The case when { z k } is an infinite sequence represents a greater challenge. The analysisof this limit process is made more difficult due to the non-differentiability of thepenalty function J ( z ). To facilitate further analysis, we first note that for any z ,(3.11) J ( z + d ) = L k ( d ) + o ( (cid:107) d (cid:107) ) , where o ( (cid:107) d (cid:107) ) denotes higher order terms of (cid:107) d (cid:107) , i.e.,lim (cid:107) d (cid:107)→ o ( (cid:107) d (cid:107) ) (cid:107) d (cid:107) = 0 , UCCESSIVE CONVEXIFICATION FOR NON-CONVEX OPTIMAL CONTROL z . This can be verified by writing out the Taylor expansion of g i ( z ) and h j ( z ) in J ( z ), and then using the fact that o ( (cid:107) d (cid:107) ) can be taken out of | · | and max(0 , · ).The next lemma is a key preliminary result, and its proof also provides somegeometric insights into the SCvx algorithm. The proof is similar to and based on thatof [37, Lemma 3].
Lemma
Let ¯ z ∈ R n z be a feasible point, but not a stationary point, of Prob-lem . Use N (¯ z, ¯ (cid:15) ) to denote an open neighborhood of ¯ z with radius ¯ (cid:15) , and let P ( z, r ) denote Problem with a linearization evaluated at z and a trust region of radius r .Then, for any c ∈ (0 , , there exist ¯ r > and ¯ (cid:15) > such that for all z ∈ N (¯ z, ¯ (cid:15) ) and r ∈ (0 , ¯ r ] , any optimal solution d ∗ for P ( z, r ) satisfies (3.12) ρ ( z, r ) := J ( z ) − J ( z + d ∗ ) J ( z ) − L k ( d ∗ ) ≥ c. Proof.
Since ¯ z is feasible but not stationary, we know that 0 / ∈ ∂J (¯ z ) accordingto Lemma 3.8.From (3.10), it follows that the generalized differential ∂J (¯ z ) is the intersectionof half spaces, and hence is a closed convex set, which does not include the vector 0.Hence, using [41, Corollary 11.4.2] (a strong form of the separation theorem of twoclosed convex sets, i.e. 0 and ∂J (¯ z )), there exists a unit vector s ∈ R n z and a scalar κ > ν ∈ ∂J (¯ z ),(3.13) ν T s ≤ − κ < . Therefore, it follows that max { ν T s | ν ∈ ∂J (¯ z ) } ≤ − κ. The left hand side is exactly the expression for the GDD, as defined in (3.9). Therefore,we have dJ (¯ z, s ) := lim sup z → ¯ zr → + J ( z + rs ) − J ( z ) r ≤ − κ. This implies that there exist positive ¯ r and ¯ (cid:15) such that for all z ∈ N (¯ z, ¯ (cid:15) ) and r ∈ (0 , ¯ r ],(3.14) J ( z + rs ) − J ( z ) r < − κ . Now, assume Problem 3.3 is solved with z ∈ N (¯ z, ¯ (cid:15) ) and r ∈ (0 , ¯ r ], producing anoptimal solution d ∗ . By using (3.11), the (nonlinear) change realized in J is∆ J ( z, d ∗ ) = J ( z ) − J ( z + d ∗ )= J ( z ) − L k ( d ∗ ) − o ( (cid:107) d ∗ (cid:107) )= ∆ L k ( d ∗ ) − o ( r ) . (3.15)Thus, the ratio ρ ( z, r ) is given by ρ ( z, r ) = ∆ J ( z, d ∗ )∆ L k ( d ∗ ) = 1 − o ( r )∆ L k ( d ∗ ) . Y. MAO, M. SZMUK, X. XU, AND B. AC¸ ıKMES¸E
Next, let d (cid:48) = rs ∈ R n z . From the definition of s , it follows that (cid:107) d (cid:48) (cid:107) = r , andtherefore that d (cid:48) is within the thrust region of radius ¯ r . Since d ∗ is the optimalsolution, we have L k ( d ∗ ) ≤ L k ( d (cid:48) ), which in turn means(3.16) ∆ L k ( d ∗ ) ≥ ∆ L k ( d (cid:48) ) . Now, as r →
0, we will have r ∈ (0 , ¯ r ], and from (3.14) we have(3.17) ∆ J ( z, d (cid:48) ) = J ( z ) − J ( z + d (cid:48) ) > (cid:16) κ (cid:17) r. Replacing d ∗ with d (cid:48) in (3.15) and substituting into (3.17),∆ J ( z, d (cid:48) ) = ∆ L k ( d (cid:48) ) − o ( r ) > (cid:16) κ (cid:17) r. Combining this with (3.16), we obtain(3.18) ∆ L k ( d ∗ ) − o ( r ) ≥ ∆ L k ( d (cid:48) ) − o ( r ) > (cid:16) κ (cid:17) r. Thus ∆ L k ( d ∗ ) > ( κ/ r + o ( r ) >
0, and the ratio(3.19) ρ ( z, r ) = 1 − o ( r )∆ L k ( d ∗ ) > − o ( r )( κ/ r + o ( r ) . Therefore, as r → ρ ( z, r ) →
1, and thus for any c ∈ (0 , r > r ∈ (0 , ¯ r ], ρ ( z, r ) ≥ c holds. Remark ρ ( z, r ) < ρ ). Lemma 3.11 provides anassurance that the SCvx algorithm will not produce such behavior. By contracting r k sufficiently, Lemma 3.11 guarantees that the ratio ρ k will eventually exceed ρ , andthe algorithm will stop rejecting steps.We are now ready to present our main result. The proof of this theorem is basedon that of [37, Theorem 4]. Theorem
Given Assumption , if the
SCvx algorithm generates an infi-nite sequence { z k } , then { z k } is guaranteed to have limit points. Furthermore, anysuch limit point, ¯ z , is a stationary point of the non-convex penalty Problem .Proof. Since we have assumed the feasible region to be convex and compact,by the Bolzano-Weierstrass theorem (see e.g. [43]), there is at least one convergentsubsequence { z k i } → ¯ z , which is a guaranteed limit point.The proof of stationarity of ¯ z is by contradiction, i.e. we assume that ¯ z is not astationary point. From Lemma 3.11, there exist positive ¯ r and ¯ (cid:15) such that ρ ( z, r ) ≥ ρ ∀ z ∈ N (¯ z, ¯ (cid:15) ) and r ∈ (0 , ¯ r ] , since ρ can be chosen arbitrarily small. Without loss of generality, we can supposethe whole subsequence { z k i } is in N (¯ z, ¯ (cid:15) ), so that(3.20) ρ ( z k i , r ) ≥ ρ ∀ r ∈ (0 , ¯ r ] . If the initial trust region radius is less than ¯ r , then (3.20) will be trivially satisfied.On the other hand, if the initial radius is greater than ¯ r , then the trust region radius UCCESSIVE CONVEXIFICATION FOR NON-CONVEX OPTIMAL CONTROL k i , let ˆ r k i denote the last radius thatneeds to be reduced, and note that ˆ r k i > ¯ r . Additionally, let r k i be the trust regionradius selected after the last rejection step. Thus, we have Problem 3.3, which wasobtained by linearizing about z k i , and which is subject to the trust region constraint (cid:107) z − z k i (cid:107) ≤ r k i . Then r k i = ˆ r k i /α > ¯ r/α. Since there is also a lower bound r l on r , we have(3.21) r k i ≥ min { r l , ¯ r/α } (cid:44) δ > . Note that condition (3.20) can be expressed as(3.22) J ( z k i ) − J ( z k i +1 ) ≥ ρ ∆ L k i . By Theorem 3.10, we have ∆ L k i ≥
0. Thus, (3.22) implies that the penalized cost ismonotonically decreasing.Our next goal is to find a lower bound for ∆ L k i . Let d ∗ be the solution of Prob-lem 3.3, linearized about ¯ z , and with trust region radius δ/ (cid:107) d ∗ (cid:107) ≤ δ/ z = ¯ z + d ∗ . Then,(3.23) (cid:107) ˆ z − ¯ z (cid:107) ≤ δ/ . Since ¯ z is not a stationary point, Theorem 3.10 implies∆ L k ( d ∗ ) = J (¯ z ) − L k ( d ∗ ) (cid:44) θ > . Consequently, by continuity of J and L k , there exists an i > i ≥ i J ( z k i ) − L k i (ˆ z − z k i ) > θ/ , and(3.24) (cid:107) z k i − ¯ z (cid:107) < δ/ . (3.25)From (3.23) and (3.25), for all i ≥ i , we have(3.26) (cid:107) ˆ z − z k i (cid:107) ≤ (cid:107) ˆ z − ¯ z (cid:107) + (cid:107) z k i − ¯ z (cid:107) < δ ≤ r k i , where the last inequality comes from (3.21).Defining ˆ d k i = ˆ z − z k i , (3.26) implies that ˆ d k i is a feasible solution for the convexsubproblem (i.e. Problem 3.3) at ( z k i , r k i ) when i ≥ i . Then if d k i is the optimalsolution to this subproblem, we have L k i ( d k i ) ≤ L k i ( ˆ d k i ), so that∆ L k i ( d k i ) = J ( z k i ) − L k i ( d k i ) ≥ J ( z k i ) − L k i ( ˆ d k i ) > θ/ . (3.27)The last inequality is due to (3.24). Combining (3.22) and (3.27), we obtain thefollowing for all i ≥ i :(3.28) J ( z k i ) − J ( z k i +1 ) ≥ ρ θ/ . Y. MAO, M. SZMUK, X. XU, AND B. AC¸ ıKMES¸E
However, since k i + 1 ≤ k ( i +1) , and the penalized cost is not increasing due to (3.22),we have J ( z k i +1 ) ≥ J ( z k ( i +1) ), and thus ∞ (cid:88) i =1 (cid:0) J ( z k i ) − J ( z k i +1 ) (cid:1) ≤ ∞ (cid:88) i =1 (cid:0) J ( z k i ) − J ( z k ( i +1) ) (cid:1) = J ( z k ) − J (¯ z ) ≤ ∞ . Therefore, the series is convergent, and necessarily J ( z k i ) − J ( z k i +1 ) → , which contradicts (3.28). This contradiction implies that every limit point ¯ z is astationary point of Problem 3.2.Now, combining Theorem 3.10 and Theorem 3.13 and using the “converse” part ofTheorem 3.9, we can summarize the final result as follows. Theorem
Global Weak Convergence ). Given Assumption , regard-less of initial conditions, the
SCvx algorithm (Algorithm ) always has limit points,and any limit point, ¯ z , is a stationary point of the non-convex penalty Problem .Furthermore, if ¯ z is feasible for the original non-cnvex Problem , then it is a KKTpoint of Problem . The weak convergence result is inline with classical nonlinear optimization analysis,but does not prevent oscillation among several limit points. In the next section,we will see that this result can in fact be strengthened to single point convergenceunder some mild additional assumptions. Further, on its own Theorem 3.14 doesnot guarantee feasibility of the converged solution, even if the original non-convexproblem, Problem 3.1, may have feasible regions. However, it is important to pointout that this scenario is rarely observed in practice, and when it actually occurs,trying different initial guesses would often resolve it.
In this section, we will show that the
SCvx al-gorithm indeed converges to a single limit point under some mild additional as-sumptions. For the purpose of this section, it is not necessary to distinguish non-convex inequality constraints g i ( z ) ≤ , ∀ i ∈ I ineq and convex inequality constraints h j ( z ) ≤ , ∀ j ∈ J ineq . Thus we will use g i ( z ) ≤ , ∀ i ∈ { e + 1 , . . . , p + q } to denoteall inequality constraints. For strong convergence, first we need a slightly strongersmoothness assumption. Assumption g i ( z ) , ∀ i = 0 , . . . , p + q have Lipschitz continuous gradients,that is, ∃ L i ≥ , s.t. (cid:107)∇ g i ( z ) − ∇ g i ( z ) (cid:107) ≤ L i (cid:107) z − z (cid:107) . Using the simplified notation, we can rewrite the original non-convex (penalty)cost functions as(3.29) J ( z ) = g ( z ) + e (cid:88) i =1 λ i | g i ( z ) | + p + q (cid:88) i = e +1 λ i max (cid:0) , g i ( z ) (cid:1) , UCCESSIVE CONVEXIFICATION FOR NON-CONVEX OPTIMAL CONTROL k th iteration as(3.30) L k ( d ) = g ( z k ) + ∇ g ( z k ) T d + e (cid:88) i =1 λ i | g i ( z k ) + ∇ g i ( z k ) T d | + p + q (cid:88) i = e +1 λ i max (cid:0) , g i ( z k ) + ∇ g i ( z k ) T d (cid:1) . The key assumption used in [4] to establish single point convergence is that thefunction been optimized satisfies the (nonsmooth) Kurdyka–(cid:32)Lojasiewicz (KL) prop-erty [4, Definition 2.4], which means, roughly speaking, that the functions underconsideration are “sharp up to a reparametrization”. Real semi-algebraic functionsprovide a very rich class of functions satisfying the KL property. Many other functionsmay also satisfy this property, among which an important class is given by functionsdefinable in an o-minimal structure [29]. One can verify that many real-world optimalcontrol problems indeed have KL property, especially by attesting its semi-algebracity.Therefore, it is reasonable to have the following assumption:
Assumption
The penalized cost function J ( z ) in (3.29) have KL prop-erty [4, Definition 2.4]. Given Assumption 3.16, [4, Section 2.3] establishes three conditions (referred as H1–H3) one can check to ensure strong convergence of an optimization scheme. Note thatH3, the continuity condition , has already been proved in Theorem 3.13. Now, let usproceed to verify the other two conditions in the context of
SCvx algorithm.
Condition
Sufficient Decrease ). At iteration k , let z k be the currentpoint, and z k +1 be the accepted solution by the SCvx algorithm, then the penalizedcost function J ( · ) in (3.29) satisfies (3.31) J ( z k ) − J ( z k +1 ) ≥ a (cid:107) z k +1 − z k (cid:107) , where a > is some constant. Condition
Relative Error ). At iteration k , let z k be the current point,and z k +1 be the accepted solution by the SCvx algorithm, then given Assumption , ∃ ω k +1 ∈ ∂J ( z k +1 ) , s.t. (3.32) (cid:107) ω k +1 (cid:107) ≤ b (cid:107) z k +1 − z k (cid:107) , where b > is some constant, and ∂J ( · ) is the generalized differential defined in (3.10) . The intuition behind these two conditions can be found in [4, p. 92], and roughlyspeaking, Condition 3.17 measures the quality of a descent step, while Condition 3.18reflects relative inexact optimality conditions for subproblems. The proof of Condi-tion 3.17 is relatively straightforward, and is given as follows.
Proof of Condition . Note that in (3.15), d ∗ = z k +1 − z k , then from (3.15),(3.18) and the trust region constraint (cid:107) d ∗ (cid:107) ≤ r , we have that(3.33) J ( z k ) − J ( z k +1 ) ≥ κ r ≥ κ (cid:107) d ∗ (cid:107) . Since z ∈ Z = X × U , and Z is compact, (cid:107) d ∗ (cid:107) = (cid:107) z k +1 − z k (cid:107) is bounded. Let D = max z ,z ∈ Z (cid:107) z − z (cid:107) , then (cid:107) d ∗ (cid:107) ≤ D, ∀ d .8 Y. MAO, M. SZMUK, X. XU, AND B. AC¸ ıKMES¸E
Now, let a = κ D , then from 3.33, we have J ( z k ) − J ( z k +1 ) ≥ κ (cid:107) d ∗ (cid:107) (cid:107) d ∗ (cid:107) ≥ κ D (cid:107) d ∗ (cid:107) = a (cid:107) d ∗ (cid:107) = a (cid:107) z k +1 − z k (cid:107) , which is exactly (3.31).The verification of Condition 3.18 requires a bit more work and a constructiveapproach. We will leverage convexity of the generalized subdifferential and somespecial structures of the two types of non-smoothness, | · | and max(0 , · ) presented in(3.29) and (3.30). Namely, ∂ | g i ( z ) | is symmetric with respect to 0, and in fact a convexhull spanned by −∇ g i ( z ) and ∇ g i ( z ). Thus we have ∀ ω i ∈ ∂ | g i ( z ) | , i = 1 , . . . , e , ∃ α i ∈ [0 , ω i = α i ∇ g i ( z ) + (1 − α i )( −∇ g i ( z )) , = (2 α i − ∇ g i ( z ) . Similarly, ∂ max (cid:0) , g i ( z ) (cid:1) is a convex hull of 0 and ∇ g i ( z ), and we also have ∀ ω i ∈ ∂ max (cid:0) , g i ( z ) (cid:1) , i = e + 1 , . . . , p + q , ∃ α i ∈ [0 , ω i = α i ∇ g i ( z ) + (1 − α i )0 = α i ∇ g i ( z ) . Next we present two technical lemmas asserting the close proximity of the gener-alized differentials at z k +1 and their linear approximations around z k . Lemma
For any ν ki ∈ ∂ d | g i ( z k ) + ∇ g i ( z k ) T d | , i = 1 , . . . , e , there exists ω k +1 i ∈ ∂ | g i ( z k +1 ) | and constant c i ≥ s.t. (cid:107) ω k +1 i − ν ki (cid:107) ≤ c i (cid:107) z k +1 − z k (cid:107) . Proof.
The distance between the two generalized differentials is captured by thedistance between the two vectors, ω k +1 i and ν ki , and we have(3.36) (cid:107) ω k +1 i − ν ki (cid:107) = (cid:107) ω k +1 i − ∇ g i ( z k +1 ) + ∇ g i ( z k +1 ) − ν ki + ∇ g i ( z k ) − ∇ g i ( z k ) (cid:107)≤ (cid:107) ω k +1 i − ∇ g i ( z k +1 ) − ν ki + ∇ g i ( z k ) (cid:107) + (cid:107)∇ g i ( z k +1 ) − ∇ g i ( z k ) (cid:107) . Now, using (3.34), we have ω k +1 i = (2 α i − ∇ g i ( z k +1 ), and ν ki = (2¯ α i − ∇ g i ( z k ).Note that α i ∈ [0 ,
1] is free, while ¯ α i ∈ [0 ,
1] is fixed for a given ν ki . Therefore, wehave (cid:107) ω k +1 i − ν ki (cid:107) ≤ (cid:107) α i − ∇ g i ( z k +1 ) − α i − ∇ g i ( z k ) (cid:107) + (cid:107)∇ g i ( z k +1 ) − ∇ g i ( z k ) (cid:107) . As illustrated in Figure 1, we may choose α i = ¯ α i , and get (cid:107) ω k +1 i − ν ki (cid:107) ≤ − ¯ α i ) (cid:107)∇ g i ( z k +1 ) − ∇ g i ( z k ) (cid:107) + (cid:107)∇ g i ( z k +1 ) − ∇ g i ( z k ) (cid:107) = (3 − α i ) (cid:107)∇ g i ( z k +1 ) − ∇ g i ( z k ) (cid:107) . By Assumption 3.15, we have (cid:107) ω k +1 i − ν ki (cid:107) ≤ (3 − α i ) L i (cid:107) z k +1 − z k (cid:107) := c i (cid:107) z k +1 − z k (cid:107) . UCCESSIVE CONVEXIFICATION FOR NON-CONVEX OPTIMAL CONTROL
In the proof of Lemma , by choosing α i = ¯ α i , we are effectively setting ω k +1 i to have thesame convex combination factor as ν ki . Lemma
For any ν ki ∈ ∂ d max (cid:0) , g i ( z k ) + ∇ g i ( z k ) T d (cid:1) , i = e + 1 , . . . , p + q ,there exists ω k +1 i ∈ ∂ max (cid:0) , g i ( z k +1 ) (cid:1) and constant c i ≥ s.t. (cid:107) ω k +1 i − ν ki (cid:107) ≤ c i (cid:107) z k +1 − z k (cid:107) . Proof.
We still have (3.36) in this case. Now, using (3.35), we have ω k +1 i = α i ∇ g i ( z k +1 ), and ν ki = ¯ α i ∇ g i ( z k ). Again, note that α i ∈ [0 ,
1] is free, while ¯ α i ∈ [0 , ν ki . Therefore, from (3.36) we have (cid:107) ω k +1 i − ν ki (cid:107) ≤ (cid:107) ( α i − ∇ g i ( z k +1 ) − (¯ α i − ∇ g i ( z k ) (cid:107) + (cid:107)∇ g i ( z k +1 ) − ∇ g i ( z k ) (cid:107) . Again, choose α i = ¯ α i , and we get (cid:107) ω k +1 i − ν ki (cid:107) ≤ (1 − ¯ α i ) (cid:107)∇ g i ( z k +1 ) − ∇ g i ( z k ) (cid:107) + (cid:107)∇ g i ( z k +1 ) − ∇ g i ( z k ) (cid:107) = (2 − ¯ α i ) (cid:107)∇ g i ( z k +1 ) − ∇ g i ( z k ) (cid:107) . By Assumption 3.15, we have (cid:107) ω k +1 i − ν ki (cid:107) ≤ (2 − ¯ α i ) L i (cid:107) z k +1 − z k (cid:107) := c i (cid:107) z k +1 − z k (cid:107) . Now we are ready to present the proof of Condition 3.18, which makes use of theoptimality conditions of the convex subproblem.
Proof of Condition . First note that(3.37) ω k +1 = ∇ g ( z k +1 ) + e (cid:88) i =1 λ i ω k +1 i + p + q (cid:88) i = e +1 λ i ω k +1 i , Y. MAO, M. SZMUK, X. XU, AND B. AC¸ ıKMES¸E where ω k +1 i ∈ ∂ | g i ( z k +1 ) | , i = 1 , . . . , e and ω k +1 i ∈ ∂ max (cid:0) , g i ( z k +1 ) (cid:1) , i = e +1 , . . . , p + q . Now let us turn to the stationarity condition for the convex subproblem:(3.38) 0 ∈ ∂ d L k ( d ) + N (cid:107) d (cid:107)≤ r k ( d )where N (cid:107) d (cid:107)≤ r k ( d ) is the normal cone of set (cid:107) d (cid:107) ≤ r k at d , which turns out to be c d ,i.e. c ( z k +1 − z k ), where c ≥ L k ( d ) in (3.30),condition in (3.38) is equivalent to(3.39) ∃ ν k ∈ ∂ d L k ( d ) , s.t. 0 = ν k + c ( z k +1 − z k ) , where ν k = ∇ g ( z k ) + (cid:80) ei =1 λ i ν ki + (cid:80) p + qi = e +1 λ i ν ki , where ν ki ∈ ∂ d | g i ( z k ) + ∇ g i ( z k ) T d | , i = 1 , . . . , e and ν ki ∈ ∂ d max (cid:0) , g i ( z k ) + ∇ g i ( z k ) T d (cid:1) , i = e + 1 , . . . , p + q . Therefore,(3.39) can be written as0 = ∇ g ( z k ) + e (cid:88) i =1 λ i ν ki + p + q (cid:88) i = e +1 λ i ν ki + c ( z k +1 − z k ) ⇔ ∇ g ( z k ) − ∇ g ( z k +1 ) + ∇ g ( z k +1 )+ e (cid:88) i =1 λ i ( ν ki − ω k +1 i + ω k +1 i )+ p + q (cid:88) i = e +1 λ i ( ν ki − ω k +1 i + ω k +1 i ) + c ( z k +1 − z k ) ⇔ ∇ g ( z k +1 ) + e (cid:88) i =1 λ i ω k +1 i + p + q (cid:88) i = e +1 λ i ω k +1 i = ∇ g ( z k ) − ∇ g ( z k +1 ) + e (cid:88) i =1 λ i ( ν ki − ω k +1 i )+ p + q (cid:88) i = e +1 λ i ( ν ki − ω k +1 i ) + c ( z k +1 − z k )The left-hand-side of the above equation is exactly ω k +1 as in (3.37), and apply As-sumption 3.15 and Lemma 3.19 and Lemma 3.20 to the right-hand-side, we get (cid:107) ω k +1 (cid:107) ≤ (cid:107)∇ g ( z k ) − ∇ g ( z k +1 ) (cid:107) + p + q (cid:88) i =1 λ i (cid:107) ν ki − ω k +1 i (cid:107) + c (cid:107) z k +1 − z k (cid:107)≤ ( L + p + q (cid:88) i =1 λ i c i + c ) (cid:107) z k +1 − z k (cid:107) . where c i = (3 − α i ) L i for i = 1 , . . . , e and c i = (2 − ¯ α i ) L i for i = e + 1 , . . . , p + q ,where ¯ α i is the convex combination factor for ν ki . Let b = L + (cid:80) p + qi =1 λ i c i + c , wehave (3.32).Now that we verified the strong convergence conditions, Condition 3.17 and Con-dition 3.18, by using [4, Theorem 2.9] and the weak convergence results in the previoussection 3.14, we are in the position to claim the following: UCCESSIVE CONVEXIFICATION FOR NON-CONVEX OPTIMAL CONTROL Theorem
Global Strong Convergence ). Suppose Assumptions , , and hold, then regardless of initial conditions, the sequence { z k } generatedby SCvx algorithm (Algorithm ) always converges to a single limit point, ¯ z , and ¯ z is a stationary point of the non-convex penalty Problem . Furthermore, if ¯ z isfeasible for the original Problem , then it is a KKT point of Problem .
4. Superlinear Convergence Rate.
In this section, we will show that the
SCvx algorithm converges not only globally, but also superlinearly under some additionalmild assumptions. Moreover, we will see that the superlinear rate of convergence isenabled by the structure of the underlying optimal control problem. In other words,the
SCvx algorithm is specifically tailored to solve non-convex optimal control prob-lems, and thus enjoys a faster convergence rate when compared to generic nonlinearprogramming methods, which often converge linearly, if at all.First we note that at each succession of Algorithm 2.1, we are solving a convexprogramming problem, which is best solved using IPMs that employ self-dual embed-ding technique introduced by [21]. A particular advantage of this approach is thatit always produces a strictly complementary solution, thus satisfying the followingassumption:
Assumption
Strict Complementary Slackness ). In addition to theKKT conditions in Theorem , we assume that the following conditions are satisfiedat the local optimum ¯ z , ¯ µ i > , ∀ i ∈ I ac (¯ z ) , ¯ σ j > , ∀ j ∈ J ac (¯ z ) . The next assumption leverages the structure of optimal control problems, and iscrucial in subsequent analysis.
Assumption
Binding ). Let z k → ¯ z . There are at least n u ( N − bindingconstraints in g i (¯ z ) ≤ , i ∈ I ineq and h j (¯ z ) ≤ , j ∈ J ineq . That is, | I ac (¯ z ) | + | J ac (¯ z ) | ≥ n u ( N − . Optimal control problems often observes the bang-bang principle , provided that theHamiltonian is affine in controls, the control set is a convex polyhedron, and thereare no singular arcs [45]. Linear systems with these properties are referred as normal (see Corollary 7.3 of [6]). For nonlinear systems, the non-singular condition can bechecked by sequentially examine the
Lie bracket of the system dynamics (see Section4.4.3 of [31]). If the optimal control is indeed bang-bang , then Assumption 4.2 isobviously satisfied. Once classic example is the bang-bang solution obtained by solvinga minimum time optimal control problem [30]. The algorithm proposed in [26] canalso be used to ensure bang-bang property of the solutions.Note that even if some control constraints are inactive at the optimal solution,as long as there are an equal or greater number of active state constraints, thenAssumption 4.2 still holds true. An interesting example of such a case is that of themaximum-divert planetary landing problem [25] containing both control constraintsand velocity constraints. In this example, the control constraints are inactive onlywhen the velocity constraints are activated.Nevertheless, given the special structure of optimal control problems and theproperties of their solutions, we have the following lemma:2
Y. MAO, M. SZMUK, X. XU, AND B. AC¸ ıKMES¸E
Lemma
Given Assumptions and , the gradient set of active constraints G ac (¯ z ) defined in (3.6) contains a basis of R n z , where n z = n x N + n u ( N − . Inother words, there are at least n z linearly independent vectors in G ac (¯ z ) .Proof. From Assumptions 3.5 and 4.2, we have at least n u ( N −
1) linearly inde-pendent gradient vectors from active control constraints, In addition, from the formu-lation of the original optimal control problem, we know that there are at least n x N equality constraints due to the system dynamics, and their gradient vectors are alsolinearly independent by Assumption 3.5. Therefore, G ac (¯ z ) has at least n z linearlyindependent vectors.Next, we prove several technical lemmas that are instrumental in obtaining the finalconvergence rate result. In this section, we will use { z k } → ¯ z broadly to denote weakconvergence (i.e. { z k } is the subsequence converged to ¯ z ), but if Assumptions 3.15and 3.16 are also satisfied, then the same notation means strong convergence (to asingle limit point). Lemma
Let { y k } be a sequence in R n z such that { y k } (cid:54) = ¯ z and { y k } → ¯ z .Then, there exists ξ ∈ R n z , (cid:107) ξ (cid:107) = 1 , such that for any function g ( · ) ∈ C : R n z → R ,we have a subsequence denoted by k s , s = 1 , , . . . , such that (4.1) lim k s →∞ g ( y k s ) − g (¯ z ) (cid:107) y k s − ¯ z (cid:107) = ∇ g (¯ z ) T ξ. Proof.
Define ξ k := y k − ¯ z (cid:107) y k − ¯ z (cid:107) , such that (cid:107) ξ k (cid:107) = 1. Clearly { ξ k } is bounded, andby Bolzano-Weierstrass theorem (see e.g. [43]), this sequence has a convergent subse-quence { ξ k s } → ξ , i.e., the limit of the left hand side of (4.1) exists, and(4.2) ξ = lim k s →∞ ξ k s = lim k s →∞ y k s − ¯ z (cid:107) y k s − ¯ z (cid:107) , and (cid:107) ξ (cid:107) = (cid:107) ξ k s (cid:107) = 1. Let y k s = ¯ z + ω k s ξ k s , where ω k s = (cid:107) y k s − ¯ z (cid:107) , and ω k s →
0, as k s → ∞ . Then, we havelim k s →∞ g ( y k s ) − g (¯ z ) (cid:107) y k s − ¯ z (cid:107) = lim k s →∞ g (¯ z + ω k s ξ k s ) − g (¯ z ) (cid:107) ω k s ξ k s (cid:107) . Let θ k s := ξ k s − ξ , such that θ k s → k s → ∞ . Now we havelim k s →∞ g ( y k s ) − g (¯ z ) (cid:107) y k s − ¯ z (cid:107) = lim k s →∞ g (¯ z + ω k s ξ + ω k s θ k s ) − g (¯ z ) ω k s = lim k s →∞ g (¯ z + ω k s ξ + ω k s θ k s ) − g (¯ z + ω k s ξ ) ω k s + lim k s →∞ g (¯ z + ω k s ξ ) − g (¯ z ) ω k s . (4.3)Since g ( · ) ∈ C , we apply the mean value theorem, such that the first term of (4.3)becomes lim k s →∞ ω k s θ k s T ∇ g (¯ z + ω k s ξ + ω k s ¯ θ k s ) ω k s = lim k s →∞ ∇ g (¯ z ) T θ k s = 0 , UCCESSIVE CONVEXIFICATION FOR NON-CONVEX OPTIMAL CONTROL θ k s denotes a point that lies on the line segment between 0 and θ k s . Therefore,(4.3) becomeslim k s →∞ g ( y k s ) − g (¯ z ) (cid:107) y k s − ¯ z (cid:107) = lim k s →∞ g (¯ z + ω k s ξ ) − g (¯ z ) ω k s = ∇ g (¯ z ) T ξ, by definition of the directional derivative. Lemma
Let { z k } be the convergent sequence generated by the SCvx algo-rithm, and { z k } → ¯ z . Assume ¯ z is feasible to the original problem, Problem , thenunder Assumptions , , and , there exist β > and δ > such that ∀ z ∈ N (¯ z, δ ) := { z | (cid:107) z − ¯ z (cid:107) ≤ δ } , we have (4.4) J ( z ) − J (¯ z ) ≥ β (cid:107) z − ¯ z (cid:107) , where J ( z ) is the penalty cost defined in (3.2) .Proof. We will prove by contradiction. Assume the statement is false. It meansthat for a given diminishing sequence { ε k } → , k = 1 , , . . . , there exists sequence { y k } (cid:54) = ¯ z and { y k } → ¯ z, k = 1 , , . . . , such that(4.5) J ( y k ) − J (¯ z ) ≤ ε k (cid:107) y k − ¯ z (cid:107) . Now let { y k s } denote a subsequence of { y k } , such that (4.2) holds and (cid:107) ξ (cid:107) = 1. Since¯ z is assumed to be feasible, we have J (¯ z ) = g (¯ z ). Therefore, (4.5) with respect to k s becomes g ( y k s ) + (cid:88) i ∈I eq λ i | g i ( y k s ) | + (cid:88) i ∈I ineq λ i max (cid:0) , g i ( y k s ) (cid:1) + (cid:88) j ∈J ineq τ j max (cid:0) , h j ( y k s ) (cid:1) − g (¯ z ) ≤ ε k s (cid:107) y k s − ¯ z (cid:107) , which can be rewritten as g ( y k s ) − g (¯ z ) + (cid:88) i ∈I eq λ i | g i ( y k s ) − g i (¯ z ) | + (cid:88) i ∈ I ac (¯ z ) λ i max (cid:0) , g i ( y k s ) − g i (¯ z ) (cid:1) + (cid:88) j ∈ J ac (¯ z ) τ j max (cid:0) , h j ( y k s ) − h j (¯ z ) (cid:1) ≤ ε k s (cid:107) y k s − ¯ z (cid:107) . Dividing both sides by (cid:107) y k s − ¯ z (cid:107) , we have g ( y k s ) − g (¯ z ) (cid:107) y k s − ¯ z (cid:107) + (cid:80) i ∈I eq λ i | g i ( y k s ) − g i (¯ z ) |(cid:107) y k s − ¯ z (cid:107) + (cid:80) i ∈ I ac (¯ z ) λ i max (cid:0) , g i ( y k s ) − g i (¯ z ) (cid:1) (cid:107) y k s − ¯ z (cid:107) + (cid:80) j ∈ J ac (¯ z ) τ j max (cid:0) , h j ( y k s ) − h j (¯ z ) (cid:1) (cid:107) y k s − ¯ z (cid:107) ≤ ε k s . Let k s → ∞ , then by Lemma 4.4 and the definition of ξ from (4.2), we have ∇ g (¯ z ) T ξ + (cid:88) i ∈I eq λ i |∇ g i (¯ z ) T ξ | + (cid:88) i ∈ I ac (¯ z ) λ i max (cid:0) , ∇ g i (¯ z ) T ξ (cid:1) + (cid:88) j ∈ J ac (¯ z ) τ j max (cid:0) , ∇ h j (¯ z ) T ξ (cid:1) ≤ . Y. MAO, M. SZMUK, X. XU, AND B. AC¸ ıKMES¸E
Now recall the KKT condition in Theorem 3.6. We subtracting the product of (3.7)and ξ from the above equation, we have (cid:88) i ∈I eq (cid:104) λ i |∇ g i (¯ z ) T ξ | − ¯ µ i ∇ g i (¯ z ) T ξ (cid:105) + (cid:88) i ∈ I ac (¯ z ) (cid:104) λ i max(0 , ∇ g i (¯ z ) T ξ ) − ¯ µ i ∇ g i (¯ z ) T ξ (cid:105) + (cid:88) j ∈ J ac (¯ z ) (cid:104) τ j max(0 , ∇ h j (¯ z ) T ξ ) − ¯ σ j ∇ h j (¯ z ) T ξ (cid:105) ≤ , where ¯ µ i and ¯ σ j are Lagrange multipliers associated with constraints. Due to theexactness property in Theorem 3.9, these three terms are all nonnegative, and by thestrict complementary slackness property in Assumption 4.1, we have ∇ g i (¯ z ) T ξ = 0 , ∀ i ∈ I eq ∪ I ac (¯ z ) , ∇ h j (¯ z ) T ξ = 0 , ∀ j ∈ J ac (¯ z ) , and therefore, (cid:2) ∇ g i (¯ z ) T , ∇ h j (¯ z ) T (cid:3) ξ = 0 , ∀ i ∈ I eq ∪ I ac (¯ z ) , j ∈ J ac (¯ z ) . However, by Lemma 4.3 we know that the column space of [ ∇ g i (¯ z ) T , ∇ h j (¯ z ) T ] con-tains a basis of R n z . Since ξ ∈ R n z , this implies that ξ = 0, which contradicts thefact that (cid:107) ξ (cid:107) = 1, and thus (4.4) holds true.Lemma 4.5 provides an important condition that is satisfied by many optimalcontrol problems. Next, we only need to show that given this condition, the SCvx procedure will indeed converge superlinearly. To proceed, let us first denote the stackof g ( · ) , g i ( · ), and h j ( · ) as G ( · ) ∈ C , and represent J ( · ) by a function composition ψ ( G ( · )), and qualitatively we can write ψ ( G ( · )) = G cost ( · ) + | G eq ( · ) | + max (0 , G ineq ( · )) . Note that ψ ( · ) is convex since both | · | and max(0 , · ) are convex functions. Lemma
Under the assumptions of Lemma , there exists γ > , such that (4.6) ψ (cid:0) G (¯ z ) + ∇ G (¯ z ) T d (cid:1) ≥ ψ ( G (¯ z )) + γ (cid:107) d (cid:107) , ∀ d ∈ R n z . Proof.
First we show that the statement is true for any small step d δ such that (cid:107) d δ (cid:107) ≤ δ , where δ is defined in Lemma 4.5. We have ψ (cid:0) G (¯ z ) + ∇ G (¯ z ) T d δ (cid:1) − ψ ( G (¯ z )) = ψ ( G (¯ z + d δ )) − ψ ( G (¯ z ))+ ψ (cid:0) G (¯ z ) + ∇ G (¯ z ) T d δ (cid:1) − ψ ( G (¯ z + d δ )) ≥ β (cid:107) d δ (cid:107) + o ( (cid:107) d δ (cid:107) ) ≥ β (cid:107) d δ (cid:107) . Let γ := β , then we have (4.6) hold for a small step d δ . Note that the first inequalityis due to Lemma 4.5 and again the fact that o ( (cid:107) d δ (cid:107) ) can be taken out of | · | andmax(0 , · ) and G ( · ) ∈ C .Now to generalize this result to any d ∈ R n z , d (cid:54) = 0, we first define ζ := min (1 , δ/ (cid:107) d (cid:107) ) , UCCESSIVE CONVEXIFICATION FOR NON-CONVEX OPTIMAL CONTROL z ζ := ¯ z + ζd . Denoting ( z ζ − ¯ z ) as d ζ , we have d ζ = ζ d . With this definitionof ζ , one can verify that (cid:107) d ζ (cid:107) ≤ δ . Hence, (4.6) holds true for d ζ . Therefore, we have γζ (cid:107) d (cid:107) = γ (cid:107) d ζ (cid:107) ≤ ψ (cid:0) G (¯ z ) + ∇ G (¯ z ) T d ζ (cid:1) − ψ ( G (¯ z ))= ψ (cid:0) G (¯ z ) + ζ ∇ G (¯ z ) T d (cid:1) − ψ ( G (¯ z ))= ψ (cid:0) (1 − ζ ) G (¯ z ) + ζ (cid:2) G (¯ z ) + ∇ G (¯ z ) T d (cid:3)(cid:1) − ψ ( G (¯ z )) ≤ (1 − ζ ) ψ ( G (¯ z )) + ζψ (cid:0) G (¯ z ) + ∇ G (¯ z ) T d (cid:1) − ψ ( G (¯ z ))= ζ (cid:2) ψ (cid:0) G (¯ z ) + ∇ G (¯ z ) T d (cid:1) − ψ ( G (¯ z )) (cid:3) . The second inequality is due to the convexity of ψ (note that 0 ≤ ζ ≤ ζ , we obtain (4.6). Theorem
Superlinear Convergence ). Given a sequence { z k } → ¯ z gen-erated by the SCvx algorithm (Algorithm ) and suppose that Assumptions , ,and are satisfied, then we have (4.7) (cid:107) z k +1 − ¯ z (cid:107) = o ( (cid:107) z k − ¯ z (cid:107) ) . Proof.
First consider the case without trust region constraints. Since d k := z k +1 − z k is the unconstrained optimal solution to the convex subproblem, we have ∀ d ∈ R n z , ψ (cid:0) G ( z k ) + ∇ G ( z k ) T d (cid:1) ≥ ψ (cid:0) G ( z k ) + ∇ G ( z k ) T d k (cid:1) . Let d = ¯ z − z k . Then we have, ψ (cid:0) G ( z k ) + ∇ G ( z k ) T (¯ z − z k ) (cid:1) − ψ (cid:0) G ( z k ) + ∇ G ( z k ) T ( z k +1 − z k ) (cid:1) ≥ . Together with (4.6), we have γ (cid:107) z k +1 − ¯ z (cid:107) ≤ ψ (cid:0) G (¯ z ) + ∇ G (¯ z ) T ( z k +1 − ¯ z ) (cid:1) − ψ (cid:0) G (¯ z ) (cid:1) + ψ (cid:0) G ( z k ) + ∇ G ( z k ) T (¯ z − z k ) (cid:1) − ψ (cid:0) G ( z k ) + ∇ G ( z k ) T ( z k +1 − z k ) (cid:1) . Since ψ is convex and has a compact domain, it is (locally) Lipschitz continuous (seee.g. [42]). Therefore, we have ψ (cid:0) G ( z k ) + ∇ G ( z k ) T (¯ z − z k ) (cid:1) − ψ (cid:0) G (¯ z ) (cid:1) ≤ L (cid:107) G ( z k ) − G (¯ z ) + ∇ G ( z k ) T (¯ z − z k ) (cid:107) , where L is the Lipschitz constant, and ψ (cid:0) G (¯ z ) + ∇ G (¯ z ) T ( z k +1 − ¯ z ) (cid:1) − ψ (cid:0) G ( z k ) + ∇ G ( z k ) T ( z k +1 − z k ) (cid:1) ≤ L (cid:107) G (¯ z ) − G ( z k ) − ∇ G ( z k ) T (¯ z − z k ) + (cid:2) ∇ G (¯ z ) − ∇ G ( z k ) (cid:3) T (cid:0) z k +1 − ¯ z (cid:1) (cid:107)≤ L (cid:107) G (¯ z ) − G ( z k ) − ∇ G ( z k ) T (¯ z − z k ) (cid:107) + L (cid:107) (cid:2) ∇ G (¯ z ) − ∇ G ( z k ) (cid:3) T (cid:0) z k +1 − ¯ z (cid:1) (cid:107) . Combining the two parts, we obtain (cid:107) z k +1 − ¯ z (cid:107) ≤ Lγ (cid:107) G (¯ z ) − G ( z k ) − ∇ G ( z k ) T (¯ z − z k ) (cid:107) + Lγ (cid:107) (cid:2) ∇ G (¯ z ) − ∇ G ( z k ) (cid:3) T (cid:0) z k +1 − ¯ z (cid:1) (cid:107) . Y. MAO, M. SZMUK, X. XU, AND B. AC¸ ıKMES¸E
Since G ( · ) ∈ C , we have (cid:107) G (¯ z ) − G ( z k ) − ∇ G ( z k ) T (¯ z − z k ) (cid:107) = o ( (cid:107) z k − ¯ z (cid:107) ) . Given the fact that as k → ∞ , ( ∇ G (¯ z ) − ∇ G ( z k )) →
0, we have (cid:107) (cid:2) ∇ G (¯ z ) − ∇ G ( z k ) (cid:3) T (cid:0) z k +1 − ¯ z (cid:1) (cid:107) = o ( (cid:107) z k +1 − ¯ z (cid:107) ) . Combining these two results, we obtain that (cid:107) z k +1 − ¯ z (cid:107) = o ( (cid:107) z k − ¯ z (cid:107) ) . Now, consider the
SCvx algorithm with trust region constraints. From (3.19) in theproof of Lemma 3.11, ρ k →
1, we have that there exists k (cid:48) large enough such that ρ k > ρ for all k ≥ k (cid:48) , where ρ k is the ratio defined in (2.9). This means there mustbe some trust region radius r k (cid:48) > r k ≥ r k (cid:48) for all k ≥ k (cid:48) , because we will notshrink the trust region radius from iteration k (cid:48) on. On the other hand, from the globalconvergence result (Theorems 3.14 and 3.21), we know that as k increases, d k → k (cid:48) , such that (cid:107) d k (cid:107) < r k (cid:48) , ∀ k ≥ k (cid:48) , implying that the trust region constraints will eventually become inactive. Therefore,the above conclusion about the unconstrained problem holds for the trust regionconstrained case as well.With Theorem 4.7, we have in theory established superlinear convergence rate of the SCvx algorithm for many optimal control problems. Next we will have the claimverified through numerical simulations.
5. Numerical Results.
In this section, we present a non-convex quad-rotormotion planning example problem to demonstrate the convergence rate of the
SCvx algorithm. The problem was posed as a fixed-final-time minimum-fuel optimal controlproblem, and included non-convexities such as obstacle keep-out zones and nonlinearaerodynamic drag. See Figure 2 for a visual illustration of the problem setup.Our algorithm was implemented in MATLAB using
CVX [22] and
SDPT3 [49]. Forcomparison, we also implemented the problem using SQP solver
SNOPT [20] and generalpurpose IPM solver MATLAB’s fmincon function. Specifically, for each method wecompared the converged solutions, and the convergence rate attained in computingsaid solutions.We intentionally did not compare the computation time of the algorithms, sincewe lacked sufficient insight into the internal implementation of the underlying algo-rithms (i.e.
CVX , SDPT3 , SNOPT and fmincon ). However, on numerous occasions, wehave implemented the
SCvx algorithm using our in-house generic IPM solver [17],whose customized variant was used in recent autonomous rocket landing experiments[16]. For example, we used the
SCvx algorithm in conjunction with our in-houseIPM to solve real-time onboard quad-rotor motion planning problems (similar to theexample problem presented here) at rates greater than 8 Hz [48].For tractability, we model the system using three degree-of-freedom translationaldynamics, as in [48], and assume that feedback controllers provide sufficiently fastattitude tracking. Unless otherwise stated, assume that the notation defined in thissection supersedes the notation defined in sections 2 and 3. The position, velocity,
UCCESSIVE CONVEXIFICATION FOR NON-CONVEX OPTIMAL CONTROL Fig. 2:
A perspective view of the converged trajectory. The obstacles arerepresented by the black circles. The red dots and blue lines represent thetime discretized positions and thrust vectors, respectively. The green dotsrepresent the initial guess, i.e., a straight line. The motion of the vehicle isfrom left to right. and thrust vectors of the vehicle at time t are given by p ( t ) ∈ R , v ( t ) ∈ R , and T ( t ) ∈ R , respectively. We use g ∈ R to denote the gravity vector. Additionally, m ∈ R ++ and k D ∈ R + are used to denote the vehicle’s mass and drag constant,respectively. In defining k D , we combine the effects of air density, drag referencearea, and coefficient of drag, and assume that all are constant. The continuous-timedynamics of the system are thus given by:˙ p ( t ) = v ( t ) , ˙ v ( t ) = 1 m T ( t ) − k D (cid:107) v ( t ) (cid:107) v ( t ) + g. To implement this problem numerically, we discretize the fixed-final-time problem ofduration t f into N − t . To achievethis, the system is linearized about a trajectory, and then the corresponding discrete-time A - and B - matrices are computed accordingly. To linearize, we define the stateand control vectors as x ( t ) := [ p T ( t ) , v T ( t )] T ,u ( t ) := T ( t ) . Using this notation, the original nonlinear dynamics are expressed as˙ x ( t ) = f (cid:0) x ( t ) , u ( t ) (cid:1) , and the linearized system can be written as˙ x ( t ) = A ( t ) x ( t ) + B ( t ) u ( t ) + z ( t ) , Y. MAO, M. SZMUK, X. XU, AND B. AC¸ ıKMES¸E where z ( t ) := f (cid:0) x ( t ) , u ( t ) (cid:1) − A ( t ) x ( t ) − B ( t ) u ( t ) . For the first succession, we evaluate this linearization about an initialization trajec-tory, and for all subsequent successions we evaluate it about the previous iteration.To complete the discretization process, we assume a first-order-hold (direct collo-cation) on u ( t ) over t ∈ [ t i , t i +1 ] for each i ∈ I − , where I − := { , , . . . , N − } . Thatis we define the control input u ( t ) as u ( t ) = β − ( t ) u i + β + ( t ) u i +1 β − ( t ) := ( t i +1 − t ) / ∆ tβ + ( t ) := ( t − t i ) / ∆ t t ∈ [ t i , t i +1 ] , ∀ i ∈ I − . Noting that the state transition matrix (i.e. the discrete-time A -matrix) associatedwith the linearized system is governed by˙Φ A ( t ) = A ( t )Φ A ( t, t i ) , we perform numerical integration to obtain the following matrices for all i ∈ I − : A d,i := Φ A ( t i +1 , t i ) ,B − d,i := (cid:90) t i +1 t i β − ( τ )Φ A ( t i +1 , τ ) B ( τ ) dτ ,B + d,i := (cid:90) t i +1 t i β + ( τ )Φ A ( t i +1 , τ ) B ( τ ) dτ ,z d,i := (cid:90) t i +1 t i Φ A ( t i +1 , τ ) z ( τ ) dτ . We impose initial and final position, velocity, and thrust constraints, minimum andmaximum thrust magnitude constraints, and thrust tilt constraints. The non-convexminimum thrust magnitude constraint can be imposed by introducing the relaxationvariable Γ( t ) ∈ R , and applying the lossless convexification technique proposed in [2]and employed in [48]. Additionally, to simplify the presentation of the results, we addconstraints that artificially restrict the motion of the vehicle to the horizontal plane.However, we emphasize that the algorithms are solving a three-dimensional problem,not a two-dimensional one.We assume that n obs cylindrical obstacles are present. Defining the sets I := { , , . . . , N } , J := { , , . . . , n obs } , the obstacle avoidance constraint for the j th obstacle at the i th time instance is givenby (cid:107) ∆ p j,i (cid:107) ≥ R obs,j ∆ p j,i := p i − p obs,j (cid:41) ∀ i ∈ I , ∀ j ∈ J . Since these constraints are non-convex, we linearize them about a trajectory ¯ p ( t ), anddefine ∆¯ p j,i := ¯ p i − p obs,j .Additionally, for notational convenience, we define the concatenated solution vec-tor(5.1) X := [ x T , . . . , x TN − , u T , . . . , u TN − ] T , UCCESSIVE CONVEXIFICATION FOR NON-CONVEX OPTIMAL CONTROL Problem : Convex Subproblem for the ( k + 1) th Succession of aQuad-Rotor Motion Planning Problem min u L subject to: L := (cid:88) i ∈I Γ i ∆ t + λ (cid:88) i ∈I − | ν i | + λ (cid:88) j ∈J (cid:88) i ∈I max { , η j,i } p = p ic p N = p fc v = v ic v N = v fc T = T ic T N = T fc x i +1 = A d,i x i + B − d,i u i + B + d,i u i +1 + z d,i + ν i R j − (cid:107) ∆¯ p j,i (cid:107) − ∆¯ p j,i (cid:107) ∆¯ p j,i (cid:107) ∆ p j,i ≤ η j,i [1 0 0] p i = 0 (cid:107) T i (cid:107) ≤ Γ i T min ≤ Γ i ≤ T max cos θ max Γ i ≤ [1 0 0] T i (cid:107) X (cid:107) ≤ r k and use r and r k to denote the initial and current (i.e. at the ( k + 1) th iteration)size of the trust region, respectively. As outlined in section 3, we enforce the trustregion on the entire solution vector, X . A summary of the convex subproblem solvedat each succession is provided in Problem 5.1. Table 1 provides the SCvx algorithmparameters, the problem parameters, and the boundary conditions used to obtain ourresults. Note that the parameters and results are presented assuming an Up-East-North reference frame.Aside from the problem setup, Figure 2 also shows the trajectory correspondingto the converged solution. The obstacles keep-out zones are shown as the blackcircles, and the optimized path is shown in red. The red dots indicate each of the N discretization points, and the blue lines represent the thrust vectors at said timepoints. The large tilt angles generated by the optimization are necessary to counterthe drag force dictated by our choice of k D .Figure 3 shows the positions and velocities of the converged trajectory. Thetransit time for the trajectory is largely dictated by the east velocity of the vehicle,which in turn is dictated by the prescribed travel distance of 10 [m], and prescribedfinal time of 3.0 [s].Figure 4 shows the corresponding thrust magnitude and thrust tilt angle. As seenin the figure, the thrust magnitude rides the maximum thrust constraint throughmost of the trajectory. The constant altitude constraint thus translates to a tiltangle that provides enough vertical thrust to balance out the force of gravity. Thetrajectory terminates with a quick reversal in thrust direction to bring the vehicle to0 Y. MAO, M. SZMUK, X. XU, AND B. AC¸ ıKMES¸E
Table 1:
SCvx
Parameters, Problem Parameters, and B.C.’sParameter Value Units λ e - r . α . β . (cid:15) tol e − - ρ .
00 - ρ .
25 - ρ . N
31 - t f m T min . T max . θ max
45 [ ◦ ] k D . g [ − .
81 0 0] T [m/s ] n obs R obs, R obs, p obs, [0 3 0 . T [m] p obs, [0 7 − . T [m] p ic [0 0 0] T [m] v ic [0 0 . T [m/s] T ic − mg [N] p fc [0 10 0] T [m] v fc [0 0 . T [m/s] T fc − mg [N]the prescribed final velocity.Figure 5 shows the differences in position, velocity, and thrust between the SCvx and SQP solutions. Evidently, the solutions are effectively identical.Figure 6 shows a comparison between the convergence rates of the
SCvx andSQP algorithms, using the definition of X given in (5.1). The same initial guesswas used to initialize both methods. This result shows that the SCvx algorithminitially converged at a slower rate than that of the SQP algorithm, but that after afew iterations, the
SCvx algorithm attains a far faster convergence rate, which is thetypical fashion of superlinear convergence. Specifically, the
SCvx algorithm convergedto a tolerance of ∆ L tol (see Table 1) in 11 iterations, whereas the SQP algorithmobtained about the same level of convergence in approximately 16 iterations. It isworthwhile mentioning that the first few iterations of the SCvx convergence processwere spent decreasing the virtual control, ν i , and virtual buffer zone, η j,i , terms(see Problem 5.1), effectively avoiding artificial infeasibility, and recovering physicalfeasibility. The remaining iterations refined the physically feasible trajectory to thespecified tolerance.Lastly, Figure 7 also compares the convergence rates with the addition of IPM. UCCESSIVE CONVEXIFICATION FOR NON-CONVEX OPTIMAL CONTROL fmincon ’s implementation, performs poorlyin solving this problem. It struggles to find a descent direction, and takes a largenumber of iterations (around 75) to achieve the same level of accuracy.
6. Conclusion.
In this paper, we have proposed an enhanced version of the
SCvx algorithm, which is able to solve optimal control problems with nonlinear dynamicsand non-convex state and control constraints. The algorithm computes solutions bysolving a sequence of convex optimization subproblems, each obtained by linearizingthe non-convexities around the solution of the previous iteration. Each subproblememploys virtual controls, virtual buffer zones, and a trust region as safe-guardingmechanisms against artificial infeasibility and artificial unboundedness. Although thelinearization acts as an approximation during the convergence process, the convergedsolution solves the original non-convex optimal control problem exactly and with localoptimality.Further, to set the
SCvx algorithm apart from existing SCP-like methods, wehave given a thorough analysis of the convergence properties of the
SCvx algorithm,providing proofs of both global convergence (weak and strong) and superlinear conver-gence rate. These theoretical results were further validated by numerical simulationsof a non-convex quad-rotor motion planning example problem. Notably, our simula-tion results showed that while the
SCvx algorithm made slower progress early in theconvergence process, it ultimately converged to the specified tolerance in less (and insome cases much less) number of iterations than competing solvers.In summary, the main technical result of this paper is twofold: (1) limit point ofthe
SCvx algorithm is guaranteed to be a local optimum of the original non-convexoptimal control problem (assuming it contains zero virtual control and virtual bufferzone terms), and (2) the
SCvx algorithm converges superlinearly. Since the solutionprocess is distilled into a sequence of subproblems that are guaranteed to be convex,the
SCvx algorithm is well suited for real-time autonomous applications, as demon-strated in [48, 47].
Acknowledgments.
The authors gratefully acknowledge John Hauser of Uni-versity of Colorado for his valuable insights.
REFERENCES[1]
P. Absil, R. Mahony, and B. Andrews , Convergence of the iterates of descent methods foranalytic cost functions , SIAM Journal on Optimization, 16 (2005), pp. 531–547, https://doi.org/10.1137/040605266.[2]
B. Ac¸ıkmes¸e and L. Blackmore , Lossless convexification of a class of optimal control prob-lems with non-convex control constraints , Automatica, 47 (2011), pp. 341–347.[3]
B. Ac¸ıkmes¸e, J. Carson, and L. Blackmore , Lossless convexification of non-convex controlbound and pointing constraints of the soft landing optimal control problem , IEEE Trans-actions on Control Systems Technology, 21 (2013), pp. 2104–2113.[4]
H. Attouch, J. Bolte, and B. F. Svaiter , Convergence of descent methods for semi-algebraicand tame problems: proximal algorithms, forward–backward splitting, and regularizedgauss–seidel methods , Mathematical Programming, 137 (2013), pp. 91–129.[5]
F. Augugliaro, A. P. Schoellig, and R. D’Andrea , Generation of collision-free trajecto-ries for a quadrocopter fleet: A sequential convex programming approach , in InternationalConference on Intelligent Robots and Systems (IROS), IEEE, 2012, pp. 1917–1922.[6]
L. D. Berkovitz , Optimal control theory , Springer-Verlag, 1974.[7]
J. T. Betts , Survey of numerical methods for trajectory optimization , Journal of guidance,control, and dynamics, 21 (1998), pp. 193–207. Y. MAO, M. SZMUK, X. XU, AND B. AC¸ ıKMES¸E[8]
L. Blackmore , Autonomous precision landing of space rockets , The Bridge on Frontiers ofEngineering, 4 (2016), pp. 15–20.[9]
P. T. Boggs and J. W. Tolle , Sequential quadratic programming , Acta numerica, 4 (1995),pp. 1–51.[10]
J. Bolte, A. Daniilidis, and A. Lewis , The Lojasiewicz inequality for nonsmooth subanalyticfunctions with applications to subgradient dynamical systems , SIAM Journal on Optimiza-tion, 17 (2007), pp. 1205–1223.[11]
F. H. Clarke , Generalized gradients and applications , Transactions of the American Mathe-matical Society, 205 (1975), pp. 247–262.[12]
F. H. Clarke , A new approach to lagrange multipliers , Mathematics of Operations Research,1 (1976), pp. 165–174.[13]
A. R. Conn, N. I. Gould, and P. L. Toint , Trust region methods , vol. 1, SIAM, 2000.[14]
E. Dall’Anese, H. Zhu, and G. B. Giannakis , Distributed optimal power flow for smartmicrogrids , IEEE Transactions on Smart Grid, 4 (2013), pp. 1464–1475, https://doi.org/10.1109/TSG.2013.2248175.[15]
A. Domahidi, E. Chu, and S. Boyd , ECOS: An SOCP solver for embedded systems , in Euro-pean Control Conference (ECC), IEEE, 2013, pp. 3071–3076.[16]
D. Dueri, B. Ac¸ıkmes¸e, D. P. Scharf, and M. W. Harris , Customized real-time interior-point methods for onboard powered-descent guidance , Journal of Guidance, Control, andDynamics, 40 (2016), pp. 197–212.[17]
D. Dueri, J. Zhang, and B. Ac¸ikmese , Automated custom code generation for embedded,real-time second order cone programming , in 19th IFAC World Congress, 2014, pp. 1605–1612.[18]
R. Fletcher , Practical methods of optimization, 2nd Edition , vol. 2, Wiley, 1987.[19]
D. Garg, M. Patterson, W. W. Hager, A. V. Rao, D. A. Benson, and G. T. Hunting-ton , A unified framework for the numerical solution of optimal control problems usingpseudospectral methods , Automatica, 46 (2010), pp. 1843–1851.[20]
P. E. Gill, W. Murray, and M. A. Saunders , SNOPT: An SQP algorithm for large-scaleconstrained optimization , SIAM Rev., 47 (2005), pp. 99–131.[21]
A. J. Goldman and A. W. Tucker , Theory of linear programming , Linear inequalities andrelated systems, 38 (1956), pp. 53–97.[22]
M. Grant and S. Boyd , CVX: Matlab software for disciplined convex programming, version2.1 . http://cvxr.com/cvx, Mar. 2014.[23]
W. W. Hager, J. Liu, S. Mohapatra, A. V. Rao, and X.-S. Wang , Convergence rate for agauss collocation method applied to constrained optimal control , SIAM Journal on Controland Optimization, 56 (2018), pp. 1386–1411, https://doi.org/10.1137/16M1096761.[24]
S. P. Han and O. L. Mangasarian , Exact penalty functions in nonlinear programming , Math-ematical programming, 17 (1979), pp. 251–269.[25]
M. W. Harris and B. Ac¸ıkmes¸e , Maximum divert for planetary landing using convex opti-mization , Journal of Optimization Theory and Applications, 162 (2014), pp. 975–995.[26]
M. W. Harris and B. Ac¸ıkmes¸e , Minimum time rendezvous of multiple spacecraft usingdifferential drag , Journal of Guidance, Control, and Dynamics, 37 (2014), pp. 365–373.[27]
B. Houska, H. Ferreau, and M. Diehl , ACADO Toolkit – An Open Source Frameworkfor Automatic Control and Dynamic Optimization , Optimal Control Applications andMethods, 32 (2011), pp. 298–312.[28]
D. Hull , Conversion of optimal control problems into parameter optimization problems , Jour-nal of Guidance, Control, and Dynamics, 20 (1997), pp. 57–60.[29]
K. Kurdyka , On gradients of functions definable in o-minimal structures , in Annales del’institut Fourier, vol. 48, Chartres: L’Institut, 1950-, 1998, pp. 769–784.[30]
J. P. LaSalle , Study of the basic principles underlying the bang-bang servo , Tech. Rep. GER-5518, (1953).[31]
D. Liberzon , Calculus of variations and optimal control theory: a concise introduction , Prince-ton University Press, 2012.[32]
X. Liu and P. Lu , Solving nonconvex optimal control problems by convex optimization , Journalof Guidance, Control, and Dynamics, 37 (2014), pp. 750–765.[33]
S. Lojasiewicz , Sur les trajectoires du gradient dune fonction analytique , Seminari di geome-tria, 1983 (1982), pp. 115–117.[34]
R. C. Loxton, K. L. Teo, V. Rehbock, and K. F. C. Yiu , Optimal control problems witha continuous inequality constraint on the state and the control , Automatica, 45 (2009),pp. 2250–2257.[35]
Y. Mao, D. Dueri, M. Szmuk, and B. Ac¸ıkmes¸e , Convexification and Real-Time Optimizationfor MPC with Aerospace Applications , Springer International Publishing, Cham, 2019,UCCESSIVE CONVEXIFICATION FOR NON-CONVEX OPTIMAL CONTROL pp. 335–358, https://doi.org/10.1007/978-3-319-77489-3 15.[36] Y. Mao, D. Dueri, M. Szmuk, and B. Akmee , Successive convexification of non-convexoptimal control problems with state constraints , IFAC-PapersOnLine, 50 (2017), pp. 4063–4069, https://doi.org/https://doi.org/10.1016/j.ifacol.2017.08.789.[37]
Y. Mao, M. Szmuk, and B. Ac¸ıkmes¸e , Successive convexification of non-convex optimal con-trol problems and its convergence properties , in 2016 IEEE 55th Conference on Decisionand Control (CDC), 2016, pp. 3636–3641.[38]
J. Mattingley and S. Boyd , Cvxgen: A code generator for embedded convex optimization ,Optimization and Engineering, 13 (2012), pp. 1–27.[39]
Y. Nesterov and A. Nemirovskii , Interior-Point Polynomial Algorithms in Convex Program-ming , Society for Industrial and Applied Mathematics, 1994.[40]
L. S. Pontryagin , Mathematical theory of optimal processes , CRC Press, 1987.[41]
R. T. Rockafellar , Convex Analysis , vol. 28, Princeton University Press, 1970.[42]
W. S. U. M. D. C. Room , Every convex function is locally lipschitz
W. Rudin , Principles of mathematical analysis , vol. 3, McGraw-Hill, 1964.[44]
J. Schulman, Y. Duan, J. Ho, A. Lee, I. Awwal, H. Bradlow, J. Pan, S. Patil, K. Gold-berg, and P. Abbeel , Motion planning with sequential convex optimization and convexcollision checking , The International Journal of Robotics Research, 33 (2014), pp. 1251–1270.[45]
H. J. Sussmann , Lie brackets, real analyticity and geometric control , Differential geometriccontrol theory, 27 (1983), pp. 1–116.[46]
M. Szmuk and B. Ac¸ıkmes¸e , Successive convexification for 6-dof mars rocket powered landingwith free-final-time , ArXiv e-prints, (2018). arXiv:1802.03827.[47]
M. Szmuk, C. A. Pascucci, and B. Aikmee , Real-time quad-rotor path planning for mobileobstacle avoidance using convex optimization , in 2018 IEEE/RSJ International Conferenceon Intelligent Robots and Systems (IROS), Oct 2018, pp. 1–9, https://doi.org/10.1109/IROS.2018.8594351.[48]
M. Szmuk, C. A. Pascucci, D. Dueri, and B. Ac¸ıkmes¸e , Convexification and real-timeon-board optimization for agile quad-rotor maneuvering and obstacle avoidance , in 2017IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Sep 2017,pp. 4862–4868.[49]
K. Toh, M. Todd, and R. Tutuncu , SDPT3 — a Matlab software package for semidefiniteprogramming , Optimization Methods and Software, 11 (1999), pp. 545–581.[50]
A. W¨achter and L. T. Biegler , On the implementation of an interior-point filter line-searchalgorithm for large-scale nonlinear programming , Mathematical Programming, 106 (2006),pp. 25–57, https://doi.org/10.1007/s10107-004-0559-y. Y. MAO, M. SZMUK, X. XU, AND B. AC¸ ıKMES¸E
Fig. 3:
Positions and velocities of the converged trajectory.
Fig. 4:
Thrust magnitude and tilt versus time for the converged tra-jectory. Black lines show the minimum and maximum thrust boundsin the top plot, and the maximum thrust tilt angle in the bottom plot. -12 -10 -8 -6 -4 -2 Fig. 5:
Position, velocity, and thrust differences between the con-verged
SCvx solution and the converged SQP solution.
UCCESSIVE CONVEXIFICATION FOR NON-CONVEX OPTIMAL CONTROL -4 -3 -2 -1 Fig. 6:
Convergence history of the
SCvx (blue) and SQP (red) al-gorithms. The lines show the magnitude of the difference between X at each iteration and the converged solution, X ∗ , and their slopeindicates the rate of convergence. -4 -3 -2 -1 Fig. 7:
Convergence history of the
SCvx (red), SQP (blue) and IPM(black) algorithms. The lines show the magnitude of the differencebetween X at each iteration and the converged solution, X ∗∗