[PDF] Efficient Riccati recursion for optimal control problems with pure-state equality constraints

Abstract

A novel approach to efficiently treat pure-state equality constraints in optimal control problems (OCPs) using a Riccati recursion algorithm is proposed. The proposed method transforms a pure-state equality constraint into a mixed state-control constraint such that the constraint is expressed by variables at a certain previous time stage. It is showed that if the solution satisfies the second-order sufficient conditions of the OCP with the transformed mixed state-control constraints, it is a local minimum of the OCP with the original pure-state constraints. A Riccati recursion algorithm is derived to solve the OCP using the transformed constraints with linear time complexity in the grid number of the horizon, in contrast to a previous approach that scales cubically with respect to the total dimension of the pure-state equality constraints. Numerical experiments on the whole-body optimal control of quadrupedal gaits that involve pure-state equality constraints owing to contact switches demonstrate the effectiveness of the proposed method over existing approaches.

Full PDF

EEfﬁcient Riccati recursion for optimal control problemswith pure-state equality constraints

Sotaro Katayama and Toshiyuki Ohtsuka Abstract — A novel approach to efﬁciently treat pure-stateequality constraints in optimal control problems (OCPs) usinga Riccati recursion algorithm is proposed. The proposed methodtransforms a pure-state equality constraint into a mixed state-control constraint such that the constraint is expressed byvariables at a certain previous time stage. It is showed thatif the solution satisﬁes the second-order sufﬁcient conditions ofthe OCP with the transformed mixed state-control constraints,it is a local minimum of the OCP with the original pure-stateconstraints. A Riccati recursion algorithm is derived to solvethe OCP using the transformed constraints with linear timecomplexity in the grid number of the horizon, in contrastto a previous approach that scales cubically with respectto the total dimension of the pure-state equality constraints.Numerical experiments on the whole-body optimal control ofquadrupedal gaits that involve pure-state equality constraintsowing to contact switches demonstrate the effectiveness of theproposed method over existing approaches.

I. I

NTRODUCTION

Optimal control underlies the motion planning and con-trol of dynamical systems such as trajectory optimization(TO) and model predictive control (MPC) [1]. TO achievesversatile and dynamically consistent planning by solvingoptimal control problems (OCPs). MPC leverages the sameadvantages as TO in real-time control by solving an OCPonline within a particular sampling interval. It is essential,particularly for MPC, to solve OCPs within a short com-putational time, even if they involve highly complicateddynamics, a large dimensional state, and a long horizon.Newton-type methods are the most practical methods usedfor solving OCPs in terms of the convergence speed. Oneof the most efﬁcient algorithms that implement the Newton-type methods to solve the OCPs of large-scale systems is theRiccati recursion algorithm [2], [3]. The Riccati recursionalgorithm scales only linearly with respect to the number ofdiscretization grids of the horizon, in contrast to the directmethods (i.e., methods applying the Cholesky decompositiondirectly to the entire Hessian matrix) that scale cubically.For example, differential dynamic programming (DDP) [4]and the iterative linear quadratic regulator (iLQR) [5] com-pute the Newton direction of single-shooting OCPs usingthe Riccati recursion algorithm and successfully solve suchproblems even for signiﬁcantly complicated systems such aslegged robots with large degrees of freedom within very shortcomputational times [6], [7]. Moreover, the Riccati recursionalgorithm is also efﬁcient in solving multiple-shooting OCPs S. Katayama and T. Ohtsuka are with Department of SystemScience, Graduate School of Informatics, Kyoto University,Kyoto, Japan [email protected] , [email protected] [8], which are practical formulations particularly for MPCbecause they can leverage parallel computing and have strongconvergence properties [9], [10].However, there is a drawback of the Riccati recursionalgorithm: it cannot efﬁciently treat pure-state equality con-straints, which often arise, for example, in waypoint con-straints, terminal constraints, switching constraints in hybridsystems such as legged robots [11], [12], and inequality con-straints handled by active-set methods [13]. [13] extended theRiccati recursion algorithm for pure-state equality constraintsand illustrated its effectiveness over the direct method forcertain quadratic programming problems. However, the com-putational time of this method scales cubically with respectto the total dimension of pure-state equality constraints overthe horizon, and it is inefﬁcient when the total dimension islarge. [14] proposed a projection approach to treat pure-stateequality constraints in iLQR efﬁciently. However, this ap-proach can only treat the equality constraints whose relativedegree is 1, for example, velocity-level constraints of second-order systems, and cannot treat position-level constraints forsuch systems, which is a very common and practical classof constraints.Other popular constraint-handling methods used with theRiccati recursion algorithm are the penalty function methodand the augmented Lagrangian (AL) method. For example,[11] used the penalty function method and [12] used theAL method to treat pure-state equality constraints repre-senting the switching constraints arising in OCPs involvingquadrupedal gaits. However, the penalty function methodpractically yields only the approximated solution, as illus-trated by the numerical results obtained in [11], and thereis a trade-off between the accuracy of the solution and thenumerical stability. The AL method can treat constraintsbetter than the penalty function method, for example, itconverges to the optimal solution even if the penalty param-eter remains at a ﬁnite value. However, it generally lacksconvergence speed compared with the Newton-type methodsthat achieve superlinear or quadratic convergence. The ALmethod essentially achieves linear convergence and it yieldssuperlinear convergence only if the penalty parameter goesto inﬁnity [15], which is an impractical assumption. Forexample, [12] used the AL method with DDP to considerthe switching constraints in an OCP of quadruped bouncingmotion. The AL method required a large number of theiterations (up to 300), although a simple 2D robot modelwas used and only one cycle of the bouncing motion wasconsidered, which led to very low-dimensional (only fourdimensions in all) pure-state equality constraints. a r X i v : . [ m a t h . O C ] F e b n this paper, we propose a novel approach to efﬁcientlytreat pure-state equality constraints in OCPs using a Riccatirecursion algorithm. The proposed method transforms a pure-state equality constraint into a mixed state-control constraintsuch that the constraint is expressed by variables at a certainprevious time stage. We show a relationship between an OCPwith the original pure-state constraint (the original OCP) andan OCP with the transformed mixed state-control constraint(the transformed OCP); if the solution satisﬁes the ﬁrst-ordernecessary conditions (FONC) and/or second-order sufﬁcientconditions (SOSC) of the transformed OCP, then the solutionalso satisﬁes the FONC and/or SOSC of the original OCP.Therefore, if we ﬁnd a solution that satisﬁes the SOSC ofthe transformed OCP, it is a local minimum of the originalOCP. We then derive a Riccati recursion algorithm to solvethe transformed OCP with linear time complexity in the gridnumber of the horizon, in contrast with the previous approach[13] that scales cubically with respect to the total dimensionof pure-state equality constraints. Moreover, because theproposed method is in substance a Newton’s method for anoptimization problem with equality constraints, the proposedmethod achieves superlinear or quadratic convergence, whichdistinguishes our approach from the penalty function methodand the AL method in terms of convergence speed. Wepresent numerical experiments of the whole-body optimalcontrol of quadrupedal gaits that involve pure-state equalityconstraints owing to contact switches, which represent theposition-level constraints of a second-order system, anddemonstrate the effectiveness of the proposed method overthe existing approaches, that is, the approach of [13] and theAL method.This paper is organized as follows. In Section II, wetransform an OCP with a pure-state equality constraint intoan OCP with a mixed state-control equality constraint. InSection III, we derive a Riccati recursion algorithm to applyNewton’s method efﬁciently to the OCP with transformedmixed state-control equality constraints. In Section IV, thetheoretical properties of the proposed transformation ofOCPs are described. In Section V, the proposed methodis compared with existing methods and its effectivenessis demonstrated in terms of computational time and con-vergence speed. In Section VI, we conclude with a briefsummary and mention of future work. Notation:

We describe the partial derivatives of a differ-entiable function with respect to certain variables using afunction with subscripts; that is, f x ( x ) denotes ∂f∂x ( x ) and g xy ( x, y ) denotes ∂ g∂x∂y ( x, y ) .II. T RANSFORMATION OF O PTIMAL C ONTROL P ROBLEMWITH P URE -S TATE E QUALITY C ONSTRAINTS

A. Original Optimal Control Problem

We consider the following discrete-time OCP: Find thestate x , ..., x N ∈ R n x and the control input u , ..., u N − ∈ R n u minimizing the cost function J = ϕ ( x N ) + N − (cid:88) i =0 L ( x i , u i ) (1) subject to the state equation x i + f ( x i , u i ) − x i +1 = 0 , i ∈ { , ..., N − } , (2)a pure-state equality constraint φ ( x k ) = 0 , φ ( x k ) ∈ R n c , (3)and the initial state constraint x − ¯ x = 0 , ¯ x ∈ R n x . (4)In the following, we assume the form of the state as x i = (cid:2) q T i v T i (cid:3) T , where q i ∈ R n and v i ∈ R n represent the gen-eralized coordinates and velocity of the system, respectively,and assume the form of the state equation as follows: f ( x i , u i ) := (cid:20) f ( q ) ( x i ) f ( v ) ( x i , u i ) (cid:21) , f ( q ) ( x i ) , f ( v ) ( x i , u i ) ∈ R n . (5)We also assume that k ≥ , n u ≤ n , n c ≤ n , and that theconstraint (3) depends only on the generalized coordinate,that is, its Jacobian is expressed as follows: φ x ( x k ) = (cid:2) φ q ( q k ) O (cid:3) , φ q ( q k ) ∈ R n c × n . (6)The state equation (5) mainly represents a second-order La-grangian system with n degrees of freedom, and a constraint(3), whose Jacobian is of form (6) represents a position-levelconstraint, which is a very common and practical class ofproblem settings. B. Transformation of Optimal Control Problem

To solve the aforementioned OCP efﬁciently, we transformthe original pure-state equality constraint (3) into a mixedstate-control equality constraint that is equivalent to (3) aslong as (2) is satisﬁed. If (2) is satisﬁed, x k = x k − + f ( x k − , u k − )+ f ( x k − + f ( x k − , u k − ) , u k − ) holds and therefore φ ( x k − + f ( x k − , u k − )+ f ( x k − + f ( x k − , u k − ) , u k − )) = 0 (7)is equivalent to (3). Furthermore, because φ ( · ) only dependson the generalized coordinate, (7) is equivalent to φ ( x k − + f ( x k − , u k − ) + g ( x k − + f ( x k − , u k − ))) = 0 , (8)where we deﬁne g ( x ) := (cid:20) f ( q ) ( x )0 (cid:21) . (9)Therefore, the constraint (8) is equivalent to (3) if (2) issatisﬁed. Therefore, we consider (8) instead of (3) in thefollowing. We herein summarize the original and transformedOCPs as follows: Problem 1 – Original OCP:

Find the solution x , ..., x N , u , ..., u N − minimizing (1) subject to (2)–(4). Problem 2 - Transformed OCP:

Find the solution x , ..., x N , u , ..., u N − minimizing (1) subject to (2), (4),and (8).n fact, we have the following relations between thetransformed OCP and the original OCP: If the solution x , ..., x N , u , ..., u N − satisﬁes the FONC of the trans-formed OCP, it also satisﬁes the FONC of the original OCP.If the solution x , ..., x N , u , ..., u N − satisﬁes the SOSCof the transformed OCP, it also satisﬁes the SOSC of theoriginal OCP. Therefore, if we ﬁnd the solution x , ..., x N , u , ..., u N − that satisﬁes the SOSC of the transformed OCP,it is a local minimum of the original OCP. We show thesetheoretical points later in Section IV.It should be noted that it is trivial to apply the proposedapproach to constraints whose relative degree is 1. We thentransform (3) into a mixed state-control constraint repre-sented by x k − and u k − in the same manner as in theaforementioned discussion.It should also be noted that the proposed approach iscompletely different from the classical transformation ofpure-state equality constraints in continuous-time OCPs byconsidering their derivatives with respect to time, for exam-ple, in Section 3.4 of [16]. To explain this difference, weconsider that there is a pure-state constraint of the form of(3) over a time interval. The classical method then transformsthe pure-state constraint over the interval into a combinationof the pure-state equality constraints (3) and ddt φ ( x ) = 0 ata point on the interval and the mixed state-control constraint d dt φ ( x ) = 0 over the interval, where ddt and d dt yield a kindof Lie derivative. Therefore, the classical method still needsto consider the pure-state equality constraints, whereas ourapproach transforms all pure-state equality constraints intothe corresponding mixed state-control constraints. C. Optimality Conditions

We derive the optimality conditions, known as FONC, ofthe transformed OCP. We ﬁrst deﬁne the Hamiltonian H ( x, u, λ ) := L ( x, u ) + λ T f ( x, u ) and ˜ H ( x, u, λ, ν ):= H ( x, u, λ ) + ν T φ ( x + f ( x, u ) + g ( x + f ( x, u ))) . We also deﬁne the intermediate time stages in which the con-straint is not active as ¯ I := { , ..., k − , k − , ..., N − } .The optimality conditions are then derived as follows [16]: ϕ T x ( x N ) − λ N = 0 , (10) H T x ( x i , u i , λ i +1 ) + λ i +1 − λ i = 0 (11)and H T u ( x i , u i , λ i +1 ) = 0 (12)for i ∈ ¯ I , ˜ H T x ( x k − , u k − , λ k − , ν ) + λ k − − λ k − = H T x ( x k − , u k − , λ k − ) + λ k − − λ k − + ( I + f T x ( x k − , u k − ))( I + g T x ) φ T x ν = 0 , (13) and ˜ H T u ( x k − , u k − , λ k − , ν )= H T u ( x k − , u k − , λ k − )+ f T u ( x k − , u k − )( I + g T x ) φ T x ν = 0 , (14)where λ , ..., λ N are the Lagrange multipliers with respectto (4) and (2), and ν is that with respect to (8). It should benoted that we have omitted the arguments from φ x and g x in (13) and (14).III. R ICCATI R ECURSION

A. Linearization for Newton’s Method

To apply Newton’s method for the transformed OCP, welinearize the optimality conditions. It should be noted thatwe adopt the direct multiple shooting method [9], that is, weconsider x , ..., x N , u , ..., u N − , λ , ..., λ N , and ν as theoptimization variables.

1) Terminal stage:

At the terminal stage ( i = N ), wehave Q xx,N ∆ x N − ∆ λ N + ¯ l x,N = 0 , (15)where we deﬁne Q xx,N := ϕ xx ( x N ) . Further, we deﬁne ¯ l x,N using the left-hand side of (10).

2) Intermediate stages without equality constraint:

In theintermediate stages without an equality constraint ( i ∈ ¯ I ),we have A i ∆ x i + B i ∆ u i − ∆ x i +1 + ¯ x i = 0 , (16) Q xx,i ∆ x i + Q xu,i ∆ u i + A T i ∆ λ i +1 − ∆ λ i + ¯ l x,i = 0 , (17)and Q T xu,i ∆ x i + Q uu,i ∆ u i + B T i ∆ λ i +1 + ¯ l u,i = 0 , (18)where we deﬁne A i := I + f x ( x i , u i ) , B i := f u ( x i , u i ) , Q xx,i := H xx ( x i , u i , λ i +1 ) , Q xu,i := H xu ( x i , u i , λ i +1 ) , Q uu,i := H uu ( x i , u i , λ i +1 ) . Further, we deﬁne ¯ x i , ¯ l x,i ,and ¯ l u,i using the left-hand sides of (2), (11), and (12),respectively.

3) Intermediate stage with an equality constraint:

At theintermediate stages with an equality constraint ( i = k − ),we have (16) for i = k − and have Q xx,k − ∆ x k − + Q xu,k − ∆ u k − + A T k − ∆ λ k − − ∆ λ k − + C T ∆ ν + ¯ l x,k − = 0 , (19) Q T xu,k − ∆ x k − + Q uu,k − ∆ u k − + B T k − ∆ λ k − + D T ∆ ν + ¯ l u,k − = 0 , (20)and C ∆ x k − + D ∆ u k − + ¯ φ = 0 , (21)where we deﬁne Q xx,k − := ˜ H xx ( x k − , u k − , λ k − , ν ) , Q xu,k − := ˜ H xu ( x k − , u k − , λ k − , ν ) , Q uu,k − :=˜ H uu ( x k − , u k − , λ k − , ν ) , C := φ x ( I + g x ) A k − , D := φ x ( I + g x ) B k − . We further deﬁne ¯ l x,k − , ¯ l u,k − , and ¯ φ using the left-hand sides of (13), (14), and (8), respectively. ) Initial stage: Finally, we have ∆ x + x − ¯ x = 0 . (22)It should be noted that we can apply the Gauss-NewtonHessian approximation, which improves the computationalspeed when the constraints (2) and (8) are too complicatedfor their second-order derivatives to be computed. Q xx,N and Q xx,i , Q xu,i , and Q uu,i for i ∈ { , ..., N − } are thenapproximated using only the cost function (1) and do notdepend on the Lagrange multipliers. B. Derivation of Riccati Recursion

We derive a Riccati recursion algorithm to solve the linearequations for Newton’s method (15)–(22) efﬁciently. As thestandard Riccati recursion algorithm ([2], [3]), our goal is toderive a series of matrices P i and vectors s i such that ∆ λ i = P i ∆ x i − s i (23)holds.

1) Terminal stage:

At the terminal stage ( i = N ), wehave P N = Q xx,N , s N = − ¯ l N . (24)In the forward recursion, we have ∆ x N , and we compute ∆ λ N from (23).

2) Intermediate stages without an equality constraint:

Atthe intermediate stages without an equality constraint ( i ∈ ¯ I ),we have the following standard backward Riccati recursion([2], [3]) for given P i +1 and s i +1 satisfying (23): F i := Q xx,i + A T i P i +1 A i , (25) H i := Q xu,i + A T i P i +1 B i , (26) G i := Q uu,i + B T i P i +1 B i , (27) K i := − G − i H T i , k i := − G − i ( B T i P i +1 ¯ x i − B T i s i +1 +¯ l u,i ) , (28)and P i := F i − K T i G i K i , s i := A T i ( s i +1 − P i +1 ¯ x i ) − ¯ l x,i − H i k i . (29)In the forward recursion, for a particular value of ∆ x i , wecompute ∆ u i from ∆ u i = K i ∆ x i + k i , (30) ∆ λ i from (23), and ∆ x i +1 from (16).

3) Intermediate stage with an equality constraint:

At theintermediate stage with an equality constraint ( i = k − ), weﬁrst deﬁne (25)–(29) for i = k − for the speciﬁc values of P k − and s k − that satisﬁes (23). We then have the relationsthat are used in the forward recursion for k − as follows: (cid:20) ∆ u k − ∆ ν (cid:21) = (cid:20) K k − M (cid:21) ∆ x k − + (cid:20) k k − m (cid:21) , (31)where we deﬁne (cid:20) K k − M (cid:21) := − (cid:20) G k − D T D O (cid:21) − (cid:20) H T k − C (cid:21) (32) and (cid:20) k k − m (cid:21) := − (cid:20) G k − D T D O (cid:21) − (cid:20) B T k − P k − ¯ x k − − B T k − s k − + ¯ l u,k − ¯ φ (cid:21) . (33)We then obtain the backward recursions P k − := F k − − (cid:2) K T k − M T (cid:3) (cid:20) G k − D T D O (cid:21) (cid:20) K k − M (cid:21) (34)and s k − := A T k − ( s k − − P k − ¯ x k − ) − ¯ l x,k − − H k − k k − − C T m. (35) C. Algorithm, Convergence, and Computational Analysis

We summarize the single Newton iteration, that is, thecomputation of the Newton direction for a particular solution,using the proposed Riccati recursion algorithm (Algorithm1). In the ﬁrst step, we formulate the linear equations ofNewton’s method, that is, we compute the coefﬁcient matri-ces and residuals of (15)–(22) (line 1). This step can leverageparallel computing. Second, we perform the backward Ric-cati recursion and compute P i and z i for i ∈ { , ..., N } (lines4–11). Third, we perform the forward Riccati recursion andcompute the Newton directions for all the variables (lines12–20).Because the proposed method is in substance a Newton’smethod for an optimization problem with equality con-straints, it achieves superlinear or quadratic convergence, forexample, by Proposition 4.4.3 of [15], which distinguishesthe convergence behavior of the proposed method from thatof the AL method, popularly used to treat the pure-stateequality constraints with the Riccati recursion algorithm.The AL method achieves superlinear convergence only if thepenalty parameter goes to inﬁnity, which is an impracticalassumption; otherwise, its convergence rate is just linear.It should be noted that we can trivially apply the proposedmethod to OCPs with multiple pure-state equality constraintson the horizon. When there are multiple time stages involvingconstraint (3) on the horizon, we compute the coefﬁcientmatrices and residuals in (19)–(21) for each of the time stagesin line 1 of Algorithm 1, apply line 7 of Algorithm 1 foreach of the time stages in the backward Riccati recursion,and apply line 15 of Algorithm 1 for each of the time stagesin the forward Riccati recursion.The proposed method is particularly efﬁcient when thereare several stages with pure-state equality constraints onthe horizon. Suppose that there is an n c,i -dimensional pure-state equality constraint at each time stage i of the horizon( n c,i = 0 if there is no constraint at stage i ). The proposedmethod then computes the inverse of a matrix whose sizeis ( n u + n c,i ) × ( n u + n c,i ) at each time stage in thebackward recursion. In contrast, the previous approach of[13] requires the computation of the inverse of a matrixof size ( (cid:80) Ni =0 n c,i ) × ( (cid:80) Ni =0 n c,i ) . Broadly speaking, the lgorithm 1 Computation of Newton direction using pro-posed Riccati recursion

Input:

Initial state x ( t ) , the current solution x , ..., x N , u , ..., u N − , and Lagrange multipliers λ , ..., λ N , ν . Output:

Newton directions ∆ x , ..., ∆ x N , ∆ u , ..., ∆ u N − , ∆ λ , ..., ∆ λ N , and ∆ ν . for i = 0 , · · · , N do in parallel Computes the matrices and residuals of linear equa-tion (15)–(22). end for Compute P N and z N from (24). for i = N, · · · , do in serial if i = k − then Computes P k − and z k − from (25)–(27), (32),(33), and (35). else Computes P i and z i from (25)–(29). end if end for Compute ∆ x from (22). for i = 0 , · · · , N − do in serial if i = k − then Compute ∆ u k − , ∆ ν , ∆ λ k − , and ∆ x k − from(31), (23), and (16). else Compute ∆ u i , ∆ λ i , and ∆ x i +1 from (30), (23),and (16). end if end for Compute ∆ λ N from (23).computational burden of the proposed method with respectto the grid number N is O ( N ) , whereas that of the approachin [13] is O ( N ) .It should be noted that it is easy to apply the proposedmethod to the single-shooting methods, for example, DDPand iLQR, by considering only u , ..., u N − and ν as thedecision variables. In the single-shooting methods, beforeline 1 of Algorithm 1, we ﬁrst compute x , ..., x N based on x ( t ) and u , ..., u N − using the state equation (2) sequen-tially. Further, we compute λ N , ..., λ using (10), (11), and(13) based on u , ..., u N − , ν , and x , ..., x N , respectively, inthe backward recursion (lines 5–11 of Algorithm 1). We canthen compute the Newton directions ∆ u , ..., ∆ u N − and ∆ ν using the same (or a similar) forward recursion (lines12–20 of Algorithm 1). However, it is worth noting thatmultiple-shooting methods are practically preferred to single-shooting methods in terms of computational speed underparallel computing and convergence properties [9], [10].IV. T HEORETICAL P ROPERTIES OF O PTIMAL C ONTROL P ROBLEM T RANSFORMATION

We show the theoretical relationships between the trans-formed OCP and the original OCP in this section. The ﬁrsttheorem concerns a stationary point of the transformed OCPand a stationary point of the original OCP.

Theorem 4.1:

Suppose that x , ..., x N , u , ..., u N − , λ , ..., λ N , and ν satisfy the FONC of the transformed OCP.Then, there exist the Lagrange multipliers λ ∗ , ..., λ ∗ N and ν ∗ such that x , ..., x N , u , ..., u N − , λ ∗ , ..., λ ∗ N , and ν ∗ satisfythe FONC of the original OCP. Proof:

First, we deﬁne the intermediate time stageswithout the active constraints of the original OCP as ˜ I := { , ..., k − , k + 1 , ..., N − } . The FONC of the originalOCP is then expressed by (2)–(4), (10), and (11) for i ∈ ˜ I , H T x ( x k , u k , λ ∗ k +1 ) + φ T x ν ∗ + λ ∗ k +1 − λ ∗ k = 0 , (36)and (12) for i ∈ { , ..., N − } , in which (2)–(4) aretrivially satisﬁed because x , ..., x N and u , ..., u N − satisfythe FONC of the transformed OCP. It should be noted thatbecause x , ..., x N and u , ..., u N − satisfy (2) and φ x ( · ) only depends on the generalized coordinate, φ x ( x k ) = φ x ( x k − + f ( x k − , u k − )+ g ( x k − + f ( x k − , u k − ))) (37)holds. Therefore, we describe both the left- and right-handsides of (37) as φ x in this proof. Let λ ∗ i = λ i , i ∈ { , ..., k − , k + 1 , ..., N } . (38)Then, (10), (11) for i ∈ { , ..., k − , k + 1 , ..., N } , and (12)for i ∈ { , ..., k − , k + 1 , ..., N } of the original OCP arereduced to those of the transformed OCP and are, therefore,satisﬁed. Furthermore, let ν ∗ = ν, λ ∗ k = λ k + φ T x ν ∗ , λ ∗ k − = λ k − + ( I + g T x ) φ T x ν ∗ . (39)Then, (11) and (12) for i = k − , k − and (36) of theoriginal OCP are also reduced to (13), (14), (11) and (12)for i = k − , and (11) for i = k of the transformedOCP, respectively, noting that φ x f x ( x k − , u k − ) = φ x g x and φ x f u ( x k − , u k − ) = O , and are, therefore, satisﬁed,which completes the proof.From the proof of Theorem 4.1, we can obtain theLagrange multipliers at a stationary point of the originalOCP corresponding to those of the transformed OCP. Thefollowing theorem concerns the sufﬁciency of the optimality. Theorem 4.2:

Suppose that x , ..., x N , u , ..., u N − , λ , ..., λ N , and ν satisfy the SOSC of the transformed OCP.Then, there exist the Lagrange multipliers λ ∗ , ..., λ ∗ N , and ν ∗ such that x , ..., x N , u , ..., u N − , λ ∗ , ..., λ ∗ N , and ν ∗ satisfythe SOSC of the original OCP. Proof:

Because x , ..., x N , u , ..., u N − , λ , ..., λ N ,and ν also satisfy the FONC of the transformed OCP,we have λ ∗ , ..., λ ∗ N and ν ∗ deﬁned by (38) and (39) suchthat x , ..., x N , u , ..., u N − , λ ∗ , ..., λ ∗ N , and ν ∗ satisfy theFONC of the original OCP from Theorem 4.1. From theassumption of the SOSC of the transformed OCP, we have δx T N Q xx,N δx N + N − (cid:88) i =0 (cid:20) δx T i δu T i (cid:21) T (cid:20) Q xx,i Q xu,i Q T ux,i Q uu,i (cid:21) (cid:20) δx i δu i (cid:21) > (40)for arbitrary δx i and δu i satisfying δx = 0 , ( I + f x,i ) δx i + f u,i δu i − δx i +1 = 0 for i ∈ { , ..., N − } and φ x ( I + x )( I + f x,k − ) δx k − + φ x ( I + g x ) f u,k − δu k − = 0 , wherewe describe f x,i := f x ( x i , u i ) and f u,i := f u ( x i , u i ) for i ∈ { , ..., N − } . We introduce the Hessians of the originalOCP as Q ∗ xx,N := ϕ xx ( x N ) = Q xx,N and Q ∗ xx,i := H xx ( x i , u i , λ ∗ i +1 ) , Q ∗ xu,i := H xu ( x i , u i , λ ∗ i +1 ) , and Q ∗ uu,i := H uu ( x i , u i , λ ∗ i +1 ) for i ∈ ˜ I . Further, we introduce Q ∗ xx,k := H xx ( x k , u k , λ ∗ k +1 , ν ∗ ) , Q ∗ xu,k := H xu ( x k , u k , λ ∗ k +1 , ν ∗ ) ,and Q ∗ uu,k := H uu ( x k , u k , λ ∗ k +1 , ν ∗ ) . We can then completethe proof if δx ∗ N T Q ∗ xx,N δx ∗ N + N − (cid:88) i =0 (cid:20) δx ∗ i T δu ∗ i T (cid:21) T (cid:20) Q ∗ xx,i Q ∗ xu,i Q ∗ ux,i T Q ∗ uu,i (cid:21) (cid:20) δx ∗ i δu ∗ i (cid:21) > (41)holds for arbitrary δx ∗ i and δu ∗ i satisfying δx ∗ = 0 , ( I + f x,i ) δx ∗ i + f u,i δu ∗ i − δx ∗ i +1 = 0 for i ∈ { , ..., N − } and φ x δx ∗ k = 0 . First, we can see that the subspace of thefeasible variations δx ∗ i and δu ∗ i is identical to that of δx i and δu i because (37) holds. Then, we consider δx ∗ i and δu ∗ i as being identical to δx i and δu i . Next, by substituting (38)and (39) into the Hessians of the original OCP, we obtain Q ∗ xx,i = Q xx,i , Q ∗ xu,i = Q xu,i , and Q ∗ uu,i = Q uu,i for i ∈ { , ..., k − , k + 1 , ..., N − } , Q ∗ xx,k = Q xx,k + ν · φ xx , Q ∗ xu,k = Q xu,k , Q ∗ uu,k = Q uu,k , Q ∗ xx,k − = Q xx,k − +( φ T x ν ) · f xx,k − , Q ∗ xu,k − = Q xu,k − + ( φ T x ν ) · f xu,k − , Q ∗ uu,k − = Q uu,k − + ( φ T x ν ) · f uu,k − , Q ∗ xx,k − = Q xx,k − − ( I + f T x,k − )(( φ T x ν ) · g xx )( I + f x,k − ) − ( I + f T x,k − )( I + g T x )( ν · φ xx )( I + g x )( I + f x,k − ) ,Q ∗ xu,k − = Q xu,k − − ( I + f T x,k − )(( φ T x ν ) · g xx ) f u,k − − ( I + f T x,k − )( I + g T x )( ν · φ xx )( I + g x ) f u,k − , and Q ∗ uu,k − = Q uu,k − − f T u,k − ( I + g T x )( ν · φ xx )( I + g x ) f u,k − , where f xx,i := f xx ( x i , u i ) , f xu,i := f xu ( x i , u i ) , f uu,i := f uu ( x i , u i ) , and the notation “ · " denotes vector–tensor mul-tiplication. Since the FONC of the original OCP holds and ν · φ xx = (cid:20) ν · φ qq OO O (cid:21) , we have ( ν · φ xx )( I + f x,k − ) =( ν · φ xx )( I + g x ) and ( ν · φ xx ) f u,k − = O , which yields δx T k ( ν · φ xx ) δx k = δx T k − ( I + g T x )( ν · φ xx )( I + g x ) δx k − .In addition, from the structures of (5), (6), and (9), we have ( φ T x ν ) · f xx,k − = ( φ T x ν ) · g xx , ( φ T x ν ) · f xu,k − = O , and ( φ T x ν ) · f uu,k − = O . By substituting these relations, theleft-hand side of (41) is reduced to the left-hand side of(40), which completes the proof.We summarize the property of the proposed transformationin the next proposition: Proposition 4.3:

Suppose that x , ..., x N , u , ..., u N − , λ , ..., λ N , and ν satisfy the SOSC of the transformed OCP.Then, the solution x , ..., x N , u , ..., u N − is a strict localminimum of the original OCP. Proof:

Because the solution x , ..., x N , u , ..., u N − with the Lagrange multipliers satisﬁes the SOSC of the original OCP, as indicated by Theorem 4.2, the solution x , ..., x N , u , ..., u N − is a strict local minimum of theoriginal OCP.V. N UMERICAL E XPERIMENTS ON W HOLE -B ODY Q UADRUPEDAL G AITS O PTIMIZATION

A. Experimental Settings

To demonstrate the effectiveness of the proposed methodover existing methods, we conducted numerical experimentson the whole-body optimal control of a quadrupedal robotANYmal for various gaits. The equation of motion of thefull 3D model of the quadrupedal robot is of the form of(5). Moreover, a pure-state constraint whose Jacobian is ofthe form (6) is imposed just before each impact between theleg and the ground, which is termed the switching constraint[11], [12]. We compare the following three Riccati recursionalgorithms based on the direct-multiple shooting methodand Gauss-Newton Hessian approximation with various con-straint handling methods: • The proposed method • The Riccati recursion with pure-state constraints [13] • The AL methodWe implemented these three algorithms in C++ and usedPinocchio [17], an efﬁcient C++ library used for rigid-body dynamics and its analytical derivatives, to compute thedynamics and its derivatives of the quadrupedal robot. Weused OpenMP [18] for parallel computing (e.g., line 1 ofAlgorithm 1) and four threads through the following exper-iments. To consider the practical situation, we also imposedinequality constraints on the joint angle limits, joint angularvelocity limits, and joint torque limits of each joint andpolyhedral-approximated friction cone [19] for each contactand impact force. We used the primal-dual interior pointmethod [20] with ﬁxed barrier parameters for the inequalityconstraints. None of the three methods used line search; theyonly used the fraction-to-boundary rule [20] for step-sizeselection. We ﬁxed the instants of the impact between therobot and the ground in the following experiments and didnot treat them as optimization variables as in [11], [12], tofocus on the evaluation of the constraint-handling methods.All experiments were conducted on a laptop with a hexa-coreCPU Intel Core i9-8950HK @2.90 GHz.In the following two experiments, we considered that theOCP converges when the l -norm of the residuals in theKarush—Kuhn–Tucker (KKT) conditions, which we referto as the KKT error, becomes smaller than a prespeciﬁedthreshold. The KKT conditions are composed of the FONCand primal and dual residuals in the inequality constraints.For example, the KKT conditions of the proposed methodare composed of (2), (4), (8), (10)–(14), and the residuals inthe inequality constraints. The KKT conditions of the Riccatirecursion with pure-state constraints [13] and the AL methodare only slightly different depending on the method used totreat pure-state equality constraints. ABLE IS

ETTINGS OF

OCP

S FOR EACH NUMBER OF TROTTING STEPS

No. of trotting steps 2 4 6 8 10Horizon length T N

35 59 83 107 131Total dim. of (3) 12 24 36 48 60

B. Trotting Gait for Different Numbers of Steps

First, we evaluated the performances of the three meth-ods for different total dimensions of pure-state equalityconstraints by considering the trotting gaits of ANYmalwith different numbers of trotting steps . A six-dimensional(three-dimensional for each impact leg) pure-state equalityconstraint (switching constraint) is imposed on the OCPs foreach trotting step. We chose the number of trotting stepsfrom 2, 4, 6, 8, and 10 and measured the CPU time perNewton iteration and the total CPU time until convergence.We summarize the settings of the OCPs (horizon length T , number of grids N , and total dimension of equalityconstraints (3)) in Table I. We carefully tuned the parametersof the AL method; for example, we chose the initial penaltyparameter as p = 5 and the penalty update value as β = 8 ;that is, the AL method updates the penalty parameter as p ← βp . We chose . × − as the tolerance of the KKTerrors.Figure 1 depicts the CPU time per Newton iteration (leftﬁgure) and the total CPU time until convergence (rightﬁgure) of each method. As depicted in the left ﬁgure ofFig. 1, the CPU time per Newton iteration in the proposedmethod was almost the same as that in the AL method,whereas the Riccati recursion with pure-state constraints [13]took more computational time when compared with the othertwo methods. The right ﬁgure of Fig. 1 also indicates thatthe proposed method achieved the fastest convergence. Theproposed method took exactly the same number of iterations(approximately 20) until convergence as the Riccati recursionwith pure-state constraints [13] in all the cases. Therefore,the proposed method was faster than it in terms of thetotal CPU time as in the case of the per Newton iteration.The AL method was signiﬁcantly slower than the othertwo methods with respect to the total CPU time because itrequired approximately 80 iterations in all the cases, althoughwe carefully tuned the AL algorithm. C. Trotting, Jumping, and Running Gait Problems

Next, we investigated the performances of the proposedmethod in three different quadrupedal gaits: trotting, jump-ing, and running gaits , of which the jumping and runninggaits are particularly highly nonlinear and complicated prob-lems. Each jumping step imposes a 12-dimensional pure-state equality constraint, and each running step imposesa 6-dimensional one. We summarize the settings of eachproblem (horizon length T , number of grids N , number ofsteps, total dimension of equality constraints (3), tolerance A supplemental video including the quadrupedal gaits generated by theproposed method is available at https://youtu.be/uX1_58QvPUg.

No. of Trotting Steps C P U T i m e P e r I t e r a t i o n [ m s ] No. of Trotting Steps T o t a l C P U T i m e [ s ] Proposed Pure-state AL

Fig. 1. (Left) CPU time per Newton iteration and (right) total CPU timeuntil convergence for different numbers of trotting steps in the proposedmethod (Proposed), Riccati recursion with pure-state constraints [13] (purestate), and the AL method (AL).TABLE IIS

ETTINGS OF

OCP

S FOR TROTTING , JUMPING , AND RUNNING GAITS .Gait type Trotting Jumping RunningHorizon length T N

143 107 346No. of steps 11 3 26Total dim. of (3) 66 36 156KKT tolerance . × − . × − . × − p init of convergence, and initial penalty parameter of the ALmethod p init ) in Table II. As done in the preceding example,we carefully tuned the parameters of the AL method, forexample, the update rule of the penalty and the Lagrangemultiplier, for each problem. We measured the KKT errorswith respect to the number of iterations, and the total numberof iterations and CPU time until convergence.Figure 2 depicts the log -scaled KKT error of eachmethod for the three gait problems with respect to the numberof iterations. We can see that the convergence behavior of theproposed method was almost the same as that of the Riccatirecursion with pure-state constraints [13]. In contrast, the ALmethod resulted in signiﬁcantly slow convergence becauseit needs to update the penalty parameter and Lagrangemultiplier to reduce the constraint violation, which we cansee in the peaks in the KKT error of the AL method inFig. 2. Figure 3 indicates the number of iterations and totalCPU time until convergence of each method. We can seethat the total number of iterations of the proposed methodwas almost the same as that of the Riccati recursion withpure-state constraints [13], whereas the AL method requireda signiﬁcantly large number of iterations. In addition, aseach iteration of the proposed method was faster than that ofthe Riccati recursion with pure-state constraints [13], as inthe previous experiment, the proposed method achieved thefastest convergence.VI. C ONCLUSIONS

We proposed a novel approach to efﬁciently treat pure-state equality constraints in OCPs with a Riccati recursionalgorithm. The proposed method transforms a pure-stateequality constraint into a mixed state-control constraint such

10 20

No. of Iterations −10−50 l og ( KK T e rr o r ) Trotting

No. of Iterations −10−505

Jumping

No. of Iterations −7.5−5.0−2.50.02.5

Running

Proposed Pure-state AL

Fig. 2. log -scaled KKT errors of the proposed method (Proposed),the Riccati recursion with pure-state constraints [13] (Pure-state), and theAL method (AL) for the three gait problems with respect to the number ofiterations. It should be noted that the KKT errors include the violations ofthe switching constraints. The graphs of the AL method have peaks whenthe penalty parameter and Lagrange multipliers are updated.

80 1,000 2,80001020 N o . o f I t e r a t i o n s Trotting T o t a l C P U T i m e [ s ] Jumping

Running

Fig. 3. Number of iterations and total CPU time until convergenceof the proposed method (Proposed), the Riccati recursion with pure-stateconstraints [13] (Pure-state), and the AL method (AL) for the three gaitsproblems. that the constraint is expressed by variables at a certainprevious time stage. We derived a Riccati recursion algorithmto solve the transformed OCP with linear time complexity inthe grid number of the horizon, in contrast to the previousapproach [13], which scaled cubically with respect to the to-tal dimension of the pure-state equality constraints. Becausethe proposed method is essentially a Newton’s method for anoptimization problem with equality constraints, the proposedmethod achieves superlinear or quadratic convergence, whichdistinguishes our approach from the penalty function methodand the AL method in terms of the convergence property. Weshowed that if the solution satisﬁes the FONC and/or SOSCof the transformed OCP, then the solution also satisﬁes theFONC and/or SOSC of the original OCP. Therefore, if weﬁnd a solution that satisﬁes the SOSC of the transformedOCP, it is a local minimum of the original OCP. We per-formed numerical experiments on the whole-body optimalcontrol of quadrupedal gaits that involve pure-state equalityconstraints owing to contact switches and demonstrated theeffectiveness of the proposed method over the approach of [13] and the AL method.Our future work will include applying the proposedmethod with switching time optimization problems [11],[12]. We further extend the proposed method to constraintswhose relative degree is larger than 2.R

EFERENCES[1] L. Magni, D. Raimondo, and F. Allgöwer,

Nonlinear Model PredictiveControl: Towards New Challenging Applications . Springer, 2008.[2] G. Frison, “Algorithms and methods for high-performance model pre-dictive control,” Ph.D. dissertation, Technical University of Denmark,2016.[3] I. Nielsen, “Structure-exploiting numerical algorithms for optimalcontrol,” Ph.D. dissertation, Linköping University, 2017.[4] Y. Tassa, T. Erez, and E. Todorov, “Synthesis and stabilization ofcomplex behaviors through online trajectory optimization,” in , 2012, pp. 4906–4913.[5] E. Todorov and W. Li, “A generalized iterative LQG method forlocally-optimal feedback control of constrained nonlinear stochasticsystems,” in

Proceedings of the 2005, American Control Conference(ACC) , 2005, pp. 300–306.[6] J. Koenemann, A. Del Prete, Y. Tassa, E. Todorov, O. Stasse,M. Bennewitz, and N. Mansard, “Whole-body model-predictive con-trol applied to the HRP-2 humanoid,” in , 2015, pp. 3346–3351.[7] M. Neunert, M. Stäuble, M. Giftthaler, C. D. Bellicoso, J. Carius,C. Gehring, M. Hutter, and J. Buchli, “Whole-body nonlinear modelpredictive control through contacts for quadrupeds,”

IEEE Roboticsand Automation Letters , vol. 3, no. 3, pp. 1458–1465, 2018.[8] A. Z. D. Kouzoupis, G. Frison and M. Diehl, “Recent advancesin quadratic programming algorithms for nonlinear model predictivecontrol,”

Vietnam Journal of Mathematics , vol. 46, no. 4, pp. 863–882,2018.[9] H. Bock and K. Plitt, “A multiple shooting algorithm for direct solutionof optimal control problems,” in , 1984, pp.1603–1608.[10] J. Albersmeyer and M. Diehl, “The lifted Newton method and itsapplication in optimization,”

SIAM Journal on Optimization , vol. 20,no. 3, pp. 1655–1684, 2010.[11] F. Farshidian, M. Neunert, A. W. Winkler, G. Rey, and J. Buchli,“An efﬁcient optimal planning and control framework for quadrupedallocomotion,” in , 2017, pp. 93–100.[12] H. Li and P. M. Wensing, “Hybrid systems differential dynamicprogramming for whole-body motion planning of legged robots,”

IEEERobotics and Automation Letters , vol. 5, no. 4, pp. 5448–5455, 2020.[13] A. Sideris and L. A. Rodriguez, “A Riccati approach for constrainedlinear quadratic optimal control,”

International Journal of Control ,vol. 84, no. 2, pp. 370–380, 2011.[14] M. Giftthaler and J. Buchli, “A projection approach to equality con-strained iterative linear quadratic optimal control,” in ,2017, pp. 61–66.[15] D. P. Bertsekas,

Nonlinear Programming , 3rd ed. Athena Scientiﬁc,2016.[16] A. E. Bryson and Y.-C. Ho,

Applied Optimal Control: Optimization,Estimation, and Control . CRC Press, 1975.[17] J. Carpentier, G. Saurel, G. Buondonno, J. Mirabel, F. Lamiraux,O. Stasse, and N. Mansard, “The Pinocchio C++ library – A fastand ﬂexible implementation of rigid body dynamics algorithms andtheir analytical derivatives,” in

International Symposium on SystemIntegration (SII) , 2019, pp. 614 – 619.[18] L. Dagum and R. Menon, “OpenMP: An industry-standard APIfor shared-memory programming,”

IEEE Computational Science &Engineering , vol. 5, no. 1, p. 46–55, 1998.[19] S. Caron, Q.-C. Pham, and Y. Nakamura, “Leveraging cone doubledescription for multi-contact stability of humanoids with applicationsto statics and dynamics,” in

Robotics: Science and System (RSS) , 2015.[20] A. Wächter and L. Biegler, “On the implementation of an interior-pointﬁlter line-search algorithm for large-scale nonlinear programming,”