Optimization Based Planner Tracker Design for Safety Guarantees
OOptimization Based Planner–Tracker Design for Safety Guarantees
He Yin (cid:63) , Monimoy Bujarbaruah (cid:63) , Murat Arcak, and Andrew Packard
Abstract — We present a safe-by-design approach to pathplanning and control for nonlinear systems. The planner uses alow fidelity model of the plant to compute reference trajectoriesby solving an MPC problem, while the plant being controlledutilizes a feedback control law that tracks those trajectorieswith an upper-bound on the tracking error. Our main goalis to allow for maximum permissiveness (that is, room forconstraint feasibility) of the planner, while maintaining safetyafter accounting for the tracking error bound. We achieve thisby parametrizing the state and input constraints imposed onthe planner and deriving corresponding parametrized trackingcontrol laws and tracking error bounds, which are computed offline through Sum-of-Squares programming. The parametersare then optimally chosen to maximize planner permissiveness,while guaranteeing safety.
I. I
NTRODUCTION
Path planning and control of automated systems is a highlyresearched topic and a number of approaches exist to tacklethis problem [1], [2], [3]. A widely used approach for pathplanning and control is Model Predictive Control (MPC) [4],[5], [6], where a model is used to predict system statesover a finite horizon, and a sequence of optimal inputs issynthesized by solving a constrained finite time optimizationproblem minimizing a suitably chosen cost function. Thefirst optimal input is applied to the system, and then theprocess is repeated, thus resulting in a receding horizoncontrol strategy. If the “planning” model used for MPCpredictions and the plant have no discrepancy, then the socalled recursive feasibility, as well as stability of such anMPC controller are ensured by suitably choosing “terminalconditions” in the MPC problem [5, Chapter 12]. Such fea-sibility certificates are crucial for safety critical applications,where constraint violations are intolerable at any time duringoperation. However, under mismatch of planning model andthe plant, the MPC optimization problem must be robustified.Feasibility and stability properties of robust MPC havebeen studied in detail over the past few decades [7], [8]. Forlinear systems, Tube MPC [9], [10] is a widely used approachthat solves a computationally efficient convex optimizationproblem for robust control synthesis. Although Tube MPCdesign with feasibility and stability properties are proposedfor nonlinear systems in [11], [12], the control synthesisproblem becomes computationally demanding, due to non-convexity of the resulting optimization problem [13].To alleviate this issue, a typical approach in the pathplanning community is a two layer control architecture ofplanner–tracker design [14], [15], [16], [17], [18], [19]. (cid:63) authors contributed equallyE-mails: { he yin, monimoyb, arcak, apackard } @berkeley.edu. The high-level planning controller is synthesized and im-plemented online using a low-fidelity planning model, andimposes appropriately chosen constraints on the variablesof the planner. The low-level tracking controller applied tothe plant (referred to as the tracker) simultaneously ensuresrobust constraint satisfaction in closed loop by bounding thetracking error in a set. The tracking controller is compu-tationally expensive to synthesize, but it is typically a statefeedback policy, making it cheaper to implement. Hence, it issynthesized offline a-priori, and the policy is invoked duringrun-time. However, if the constraints imposed on the plannerare not chosen appropriately, either the planner can becomeinfeasible, or the error bound can violate tolerable limits,resulting in the tracker violating safety constraints.In this paper we propose an optimization based approachto designing a path planning–tracking algorithm for nonlinearsystems. We extend the class of applicable systems beyondthe ones considered in [15], [18]. Our contributions are:1) We introduce an approach for parametrizing the stateand input constraints in the MPC planner with param-eters θ . Using SOS programming [20], [21] offline , wesynthesize a parametric error bound O θ for the con-tainment of error between planner and tracker states,and an associated feedback control policy κ θ .2) We then solve an optimization problem offline to pickthe optimal parameter θ (cid:63) , that gives the “widest” plan-ner state constraint set ˆ X θ (cid:63) such that, when enlargedby the error bound O θ (cid:63) , it is contained in the constraintset X . Contrary to approaches such as [18], [19], thisprovides a systematic and optimal way of designingthe planner and the associated error bound.3) We solve an MPC problem for the planner imposingconstraints ˆ X θ (cid:63) and ˆ U θ (cid:63) , and use input policy κ θ (cid:63) to control the plant. If the planner MPC problem isfeasible, then satisfaction of all safety constraints areguaranteed for the plant. We demonstrate this with adetailed numerical example. A. Notation
For ξ ∈ R n , R [ ξ ] represents the set of polynomials in ξ with real coefficients, and R m [ ξ ] and R m × p [ ξ ] denote allvector and matrix valued polynomial functions. The subset Σ[ ξ ] := { p = p + p + ... + p M : p , ..., p M ∈ R [ ξ ] } of R [ ξ ] is the set of SOS polynomials in ξ . Row k of a matrix A ,and element k of a vector b are denoted by ( A ) k and ( b ) k respectively. Unless defined otherwise, notation x j denotes avariable x used in the j ’th iteration of an iterative algorithm.The symbol “ ≤ ” represents component-wise inequality. a r X i v : . [ ee ss . S Y ] O c t lanner (MPC) Planner StatesTracking Model Tracker StatesTracking Controller (synthesized by SOS) Error - + Planning Control
Tracker
Planning ModelMPCTracking Control
Fig. 1: Control frameworkII. P
ROBLEM S ETUP
In this paper, the control framework as shown in Fig. 1has two layers: (i) planning layer (planner), where a planningtrajectory with a long time-horizon is generated using aplanning model and Model Predictive Control (MPC); (ii)tracking layer (tracker), where tracking control signals arecomputed for the true plant to track the planned trajectorieswith bounded error.
A. Tracking Model
The high-fidelity model of the plant is referred to as thetracking model. This is an uncertain, input-affine, nonlinearsystem with parametric uncertainty δ ( t ) , and is given as ˙ x ( t ) = f ( x ( t ) , δ ( t )) + g ( x ( t ) , δ ( t )) u ( t ) , ∀ t ≥ , (1)where x ( t ) ∈ X ⊆ R n , u ( t ) ∈ U ⊆ R m , δ ( t ) ∈ ∆ ⊆ R n δ , f : R n × R n δ → R n , and g : R n × R n δ → R n × m . The sets X and U are state and control constraint sets imposed onthe tracking model, and the set ∆ := { δ ∈ R n δ : p δ ( δ ) ≤ } defines the set of disturbances, where p δ : R δ → R isspecified by the designer. B. Planning Model
The low-fidelity model, also referred to as the planningmodel, is a simplified (e.g. linearized) and potentially low-dimensional version of the tracking model, given by ˙ˆ x ( t ) = ˆ f (ˆ x ( t )) + ˆ g (ˆ x ( t ))ˆ u ( t ) , ∀ t ≥ , (2)where ˆ x ( t ) ∈ R ˆ n , ˆ u ( t ) ∈ R ˆ m , ˆ f : R ˆ n → R ˆ n and ˆ g : R ˆ n → R ˆ n × R ˆ m . For now we assume the high- and low-fidelitymodels have the same state dimension ˆ n = n . The notation ˆ n is retained here for use in Section IV, where the case when ˆ n ≤ n is addressed. C. Error Dynamics
Accounting for the the difference between the states of thelow and high-fidelity models yields the error-states e ( t ) = x ( t ) − ˆ x ( t ) . The error dynamics are given as, for all t ≥ , ˙ e ( t ) = f e ( e ( t ) , ˆ x ( t ) , ˆ u ( t ) , δ ( t )) + g e ( e ( t ) , ˆ x ( t ) , δ ( t )) u ( t ) , (3)where f e ( e, ˆ x, ˆ u, δ ) := f ( e + ˆ x, δ ) − ˆ f (ˆ x ) − ˆ g (ˆ x )ˆ u , and g e ( e, ˆ x, δ ) := g ( e + ˆ x, δ ) . Let K U := { κ : R n × R ˆ n × R ˆ m × R n δ → U} define a set of admissible error-state feedbackcontrol law. Notice that (3) allows for dependence on ˆ x whichis an extension to a richer class of systems than [18], [15]. Assumption 1:
Assume the initial condition of error-state, e (0) , starts within the set Ω := { e ∈ R n : p e ( e ) ≤ } , that is, e (0) ∈ Ω , where p e : R n → R is specified by the designer. D. Planner Formulation
In the planner, the MPC that generates online planningtrajectories solves min ˆ U t N − (cid:80) k =0 (ˆ x (cid:62) k | t Q ˆ x k | t + ˆ u k | t R ˆ u k | t ) + ˆ x (cid:62) N | t P N ˆ x N | t s.t. ˆ x k +1 | t = ˆ F d (ˆ x k | t , ˆ u k | t , T s ) , ˆ x k | t ∈ ˆ X , ˆ u k | t ∈ ˆ U , ∀ k ∈ { , ..., N − } , ˆ x t | t = ˆ x ( t ) , ˆ x N | t ∈ ˆ X N ⊆ ˆ X , (4)with Q, R, P N (cid:31) , where ˆ F d is system (2) discretizedwith sampling time T s . Let ˆ x k | t be the predicted plan-ner states at time t with predicted planner inputs ˆ U t =[ˆ u | t , ˆ u | t , . . . , ˆ u k − | t ] ∈ R ˆ m × k for all k ∈ { , ..., N } . Eachprediction instant k ∈ { , ..., N } represents look-ahead timeof kT s . The planner constraint sets are defined as ˆ X := { ˆ x ∈ R ˆ n : ˆ p x (ˆ x ) ≤ ˆ h x } , (5a) ˆ U := { ˆ u ∈ R ˆ m : ˆ p u (ˆ u ) ≤ ˆ h u } , (5b)with ˆ p x : R ˆ n → R , ˆ p u : R ˆ m → R , ˆ h x ∈ R , ˆ h u ∈ R chosen by the designer. Terminal conditions ˆ X N and P N arechosen to ensure feasibility and stability properties [5]. Aftersolving (4) at any time t , we apply the first optimal input ˆ u (cid:63) ( t ) = ˆ u (cid:63) | t only to low-fidelity planner system (2). We thenre-solve (4) at next time instant t + T s . E. Tracker FormulationDefinition 1:
Robust Infinite-Time Forward ReachableSet: Consider the closed-loop error dynamics obtained from(3) for all t ≥ , under a given control law κ ∈ K U as ˙ e ( t ) = f e ( e ( t ) , ˆ x ( t ) , ˆ u ( t ) , δ ( t ))+ g e ( e ( t ) , ˆ x ( t ) , δ ( t )) κ ( e ( t ) , ˆ x ( t ) , ˆ u ( t ) , δ ( t )) , (6)with ˆ x ( t ) and ˆ u ( t ) constrained by (5) for all times t ≥ .Then a robust infinite-time forward reachable set O of Ω ,for a given feedback κ , is defined as O := { e ( t ) ∈ R n : ∃ e (0) ∈ Ω , ˆ x : R + → ˆ X , ˆ u : R + → ˆ U ,δ : R + → ∆ , t ≥ , s.t. e ( t ) is a solution to (6) } . e assume that O is a compact set. The tracking con-trol synthesizes a error-state feedback policy u ( t ) = κ ( e ( t ) , ˆ x ( t ) , ˆ u ( t ) , δ ( t )) with κ ∈ K U ensuring containmentof the error-states within such an O . We refer to that O as an“error bound”, and κ as the corresponding “tracking control”law and they can be obtained by following [19] using Sum-of-Squares (SOS) programming. O is a function of ˆ X , ˆ U and ∆ . As the volumes of ˆ X , ˆ U and ∆ increase, we tend to geta larger error bound O . Remark 1:
Note that since the planner MPC problem (4)is not posed in continuous time, the guarantees of feasibilityof planner constraints (5) hold only at sampled time instants,assuming perfect discretization (although this is valid forlinear systems, for nonlinear systems variational methodscan be used for obtaining arbitrarily low discretization errors[22]). We hereby assume that the planner sample frequencyis chosen high enough that all continuous time guaranteeshold for this planner-tracker synthesis work.III. P
ARAMETRIC A PPROACH TO P LANNER –T RACKER D ESIGN
The primary goal is to ensure constraint satisfaction onthe state x ( t ) of the tracker evolving according to (1) underthe control law κ , i.e x ( t ) ∈ X for all t ≥ . For this wemust make sure ˆ X ⊕ O ⊆ X . (7)Note that if ˆ X is chosen to be of small volume, thecorresponding O is small, and it is very likely that (7) willhold, but it might leave too small room for (4) to be feasible.If ˆ X is chosen to be too large, (7) might be violated. Toaddress this trade-off between planner permissiveness andtracker safety, we propose a parametric approach, where weparametrize planner constraint sets (5) as ˆ X θ and ˆ U θ , θ ∈ Θ .The set Θ is defined as Θ := { θ ∈ R n θ : p θ ( θ ) ≤ } , where p θ : R θ → R is picked by the user. Correspondingly, wecompute a parametric forward reachable set O θ of Ω , where O θ := { e ( t ) : ∃ e (0) ∈ Ω , ˆ x : R + → ˆ X θ , ˆ u : R + → ˆ U θ ,δ : R + → ∆ , t ≥ , s.t. e ( t ) is a solution to (6) } , and its associated parametric control law κ θ . O θ is referredto as a “parametric error bound”. The existence of a specificparameter ˜ θ ∈ Θ which satisfies ˆ X ˜ θ ⊕ O ˜ θ ⊆ X , (8)and for which (4) is feasible, ensures safety: x ( t ) ∈ X forall t ≥ . We parametrize constraint sets (5) using θ ∈ Θ as ˆ X θ := { ˆ x ∈ R ˆ n : ˆ p x (ˆ x ) ≤ ˆ h θx } , (9) ˆ U θ := { ˆ u ∈ R ˆ m : ˆ p u (ˆ u ) ≤ ˆ h θu } . (10)where ˆ h θx : R n θ → R and ˆ h θu : R n θ → R . Given theparametrized constraint sets (9)–(10), we take two steps, A)1) compute a parametric error bound O θ , and an associ-ated feedback policy denoted by κ θ , which may varyas the parameter θ is varied; 2) solve an optimization problem to pick the “best” ˜ θ thatgives the most permissive planner (widest ˆ X ˜ θ ) subjectto the safety constraint (8).These steps are elaborated in the following sections. A. Parametric Error Bound O θ We use the following Theorem 1 to compute a parametricerror bound O θ , as well as an associated feedback controlpolicy, denoted as κ θ . Note that we use the same symbol fora particular real variable in the algebraic statements as wellas the corresponding signal in the dynamical systems, afterdropping the time-series argument.Consider tracker input u defined in (1). We assume the setof constraints on u is a polytope U = { u ∈ R m : Hu ≤ h } ,where H ∈ R N × m , h ∈ R N , and we overload the notation K U as K U := { κ : R n × R ˆ n × R ˆ m × R n δ × R n θ → U} . Theorem 1:
Let Assumption 1 hold. Given the error dy-namics with mappings f e : R n × R ˆ n × R ˆ m × R n δ → R n , g e : R n × R ˆ n × R n δ → R n , γ ∈ R , ˆ X θ ⊆ R ˆ n , ˆ U θ ⊆ R ˆ m , Θ ⊆ R n θ , ∆ ⊆ R n δ , Ω ⊆ R n , H ∈ R N × m and h ∈ R N ,if there exists a C function V : R n × R n θ → R , and κ : R n × R ˆ n × R ˆ m × R n δ × R n θ → R m , such that forall δ ∈ ∆ , ˆ x ∈ ˆ X θ , ˆ u ∈ ˆ U θ , the following constraints hold, ∂V ( e, θ ) ∂e · ( f e ( e, ˆ x, ˆ u, δ ) + g e ( e, ˆ x, δ ) κ ( e, ˆ x, ˆ u, δ, θ )) ≤ , ∀ ( e, θ ) ∈ R n × Θ , s.t. V ( e, θ ) = γ, (11a) { e : V ( e, θ ) ≤ γ } ⊆ { e : Hκ ( e, ˆ x, ˆ u, δ, θ ) ≤ h } , ∀ θ ∈ Θ , (11b) Ω × Θ ⊆ { ( e, θ ) : V ( e, θ ) ≤ γ } , (11c)then the θ -dependent sub-level set O θ := { e : V ( e, θ ) ≤ γ } is a parametric forward reachable set of Ω under the controlpolicy κ θ := κ ( · , · , · , · , θ ) ∈ K U . Proof:
We have, θ as a vector of uncertain parameterswith dynamics ˙ θ = 0 . Let θ (0) = θ , for all possible θ ∈ Θ .For all the augmented states ( e (0) , θ (0)) ∈ Ω × Θ ⊆ { ( e, θ ) : V ( e, θ ) ≤ γ } , (i.e. e (0) ∈ { e : V ( e, θ ) ≤ γ } ), we have e ( t ) ∈ { e : V ( e, θ ) ≤ γ } , implying { e : V ( e, θ ) ≤ γ } is aparametric forward reachable set of Ω . Remark 2:
Since O θ is also a positive invariant set forerror-states for all θ ∈ Θ , after we obtain it based on Ω , itcan then serve as the set of initial conditions for error-states.We use Sum-of-Squares (SOS) programming [20], [21] infinding storage function V and control law κ θ by solvingthe following non-convex optimization problem. We restrict p e ∈ R [ e ] , p θ ∈ R [ θ ] , p δ ∈ R [ δ ] , ˆ p x ∈ R [ˆ x ] , ˆ p u ∈ R [ˆ u ] , f e ∈ R n [( e, ˆ x, ˆ u, δ )] , g e ∈ R n × m [( e, ˆ x, δ )] , V ∈ R [( e, θ )] and κ ∈ R [( e, ˆ x, ˆ u, δ, θ )] . min V,κ,s volume( O θ )s . t . s ∈ R [( e, ˆ x, ˆ u, δ, θ )] , s , s ∈ Σ[( e, θ )] ,s j ∈ Σ[( e, ˆ x, ˆ u, δ, θ )] , ∀ j ∈ { , . . . , } , ( s l ) k ∈ Σ[( e, ˆ x, ˆ u, δ, θ )] , ∀ l ∈ { , . . . , } , (12a) (cid:18) ∂V∂θ (cid:19) i + ( s ) i · p θ ∈ Σ[( e, θ )] , ( s ) i ∈ Σ[( e, θ )] ∀ i ∈ { , . . . , n θ } , (12b) − ∂V∂e · ( f e + g e κ ) − s · ( V − γ ) + s · (ˆ p x − ˆ h θx )+ s · (ˆ p u − ˆ h θu ) + s · p θ + s · p δ ∈ Σ[( e, ˆ x, ˆ u, δ, θ )] , (12c) ( h ) k − ( H ) k κ + ( s ) k · ( V − γ ) + ( s ) k · (ˆ p x − ˆ h θx )+ ( s ) k · (ˆ p u − ˆ h θu ) + ( s ) k · p θ + ( s ) k · p δ ∈ Σ[( e, ˆ x, ˆ u, δ, θ )] , ∀ k ∈ { , ..., N } , (12d) − ( V − γ ) + s · p e + s · p θ ∈ Σ[( e, θ )] , (12e)SOS polynomials s serve as the S-procedure certificates, andare usually referred to as “multiplier polynomials”. Con-straints (12c)–(12e), when feasible, are sufficient conditionsfor (11a)–(11c), respectively. The rationale for constraint(12b) is elaborated in Proposition 1. Solving optimization(12) directly can be challenging, since it is bi-linear indecision variables V and ( κ, s and ( s ) k ) . Similar to [23],in Algorithm 1 we decompose optimization (12) into twoconvex sub-problems to iteratively search between two setsof decision variables. Note that the initialization V toAlgorithm 1 can be computed using [19, Algorithm 2]. Algorithm 1
Computing O θ and κ θ Input: function V such that (12a)–(12e) are feasible byproper choice of s, κ, γ and sub-level sets of V arebounded. Maximum iteration count N iter . Output: ( κ , γ , V ) such that with the volume of O θ havingbeen shrunk. for j = 1 : N iter do γ -step : decision variables: ( s, κ, γ ) .Minimize γ subject to (12a), (12c)–(12e) using V = V j − . This yields ( s j , ( s ) jk , κ j ) , for all k ∈ { , ..., N } and optimal cost γ j . V -step : decision variables: V and all the multiplierpolynomials s except ( s and ( s ) k ) . Maximize thefeasibility [23] subject to (12a)–(12e) as well as s , s ∈ Σ[( e, θ )] , and − s · ( V j − − γ j ) + ( V − γ j )+ s · p θ ∈ Σ[( e, θ )] , (13)using ( γ = γ j , s = s j , ( s ) k = ( s ) jk , κ = κ j ),for all k ∈ { , ..., N } . This yields V j . end for The constraint (13) enforces the sub-level set certified bythe V -step, { ( e, θ ) : V j ( e, θ ) ≤ γ j } , to be contained by thesub-level set from the γ -step, { ( e, θ ) : V j − ( e, θ ) ≤ γ j } , forall θ ∈ Θ . B. Optimal Parameter Selection
Next, we need to pick the “optimal” ˜ θ , denoted by θ (cid:63) ,which gives the “widest” ˆ X ˜ θ subject to (8). The Minkowski sum of ˆ X θ and O θ in (8) can be expressed as follows: ˆ X θ ⊕ O θ = { x ∈ R n : x = ˆ x + e, ˆ p x (ˆ x ) ≤ ˆ h θx , V ( e, θ ) ≤ γ } , = { x ∈ R n : ˆ p x (ˆ x ) ≤ ˆ h θx , V ( x − ˆ x, θ ) ≤ γ } , or = { x ∈ R n : ˆ p x ( x − e ) ≤ ˆ h θx , V ( e, θ ) ≤ γ } . (14)We assume X is a semi-algebraic set, which is a sub-levelset of a given polynomial function p ( · ) . That is X = { x ∈ R n : p ( x ) ≥ } . Optimal Parameter Selection by Sum-of-Squares:
Replac-ing the constraint (8) with the reformulation as in (14), wepose the following optimization problem by applying thepolynomial S-procedure to (14) to obtain θ (cid:63) . Assume that ˆ h θx and ˆ h θu are chosen in a way that when θ grows, ˆ h θx and ˆ h θu grow as well, by making sure ∂ ˆ h θx ∂θ ≥ ∂ ˆ h θu ∂θ ≥ , ∀ θ ∈ Θ . (15)Therefore, to find the most permissive constraint sets for theMPC in (4), the summation of all the elements of θ is chosenas the reward function in max θ,s a ,s b (cid:80) n θ i =1 ( θ ) i s.t. θ ∈ Θ , s a , s b ∈ Σ[( x, e )] ,p + s a · (ˆ p x ( x − e ) − ˆ h θx )+ s b · ( V ( e, θ ) − γ ) ∈ Σ[( x, e )] , (16)where s a and s b are polynomial multipliers. However, (16)is bi-linear in two sets of decision variables: multipliers( s a , s b ) and ( V ( e, θ )) , ˆ h θx ) that are nonlinear functions in θ .Although (16) is convex in ( s a , s b ) when θ is fixed, (16) isnot necessarily convex in θ when fixing ( s a , s b ). We resolvethis issue with the following Proposition. Proposition 1:
Imposing constraints (12b) on V , that is, ∂V∂θ ≤ , for all ( e, θ ) ∈ R n × Θ , ensures O θ a ⊆ O θ b , ∀ θ a ≤ θ b , (17)where θ a , θ b ∈ Θ . Proof: As ∂V∂θ ≤ , for all ( e, θ ) ∈ R n × Θ , we have V ( e, θ a ) ≥ V ( e, θ b ) . If an error-state e satisfies V ( e, θ a ) ≤ γ , then it also satisfies V ( e, θ b ) ≤ γ . Reformulation for Iterative Convex Optimization:
We caniteratively solve (16) with Linear Matrix Inequalities (LMIs)if we do not make θ a decision variable, and instead lookfor its maximum allowable box bound θ such that θ ∈ [0 , ¯ θ ] .Thus, we solve the following reformulated SOS optimizationproblem as a tractable relaxation to (16): max ¯ θ,s a ,s b ,s c (cid:80) n θ i =1 (¯ θ ) i s.t. ¯ θ ∈ Θ ,s a , s b , ( s c ) i ∈ Σ[( x, e, θ )] , ∀ i ∈ { , . . . , n θ } ,p + s a · (ˆ p x ( x − e ) − ˆ h θx ) + s b · ( V ( e, θ ) − γ ) − (cid:80) n θ i =1 ( s c ) i · ( θ ) i (cid:16) (¯ θ ) i − ( θ ) i (cid:17) ∈ Σ[( x, e, θ )] . (18)hen feasible, (18) is a sufficient condition for ˆ X θ ⊕ O θ ⊆ X , ∀ θ ∈ [0 , ¯ θ ] . (19)Most importantly, optimization problem (18) is only bi-linearin ( s c ) i and (¯ θ ) i , and can be solved by iteratively searchingbetween ( s c ) i and (¯ θ ) i using Algorithm 2. Algorithm 2
Optimal θ Selection
Input: ¯ θ such that constraints in (18) are feasible by properchoice of s a , s b , ( s c ) i , for all i ∈ { , ..., n θ } . Output: ¯ θ that has been maximized for j = 1 : N iter do s -step : decision variables: ( s a , s b , ( s c ) i ) .Maximize the feasibility subject to the constraintsin (18), using ¯ θ = ¯ θ j − . This yields ( s c ) ji . ¯ θ -step : decision variables: ( s a , s b , ¯ θ ) .Maximize ¯ θ subject to the constraints in (18) using ( s c ) i = ( s c ) ji for all i ∈ { , ..., n θ } . This yieldsan optima ¯ θ j to (18). end for Assumption 2:
We assume Θ is a box constraint set, andwithout loss of generality we choose as the lower bound. Proposition 2:
Assume Proposition 1 and Assumption 2hold. Assume θ (cid:63) and ¯ θ (cid:63) as the global optima to (16) and(18) respectively. Then, the optimization problems (16) and(18) are equivalent. That is, θ (cid:63) = ¯ θ (cid:63) . Proof:
For all ˜ θ ≤ θ (cid:63) , by (15), it yields ˆ X ˜ θ ⊆ ˆ X θ (cid:63) . Itfollows from (17) that O ˜ θ ⊆ O θ (cid:63) . Since θ (cid:63) satisfies ˆ X θ (cid:63) ⊕O θ (cid:63) ⊆ X , we have ˆ X ˜ θ ⊕ O ˜ θ ⊆ X , for all ˜ θ ≤ θ (cid:63) . Itfollows from Assumption 2 that the feasible set of θ for (16)is F θ := { θ ∈ Θ : 0 ≤ θ ≤ θ (cid:63) } , which from (19) implies θ (cid:63) = ¯ θ (cid:63) . This proves the Proposition.IV. M ODEL REDUCTION
In practice it may be desirable to simplify the low-fidelitymodel further by reducing the state dimension. To make thereduced states comparable to the original states, we definean appropriate map π : R ˆ n → R n , ˆ n ≤ n , and redefine theerror-states as e ( t ) = x ( t ) − π (ˆ x ( t )) . The map π ( · ) needs tobe chosen with care according to the specific application andcontrol objective. Accordingly, for the newly defined error-states, f e and g e in (3) become f e ( e, ˆ x, ˆ u, δ ) := f ( e + π (ˆ x ) , δ ) − ∂π∂ ˆ x ( ˆ f (ˆ x ) + ˆ g (ˆ x )ˆ u ) ,g e ( e, ˆ x, δ ) := g ( e + π (ˆ x ) , δ ) . Without any modification, optimization (12) can still be usedto compute parametric error bounds and control law for theerror dynamics with model reduction. However, the optimiza-tion for finding optimal parameter will need to change, sincethe constraint (8) now becomes π (cid:16) ˆ X θ (cid:17) ⊕ O θ ⊆ X , where π (cid:16) ˆ X θ (cid:17) := { η ∈ R n : η = π (ˆ x ) , ˆ p x (ˆ x ) ≤ ˆ h θx } . Then, the Minkowski sum of π (cid:16) ˆ X θ (cid:17) and O θ amounts to π (cid:16) ˆ X θ (cid:17) ⊕ O θ = { x ∈ R n : x = η + e, η = π (ˆ x ) , ˆ p x (ˆ x ) ≤ ˆ h θx , V ( e, θ ) ≤ γ } , = { x ∈ R n : ˆ p x (ˆ x ) ≤ ˆ h θx , V ( x − π (ˆ x ) , θ ) ≤ γ } . To render the parameter selection process tractable, we lookfor a maximum allowable box bound ¯ θ that makes π (cid:16) ˆ X θ (cid:17) ⊕O θ ⊆ X , ∀ θ ∈ [0 , ¯ θ ] feasible by replacing the constraint in(18) with the following constraint p + s d · (cid:16) ˆ p x (ˆ x ) − ˆ h θx (cid:17) + s e · ( V ( x − π (ˆ x ) , θ ) − γ ) − n θ (cid:88) j =1 ( s f ) j · ( θ ) j (cid:0) (¯ θ ) j − ( θ ) j (cid:1) ∈ Σ[( x, ˆ x, θ )] , where s d , s e , ( s f ) j ∈ Σ[( x, ˆ x, θ )] for all j ∈ { , ..., n θ } .V. P LANNER F EASIBILITY AND T RACKER C ONSTRAINT S ATISFACTION
Once ¯ θ (cid:63) is fixed, the high level planner, that is the MPCjust has to solve the following reformulation of (4): min ˆ U t N − (cid:80) k =0 (ˆ x (cid:62) k | t Q ˆ x k | t + ˆ u k | t R ˆ u k | t ) + ˆ x (cid:62) N | t P N ˆ x N | t s.t. ˆ x k +1 | t = ˆ F d (ˆ x k | t , ˆ u k | t , T s ) , ˆ x k | t ∈ ˆ X ¯ θ (cid:63) , ˆ u k | t ∈ ˆ U ¯ θ (cid:63) , ∀ k ∈ { , . . . , N − } , ˆ x t | t = ˆ x ( t ) , ˆ x N | t ∈ ˆ X N ⊆ ˆ X ¯ θ (cid:63) , (20)with Q, R, P N (cid:31) . We solve (20) at any time t and thenapply the first input ˆ u ( t ) = ˆ u (cid:63) | t (21)to (2). We then re-solve (20) at the next instant t + T s andrepeat in receding horizon fashion. Assumption 3:
We assume recursive feasibility of (20).That is, if (20) is feasible at time t = 0 , it remains feasiblefor all times t ≥ , when (21) is applied to (2).Recursive feasibility of a nonlinear planner can beachieved by picking a “long” prediction horizon N asmentioned in [24], [8]. However in this case, problem(20) remains non-convex. An alternative way of ensuringrecursive feasibility of (20) while solving a convex problemis by resorting to linear time invariant planner dynamics ˆ x ( t + T s ) = ˆ A ˆ x ( t ) + ˆ B ˆ u ( t ) and then appropriately choosingterminal conditions ˆ X N and P N . Matrices ˆ A, ˆ B can bechosen with OLS approximation [25] of (2). Proposition 3:
Let problem (18) be feasible. Let Assump-tion 3 hold true. Assume initial error-states satisfy e (0) ∈ Ω .Then system variable x ( t ) associated to tracker evolvingaccording to (1) satisfies x ( t ) ∈ X for all times t ≥ underthe policy κ ¯ θ (cid:63) . Proof:
Let (18) be feasible and Assumption 3 hold.Since, O ¯ θ (cid:63) is a forward reachable set of Ω under policy κ ¯ θ (cid:63) , we have e (0) ∈ Ω = ⇒ e ( t ) ∈ O ¯ θ (cid:63) , for all t ≥ .Therefore, feasibility of (20) guarantees ˆ X ¯ θ (cid:63) ⊕ O ¯ θ (cid:63) ⊆ X ,implying x ( t ) ∈ X for all t ≥ .I. N UMERICAL E XAMPLE : D
OUBLE P ENDULUM
In this section we present a numerical example withour proposed Algorithm 1 and Algorithm 2. For the fully-actuated double pendulum example from [19], the polyno-mial dynamics obtained from a least-squares approximationfor ( x , x ) ∈ [ − , × [ − , are ˙ x ˙ x ˙ x ˙ x = x f ( x , x , x , x ) x f ( x , x , x , x ) + − .
20 0 − . . (cid:20) u u (cid:21) ,f = − . x + 2 . x x + 1 . x x + 3 . x + 21 . x − . x ,f = 4 . x − . x x − . x x − . x − . x + 77 . x , where x and x are angular positions of the first andsecond links (relative to the first link), x and x are angularvelocities of the first and second links (relative to the firstlink), u and u are torques applied at the joint 1 andjoint 2. The angular positions and applied torques are shownin Fig. 2. The control objectives are: ( i ) to bring x frominitialized ( − . , . , , . to target (0 . , , , andmaintain it there, and ( ii ) to satisfy state constraints X := { ( x , x ) : | x | ≤ . , | x | ≤ . } . (22) A. Planner Parametrization
Based on the control objective, we use a single invertedpendulum as the low-fidelity model to generate planningtrajectories (ˆ x ( t ) , ˆ x ( t )) for the planner. The polynomialdynamics of this low-fidelity planner are given as (cid:20) ˙ˆ x ˙ˆ x (cid:21) = (cid:20) ˆ x − . x + 32 . x (cid:21) + (cid:20) . (cid:21) ˆ u, where ˆ x represents the angular position of the single in-verted pendulum (shown in Fig. 2), ˆ x is the angular velocity,and ˆ u is the torque applied at joint 1. We want ( x ( t ) , x ( t ))
In this example, V is chosen to be a degree-2 polynomialin ( e, θ ) , and κ is chosen to be a degree-4 polynomialin ( e, ˆ x, ˆ u, θ ) . The SOS optimizations in Algorithm 1 areformulated using the sum-of-squares module SOSOPT [26]on MATLAB. and solved by Mosek [27]. After solving (12),we obtain the parametric error bound O θ and the associatedfeedback controller for tracker, κ θ . C. Optimal Planner-Tracker Design
In this section we highlight the “safety by design” aspectof Algorithm 2, as a consequence of solving (18). Insteadof fixing the planner constraint sets ˆ X and ˆ U heuristicallyas in [19], we enmesh the planner-tracker design phases,looking for the best parameter ¯ θ (cid:63) in (23) that satisfies(8). The inclusion of Algorithm 2 inherently ensures safety(satisfaction of constraints (22) by tracker states x ( t ) forall times t ) by design, while simultaneously allowing forthe maximum permissiveness of the planner in (20). Forthe following simulations, we set ˆ x (0) = ( − . , . , i.e. e (0) = (0 , , , . .
1) Failure of Heuristics:
In search for the most per-missive planner, the first planner design scenario involvessetting ˆ X ˜ θ = X , i.e. ˜ θ = (1 , in (23). As expected,the tracker can easily violate safety constraints (22). Forsatisfying (22) by the tracker, we next use our heuristicsand set ˜ θ = (0 . , . and (0 . , . in the next twocases respectively. We see from Fig. 3 and Fig. 4 thatboth O (0 . , . and O (0 . , . cross the safety constraints X given in (22). Hence both the heuristic parameters arerendered invalid. In fact in Fig. 3, we also see the trackertrajectory violating (22). -0.6 -0.4 -0.2 0 0.2 0.4 0.6-1-0.500.51 Fig. 3: Planner design using ˜ θ = (0 . , . . Dashed purplecurve denotes planner trajectory and solid black curve iscorresponding tracker trajectory. Fig. 4: Planner design using ˜ θ = (0 . , . .
2) Optimal Parametrization:
Using our Algorithm 2, thecomputed ¯ θ (cid:63) = (0 . , . , i.e. the most permissiveplanner state constraint set is ˆ X ¯ θ (cid:63) = { ˆ x ∈ R : | ˆ x | ≤ . , | ˆ x | ≤ . } . In Fig. 5, the planner uses ˆ X ¯ θ (cid:63) as the state constraint. We can see that the error bound O ¯ θ (cid:63) around the planner trajectory remains within X , whichguarantees the safety of the tracker trajectory. The trackertrajectory never violates X . This highlights that Algorithm 2provides safety guarantees, and enables the designer to avoidrepeated planner-tracker design in search for safety. -0.6 -0.4 -0.2 0 0.2 0.4 0.6-1-0.500.51 Fig. 5: Planner and tracker design with optimal ¯ θ (cid:63) .VII. C ONCLUSIONS
We presented an optimization based safe-by-design ap-proach of trajectory planning–tracking for nonlinear systems.Instead of heuristically picking the constraints imposed onthe planner, we parametrized them with additional designparameters. Consequently, the tracking error bound and thetracking control law are parametrized too, and are com-puted through Sum-of-Squares programming (Algorithm 1).The optimal design parameters are chosen (Algorithm 2)specifically ensuring tracker safety along with maximumpermissiveness of the planner.A
CKNOWLEDGEMENTS
We thank professor Francesco Borrelli for providing help-ful commetns. This work was supported in part by the grantsONR-N00014-18-1-2209, ONR-N00014-18-1-2833, AFOSRFA9550-18-1-0253, and NSF ECCS-1906164. R
EFERENCES[1] S. M. LaValle,
Planning algorithms . Cambridge university press,2006.[2] B. Paden, M. ˇC´ap, S. Z. Yong, D. Yershov, and E. Frazzoli, “Asurvey of motion planning and control techniques for self-drivingurban vehicles,”
IEEE Transactions on Intelligent Vehicles , vol. 1,no. 1, pp. 33–55, March 2016.[3] D. Gonz´alez, J. P´erez, V. Milan´es, and F. Nashashibi, “A review of mo-tion planning techniques for automated vehicles,”
IEEE Transactionson Intelligent Transportation Systems , vol. 17, no. 4, pp. 1135–1145,2015.[4] J. B. Rawlings and D. Q. Mayne,
Model predictive control: Theoryand design . Nob Hill Pub., 2009.[5] F. Borrelli, A. Bemporad, and M. Morari,
Predictive control for linearand hybrid systems . Cambridge University Press, 2017.[6] B. Kouvaritakis and M. Cannon,
Model predictive control: Classical,robust and stochastic . Springer, 2016.[7] M. V. Kothare, V. Balakrishnan, and M. Morari, “Robust constrainedmodel predictive control using linear matrix inequalities,”
Automatica ,vol. 32, no. 10, pp. 1361–1379, 1996.[8] D. Q. Mayne, J. B. Rawlings, C. V. Rao, and P. O. Scokaert,“Constrained model predictive control: Stability and optimality,”
Au-tomatica , vol. 36, no. 6, pp. 789–814, 2000.[9] W. Langson, I. Chryssochoos, S. V. Rakovic, and D. Q. Mayne,“Robust model predictive control using tubes,”
Automatica , vol. 40,pp. 125–133, 2004.[10] S. V. Rakovic, B. Kouvaritakis, R. Findeisen, and M. Cannon, “Homo-thetic tube model predictive control,”
Automatica , vol. 48, pp. 1631–1638, 2012.[11] J. K¨ohler, R. Soloperto, M. A. M¨uller, and F. Allg¨ower, “A com-putationally efficient robust model predictive control framework foruncertain nonlinear systems,” submitted to IEEE Transactions onAutomatic Control , 2019.[12] T. Koller, F. Berkenkamp, M. Turchetta, and A. Krause, “Learning-based model predictive control for safe exploration,” in , Dec 2018, pp. 6059–6066.[13] S. Boyd and L. Vandenberghe,
Convex Optimization . New York, NY,USA: Cambridge University Press, 2004.[14] S. Di Cairano and F. Borrelli, “Reference tracking with guaranteederror bound for constrained linear systems,”
IEEE Transactions onAutomatic Control , vol. 61, no. 8, pp. 2245–2250, Aug 2016.[15] S. L. Herbert, M. Chen, S. Han, S. Bansal, J. F. Fisac, and C. J. Tomlin,“FaSTrack: A modular framework for fast and guaranteed safe motionplanning,” in , Dec 2017, pp. 1517–1522.[16] S. Singh, A. Majumdar, J. Slotine, and M. Pavone, “Robust onlinemotion planning via contraction theory and convex optimization,” in , May 2017, pp. 5883–5890.[17] S. Kousik, S. Vaskov, F. Bu, M. Johnson-Roberson, and R. Vasude-van, “Bridging the gap between safety and real-time performance inreceding-horizon trajectory design for mobile robots,” arXiv preprintarXiv:1809.06746 , 2018.[18] S. Singh, M. Chen, S. L. Herbert, C. J. Tomlin, and M. Pavone,“Robust tracking with model mismatch for fast and safe planning: ansos optimization approach,” arXiv preprint arXiv:1808.00649 , 2018.[19] S. Smith, H. Yin, and M. Arcak, “Continuous abstraction of non-linear systems using sum-of-squares programming,” arXiv preprintarXiv:1909.06468 , 2019.[20] P. Parrilo, “Structured semidefinite programs and semialgebraic geom-etry methods in robustness and optimization,”
PhD thesis, CaliforniaInstitute of Technology , 2000.[21] Z. Jarvis-Wloszek, R. Feeley, W. Tan, K. Sun, and A. Packard,“Controls applications of sum of squares programming,” in
PositivePolynomials in Control . Springer, Berlin, Heidelberg, 2005, vol. 312.[22] S. H. Nair and R. N. Banavar, “Discrete optimal control of intercon-nected mechanical systems,” arXiv preprint arXiv:1809.09191 , 2018.[23] H. Yin, M. Arcak, A. Packard, and P. Seiler, “Backward reacha-bility for polynomial systems on a finite horizon,” arXiv preprintarXiv:1907.03225 , 2019.[24] H. Michalska and D. Q. Mayne, “Robust receding horizon controlof constrained nonlinear systems,”
IEEE Transactions on AutomaticControl , vol. 38, no. 11, pp. 1623–1633, Nov 1993.[25] S. Dean, H. Mania, N. Matni, B. Recht, and S. Tu, “On thesample complexity of the linear quadratic regulator,” arXiv preprintarXiv:1710.01688 , 2017.[26] P. Seiler, “SOSOPT: A toolbox for polynomial optimization,” arXivpreprint arXiv:1308.1889 , 2013.[27] M. ApS, “The MOSEK optimization toolbox for MATLAB manual,” http://docs.mosek.com/8.1/toolbox/index.htmlhttp://docs.mosek.com/8.1/toolbox/index.html