Covariance Steering with Optimal Risk Allocation
CChance-Constrained Optimal Covariance Steeringwith Iterative Risk Allocation
Joshua Pilipovsky * Panagiotis Tsiotras † Abstract —This paper extends the optimal covariance steeringproblem for linear stochastic systems subject to chance con-straints to account for optimal risk allocation. Previous workshave assumed a uniform risk allocation to cast the optimal controlproblem as a semi-definite program (SDP), which can be solvedefficiently using standard SDP solvers. We adopt the IterativeRisk Allocation (IRA) formalism from [1], which uses a two-stage approach to solve the optimal risk allocation problemfor covariance steering. The upper-stage of IRA optimizes therisk, which is proved to be a convex problem, while the lower-stage optimizes the controller with the new constraints. This isdone iteratively so as to find the optimal risk allocation thatachieves the lowest total cost. The proposed framework resultsin solutions that tend to maximize the terminal covariance,while still satisfying the chance constraints, thus leading toless conservative solutions than previous methodologies. We alsointroduce two novel convex relaxation methods to approximatequadratic chance constraints as second-order cone constraints.We finally demonstrate the approach to a spacecraft rendezvousproblem and compare the results.
I. I
NTRODUCTION
In this paper we address the problem of finite-horizonstochastic optimal control of a discrete linear time-varying(LTV) system with time-independent white-noise Gaussiandiffusion. The control task is to steer the state from an initialGaussian distribution to a final Gaussian distribution withknown statistics. In addition to the boundary conditions, weconsider chance constraints that restrict the probability of vio-lating the state constraints to be less than a certain threshold.Hard state constraints are difficult to impose in stochasticsystems because the noise can be unbounded, so chanceconstraints are used to deal with this problem by imposinga small, but finite, probability of violating the constraints.In the literature, there are two kinds of chance constraints; individual and joint [2]. Individual chance constraints limitthe probability of violating each constraint, while joint chanceconstraints limit the probability of violating any constraint overthe whole time horizon. In this paper, we consider the caseof joint chance constraints, because they are a more naturalchoice for most applications.The control of stochastic systems can be best formulatedas a problem of controlling the distribution of trajectoriesover time. Moreover, Gaussian distributions are completely * J. Pilipovsky is a Graduate student at the School of Aerospace Engineer-ing, Georgia Institute of Technology, Atlanta, GA 30332-0150, USA. Email:[email protected] † P. Tsiotras is the David & Lewis Chair and Professor at the School ofAerospace Engineering and the Institute for Robotics & Intelligent Machines,Georgia Institute of Technology, Atlanta, GA 30332-0150, USA. Email:[email protected] characterized by their first and second moments, so the controlproblem can be thought of as one of steering the meanand the covariance to their terminal values. The problem ofcovariance control has a history dating back to the ’80s, withthe works of Hotz and Skelton [3], [4]. Much of the earlywork focused on the infinite horizon problem, where the statecovariances asymptotically approach their terminal values.Only recently has the finite-horizon problem drawn attention,with much of the early work focusing on the covariancesteering (CS) problem, namely, with the problem of steeringan initial distribution to a final distribution at a specific finaltime step subject to LTV dynamics. The problem could bethought of as a linear-quadratic Gaussian (LQG) problemwith a condition on the terminal covariance [5]. Moreover,it has been shown that the finite-horizon controller can beconstructed as a state-feedback controller and the problem canbe formulated as a convex program [5], [6], or as the solutionof a pair Lyapunov differential equations coupled through theirboundary conditions [7], [8]. Alternatively, for certain specialcases one can solve the CS problem directly by solving an LQstochastic problem with a particular choice of cost weights [9].Other approaches [10], [11] use an affine disturbance feedbackcontroller having two components, one that steers the meanstate and the other that steers the covariance.In general, the theory of steering marginal distributions hasa long history stemming from the problem of Schr¨odingerbridges and optimal mass transport [8], [12]–[14]. Recentwork has focused on incorporating physical constraints onthe system, such as state chance constraints [15], obstacles inpath-planning environments [11], input hard constraints [16],incomplete state information [17], and extensions in the con-text of stochastic model predictive control [18] and nonlinearsystems [19]–[21].In this work, we extend the Covariance Steering ChanceConstraint (CSCC) problem, to account for optimal risk allo-cation. By risk allocation we mean allocating the probabilityof violating each individual chance constraint at each timestep. For example, if there are M chance constraints and N time steps, there would be N M total allocations for thewhole problem. Previous works [9], [11], [15], [16], [18]have assumed a constant risk allocation, so that the resultingproblem can be turned into a semi-definite program (SDP).Here, however, we adopt a two-stage algorithm that optimizesthe risk distribution over all time steps, and subsequentlyoptimizes the controller by solving a SDP. Other works havetried to optimize the risk using techniques such as ellipsoidalrelaxation [22] and particle control [23]. However, ellipsoidal a r X i v : . [ m a t h . O C ] S e p elaxation techniques are overly conservative and lead tohighly suboptimal solutions. Particle control methods arecomputationally too demanding, since the number of decisionvariables grows with the number of samples. The two-stagerisk allocation scheme proposed in this paper is computediteratively until the cost is within a given tolerance of theminimum, from which we get the optimal risk allocation forthe problem, as well as the optimal controller.Previous works on chance constrained optimization andCS use polyhedral chance constraints, since they can berepresented as intersections of linear inequalities [24]. Thisformulation results in some favorable properties that helpwith the optimization. However, in many applications theconstraints are in the form of a conical region (e.g., line-of-sight (LOS) constraints). Approximating such cone constraintswith intersecting planes would make the problem rather largefor high accuracy approximations. In this work, we alsopresent a way to approximate such cone chance constraints (asspecial cases of general quadratic constraints ) in terms of two-sided polyhedral constraints. We then apply this formulationto the case of LOS cone chance constraints, and comparewith a polyhedral approximation. Additionally, we present ageometric relaxation of the cone chance constraints, whichis less conservative than the two-sided approximation. Toillustrate the proposed risk allocation algorithm we use asan example a spacecraft rendezvous problem between twospacecraft, in which the approaching spacecraft has to remainwithin a predetermined LOS region during the whole maneu-ver. Both polyhedral and cone LOS constraints are investigatedand compared.The paper is structured as follows: In Section II we definethe general stochastic optimal control problem for steering adistribution from an initial Gaussian to a terminal Gaussianwith joint state chance constraints. In Section III we reviewthe two-stage risk allocation formalism, and formulate the SDPfor the optimal controller as well as the proposed iterative riskallocation algorithm. In Section IV we present two differentconvex relaxations of quadratic chance constraints, one interms of two-sided linear constraint relaxation, and the otherbased on a geometrical construction. Finally, in Section V weimplement the theory to the spacecraft rendezvous and dockingproblem with both polyhedral and cone chance constraints.II. P ROBLEM S TATEMENT
We consider the following discrete-time stochastic time-varying system subject to noise x k +1 = A k x k + B k u k + D k w k , (1)where x ∈ R n , u ∈ R m , with time steps k = 0 , . . . , N − ,where N representing the finite horizon. The uncertainty w ∈ R r is a zero-mean white Gaussian noise with unit covariance,i.e., E [ w k ] = 0 and E [ w k w (cid:124) k ] = I r δ k ,k . Additionally, weassume that E [ x k w (cid:124) k ] = 0 , for ≤ k ≤ k ≤ N . The initialstate x is a random vector drawn from the normal distribution x ∼ N ( µ , Σ ) , (2) where µ ∈ R n is the initial state mean and Σ ∈ R n × n > is the initial state covariance. The objective is to steer thetrajectories of (1) from the initial distribution (2) to theterminal distribution x f ∼ N ( µ f , Σ f ) , (3)where µ f ∈ R n and Σ f > are the state mean and covarianceat time N , respectively. The cost function to be minimized is J ( u , . . . , u N − ) := E (cid:20) N − (cid:88) k =0 x (cid:124) k Q k x k + u (cid:124) k R k u k (cid:21) , (4)where Q k ≥ and R k > for all k = 0 , . . . , N − .Additionally, and over the whole horizon, we impose thefollowing joint chance constraint that limits the probabilityof state violation to be less than a pre-specified threshold, i.e., P (cid:18) N (cid:94) k =0 x k / ∈ X (cid:19) ≤ ∆ , (5)where P ( · ) denotes the probability of an event, X ⊂ R n is thestate constraint set, and ∆ ∈ (0 , . . Remark 1 : We assume that the system (1) is controllable,that is, for any x , x f ∈ R n , and no noise ( w k ≡ , k =0 , . . . , N − ), there exists a sequence of control inputs { u k } N − k =0 that steer the system from x to x f .First, we provide an alternative description of the system (1)in order to solve the problem at hand. Using [9], [11], [15],[16], [18], we can reformulate (1) as X = A x + B U + D W, (6)where X := [ x (cid:124) , ...x (cid:124) N ] (cid:124) ∈ R ( N +1) n , U := [ u (cid:124) , ...u (cid:124) N − ] (cid:124) ∈ R Nm , and W := [ w (cid:124) , ..., w (cid:124) N − ] (cid:124) ∈ R Nr are the state, input,and disturbance sequences, respectively. The matrices A , B ,and D are defined accordingly [9]. Using this notation, wecan write the cost function compactly as J ( U ) = E [ X (cid:124) ¯ QX + U (cid:124) ¯ RU ] , (7)where ¯ Q and ¯ R are defined accordingly. Note that since Q k ≥ and R k > for all k = 0 , . . . , N − , it follows that ¯ Q ≥ and ¯ R > . The initial and terminal conditions (2) and (3) canbe written as µ = E E [ X ] , Σ = E Σ X E , (8)and µ f = E N E [ X ] , Σ f = E N Σ X E n , (9)where Σ X := E [ X ] − E [ X ] , and E k :=[0 n,kn , I n , n, ( N − k ) n ] picks out the k th component of avector. Consequently, the state chance constraints (5) can bewritten as P (cid:18) N (cid:94) k =1 E k X / ∈ X (cid:19) ≤ ∆ . (10)In summary, we wish to solve the following stochastic optimalcontrol problem. roblem 1 : Given the system (6), find the optimal controlsequence U ∗ := U ∗ N − that minimizes the objective function(7), subject to the initial state (8), terminal state (9), and thestate chance constraints (10).III. C HANCE C ONSTRAINED C OVARIANCE S TEERINGWITH R ISK A LLOCATION
A. Lower-Stage Covariance Steering
Borrowing from the work in [11], we adopt the controlpolicy u k = v k + K k y k , (11)where v k ∈ R m , K k ∈ R m × n , and y k ∈ R n is given by y k +1 = A k y k + D k w k , (12a) y = x − µ . (12b) Remark 2 : The proposed control scheme (11)-(12) leads to aconvex programming formulation of Problem 1 as follows.Using (11)-(12), we can write the control sequence as U = V + K ( A y + D W ) , (13)where V := [ v (cid:124) , . . . , v (cid:124) N − ] (cid:124) ∈ R Nm and K ∈ R Nm × ( N +1) n a matrix containing the gains K k . It follows that the dynamicscan be decoupled into a mean and error state as follows ¯ X = E [ X ] = A µ + B V, (14) ˜ X = X − ¯ X = ( I + B K )( A y + D W ) . (15)Additionally, the cost function takes the form J ( V, K ) =( A µ + B V ) (cid:124) ¯ Q ( A µ + B V ) + V (cid:124) ¯ RV + tr (cid:104)(cid:0) ( I + B K ) (cid:124) ¯ Q ( I + B K ) + K (cid:124) ¯ RK (cid:1) Σ Y (cid:105) , (16)where Σ Y := A Σ A (cid:124) + DD (cid:124) . The terminal constraints canbe reformulated as µ f = E N ( A µ + B V ) , (17a) Σ f = E N ( I + B K )Σ Y ( I + B K ) (cid:124) E (cid:124) N . (17b)Qualitatively speaking, V steers the mean of the system to µ f , while K steers the covariance to Σ f . In order to make theproblem convex, we relax the terminal covariance constraint(17b) to Σ f ≥ E N ( I + B K )Σ Y ( I + B K ) (cid:124) E (cid:124) N , which can bewritten as the linear matrix inequality (LMI) (cid:34) Σ f E N ( I + B K )Σ / Y Σ / Y ( I + B K ) (cid:124) E (cid:124) N I (cid:35) ≥ . (18) B. Polyhedral Chance Constraints
When dealing with the risk allocation problem, it is cus-tomary to assume that the state constraint set X is a convex polytope X p , so that X p := M (cid:92) j =1 { x : α (cid:124) j x ≤ β j } , (19) where α j ∈ R n and β j ∈ R . Under this assumption, theprobability of violating the state constraints (10) can be writtenas P (cid:18) N (cid:94) k =1 M (cid:94) j =1 α (cid:124) j E k X > β j (cid:19) ≤ ∆ . (20)Equation (20) represents the objective that the joint probabilityof violating any of the M state constraints over the horizon N is less than or equal to ∆ . Using Boole’s Inequality [25], [26],one can decompose a joint chance constraint into individual chance constraints as follows P (cid:0) α (cid:124) j E k X ≤ β j (cid:1) ≥ − δ jk , k = 1 , . . . , N, j = 1 , . . . , M. (21)with N (cid:88) k =1 M (cid:88) j =1 δ jk ≤ ∆ , (22)where each δ jk represents the probability of violating the j thconstraint at time step k . Notice that the probability in (21)is of a random variable with mean α (cid:124) j E k ¯ X and covariance α (cid:124) j E k Σ X E (cid:124) k α j . Thus, (21) can be equivalently written as Φ (cid:32) β j − α (cid:124) j E k ¯ X (cid:113) α (cid:124) j E k Σ X E (cid:124) k α j (cid:33) ≥ − δ jk , (23)where Φ( · ) denotes the cumulative distribution function of thestandard normal distribution. Simplifying (23) and noting that Σ X = ( I + B K )Σ Y ( I + B K ) (cid:124) yields α (cid:124) j E k ( A µ + B V )+ (cid:107) Σ / Y ( I + B K ) (cid:124) E (cid:124) k α j (cid:107) Φ − (1 − δ jk ) ≤ β j . (24) Remark 3 : Since Σ X > , it follows that Σ Y > , and Σ / Y in (24) can be computed from its Cholesky decomposition.The expression in (24) gives N M inequality constraintsfor the optimization problem. In summary, Problem 1 isconverted into a convex programming problem.
Problem 2 : Given the system (14) and (15), find the optimalcontrol sequences V ∗ and K ∗ that minimize the cost function(16) subject to the terminal state constraints (17a) and (18),and the individual chance constraints (24). Remark 4 : Note that it is not possible to decouple the mean andcovariance controllers in the presence of chance constraints,because of (24).
C. Risk Allocation Optimization
Since δ jk are decision variables in (24), the constraints are bilinear , which makes it difficult to solve this problem. Asmentioned previously, in order to transform Problem 2 to amore tractable form, the allocation of the risk levels δ jk maybe assumed to be fixed to some pre-specified values, usuallyuniformly. In this case, δ jk are no longer decision variables andthe problem can be efficiently solved as an SDP. However, abetter approach is to allocate δ jk concurrently when solving theptimization Problem 2, so as to minimize the total cost. Thisgives rise to a natural two-stage optimization framework [1].According to the approach in [1], the upper stageoptimization finds the optimal risk allocation δ :=[ δ , δ , . . . , δ M − N , δ MN ] ∈ R NM , and the lower stage solvesthe CS problem for the optimal controller U ∗ = U ∗ N − giventhe risk allocation δ from the upper-stage.Let the value of the objective function after the lower-stageoptimization for a given risk allocation δ be J ∗ , that is, J ∗ ( δ ) = min V,K J ( V, K ) , (25)where J ( V, K ) is given in (16). The upper-stage optimizationproblem can then be formulated as follows. Problem 3 (Risk Allocation): min δ J ∗ ( δ ) , (26)such that N (cid:88) k =1 M (cid:88) j =1 δ jk ≤ ∆ , (27) δ jk > , (28)As shown in [1], Problem 3 is a convex optimization problem,given that the objective function J ( V, K ) is convex, and ∆ ∈ (0 , . . D. Iterative Risk Allocation Motivation
Even though we have formulated the solution of Problem 2as a two-stage optimization problem, it is not clear yet how tosolve Problem 3 efficiently in order to determine the optimalrisk allocation. To gain insight into the solution, we first statea theorem about the monotonicity of J ∗ ( δ ) . Theorem 1.
The optimal cost from solving Problem 2 is amonotonically decreasing function in δ jk , that is, ∂J ∗ ∂δ jk ≤ , k = 1 , . . . , N, j = 1 , . . . , M. (29) Proof.
Let δ, δ (cid:48) be two risk allocations, and let R ( δ ) , R ( δ (cid:48) ) de-note the feasible regions, defined by the inequality constraints(24). If δ jk ≤ δ j (cid:48) k for all j and k , then R ( δ ) ⊆ R ( δ (cid:48) ) . To seethis, let us rewrite (24) as follows α (cid:124) j E k ¯ X ≤ β j − (cid:107) Σ / Y ( I + B K ) (cid:124) α j (cid:107) Φ − (1 − δ jk ) . (30)Next, and since ∆ ∈ (0 , . , it follows that δ jk ∈ (0 , . . Alsonote that in the domain z ∈ [0 . , the cumulative distributionfunction Φ( z ) forms a convex region M , as shown in Figure 1.Additionally, Φ − ( z ) is a monotonically increasing function,and hence ∂ Φ − (1 − δ jk ) ∂δ jk < . (31)Thus, if δ j (cid:48) k ≥ δ jk , the right hand side of (30) will be larger for δ (cid:48) than it is for δ . This implies that the inequality constraintsare tighter for δ than for δ (cid:48) , which proves that R ( δ ) ⊆ R ( δ (cid:48) ) .This fact finally implies that J ∗ ( δ ) ≥ J ∗ ( δ (cid:48) ) . Fig. 1: Convex region M of the inverse cumulative distributionfunction, of the standard normal distribution. Remark 5 : The chance constraints can be written in yet anotherform that will prove useful below. Starting from (24), noticethat we can write the chance constraints as δ jk ≥ − Φ (cid:32) β j − α (cid:124) j E k ¯ X ∗ (cid:107) Σ / Y ( I + B K ∗ ) (cid:124) E (cid:124) k α j (cid:107) (cid:33) =: ¯ δ jk . (32)The quantity ¯ δ jk represents the true risk experienced by theoptimal trajectories, i.e, when using ( V ∗ , K ∗ ) . Clearly, therisk we have selected does not need to be equal to the actualrisk once the optimization is completed. When these valuesare equal we will say that the constraint is active, and isinactive otherwise. Good solutions correspond to cases whenthe true risk is within a small margin of the allocated risk.Many values of δ jk smaller than their true counterparts wouldimply an overly conservative solution. E. Iterative Risk Allocation Algorithm
We can exploit Theorem 1 in the context of CS to createan iterative risk allocation algorithm that simultaneously findsthe optimal risk allocation δ ∗ and the optimal control pair ( V ∗ , K ∗ ) . To this end, suppose we start with some feasiblerisk allocation δ jk ( i ) , for all k, j , where i denotes the iterationnumber. Using this risk allocation, we then solve Problem 2to get the optimal controller ( V ∗ ( i ) , K ∗ ( i ) ) , which correspondsto the optimal mean trajectory ¯ X ∗ ( i ) at iteration i . Next, weconstruct a new risk allocation δ j (cid:48) k ( i ) as follows: for all k, j such that δ jk ( i ) is active, we keep the corresponding allocationthe same, i.e, δ j (cid:48) k ( i ) = δ jk ( i ) . However, for all k, j such that δ jk ( i ) is inactive we let δ j (cid:48) k ( i ) < δ jk ( i ) , which corresponds totightening the constraints. Since this new risk allocation issmaller, it follows from (31) that Φ − (1 − δ j (cid:48) k ( i ) ) > Φ − (1 − δ jk ( i ) ) . Furthermore, this implies that α (cid:124) j E k ¯ X ∗ ( i ) < β j − (cid:107) Σ / Y ( I + B K ∗ ( i ) ) (cid:124) E (cid:124) k α j (cid:107) Φ − (1 − δ j (cid:48) k ( i ) ) < β j − (cid:107) Σ / Y ( I + B K ∗ ( i ) ) (cid:124) E (cid:124) k α j (cid:107) Φ − (1 − δ jk ( i ) ) . (33)he constraint (33) ensures that the optimal solution for δ ( i ) is feasible for δ (cid:48) ( i ) . Furthermore, since δ j (cid:48) k ( i ) < δ jk ( i ) , it followsthat R ( δ (cid:48) ) ⊆ R ( δ ) , so the optimal solution for δ ( i ) is also theoptimal solution for δ (cid:48) ( i ) as well, hence J ∗ ( δ (cid:48) ) = J ∗ ( δ ) .Next, we construct a new risk allocation δ jk ( i +1) from δ j (cid:48) k ( i ) as follows. For all k, j such that δ j (cid:48) k ( i ) is inactive, leave thenew risk allocation the same. For all k, j such that δ j (cid:48) k ( i ) isactive, let δ jk ( i +1) > δ j (cid:48) k ( i ) , which corresponds to relaxing theconstraints. Following the same logic, Theorem 1 implies that J ∗ ( δ (cid:48) ( i ) ) ≥ J ∗ ( δ ( i +1) ) . Thus, we have laid out an iterativescheme for a sequence of risk allocations { δ (0) , δ (1) , . . . , δ ( i ) } that continually lowers the optimal cost.This leads to Algorithm 1 that solves the optimal riskallocation for the CS problem subject to chance constraints.Note that the algorithm is initialized with a constant riskallocation. To tighten the inactive constraints in Line 9, thecorresponding risk is scaled by a parameter < ρ < thatweighs the current risk with the true risk from that solution.Additionally, to loosen the active constraints in Line 13, thecorresponding risk is increased proportionally to the residualrisk remaining. Algorithm 1:
Iterative Risk Allocation CS Input: δ jk ← ∆ / ( N M ) , (cid:15), ρ Output: δ ∗ , J ∗ , V ∗ , K ∗ while | J ∗ − J ∗ prev | > (cid:15) do J ∗ prev ← J ∗ Solve Problem 2 with current δ to obtain ¯ δ ˆ N ← number of indices where constraint is active if ˆ N = 0 or ˆ N = M N then break ; end foreach ( k, j ) such that j th constraint at k th timestep is inactive do δ jk ← ρδ jk + (1 − ρ )¯ δ jk end δ res ← ∆ − (cid:80) Nk =1 (cid:80) Mj =1 δ jk foreach ( k, j ) such that j th constraint at k th timestep is active do δ jk ← δ jk + δ res / ˆ N end end IV. C
ONE C HANCE C ONSTRAINTS
In many engineering applications polytopic constraints suchas (19) are not realistic. Most often, the constraints havethe form of a convex cone, namely, the feasible region ischaracterized by X c := { x ∈ R n : (cid:107) Ax + b (cid:107) ≤ c (cid:124) x + d } . (34)Cone constraints such as (34) are more realistic, as theybetter describe the feasible space. As with the case of a polyhedral feasible state space X p , we want the state to beinside X c throughout the whole time horizon. However, sincethe dynamics are stochastic and similar to (5), this assumptionis relaxed to the condition that the probability that the state isnot inside this set is less than or equal to ∆ . In the context ofconvex cone state constraints, this condition becomes P ( (cid:107) Ax k + b (cid:107) ≤ c (cid:124) x k + d ) ≥ − δ k , k = 1 , . . . , N, (35a) N (cid:88) k =1 δ k ≤ ∆ . (35b) Remark 6 : Although the set X c is convex, the chance con-straint P ( x ∈ X c ) ≥ − δ may not be convex. Specifically,for large δ , it is possible that the chance constraint (35) isnon-convex [27].Since there is no guarantee that (35) will be a convexconstraint, we need to make a convex approximation so that(35) holds for all ∆ ∈ (0 , . . A. Two-Sided Approximation of Cone Constraints
Recent work on two-sided affine chance constraints [27] hasshown how to relax a general class of quadratic constraints ofthe form P ((ˆ a (cid:124) ξ + ˆ b ) + (ˆ c (cid:124) ξ + ˆ d ) ≤ κ ) ≥ − (cid:15), (36)where ξ is a Gaussian random variable, and (cid:15) ∈ (0 , . .In [27] the authors proved that (36) can be conservativelyapproximated by the following convex constraints P ( | ˆ a (cid:124) ξ + ˆ b | ≤ f ) ≥ − β(cid:15), (37a) P ( | ˆ c (cid:124) ξ + ˆ d | ≤ f ) ≥ − (1 − β ) (cid:15), (37b) f + f ≤ κ, (37c)where β ∈ (0 , represents a constant that balances the trade-off between violating any of the two constraints (37a)-(37b).In order to cast the cone chance constraint (35a) in the form(36), we first replace the constraint in (35a) with the chanceconstraint P ( (cid:107) Ax k + b (cid:107) ≤ c (cid:124) ¯ x k + d ) ≥ − δ k , k = 1 , . . . , N. (38) Remark 7 : The chance constraint (38) is a relaxation ofthe original chance constraint (35a). The proof of this resultis given in Appendix A.In order to write (38) in the form (36), square bothsides of the inequality in (38) and rearrange terms to obtain P ( x (cid:124) k A (cid:124) Ax k + 2 b (cid:124) Ax k ≤ ( c (cid:124) ¯ x k + d ) − b (cid:124) b ) ≥ − δ k . (39)Letting now A (cid:124) A = ˆ a ˆ a (cid:124) + ˆ c ˆ c (cid:124) , (40a) A (cid:124) b = ˆ b ˆ a + ˆ d ˆ c, (40b) ( c (cid:124) ¯ x k + d ) − b (cid:124) b = κ − ˆ b − ˆ d , (40c)nd identifying ξ = x k yields P ( ξ (cid:124) (ˆ a ˆ a (cid:124) +ˆ c ˆ c (cid:124) ) ξ +2(ˆ b ˆ a (cid:124) + ˆ d ˆ c (cid:124) ) ξ ≤ κ − ˆ b − ˆ d ) ≥ − (cid:15), (41)or, after rearranging terms and completing the squares, P ((ˆ a (cid:124) ξ + ˆ b ) + (ˆ c (cid:124) ξ + ˆ d ) ≤ κ ) ≥ − (cid:15), (42)which yields the desired result. Remark 8 : It should be noted that the set of equations (40)does not always have a solution. Specifically, (40a) impliesthat A (cid:124) A is the sum of two rank-one matrices, which is arestrictive condition. However, it turns out that this conditionholds for our problem.In the case when the cone is centered at the origin, we havethat b = d = 0 and a simple solution of equations (40) yields A (cid:124) A = ˆ a ˆ a (cid:124) + ˆ c ˆ c (cid:124) , κ = ( c (cid:124) ¯ x k ) , ˆ b = ˆ d = 0 . (43)In this sense, ˆ a and ˆ c denote the unit vectors that parametrizethe orientation of the cone. In the context of CS, (37a)-(37c)then result in the following four affine chance constraints P (ˆ a (cid:124) E k X + ˆ b ≤ f ) ≥ − βδ k , (44a) P (ˆ a (cid:124) E k X + ˆ b ≤ − f ) ≤ (1 − β ) δ k , (44b) P (ˆ c (cid:124) E k X + ˆ d ≤ f ) ≥ − (1 − β ) δ k , (44c) P (ˆ c (cid:124) E k X + ˆ d ≤ − f ) ≤ (1 − β ) δ k . (44d)These constraints are now in the standard affine form, andsimilar to (24), they can be converted to ˆ a (cid:124) E k ¯ X + ˆ b + Φ − (1 − βδ k ) (cid:107) Σ / Y ( I + B K ) (cid:124) E (cid:124) k ˆ a (cid:107) ≤ f , (45a) ˆ a (cid:124) E k ¯ X + ˆ b + Φ − ( βδ k ) (cid:107) Σ / Y ( I + B K ) (cid:124) E (cid:124) k ˆ a (cid:107) + f ≥ , (45b) ˆ c (cid:124) E k ¯ X + ˆ d + Φ − (1 − (1 − β ) δ k ) (cid:107) Σ / Y ( I + B K ) (cid:124) E (cid:124) k ˆ c (cid:107) ≤ f , (45c) ˆ c (cid:124) E k ¯ X + ˆ d + Φ − ((1 − β ) δ k ) (cid:107) Σ / Y ( I + B K ) (cid:124) E (cid:124) k ˆ c (cid:107) + f ≥ . (45d)As a result, the approximation of the quadratic chance con-straints has resulted in four cone constraints at each time step,or N total cone constraints for the whole problem. Sincethese constraints are now convex, the resulting problem isconvex and can be solved using standard SDP solvers similarlyto the polyhedral chance constraint case. B. Geometric Approximation
We limit the following discussion to the three-dimensionalcase, which often occurs when enforcing position constraints.However, the results can be generalized to n -dimensionalconvex cones. For simplicity, let b = 0 in (34), whichcorresponds to a cone centered at the origin. From a geometricpoint of view, one can think of the conical state space(35a) as imposing, at each time step k , that the projection ξ k := AE k X ∈ R lies inside the disk r k = c (cid:124) E k X + d with probability greater than − δ k . However, since E k X isa stochastic process, it follows that the radius of the disk is uncertain, therefore, and similar to Section IV-A, we relax thechance constraint such that the Gaussian vector ξ lie withinthe mean radius of the disk ¯ r k = c (cid:124) E k ¯ X + d .Using this approximation, the chance constraints (35a)become P ( (cid:107) ξ k (cid:107) ≤ ¯ r k ) ≥ − δ k . (46)Note that the random variable ξ k = AE k X is Gaussian suchthat ξ k ∼ N ( ¯ ξ k , Σ ξ k ) , with mean ¯ ξ k := AE k ¯ X and covariance Σ ξ k := AE k Σ X E (cid:124) k A (cid:124) . So far, we have turned the convexcone chance constraint (35a) into the chance constraint (46)that requires the probability of a Gaussian random vector beinginside a circle of given radius to be greater than − δ k . Thisproblem can be analytically solved as follows. Proposition 1.
Let ζ ∼ N (0 , Σ ζ ) be a two-dimensionalrandom vector. Then, for any a > , P (cid:0) ζ (cid:124) Σ − ζ ζ ≤ a (cid:1) = 1 − e − a . (47) Proof.
The probability density function (PDF) of ζ is givenby N (0 , Σ ζ ) = 12 π | det Σ ζ | e − ζ (cid:124) Σ − ζ ζ . (48)Then, the probability in (47) is given explicitly by P ( ζ (cid:124) Σ − ζ ζ ≤ a ) = 12 π | det Σ ζ | (cid:90) Ω ζ e − ζ (cid:124) Σ − ζ ζ d ζ, (49)where Ω ζ := { ζ : ζ (cid:124) Σ − ζ ζ ≤ a } . Changing coordi-nates such that ν := Σ − ζ ζ = ( ρ cos φ, ρ sin φ ) so that d ν = | det Σ ζ | − d ζ , note that the sets { ζ (cid:124) Σ − ζ ζ ≤ a } and {(cid:107) ν (cid:107) ≤ a } are equivalent. Thus, the integral in (49) becomes P ( ζ (cid:124) Σ − ζ ζ ≤ a ) = P ( (cid:107) ν (cid:107) ≤ a ) = 12 π (cid:90) Ω ν e − ν (cid:124) ν d ν, (50)where Ω ν := { ν : (cid:107) ν (cid:107) ≤ a } . The last integral is straightfor-ward to evaluate in two dimensions, P ( (cid:107) ν (cid:107) ≤ a ) = 12 π (cid:90) π (cid:90) a e − ρ ρ d ρ d φ = 1 − e − a , (51)which yields the desired result. Lemma 1.
Let ζ ∼ N (0 , Σ ζ ) be a two-dimensional randomvector, let σ ζ = λ max (Σ ζ ) , and let r > . Then P ( (cid:107) ζ (cid:107) ≤ r ) ≥ − e − r / σ ζ (52) Proof.
Since the covariance matrix is positive definite, we candiagonalize it as Σ ζ = P DP (cid:124) where D is a diagonal matrixcontaining the eigenvalues λ i of Σ ζ and P is an orthogonalmatrix. Since σ ζ = max i λ i , it follows that D − = 1 σ ζ diag ( σ ζ /λ i ) ≥ σ ζ I. (53)From the previous expression, it follows that ζ (cid:124) Σ − ζ ζ = ζ (cid:124) P D − P (cid:124) ζ ≥ σ ζ ζ (cid:124) P P (cid:124) ζ = 1 σ ζ (cid:107) ζ (cid:107) . (54)earranging the previous inequality gives (cid:107) ζ (cid:107) /σ ζ ≤ ζ (cid:124) Σ − ζ ζ , and using (47), it follows that P ( (cid:107) ζ (cid:107) ≤ σ ζ a ) ≥ P ( ζ (cid:124) Σ − ζ ζ ≤ a ) = 1 − e − a . (55)Setting r = σ ζ a achieves the desired result. Geometrically,the level sets { ζ (cid:124) Σ − ζ ζ = a } define the contours of ellipseshaving probability − e − a / and the level sets {(cid:107) ζ (cid:107) = r } are the smallest circles that contain these ellipses. Proposition 2.
Let ξ ∼ N ( ¯ ξ, Σ ξ ) be a two-dimensionalrandom vector, let σ ξ = λ max (Σ ξ ) , and let r > . Then (cid:107) ¯ ξ (cid:107) + σ ξ (cid:114) δ ≤ r ⇒ P ( (cid:107) ξ (cid:107) ≤ r ) ≥ − δ. (56) Proof.
First, note that for (cid:107) ξ (cid:107) ≤ r , the following implicationshold (cid:107) ¯ ξ (cid:107) + σ ξ (cid:114) δ ≤ r ⇒ σ ξ (cid:114) δ ≤ r − (cid:107) ¯ ξ (cid:107) (57a) ⇒ σ ξ log 1 δ ≤ ( r − (cid:107) ¯ ξ (cid:107) ) (57b) ⇒ σ ξ log δ ≥ − ( r − (cid:107) ¯ ξ (cid:107) ) (57c) ⇒ log δ ≥ − ( r − (cid:107) ¯ ξ (cid:107) ) σ ξ (57d) ⇒ δ ≥ exp (cid:18) − ( r − (cid:107) ¯ ξ (cid:107) ) σ ξ (cid:19) (57e) ⇒ − δ ≤ − exp (cid:18) − ( r − (cid:107) ¯ ξ (cid:107) ) σ ξ (cid:19) . (57f)Since { ξ : (cid:107) ¯ ξ (cid:107) + (cid:107) ˜ ξ (cid:107) ≤ r } ⊆ { ξ : (cid:107) ξ (cid:107) ≤ r } , where ˜ ξ := ξ − ¯ ξ , it follows that P ( (cid:107) ξ (cid:107) ≤ r ) ≥ P ( (cid:107) ¯ ξ (cid:107) + (cid:107) ˜ ξ (cid:107) ≤ r ) = P ( (cid:107) ˜ ξ (cid:107) ≤ r − (cid:107) ¯ ξ (cid:107) ) . (58)Since ˜ ξ is a zero-mean Gaussian vector, applying Lemma 1gives P ( (cid:107) ˜ ξ (cid:107) ≤ r − (cid:107) ¯ ξ (cid:107) ) ≥ − exp (cid:18) − ( r − (cid:107) ¯ ξ (cid:107) ) σ ξ (cid:19) . (59)Finally, by (57) and (58), we obtain the desired result.Using Proposition 2 we can now satisfy (46) by enforcing σ ξ k (cid:114) δ k ≤ ¯ r k − (cid:107) ¯ ξ k (cid:107) =: ¯ R k . (60)Note that σ ξ k = λ max (Σ ξ ) = λ max ( AE k Σ X E (cid:124) k A (cid:124) ) . There-fore, using Σ X = ( I + B K )Σ Y ( I + B K ) (cid:124) , we get σ ξ k = (cid:107) Σ / Y ( I + B K ) (cid:124) E (cid:124) k A (cid:124) (cid:107) . (61)In summary, the convex cone chance constraints (35a) become (cid:114) δ k (cid:107) Σ / Y ( I + B K ) (cid:124) E (cid:124) k A (cid:124) (cid:107) ≤ ¯ R k , k = 1 , . . . , N. (62) V. S PACECRAFT R ENDEZVOUS E XAMPLE
A. IRA-CS with Polytopic Chance Constraints
In this section, we implement the previous theory of CS withoptimal risk allocation to the problem of spacecraft proximityoperations in orbit. We consider the problem where one of thespacecraft, called the Deputy, approaches and docks with thesecond spacecraft, called the Chief, such that in the process,the Deputy remains within the line-of-sight (LOS) of theChief, defined initially to be the polytopic region shown inFigure 2.
Fig. 2: Feasible state space region for spacecraft rendezvous problem.
Assuming that the Chief is in a circular orbit, the relativedynamics of the motion between the two spacecraft are givenby the Clohessy-Wiltshire-Hill Equations [28], ¨ x = 3 ω x + 2 ω ˙ y + F x /m c , (63a) ¨ y = − ω ˙ x + F y /m c , (63b) ¨ z = − ω z + F z /m c , (63c)where m c is the mass of the Chief, ω = (cid:112) µ/R is the orbitalfrequency, and F := [ F x , F y , F z ] (cid:124) represents the thrust inputcomponents to the spacecraft. These equations of motion arewritten in a relative coordinate system, where the Chief islocated at the origin, and x, y, z represent the position of theDeputy with respect to the Chief. Note that the z dynamicsare decoupled from the x − y dynamics; furthermore, the z dynamics are globally asymptotically stable, so in theory weonly need to control the planar dynamics. In Figure 2 the bluearea represents the planar region. To write the system in statespace form, let x := [ x, y, z, ˙ x, ˙ y, ˙ z ] (cid:124) ∈ R to obtain the LTIystem ˙ x = A x + B u , where A = ω ω − ω − ω , B = [0 , I ] (cid:124) , (64)and u := m − c [ F x , F y , F z ] (cid:124) ∈ R . To discretize the system,we divide the time interval into N = 15 steps, with a timeinterval ∆ t = 0 . sec. Assuming a zero-order hold (ZOH) onthe control yields the discrete system x k +1 = A d x k + B d u k + Gw k , (65)where A d = e A ∆ t , B d = B ∆ t + AB ∆ t / and we choosethe associated noise characteristics G = diag (10 − , − , × − , × − ) [29]. We assume that the initial state meanand covariance are µ = [0 . , − , . , × ] (cid:124) km and Σ = 10 − diag (0 . , . , . , . , . , . , respectively.We wish to steer the distribution from the above initial stateto the final mean µ f = 0 with final covariance Σ f = Σ ,while minimizing the cost function (4) with weight matrices Q = diag (10 , , , , , and R = 10 I . We imposethe joint probability of failure over the whole horizon to be ∆ = 0 . , which implies that the probability of violating anystate constraint over the whole horizon is less than 3%. Thecontrol inputs are bounded as (cid:107) u k (cid:107) ∞ ≤ . km/s . Note thatthese bounds are hard constraints as opposed to (soft) chanceconstraints. To implement this input hard constraint within theCS framework, the algorithm in [16] was used. The details aregiven in the Appendix. Lastly, in the iterative risk allocationalgorithm, we use a scaling parameter ρ ( i ) = (0 . . i inLine 10 of the algorithm, where i represents the current iter-ation. The SDP in Problem 2 was implemented in MATLABusing YALMIP [30] along with MOSEK [31] to solve therelevant optimization problems. Fig. 3: Optimal trajectories using covariance steering with iterativerisk allocation, with − σ covariance ellipsoids. Fig. 4: Terminal behavior of optimal trajectories using iterative riskallocation.Fig. 5: Optimal trajectories using covariance steering using iterativerisk allocation, with 3- σ covariance ellipsoids. Figures 3 and 4 show the optimal trajectories with optimalrisk allocation, and Figure 5 shows the two dimensional planarmotion. Figure 6 compares the terminal trajectories of CSwith a uniform risk allocation with the proposed method.The two solutions look similar and both satisfy the terminalconstraints on the mean and the covariance. However, due tothe relaxation Σ N ≤ Σ f , the uniform risk allocation leadsto more conservative solutions, as shown in Figure 6. Thevolume of the final covariance ellipsoid, V N ∝ log det Σ N is considerably smaller for the uniform allocation solutioncompared to the optimal allocation solution (see Table I). Infact, we see that a consequence of optimal risk allocation isthat it maximizes the final covariance given all the constraints,while still being bounded by Σ f .Figures 7 and 8 show the state trajectories and the optimalcontrols for the polyhedral chance constraints. The control isalmost linear but saturates at the first and the last few timesteps. Figure 9 shows the a priori allocation of risk, as wellas the true risk ¯ δ once the optimization is completed, where δ r corresponds to the risk allocated for the right boundary ig. 6: Comparison of terminal covariance steering using a uniformand the optimal risk allocation.Fig. 7: Trajectories of controlled system and their associated standarddeviations using iterative risk allocation.Fig. 8: Optimal control inputs using iterative risk allocation. and δ u for the risk allocated for the top boundary. Notice thatin Figure 9a the true risk exposure is much lower than theallocated risk, which confirms the conclusion that the solutionsfor the uniform allocation case are overly conservative. In fact, the true risk is nearly zero except at the initial andterminal times. Comparing this to Figure 9b we see a closecorrespondence between the allocated risk and the true riskexposure over the whole horizon for the optimal risk allocationcase. It should be noted that although the true risk is stillslightly less than the allocated risk, the error between the twois much smaller when compared to that of the uniform riskallocation strategy. (a) Uniform allocation.(b) Optimal allocation.Fig. 9: Comparison of allocated risk and true risk using: (a) uniformrisk allocation, (b) iterative risk allocation. The iterative risk allocation algorithm is robust in the sensethat the algorithm will assign risk proportionately to how closethe solution trajectories are to the boundaries of the statespace. Since solutions are close to the right and top boundariesof the allowable LOS region for most of the horizon, theoptimal allocation weighs these respective risks greater thanthose of the left and bottom boundaries. Thus, IRA assigns anextremely small risk to the right boundary during these timesteps and only assigns a larger risk when the trajectories reachtheir terminal values. Table I shows the true joint probabilityof failure, defined as ¯∆ := 1 − P (cid:34) N (cid:94) k =1 M (cid:94) j =1 α (cid:124) j E k X ∗ ≤ β j (cid:35) . (66)It is clear that the uniform risk allocation does not even comeclose to the desired design of ∆ = 0 . , while the IRA gives true probability of failure very close to the desired one. Fig. 10: Optimal cost J ∗ ( δ ) after every IRA iteration.TABLE I: Comparison of total true risk and terminal volume betweenuniform and optimal allocations. For the two-sided approximation, themaximum true risk was used. - Uniform IRA Poly IRA Two-Sided* IRA Geometric ¯∆ V N Finally, we looked at the optimal cost function over eachIRA iteration, as in Figure 10. The convergence criterion setin this example is (cid:15) = 10 − , or when all of the constraints areinactive, which can be proved in [1] to be a sufficient conditionfor optimality for Problem 3. We see that indeed (29) holds,and the optimization resulted even in a slight decrease of theobjective function, converging within 16 iterations. Thus, theiterative risk algorithm optimizes the risk allocation at eachtime step without increasing the cost. B. IRA-CS with Cone Chance Constraints
For the convex cone chance constraint case, we also im-plemented the method outlined in Section IV, namely the N constraints in (45). For this example, the following represen-tation of a cone was used X c = { ( x, y, z ) : x + z ≤ ( λy ) } , (67)where λ = 1 . , which corresponds to a 50 ◦ cone half-angle.This requirement translates to the individual chance constraints P (cid:0) x k + z k ≤ ( λy k ) (cid:1) ≥ − δ k , k = 1 , . . . , N. (68)As discussed in Section IV, in order to put this in the form of(36), the probabilistic constraint in (68) is relaxed to κ = λ ¯ y k ,so that at each time step, the state is forced to stay insidea disk with radius λ ¯ y k . Comparing (68) with (34), we seethat A = diag (1 , , , , , , b = d = 0 and c = [0 , λ, ] (cid:124) .From (67) and (36) we see that ˆ a = [1 , × ] (cid:124) and ˆ c =[0 , , , × ] (cid:124) . The parameters β = 0 . for constant weightsand f k = f k = f k = λ ¯ y k / √ were used in the numerical Fig. 11: Optimal trajectories using covariance steering under two-sided chance constraints, with 3- σ covariance ellipses.(a) Two-sided approximation. (b) Geometric approximation.Fig. 12: Optimal trajectories using convex relaxation of conic chanceconstraints. simulations. Figures 11 and 12 show the optimal trajectoriesin the three-dimensional space and in the projection on the x - y plane, respectively.It should be noted that for the two-sided approximation ofthe cone constraint, and since we approximated the quadraticconstraints as four linear constraints, the IRA algorithm needsto be adjusted as follows. In Line 5 of Algorithm 1, aconstraint is active at time step k if any of the four affineconstraints in (45) is active. Similarly, when tightening theconstraints in Line 10, the maximum true risk ¯ δ k := max j ¯ δ jk is used. This is not needed for the geometric approximationbecause it approximates each cone chance constraint as a single convex constraint for each k , so the standard IRAalgorithm is applicable.VI. C ONCLUSION
In this paper, we have incorporated an iterative risk allo-cation (IRA) strategy to optimize the probability of violatingthe state constraints at every time step within the covarianceteering problem of a linear stochastic system subject tochance constraints. For the covariance steering problem, weshowed that employing IRA not only leads to less conservativesolutions that are more practical, but also tends to maximizethe final covariance. Additionally, the use of IRA in thecontext of CS with chance constraints results in optimalsolutions that have a true risk much closer to the intendeddesign requirements, compared to the use of a uniform riskallocation. We also implemented quadratic chance constraintsin the form of convex cones, which are more accurate andnatural for many engineering applications. Using a two-sidedaffine approximation, the quadratic chance constraints can bemade convex, and a slightly modified IRA algorithm wasused to optimize the risk. Lastly, we also used a geometricapproximation of the cone chance constraints, which is validwhen the state space is three-dimensional, as is often the casewhen constraining the position of the vehicle, and which is lessconservative than the two-sided affine approximation. Bothrelaxations result in convex programs, where the two-stageIRA algorithm is applicable.R
EFERENCES[1] M. Ono and B. C. Williams, “Iterative risk allocation: A new approachto robust model predictive control with a joint chance constraint,” in , Cancun, Mexico, Dec8–11, 2008, pp. 3427–3432.[2] P. Li, M. Wendt, and G. Wozny, “A probabilistically constrained modelpredictive controller,”
Automatica , vol. 38, pp. 1171–1176, 2002.[3] A. F. Hotz and R. E. Skelton, “A covariance control theory,” in , Fort Lauderdale, FL, Dec11–13, 1985, pp. 552–557.[4] A. Hotz and R. E. Skelton, “Covariance control theory,”
InternationalJournal of Control , vol. 46, no. 1, pp. 13–32, 1987.[5] E. Bakolas, “Optimal covariance control for discrete-time stochasticlinear systems subject to constraints,” in , Las Vegas, NV, Dec 12–14, 2016, pp. 1153–1158.[6] ——, “Finite-horizon covariance control for discrete-time stochasticlinear systems subject to input constraints,”
Automatica , vol. 91, pp.61–68, 2018.[7] A. Halder and E. D. B. Wendel, “Finite horizon linear quadratic gaussiandensity regulator with wasserstein terminal cost,” in
American ControlConference , Boston, MA, July 6–8, 2016, pp. 7249–7254.[8] Y. Chen, T. T. Georgiou, and M. Pavon, “Optimal steering of a linearstochastic system to a final probability distribution – Part I,”
IEEETransactions on Automatic Control , vol. 61, no. 5, pp. 1158–1169, 2016.[9] M. Goldshtein and P. Tsiotras, “Finite-horizon covariance control oflinear time-varying systems,” in , Melbourne, Australia, Dec 12–15 2017, pp. 3606–3611.[10] E. Bakolas, “Finite-horizon separation-based covariance control fordiscrete-time stochastic linear systems,” in , Miami Beach, FL, Dec 17–19, 2018, pp. 3299–3304.[11] K. Okamoto and P. Tsiotras, “Optimal stochastic vehicle path planningusing covariance steering,”
IEEE Robotics and Automation Letters ,vol. 4, no. 3, pp. 2276–2281, 2019.[12] Y. Chen, T. T. Georgiou, and M. Pavon, “Optimal steering of a linearstochastic system to a final probability distribution – Part II,”
IEEETransactions on Automatic Control , vol. 61, no. 5, pp. 1170–1180, 2016.[13] ——, “Optimal steering of a linear stochastic system to a final proba-bility distribution – Part III,”
IEEE Transactions on Automatic Control ,vol. 63, no. 9, pp. 3112–3118, 2018.[14] Y. Chen, T. T. Georgiou, and A. Tannenbaum, “Vector-valued optimalmass transport,” 2017, arXiv:1611.09946.[15] K. Okamoto, M. Goldshtein, and P. Tsiotras, “Optimal covariancecontrol for stochastic systems under chance constraints,”
IEEE ControlSystem Letters , vol. 2, pp. 266–271, 2018. [16] K. Okamoto and P. Tsiotras, “Input hard constrained optimal covariancesteering,” in , Nice,France, Dec 11–13, 2019, pp. 3497–3502.[17] E. Bakolas, “Covariance control for discrete-time stochastic linearsystems with incomplete state information,” in
American Control Con-ference , Seattle, WA, 2017, pp. 432–437.[18] K. Okamoto and P. Tsiotras, “Stochastic model predictive control forconstrained linear systems using optimal covariance steering,” 2019,arXiv:1905.13296.[19] J. Ridderhof, K. Okamoto, and P. Tsiotras, “Nonlinear uncertaintycontrol with iterative covariance steering,” in , Nice, France, Dec 11–13 2019, pp. 3484–3490.[20] E. Bakolas and A. Tsolovikos, “Greedy finite-horizon covariance steer-ing for discrete-time stochastic nonlinear systems based on the unscentedtransform,” 2020, arXiv:2003.03679.[21] Z. Yi, Z. Cao, E. Theodorou, and Y. Chen, “Nonlinear covariance controlvia differential dynamic programming,” 2019, arXiv:1911.09283.[22] D. H. van Hessem, “Stochastic inequality constrained closed loop modelpredictive control with application to chemical process operation,” Ph.D.dissertation, Delft University of Technology, 2004.[23] L. Blackmore, “A probabilistic particle control approach to optimal,robust predictive control,” in
AIAA Guidance, Navigation, ControlConference , Keystone, CO, Aug 21–24, 2006, pp. 1–15.[24] S. Boyd and L. Vandenberghe,
Convex Optimization . CambridgeUniversity Press, 2004.[25] A. Prkopa, “Boole-Bonferroni inequalities and linear programming,”
Operations Research , vol. 36, no. 1, pp. 145–162, 1988.[26] L. Blackmore, H. X. Li, and B. C. Williams, “A probabilistic approachto optimal robust path planning with obstacles,” in
American ControlConference , Minneapolis, MN, June 14–16, 2006, pp. 1–7.[27] M. Lubin, D. Bienstock, and J. Vielma, “Two-sided linear chanceconstraints and extensions,” 2016, arXiv:1507.01995.[28] W. Weisel,
Spaceflight Dynamics . McGraw-Hill Book Co, 1989.[29] A. Vinod and M. Oishi, “Affine controller synthesis for stochasticreachability via difference of convex programming,” in , Nice, France, 2019, pp. 7273–7280.[30] J. Lofberg, “YALMIP: A toolbox for modeling and optimization inmatlab,” in
IEEE International Symposium on Computer Aided ControlSystems Design , Taipei, Taiwan, 2004, pp. 284–289.[31] MOSEK ApS, The MOSEK Optimization Toolbox for MATLAB Man-ual. Version 8.1., 2017. [Online]. Available: http://docs.mosek.com.[32] M. J. Wainwright,
High-Dimensional Statistics: A Non-Asymptotic View-point . Cambridge University Press, 2019. A PPENDIX
A. C
ONE C HANCE C ONSTRAINT R ELAXATION
Theorem 2.
The quadratic chance constraint P ( (cid:107) Ax + b (cid:107) ≤ c (cid:124) µ + d ) ≥ − δ, (A.1) where x ∼ N ( µ, Σ) is a relaxation of the cone chanceconstraint P ( (cid:107) Ax + b (cid:107) ≤ c (cid:124) x + d ) ≥ − δ. (A.2) Proof.
Since x ∼ N ( µ, Σ) it follows that ξ := (cid:107) Ax + b (cid:107) follows a non-central χ distribution with probability densityfunction f ξ ( x ) [32]. Let η := c (cid:124) x + d and notice that η ∼N ( c (cid:124) µ + d, c (cid:124) Σ c ) . The chance constraint (A.2) then takes theform P ( ξ ≤ η ) ≥ − δ . The probability that one randomvariable is less than or equal another random variable is givenby P ( ξ ≤ η ) = (cid:90) ∞−∞ (cid:90) y −∞ f ξ,η ( x, y ) d x d y, (A.3)where f ξ,η ( x, y ) is the joint probability distribution functionof the random variables ξ and η . Next, write η = c (cid:124) µ + d + √ c (cid:124) Σ c , where z ∼ N (0 , with probability density f z ( y ) and let ¯ η = c (cid:124) µ + d . The inner integral in (A.3) then becomes (cid:90) y −∞ f ξ,η ( x, y ) d x = (cid:90) ¯ η + y √ c (cid:124) Σ c −∞ f ξ,z ( x, y ) d x = (cid:90) ¯ η −∞ f ξ,z ( x, y ) d x + (cid:90) ¯ η + y √ c (cid:124) Σ c ¯ η f ξ,z ( x, y ) d x. It follows that P ( ξ ≤ η ) = (cid:90) ∞−∞ (cid:90) ¯ η −∞ f ξ,z ( x, y ) d x d y + (cid:90) ∞−∞ (cid:90) ¯ η + y √ c (cid:124) Σ c ¯ η f ξ,z ( x, y ) d x d y ≥ (cid:90) ∞−∞ (cid:90) ¯ η −∞ f ξ,z ( x, y ) d x d y. (A.4)Noticing that (cid:90) ∞−∞ f ξ,z ( x, y ) d y = f ξ ( x ) , the last expression in (A.4) implies that P ( ξ ≤ η ) ≥ (cid:90) ¯ η −∞ f ξ ( x ) d x = P ( ξ ≤ ¯ η ) , which achieves the desired result. In order words, if the relaxedchance constraint P ( ξ ≤ ¯ η ) ≥ − δ is satisfied, then the original chance constraint P ( ξ ≤ η ) ≥ − δ is satisfied aswell. B. I NPUT H ARD C ONSTRAINED C OVARIANCE C ONTROLLER
In the following, and similar to polytopic chance constraints,we assume that the hard input constraints on the control areaffine, i.e., they are of the form α (cid:124) u,s F k U ≤ β u,s , s = 1 , . . . , N c . (B.1) Theorem 3 ( [16]). The control law u k = v k + K k z k , (B.2) where z k is governed by the dynamics z k +1 = Az k + φ ( w k ) , (B.3) z = φ ( y ) , y = x − µ , (B.4) where φ : R n → R n is an element-wise symmetric saturationfunction with pre-specified saturation value of the i th entry ofthe input y max i > as φ i ( y ) = max( − y max i , min( y i , y max i )) , (B.5) converts Problem 1 to the following convex programmingproblem that constrains the control to a maximum saturationvalue min V,K, Ω J ( V, K,
Ω) = tr (cid:18) ¯ Q (cid:2) I B K (cid:3) Σ XX (cid:20) IK (cid:124) B (cid:124) (cid:21) (cid:19) + tr( ¯ RK Σ UU K (cid:124) ) + ( A µ + B V ) (cid:124) ¯ Q ( A µ + B V ) + V (cid:124) ¯ RV subject to P ( E k X / ∈ X ) ≤ δ k , k = 1 , . . . , N (B.6) N (cid:88) k =1 δ k ≤ ∆ , (B.7) HF k V + Ω (cid:124) σ ≤ h, (B.8) HF k K (cid:2) A D (cid:3) = Ω (cid:124) S, (B.9) Ω ≥ , (B.10) µ f = E N ( A µ + B V ) , (B.11) Σ f ≥ E N (cid:2) I B K (cid:3) Σ XX (cid:20) IK (cid:124) B (cid:124) (cid:21) E (cid:124) N , (B.12) where Ω ∈ R N +1) n × N c is a decision (slack) variable, Σ XX = (cid:20) A A (cid:21) (cid:20) Σ E [ y φ ( y ) (cid:124) ] E [ φ ( y ) y (cid:124) ] E [ φ ( y ) φ ( y ) (cid:124) ] (cid:21) (cid:20) A (cid:124) A (cid:124) (cid:21) + (cid:20) D D (cid:21) (cid:20) I E [ W φ ( W ) (cid:124) ] E [ φ ( W ) W (cid:124) ] E [ φ ( W ) φ ( W ) (cid:124) ] (cid:21) (cid:20) D (cid:124) D (cid:124) (cid:21) , (B.13) Σ UU = A E [ φ ( y ) φ ( y ) (cid:124) ] A (cid:124) + D E [ φ ( W ) φ ( W ) (cid:124) ] D (cid:124) . (B.14) Further, H = [ α u, , . . . , α u,N c ] (cid:124) ∈ R N c × m , (B.15) h = [ β u, , . . . , β u,N c ] (cid:124) ∈ R N c . (B.16) In addition, S ∈ R N +1) n × ( N +1) n and σ ∈ R N +1) n areconstant, given by S i − = e (cid:124) i − , S i = − e (cid:124) i , (B.17) σ i − = y max i , σ i = y max i , (B.18) where S i denotes the i th row of S , and e i ∈ R N +1) n is aunit vector with i th element 1. Lastly, the chance constraints(B.6) take the form α (cid:124) j E k ( A µ + B V ) + (cid:107) Σ / XX (cid:2) I B K (cid:3) (cid:124) E (cid:124) k α j (cid:107) ≤ β j , (B.19) when X = X p (19), ˆ a (cid:124) E k ¯ X + ˆ b + Φ − (1 − βδ k ) (cid:107) Σ / XX (cid:2) I B K (cid:3) (cid:124) E (cid:124) k ˆ a (cid:107) ≤ f , (B.20a) ˆ a (cid:124) E k ¯ X + ˆ b + Φ − ( βδ k ) (cid:107) Σ / XX (cid:2) I B K (cid:3) (cid:124) E (cid:124) k ˆ a (cid:107) + f ≥ , (B.20b) ˆ c (cid:124) E k ¯ X + ˆ d + Φ − (1 − (1 − β ) δ k ) (cid:107) Σ / XX (cid:2) I B K (cid:3) (cid:124) E (cid:124) k ˆ c (cid:107)≤ f , (B.20c) ˆ c (cid:124) E k ¯ X + ˆ d + Φ − ((1 − β ) δ k ) (cid:107) Σ / XX (cid:2) I B K (cid:3) (cid:124) E (cid:124) k ˆ c (cid:107) + f ≥ , (B.20d) hen X = X c (34) with the substitution (40), and (cid:114) δ k (cid:107) Σ / XX (cid:2) I B K (cid:3) (cid:124) E (cid:124) k A (cid:124) (cid:107) ≤ ¯ R k , (B.21) for X = X c with the geometric approximation.Proof. It can be shown that E [ ˜ X ˜ X (cid:124) ] = (cid:2) I B K (cid:3) Σ XX (cid:20) IK (cid:124) B (cid:124) (cid:21) , (B.22)from which it follows that the standard deviation of the statevector x k = E k X is (cid:113) E k E [ ˜ X ˜ X (cid:124) ] E (cid:124) k = (cid:107) Σ / XX (cid:2) I B K (cid:3) (cid:124) E (cid:124) k (cid:107) ..