Towards Integrated Perception and Motion Planning with Distributionally Robust Risk Constraints
TTowards Integrated Perception and MotionPlanning with Distributionally Robust RiskConstraints (cid:63)
Venkatraman Renganathan ∗ Iman Shames ∗∗ Tyler H. Summers ∗∗ Department of Mechanical Engineering,University of Texas at Dallas, Richardson, TX 75080 USA(e-mails: { vrengana, tyler.summers } @utdallas.edu). ∗∗ Department of Electrical and Electronic Engineering and MelbourneInformation, Decision and Autonomous Systems (MIDAS) Laboratory,University of Melbourne, Parkville, VIC 3010, Australia(e-mail: [email protected])
Abstract:
Safely deploying robots in uncertain and dynamic environments requires a systematicaccounting of various risks, both within and across layers in an autonomy stack from percep-tion to motion planning and control. Many widely used motion planning algorithms do notadequately incorporate inherent perception and prediction uncertainties, often ignoring themaltogether or making questionable assumptions of Gaussianity. We propose a distributionallyrobust incremental sampling-based motion planning framework that explicitly and coherentlyincorporates perception and prediction uncertainties. We design output feedback policies andconsider moment-based ambiguity sets of distributions to enforce probabilistic collision avoid-ance constraints under the worst-case distribution in the ambiguity set. Our solution approach,called Output Feedback Distributionally Robust
RRT * ( OFDR - RRT * ), produces asymptoticallyoptimal risk-bounded trajectories for robots operating in dynamic, cluttered, and uncertainenvironments, explicitly incorporating mapping and localization error, stochastic process distur-bances, unpredictable obstacle motion, and uncertain obstacle locations. Numerical experimentsillustrate the effectiveness of the proposed algorithm. Keywords:
Risk-Bounded Motion Planning, Distributional Robustness, Integrated Perception& Planning in Robotics.1. INTRODUCTIONMore sophisticated motion planning and control algo-rithms are needed for the robots to operate in increasinglydynamic and uncertain environments to ensure safe andeffective autonomous behavior. Many widely used motionplanning algorithms have been developed in deterministicsettings. However, since motion planning algorithms mustbe coupled with the outputs of inherently uncertain per-ception systems, there is a crucial need for more tightlycoupled perception and planning frameworks that explic-itly incorporate perception uncertainties.Motion planning under uncertainty has been consideredin several lines of recent research Blackmore et al. (2006);Agha-Mohammadi et al. (2014); Luders et al. (2010, 2013);Blackmore et al. (2011); Liu and Ang (2014); Zhu andAlonso-Mora (2019). Many approaches make questionableassumptions of Gaussianity and utilize chance constraints, (cid:63)
This work is partially supported by Defence Science and Tech-nology Group, through agreement MyIP: ID9156 entitled “VerifiableHierarchical Sensing, Planning and Control”, the Australian Gov-ernment, via grant AUSMURIB000001 associated with ONR MURIgrant N00014-19-1-2571, and by the United States Air Force Officeof Scientific Research under award number FA2386-19-1-4073. ostensibly to maintain computational tractability. How-ever, this can cause significant miscalculations of risk, andthe underlying risk metrics do not necessarily possess de-sirable coherence properties Rockafellar (2007); Majumdarand Pavone (2017). The emerging area of distributionallyrobust optimization (DRO) shows that stochastic uncer-tainty can be handled in much more sophisticated wayswithout sacrificing computational tractability Goh andSim (2010); Wiesemann et al. (2014). These approachesallow modelers to explicitly incorporate inherent ambigu-ity in probability distributions, rather than making overlystrong structural assumptions on the distribution.Traditionally, the perception and planning components ina robot autonomy stack are loosely coupled, in the sensethat nominal estimates from the perception system may beused for planning, while inherent perception uncertaintiesare usually ignored. This paradigm is inherited in partfrom the classical separation of estimation and control inlinear systems theory. However, in the presence of uncer-tainties and constraints, estimation and control should not be separated; there are needs and opportunities to ex-plicitly incorporate perception uncertainties into planning,both to mitigate risks of constraint violation Blackmoreet al. (2006); Florence et al. (2016); Luders et al. (2010); a r X i v : . [ ee ss . S Y ] F e b ummers (2018); Zhu and Alonso-Mora (2019) and toactively plan paths that improve perception Costante et al.(2016). Contributions:
In this paper, we take steps toward atighter integration of perception and planning in au-tonomous robotic systems. Our main contributions are: • We propose a distributionally robust incrementalsampling-based motion planning framework thatexplicitly and coherently incorporates perceptionand prediction uncertainties. Our solution approach,called Output Feedback Distributionally Robust
RRT * ( OFDR - RRT * ) (Algorithm 1), produces asymptoticallyoptimal risk-bounded trajectories for robots operat-ing in dynamic, cluttered, and uncertain environ-ments, explicitly incorporating mapping and local-ization error, stochastic process disturbances, unpre-dictable obstacle motion, and uncertain obstacle lo-cations. We design output feedback policies and con-sider moment-based ambiguity sets of distributionsto enforce probabilistic collision avoidance constraintsunder the worst-case distribution in the ambiguity set(Algorithm 2). • We demonstrate via numerical simulation results thatit gives a more sophisticated and coherent risk quan-tification compared to an approach that accountsfor uncertainty using Gaussian assumption, withoutincreasing the computation complexity.The rest of the paper is organized as follows. The dynami-cal model of the robot and the uncertainty modeling in themotion planning problem is discussed in section 2. Then,the proposed
OFDR - RRT * algorithm for motion planningis explained in section 3. The simulation results using adouble integrator model are then presented in section 4.The paper is finally closed in section 5 with a summary ofresults and with directions for the future research.2. ROBOT & ENVIRONMENT MODELINGConsider a robot operating in an uncertain environment, X ⊂ R n cluttered with n obstacles. We denote the set ofobstacles as B = { , . . . , n } . The robot and the obstaclesare modeled as a stochastic discrete-time linear system x t +1 = Ax t + Bu t + Gw r,t , (1) O it = O i ⊕ c it , i ∈ B , (2) X it = Φ ( O it ) , i ∈ B , (3)where x t ∈ R n is the robot state at time t, u t ∈ R m is theinput at time t , and A and B are the system dynamicsmatrix and input matrix, respectively. The process noise w r,t ∈ R n is a zero-mean random vector independent andidentically distributed across time. The initial condition x is subject to an uncertainty model, with the distributionof x belonging to an ambiguity set, P x ∈ P x . Moreover, O i ⊂ R n represents the shape of obstacle i , c it ∈ R n is arandom vector that represents an uncertain obstacle loca-tion and motion, not necessarily zero-mean, with unknowndistribution P cit ∈ P cit , and ⊕ denotes set translation. Theobstacles O it , i ∈ B are assumed to be convex polytopes.The state X it of obstacle i is defined by a set functionΦ : 2 R n → R l that maps the obstacle set O it to a finitevector describing the location, motion, and shape of eachobstacle relative to the uncertain trajectory c it . We concatenate both the robot’s state and the obstaclestates at time t to form the environmental state Z t = (cid:20) x t X O t (cid:21) ∈ R n + ln , (4)where X O t = [ X t X t . . . X n t ] (cid:62) represents the concate-nated states of all the n obstacles at time t . Then thedynamics of the environmental state can be written as Z t +1 = (cid:20) A n × ln ln × n I ln (cid:21)(cid:124) (cid:123)(cid:122) (cid:125) A z Z t + (cid:20) B ln × m (cid:21)(cid:124) (cid:123)(cid:122) (cid:125) B z u t + G z (cid:20) w r,t w O ,t (cid:21)(cid:124) (cid:123)(cid:122) (cid:125) w t , (5)where G z = diag( G, I ) and w O ,t ∈ R n is an obstacleprocess noise and can be derived from c it . The distribution P w t of w t is unknown and will be assumed to belong to anambiguity set P w of distributions satisfying P w = { P w t | E [ w t ] = 0 , E (cid:2) ( w t − ˆ w t )( w t − ˆ w t ) (cid:62) (cid:3) = Σ w t } . (6)At time t , the state of the robot can be extracted from theenvironmental state Z t as x t = [ n ln ] (cid:124) (cid:123)(cid:122) (cid:125) C xr Z t . (7)In an autonomous robot, the environmental state Z t mustbe estimated with a perception system from noisy on-board sensor measurements. We assume that a high-levelperception system, such as Semantic SLAM described inS¨underhauf et al. (2017), processes high dimensional rawdata Θ t ∈ R N to recognize obstacles and produce noisyjoint measurements of their state and the robot state. Inparticular, we define feature vectors Y xt ∈ R r and Y it ∈ R q for i ∈ B obtained through Y xt = Υ x (Θ t ) (8) Y O t = Υ O (Θ t ) = [ Y t Y t . . . Y n t ] (cid:62) , (9)where Υ x : R N → R r and Υ O : R N → R qn aremappings defined by the SLAM algorithm to process theraw sensor data. We then represent these features as noisymeasurements of the robot and obstacle states using anassumed linear (or linearized) output model y t = (cid:20) y x,t y O ,t (cid:21) = C (cid:18)(cid:20) Y xt Y O t (cid:21)(cid:19) = C Z t + Hv t , (10) C = (cid:20) C r C O (cid:21) , H = (cid:20) H r H O (cid:21) , v t = (cid:20) v r,t v O ,t (cid:21) ∼ P v t ∈ P v , (11)where y t ∈ R p + s , and y x,t ∈ R p , y O ,t ∈ R s are theoutput vectors corresponding to the robot and the ob-stacles respectively. The matrices C, C O , H x , H O are ofappropriate dimensions. The function C maps the featurevectors Y xt , Y O t to produce outputs y xt , y O t respectivelyas a linear function of the environmental state Z t withadditive measurement noises v t which is a zero-mean ran-dom variable. For simplicity, we assume that w t and v t are independent. The distribution P v t of v t is assumed tobelong to an ambiguity set, P v satisfying P v = { P v t | E [ v t ] = 0 , E (cid:2) ( v t − ˆ v t )( v t − ˆ v t ) (cid:62) (cid:3) = Σ v t } . (12)The robot is nominally subject to constraints on the stateand input of the form, ∀ t = 0 , . . . , T − t ∈ X free t = X \ (cid:91) i ∈B O it , (13) u t ∈ U , (14)where the environment X ⊂ R n , and U ⊂ R m areassumed to be convex polytopes, The obstacles O it , i ∈ B are described by (2), and the operator \ denotes setsubtraction. The set B represent a set of n obstacles inthe environment to be avoided. U = { u t | A u u t ≤ b u } , X = {Z t | A C xr Z t ≤ b } , O it = {Z t | A i C xr Z t ≤ b it } , i ∈ B (15)where b u ∈ R n u , b ∈ R n E , b it ∈ R n i , and A u , A , and A i are matrices of appropriate dimension. The nonconvexobstacle avoidance constraints for obstacle i ∈ B can beexpressed as the disjunction ¬ ( A i C xr Z t ≤ b it ) ⇔ n i (cid:95) j =1 ( a (cid:62) ij C xr Z t ≥ a (cid:62) ij c it ) , (16)where ∨ denotes disjunction. We seek a dynamic output feedback control policy π =[ π , . . . , π T − ] with u t = π t ( y t , u t − ), where y t and u t − are the output and input histories available to makecontrol decisions at time t , that produces a feasible andminimum cost trajectory from an initial state x to a goalset X goal ⊂ R n . In particular, we seek to (approximately)solve the distributionally robust constrained stochasticoptimal control problemminimize π T − (cid:88) t =0 (cid:96) t ( E [ Z t ] , X goal , u t ) + (cid:96) T ( E [ Z T ] , X goal )subject to Z t +1 = A z Z t + B z u t + G z w t ,y t = C Z t + Hv t Z ∼ P Z ∈ P Z ,w t ∼ P w ∈ P w ,u t ∈ U = { u t | A u u t ≤ b u } , X free t = X \ (cid:91) i ∈B O it , inf P Z t ∈P Z P Z t ( C xr Z t ∈ X free t ) ≥ − α, (17)where P Z is an ambiguity set of marginal state distribu-tions and α ∈ (0 , .
5] is a user-prescribed risk parameter.The stage cost functions (cid:96) t ( · ) quantify the robots distanceto the goal set and actuator effort, and are assumed to beexpressed in terms of the environmental state mean E [ Z t ],so that all the stochasticity appears in the constraints.Two key features distinguish our problem formulation.First, the state constraints are expressed as distributionallyrobust chance constraints . This means that the nominalconstraints x t ∈ X free t are enforced with probability α under the worst-case distribution in the ambiguity set.Second, since information about the environmental stateis obtained only from noisy measurements, we optimizeover dynamic output feedback policies. Our proposed so-lution framework, detailed in the next section, combinesa dynamic state estimator with a full-state kinodynamic motion planning under uncertainty algorithm. This com-bination and explicit incorporation of state estimationuncertainty into the motion planning and control takesa step toward tighter integration of perception, planning,and control, which are nearly always separated in state-of-the-art robotic systems.3. OUTPUT FEEDBACK DISTRIBUTIONALLYROBUST RRT * ( OFDR - RRT * )We propose to use a distributionally robust, kinodynamicvariant of the RRT * motion planning algorithm with dy-namic output feedback policies. RRT * adds a rewiringoperation to RRT to obtain asymptotic optimality. Ourproposed algorithm grows trees of state and state estimatedistributions, rather than merely trees of states, and in-corporates distributionally robust probabilistic constraintsto build risk-constrained state trajectories and feedbackpolicies.
Sampling based motion planning algorithms require asteering law to steer the robot from a noide in thetree to a feasible sampled point in the free space. Sincethe environment state is not directly observed and mustbe estimated from noisy output measurements (10), ourproposed steering law π = [ π , . . . , π T − ] with u t = π t ( y t , u t − ) comprises a combination of dynamic stateestimator and state feedback control law. Here we utilizea Kalman filter (which has been used in seminal SLAM algorithms for joint estimation of robot and environmentalstates Dissanayake et al. (2001)) for state estimationtogether with a finite horizon optimal linear quadraticstate feedback controller. It is also possible within ourframework to more sophisticated estimation and controlcomponents (e.g., extended/unscented Kalman filters orparticle filters and stochastic model predictive controllers),which will be explored in future work.The output feedback control policy has the form u t = K t ˜ Z t + k t . (18)where ˜ Z t is the Kalman filter estimate of the environmen-tal state, and K t and k t are linear and constant feedbackgains to be derived with dynamic programming for a finite-horizon LQR problem. Kalman Filter:
The Kalman filter equations with gain L t are L t +1 = Σ ˜ Z t C (cid:62) (cid:0) C Σ ˜ Z t C (cid:62) + H Σ v t H (cid:62) (cid:1) − , (19)˜ Z t +1 = ( I − L t +1 C )( A z ˜ Z t + B z u t ) + L t +1 y t +1 , (20)Σ ˜ Z t +1 = ( I − L t +1 C )( A z Σ ˜ Z t A (cid:62) z + G z Σ w t G (cid:62) z ) . (21)Together with the control law (18) we can write thecombined dynamics for the true unknown state Z t andthe state estimate ˜ Z t as Z t +1 ˜ Z t +1 (cid:21)(cid:124) (cid:123)(cid:122) (cid:125) Z t +1 = (cid:20) A z B z K t L t +1 CA z ( I − L t +1 C ) A z + B z K t (cid:21)(cid:124) (cid:123)(cid:122) (cid:125) ¯ A t (cid:20) Z t ˜ Z t (cid:21)(cid:124) (cid:123)(cid:122) (cid:125) Z t (22)+ (cid:20) B z B z (cid:21)(cid:124) (cid:123)(cid:122) (cid:125) ¯ B t [ k t ] + (cid:20) G z L t +1 CG z L t +1 H (cid:21)(cid:124) (cid:123)(cid:122) (cid:125) ¯ G t (cid:20) w t v t +1 (cid:21)(cid:124) (cid:123)(cid:122) (cid:125) W t , (23)Let us define ¯ Z t = Z t − ˆ Z t , ¯ W t = W t − ˆ W t . Then, E (cid:2) ¯ Z t ¯ Z (cid:62) t (cid:3) = Π t = ˜ A t − Ψ t − ˜ A (cid:62) t − (24) E (cid:2) ¯ W t ¯ W (cid:62) t (cid:3) = Σ W t = (cid:20) Σ w t
00 Σ v t +1 (cid:21) (25)where ˜ A t = (cid:2) ¯ A t ¯ G t W t (cid:3) , Ψ t = (cid:20) ¯ Z t ¯ Z (cid:62) t ¯ Z t ¯ Z (cid:62) t I (cid:21) . With the initialestimates ˆ Z = (cid:20) ˆ Z ˆ Z (cid:21) , Π = (cid:20) Σ Z
00 0 (cid:21) , Σ W = (cid:20) Σ w
00 Σ v (cid:21) given (or estimated from historical data), the combinedstate and state estimate mean and covariance evolve asˆ Z t +1 = ¯ A t ˆ Z t + ¯ B t k t , (26)Π t +1 = ¯ A t Π t ¯ A (cid:62) t + ¯ G t Σ W t ¯ G (cid:62) t . (27)For analysis purposes, the unknown environmental statemean and covariance can then be extracted as Z t +1 = (cid:2) (cid:62) (cid:62) (cid:3) ˆ Z t +1 , (28)Σ Z t +1 = (cid:20) I n n (cid:21) (cid:62) Π t +1 (cid:20) I n n (cid:21) . (29) Optimal Finite-Horizon LQR Control:
Define theerror e t = C xr ˜ Z t − x s , where x s represents a sample of thefree space to be steered to. Then the optimal control gainsin (18) can be obtained by minimizing the cost function J = E (cid:34) T − (cid:88) t =0 (cid:0) e (cid:62) t Qe t + u (cid:62) t Ru t (cid:1) + e (cid:62) T Qe T (cid:35) , (30)via dynamic programming with the following backward intime recursion from t = T, . . . , K t = − β − B (cid:62) z P t +1 A z , (31) k t = − β − B (cid:62) z q t +1 , (32) P t = Q + A (cid:62) z P t +1 (cid:2) I − B z β − B (cid:62) z P t +1 (cid:3) A z , (33) q t = ( A z + B z K t ) (cid:62) q t +1 + ( K t β + A (cid:62) z P t +1 B z ) k t − Qx s , (34)with initial values P T = Q , q T = − Qx s and further β = ( R + B (cid:62) z P t +1 B z ) and Q, R are the state and controlcost matrices, respectively.
Unlike most stochastic motion planning algorithms thatoften assume a functional form (often Gaussian) for prob-ability distributions to model uncertainties, we will focushere on uncertainty modeling using moment-based ambi-guity sets. Based on the ambiguity sets for the primitiverandom variables (namely, the process noise w in (6), themeasurement noise v in (12), and the analogous one for theinitial state and state estimate Z ), and based on the esti-mator and control law, the combined environmental stateand state estimate and covariance propagate according to(26) and (27). Since the primitive distributions are not assumed to be Gaussian, then neither are the marginalstate and state estimate distributions distributions P Z t .The ambiguity set defining the combined environmentalstate and state estimate is P Z = (cid:110) P Z t | E [ Z t ] = ˆ Z t , E [( Z t − ˆ Z t )( Z t − ˆ Z t ) (cid:62) ] = Σ Z t (cid:111) , (35)and the ambiguity set for the true environmental state is P Z = (cid:110) P Z t | E [ Z t ] = ˆ Z t , E [( Z t − ˆ Z t )( Z t − ˆ Z t ) (cid:62) ] = Σ Z t (cid:111) . (36) The control law returned by the steering function shouldalso satisfy the state constraints which are expressed asdistributionally robust chance constraints. In particular,the nominal state constraints, C xr Z t ∈ X free t , are requiredto be satisfied with probability 1 − α , under the worstcase probability distribution in the ambiguity set. Letthe moment-based ambiguity set for obstacle motion bedefined using E [ c it ] = ˆ c it and E [( c it − ˆ c it )( c it − ˆ c it ) (cid:62) ] = Σ cit .Now, under the moment-based ambiguity set defined by(36), a constraint on the worst-case probability of violatingthe j th constraint of obstacle i ∈ B sup P Z t ∈P Z P Z t ( a (cid:62) ij C xr Z t ≥ a (cid:62) ij c it ) ≤ α i (37)is equivalent to the linear constraint on the state mean ˆ Z t a (cid:62) ij C xr ˆ Z t ≥ a (cid:62) ij ˆ c it + (cid:114) − α i α i (cid:13)(cid:13)(cid:13) ( D ˆ x t + Σ cit ) a ij (cid:13)(cid:13)(cid:13) , (38)where D ˆ x t = C xr Σ Z t C (cid:62) xr and α i is the user prescribed riskparameter for obstacle i ∈ B . Obstacle risks are allocatedsuch that their sum does not exceed the constraint risk α .The scaling constant (cid:113) − α i α i in the deterministic tighten-ing of the nominal constraint is larger than the Gaussianone, leading to a stronger tightening that reflects theweaker assumptions about the uncertainty distributions. The
OFDR - RRT * tree expansion procedure, similar to the CC - RRT * algorithm developed in Luders et al. (2013), ispresented in Algorithm 1. The OFDR - RRT * tree is denotedby T , consisting of |T | nodes. Each node N of the tree T consists of a sequence of state distributions, characterizedby a distribution mean ˆ x and covariance D . A sequence ofmeans and covariances is denoted by ¯ σ and ¯Π, respectively.The final mean and covariance of a node’s sequence aredenoted by x [ N ] and D [ N ], respectively. For the state dis-tribution sequence (¯ σ, ¯Π), the notation ∆ J (¯ σ, ¯Π) denotesthe cost of that sequence. If (¯ σ, ¯Π) denotes the trajectoryof node N with parent N parent , then we denote by J [ N ],the entire path cost from the starting state to the terminalstate of node N , constructed recursively as J [ N ] = J [ N parent ] + ∆ J (¯ σ, ¯Π) . (39)In the first step, a random sample x rand is taken from thefeasible state set, X free t . Then the tree node, N nearest thatis nearest to the sample is selected (line 3 of Algorithm1) according to an optimal cost-to-go function withoutthe obstacle constraints, in order to efficiently explore thereachable set of the dynamics and increase the likelihoodf generating collision-free trajectories (similar to what isdone in Frazzoli et al. (2002)). Attempts are then made tosteer the robot from the nearest tree node to the randomsample using the steering law explained in subsection3.1 (line 4). The control policy obtained is then used topropagate the state mean and covariance, and the entiretrajectory (¯ σ, ¯Π) is returned by the steer function. Eachstate distribution in the trajectory is then checked fordistributionally robust probabilistic constraint satisfactiongiven by (38) and further the line connecting subsequentstate distributions in the trajectory are also checked forcollision with the obstacle sets O it , i ∈ B . An outline ofthe DR-Feasible subroutine is shown in Algorithm 2. Ifthe entire trajectory (¯ σ, ¯Π) is probabilistically feasible, anew node N min with that distribution sequence (¯ σ, ¯Π)is created (line 7) but not yet added to T . Instead,nearby nodes are identified for possible connections viathe NearNodes function (line 8), which returns a subsetof nodes N near ⊆ T , if they are within a search radiusensuring probabilistic asymptotic optimality guaranteesspecified in Luders et al. (2013) r = min (cid:40) γ (cid:18) log( |N t | ) N t (cid:19) /d , µ max (cid:41) , (40)where N t refers to the number of nodes in the tree at time t , µ max > γ refers to the planning constant based on the d dimensionalenvironment. Then we seek to identify the lowest-cost,probabilistically feasible connection from the N near nodesto x rand (lines 10-14). For each possible connection, adistribution sequence is simulated via the steering law(line 11). If the resulting sequence is probabilisticallyfeasible, and the cost of that node represented as c rand = J [ N near ] + ∆ J (¯ σ, ¯Π, is lower than the cost of N min denoteby J [ N min ], then a new node with this sequence replaces N min (line 14). The lowest-cost node is ultimately addedto T (line 15).Finally, edges are rewired based on attempted connectionsfrom the new node N min to nearby nodes N near (lines 17-22), ancestors excluded (line 17). A distribution sequenceis simulated via the steering law from N min to the terminalstate of each nearby node N near ∈ N near (line 18). If theresulting sequence is probabilistically feasible, and the costof that node c near is lower than the cost of N near given by J [ N near ] (line 19), then a new node with this distributionsequence replaces N near (lines 21-22). The tree expansionprocedure is then repeated until a node from the goal set isadded to the tree. At that point, a distributionally robustfeasible trajectory is obtained from the tree root to X goal .4. SIMULATION RESULTSWe demonstrate our proposed framework using a dou-ble integrator robot moving in a bounded and clutteredenvironment. While the proposed framework can handledynamic and uncertain obstacles, for simplicity of illustra-tion we assume the obstacles are static and deterministic( w O t = 0 , ∀ t ), so that all uncertainty in this example comesfrom the unknown initial state, robot process disturbance,and measurement noise. The robot dynamics matrices are Algorithm 1
OFDR-RRT * - Tree Expansion Procedure Inputs: Current Tree T , time step t x rand ← Sample( X t ) N nearest ← NearestNode( x rand , T ) (¯ σ, ¯Π) ← Steer(ˆ x [ N nearest ] , D [ N nearest ] , x rand ) // Check if sequence (¯ σ, ¯Π) is DR-Feasible if DR-Feasible (¯ σ, ¯Π) then Create node N min { ¯ σ, ¯Π } N near ← NearNodes( T , x rand , |T | ) // Connect along a minimum-cost path for each N near ∈ N near \ N nearest do (¯ σ, ¯Π) ← Steer(ˆ x [ N near ] , D [ N near ] , x rand ) c rand ← J[ x near ] + ∆ J (¯ σ, ¯Π) if DR-Feasible (¯ σ, ¯Π) & c rand < J [ N min ] then Replace N min with N min { ¯ σ, ¯Π } Add N min to T // Re-Wire the Tree for each N near ∈ N near \ Ancestors( N min ) do (¯ σ, ¯Π) ← Steer(ˆ x [ N min ] , D [ N min ] , ˆ x [ N near ]) c near ← J[ N min ] + ∆ J (¯ σ, ¯Π) if DR-Feasible (¯ σ, ¯Π) & c near < J [ N near ] then Delete N near from T Add new node N new { ¯ σ, ¯Π } to T Algorithm 2
DR-Feasible
SubroutineInput: T − time step distribution sequence(¯ σ, ¯Π) for t = 1 to T do (ˆ x t , D ˆ x t ) ← t th element in (¯ σ, ¯Π) sequence L ← Line connecting position block of ˆ x t − to ˆ x t for each i ∈ B do if (ˆ x t , D ˆ x t ) dissatisfies (38) or L ∈ O it then Return false Return true A = dt
00 1 0 dt , B = dt dt dt dt , G = B where dt = 0 . s and the states are the two dimensionalposition and velocity with two dimensional force inputs.The environmental state dynamics matrices A z , B z , G z are formed accordingly using (5) with the above robotdynamics matrices. We assume the robot to start fromthe origin with zero initial velocity, and that the initialstate and noise covariance matrices areΣ x = 0 . , Σ w = 0 . , Σ v = 10 − I. The robot is treated as a point mass without loss ofgenerality, since a known geometry can be easily handledby a fixed tightening of the state constraints. The 2Dposition of the robot is sampled uniformly within thebounds of the feasible 2D environment whose boundariesare not treated probabilistically. The search radius usedin the Algorithm 1 uses a maximum radius of µ max = 1and the environment based planning constant γ = 20. Theoutput filter dynamics matrices areig. 1. No Risk Constraints. An RRT * tree of 1200 nodesshowing nominal states and 1- σ position uncertaintyellipses, generated without any constraint tightening.Since uncertainty is not explicitly incorporated intocollision checking, it produces in highly risky trajec-tories. C r = (cid:20) (cid:21) , C O = I s × ln , H r = I, H O = . (41)We define the error e t = C xr Z t − x s for i = 0 , . . . , T , with T = 5. A dynamic output feedback policy u t of the formgiven by (18) that minimizes the cost function J = E (cid:34) T − (cid:88) t =0 (cid:0) e (cid:62) t Qe t + u (cid:62) t Ru t (cid:1) + e (cid:62) T Qe T (cid:35) , (42)is computed using dynamic programming to steer therobot from a tree node state x t to a random feasiblesample x s . The matrices Q = (cid:20) I
00 2 I (cid:21) , R = 0 . I are used to penalize the state and control deviationsrespectively. The distributionally robust state constraintsare enforced with probabilistic satisfaction parameter α =0 .
05. Three different simulations using the above doubleintegrator system are performed namely: 1) deterministiccollision check where uncertainties are not accounted for,2) chance constrained collision check where the systemnoises are assumed to be Gaussian distributed and, 3)distributionally robust collision check assuming the noisesbelong to their respective ambiguity sets. For all thesimulations, the chance constrained
RRT * algorithm withthe corresponding collision check procedure is run for1200 iterations, with 1- σ position uncertainty ellipses fromthe covariance matrix being drawn at the end of eachtrajectory. The tree generated by an
RRT * algorithm using a determin-istic collision check where uncertainties are not accountedfor is shown in Figure 1. It can be seen that highly riskytrajectories around the obstacles are generated, since theuncertainty in the state due to the initial localization andsystem dynamics uncertainties are not explicitly incorpo-rated. Assuming Gaussian noises and using risk parameter α = 0 .
05, the chance constrained variant of
RRT * generatesmore conservative trajectories as shown in Figure 2. How-ever, the actual perception uncertainties in robotic systemsoften do not match well with a Gaussian assumption, sothis approach can still significantly underestimate risks ofconstraint violation. Fig. 2. Gaussian Risk Constraints. An RRT * tree of1200 nodes showing nominal states and 1- σ positionuncertainty ellipses, generated with Gaussian chanceconstraints with α = 0 .
05. Although the trajectoriesare less risky than when uncertainty is ignored, thisapproach can still significantly underestimate risksof constraint violation since the actual perceptionuncertainties in robotic systems often do not matchwell with a Gaussian assumption.Fig. 3. DR Risk Constraints. An RRT * tree of 1200 nodesshowing nominal states and 1- σ position uncertaintyellipses, generated with OFDR - RRT * using distribution-ally robust chance constraints with α = 0 .
05. Thisapproach produces more risk-averse trajectories witha more sophsticated and coherent risk quantification.The
OFDR - RRT * tree as shown in Figure 3 generates moreconservative trajectories around the obstacles than theGaussian chance constrained counterpart, by explicitlyincorporating the uncertainty in the state due to theinitial localization, system dynamics, and measurementuncertainties in the form of ambiguity sets. It producestrajectories that satisfy the chance constraints under theworst-case distribution in the ambiguity sets. Clearly, thefeasible set is smaller with the distributionally robust con-straints, and certain nominally feasible paths from the ini-tial state to the goal are deemed too risky in the presenceof the uncertainties, e.g., the relatively narrow gaps to theright and below the goal region. These trajectories, witha more sophisticated and coherent quantification of risk,are generated with the same computational complexity aswith Gaussian chance constraints.. CONCLUSIONIn this paper, we presented a methodological frameworkaimed towards tighter integration of perception and plan-ning in autonomous robotic systems. The environmentalstate is estimated from sensor data to propagate bothestimates and uncertainties of both robot and obstacles.Risk constraints are posed in a meaningful and coher-ent manner through distributionally robust chance con-straints. Using a dynamic output feedback controller to-gether with the distributionally robust risk constraints, anew algorithm called OFDR - RRT * is shown to produce riskbounded trajectories with coherent risk assessment. Futureresearch involves studying distribution propagation fornonlinear systems with higher order moments and consid-ering Kalman filter variations (e.g., unscented) along withmore sophisticated steering methods to explicitly incorpo-rate the nearby obstacle constraints. Also, the combinationof moment- and data-based distribution parameterizationsfor uncertainty modeling could be used to combine theirrelative advantages. REFERENCESAgha-Mohammadi, A.A., Chakravorty, S., and Amato,N.M. (2014). Firm: Sampling-based feedback motion-planning under motion uncertainty and imperfect mea-surements. The International Journal of Robotics Re-search , 33(2), 268–304.Blackmore, L., Li, H., and Williams, B. (2006). A proba-bilistic approach to optimal robust path planning withobstacles. In , 7–pp.IEEE.Blackmore, L., Ono, M., and Williams, B.C. (2011).Chance-constrained optimal path planning with obsta-cles.
IEEE Transactions on Robotics , 27(6), 1080–1094.Costante, G., Forster, C., Delmerico, J., Valigi, P., andScaramuzza, D. (2016). Perception-aware path plan-ning. arXiv preprint arXiv:1605.04151 .Dissanayake, M.G., Newman, P., Clark, S., Durrant-Whyte, H.F., and Csorba, M. (2001). A solution tothe simultaneous localization and map building (slam)problem.
IEEE Transactions on robotics and automa-tion , 17(3), 229–241.Florence, P., Carter, J., and Tedrake, R. (2016). Integratedperception and control at high speed: Evaluating colli-sion avoidance maneuvers without maps. In
Workshopon the Algorithmic Foundations of Robotics (WAFR) .Frazzoli, E., Dahleh, M.A., and Feron, E. (2002). Real-time motion planning for agile autonomous vehicles.
Journal of guidance, control & dynamics , 25(1), 116–129.Goh, J. and Sim, M. (2010). Distributionally robust opti-mization and its tractable approximations.
Operationsresearch , 58(4-part-1), 902–917.Liu, W. and Ang, M.H. (2014). Incremental sampling-based algorithm for risk-aware planning under motionuncertainty. In , 2051–2058. IEEE.Luders, B., Kothari, M., and How, J. (2010). Chanceconstrained rrt for probabilistic robustness to environ-mental uncertainty. In
AIAA guidance, navigation, andcontrol conference , 8160. Luders, B.D., Karaman, S., and How, J.P. (2013). Robustsampling-based motion planning with asymptotic opti-mality guarantees. In
AIAA Guidance, Navigation, andControl (GNC) Conference , 5097.Majumdar, A. and Pavone, M. (2017). How should arobot assess risk? towards an axiomatic theory of riskin robotics. arXiv preprint arXiv:1710.11040 .Rockafellar, R.T. (2007). Coherent approaches to riskin optimization under uncertainty. In
OR Tools andApplications: Glimpses of Future Technologies , 38–61.Informs.Summers, T. (2018). Distributionally robust sampling-based motion planning under uncertainty. In , 6518–6523. IEEE.S¨underhauf, N., Pham, T.T., Latif, Y., Milford, M., andReid, I. (2017). Meaningful maps with object-orientedsemantic mapping. In ,5079–5085. IEEE.Wiesemann, W., Kuhn, D., and Sim, M. (2014). Dis-tributionally robust convex optimization.
OperationsResearch , 62(6), 1358–1376.Zhu, H. and Alonso-Mora, J. (2019). Chance-constrainedcollision avoidance for mavs in dynamic environments.