Risk-Aware Motion Planning for a Limbed Robot with Stochastic Gripping Forces Using Nonlinear Programming
Yuki Shirai, Xuan Lin, Yusuke Tanaka, Ankur Mehta, Dennis Hong
IIEEE ROBOTICS AND AUTOMATION LETTERS. PREPRINT VERSION. ACCEPTED JUNE, 2020 1
Risk-Aware Motion Planning for a Limbed Robotwith Stochastic Gripping Forces Using NonlinearProgramming
Yuki Shirai , Xuan Lin , Yusuke Tanaka , Ankur Mehta , and Dennis Hong Abstract —We present a motion planning algorithm with prob-abilistic guarantees for limbed robots with stochastic grippingforces. Planners based on deterministic models with a worst-case uncertainty can be conservative and inflexible to considerthe stochastic behavior of the contact, especially when a grip-per is installed. Our proposed planner enables the robot tosimultaneously plan its pose and contact force trajectories whileconsidering the risk associated with the gripping forces. Ourplanner is formulated as a nonlinear programming problemwith chance constraints, which allows the robot to generate avariety of motions based on different risk bounds. To modelthe gripping forces as random variables, we employ GaussianProcess regression. We validate our proposed motion planningalgorithm on an 11.5 kg six-limbed robot for two-wall climbing.Our results show that our proposed planner generates varioustrajectories (e.g., avoiding low friction terrain under the low riskbound, choosing an unstable but faster gait under the high riskbound) by changing the probability of risk based on variousspecifications.
Index Terms —Legged Robots, Motion and Path Planning,Optimization and Optimal Control
I. INTRODUCTION P LANNING complex motions for limbed robots is chal-lenging because planners need to design footsteps and abody trajectories while considering the robot kinematics andreaction forces. Motion planning for limbed robots has beenstudied by a number of researchers. Sampling-based planning,such as the Probabilistic-Roadmap (PRM), samples the en-vironment and generates the motion while satisfying staticequilibrium and kinematics for a robot [1], [2]. Optimization-based planning, such as Mixed-Integer Convex Programming(MICP) and Nonlinear Programming (NLP), solves the solu-tion given constraints using optimization algorithms such asgradient descent [3]-[9].
Manuscript received: February, 24, 2020; Revised May, 8, 2020; AcceptedMay, 30, 2020.This paper was recommended for publication by Editor AbderrahmaneKheddar upon evaluation of the Associate Editor and Reviewers’ comments.(
Corresponding author: Yuki Shirai. ) This work was partially supported byONR through a grant N00014-15-1-2064. The work of Y. Shirai was partiallysupported by the Funai Foundation for Information Technology. Y. Shirai, X. Lin, Y. Tanaka, and D. Hong are with the Roboticsand Mechanisms Laboratory, Department of Mechanical and AerospaceEngineering, University of California, Los Angeles, CA 90095, USA. [email protected] , [email protected] , [email protected] , [email protected] A. Mehta is with the Laboratory for Embedded Machines and UbiquitousRobotics, Department of Electrical and Computer Engineering, University ofCalifornia, Los Angeles, CA 90095, USA. [email protected]
Fig. 1:
A planned trajectory for wall climbing that considers riskarising from slippery terrain. The black area shows a high frictionarea, the green area shows a low friction area, and the red areashows a zero friction area. Blue and red dots show the planned footpositions, and the hexagons show the body of the robot.
While many papers discuss motion planning for the robot,few studies have investigated how planning is affected bystochastic gripping forces. One of the open problems inmotion planning of a limbed robot equipped with grippersis the stochastic nature of gripping [10], [11]. For example,the gripping forces caused by spine grippers depend on thestochastically distributed asperity strength (Fig. 2). Thus, risk results from the randomness of the gripping force. By consid-ering risk in a probabilistic manner, the planner can design avariety of trajectories based on various specifications.The stochastic planning problem can be categorized intotwo approaches: robust approaches [5]-[7], [12] and risk-bounded approaches [7], [8], [13]-[15]. In robust approaches,the planners design trajectories that guarantee the feasibilityof the motion given the uncertainty bounds. A soft constraints-based robust planning was investigated in [6], where theplanner allows the solution to be at the boundary of stability.Tas showed the planner to remain collision-free for the worst-case uncertainty for automated driving [12].On the other hand, the risk-bounded approach designs tra-jectories that guarantee the feasibility of the motion given theprobability density function (PDF): it prevents the probabilityof violating state constraints (violation probability) from beinghigher than a pre-specified probability. Prete formulated achance-constrained optimization problem of a bipedal robot c (cid:13) a r X i v : . [ c s . R O ] J un IEEE ROBOTICS AND AUTOMATION LETTERS. PREPRINT VERSION. ACCEPTED JUNE, 2020 by approximating a joint chance constraints with linear in-equality constraints [7]. Planning on slippery terrain was in[8], where the planner utilizes the prediction of the coefficientof friction to design the motion of the body and footsteps,respectively. Our approach is similar to [8], but we model thestochastic contact force of the robot and formulate the planningalgorithm considering the trajectory of a body and footstepssimultaneously.For tasks with a higher probability of failure (e.g., climbingon slippery terrain) [15], the risk-bounded approach has advan-tages over the robust approach. Because the robust approachoften uses a much less informative deterministic model, itis likely to generate conservative solutions with the worst-case uncertainty bound. For demanding tasks, this may beinfeasible, with such a planner generating no possible solutionand failing to achieve specified goals. In contrast, because therisk-bounded approach can be more aggressive, the problemmay be feasible, generating trajectories that carry a probabilityof failure through risk-taking alongside a non-zero chanceof successfully achieving the goal. The violation probabilityprovides a tuning knob to define a Pareto boundary on therisk between failure while finding a trajectory vs. failure whileexecuting a found trajectory. This user-defined parameter canbe task- and environment-specific, in contrast to the rigidityof the robust approach.In this paper, we address a motion planning algorithmformulated as NLP for a limbed robot with stochastic grippingforces. Our proposed planner solves for stable postures andforces simultaneously with guaranteed bounded risk. In addi-tion, chance constraints are introduced into the planner thatrestrict contact forces in a probabilistic manner. We employ aGaussian Process (GP), a non-parametric Bayesian regressiontool, to acquire the PDF of the gripping force. Our proposedmotion planning algorithm is validated on an 11.5 kg hexapodrobot with spine grippers for multi-surface climbing. While wefocus on multi-surface robotic climbing with spine grippersin this paper, our proposed planner can be applied to otherrobots with any type of grippers for performing any task (e.g.,planning of walking, grasping) as long as the robot has contactpoints with stochastic models.The contributions of this paper are as follows:1) We formulate risk-bounded NLP-based planning thatconsiders the stochasticity of gripping forces.2) We employ the Gaussian Process to model grippingforces as random variables.3) We validate the algorithm in hardware experiments.II. PROBLEM FORMULATIONThis section describes the friction cone considering maxi-mum gripping forces, model of a position-controlled limbedrobot with multi-contact surfaces, and the modeling processof a gripping force through GP.
A. Friction Cone with Stochastic Gripping Forces
With grippers, the friction cone constraint can be relaxedon the contact point. For our spine-based gripper, even undera zero normal load, the spines insert into the microscopic gaps
Fig. 2: Deflection of a multi-limbed robot bracing between walls (cid:1)(cid:2)(cid:3)(cid:4)(cid:5)(cid:6)(cid:7)(cid:8)(cid:5)(cid:9)(cid:3) (cid:1) (cid:2) (cid:3) (cid:4) (cid:5) (cid:6)(cid:7) (cid:8) (cid:2) (cid:3) (cid:9) (cid:10) (cid:1) (cid:1) (cid:1) (cid:2) (cid:1) (cid:3) (cid:10)(cid:3)(cid:4)(cid:11)(cid:12)(cid:13)(cid:14)(cid:3)(cid:15)(cid:16)(cid:7)(cid:3)(cid:4)(cid:11)(cid:12)(cid:13)(cid:14)(cid:3) (cid:15)(cid:16)(cid:7)(cid:3)(cid:4)(cid:11)(cid:12)(cid:13)(cid:14)(cid:3) (cid:1) (cid:1)
Fig. 3: Friction cone considering stochastic gripping forces on the surface (Fig. 2), generating a significant amount of shearforce (Fig. 6) [19]. For a magnet-based gripper, the reactionforces includes the additional magnetic force imposed by thegripper itself, offsetting the friction cone as seen by the restof the robot.Thus, we modify the regular friction cone, adding in anoffset shear force when a normal force is zero to accountfor the gripping force. As the normal force increases, themaximal allowable shear force increase in the same way asa regular frictional force, with a coefficient of friction λ thatis assumed to be a constant only depending on the property ofthe contact surface. This contact model is illustrated in Fig. 3,where f r is the reaction force between the surface and thegripper. f g,m is the maximum gripping force from grippersunder a zero normal force. Note that f g,m is measured pergripper as a unit. In general, f g,m can have both normal andshear components. However, for our spine grippers, the normalcomponent of f g,m is relatively small, so we assume thatthe gripper generates only shear adhesion. The gripper doesnot slip when f r is within this friction cone, as indicatedby the shaded region in Fig. 3. Since the interaction betweenthe micro-spines and the surface is highly random, f g,m isnaturally modeled as a Gaussian random variable. However,the orientation of the spine and the number of spines in contactwith the surface also change as the orientation of the gripperchanges, which leads to a shift of the mean and standarddeviation of f g,m . We learn this model from data by GP. WithGP, our proposed planner is able to deal with the stochasticnature of gripping taking into account the gripper orientation. B. Model of Reaction Force Using Limb Compliance
During multi-surface locomotion, the robot leverages thecompliance from its motors in order to squeeze itself be-tween multi-surfaces, as depicted in Fig. 2. One difficultymulti-limbed robots have is that reaction forces are staticallyindeterminate [16]. Consequently, reaction forces cannot bedetermined by static equilibrium equations when the robotsupports its weight more than three contact points. Hence, inorder to calculate the reaction force under this condition, thedeformation of the robotic system should be considered.
HIRAI et al. : RISK-AWARE MOTION PLANNING FOR A LIMBED ROBOT WITH STOCHASTIC GRIPPING FORCES USING NONLINEAR PROGRAMMING 3
From the standard elasticity theory, f r can be described asthe spring force using the Virtual Joint Method [17]: f r = K ( δ wall − δ CoM ) (1) K = (cid:0) J k − J (cid:62) (cid:1) − , k = diag ( k i ) , i = 1 , . . . , H (2)where K is the stiffness matrix for H degree-of-freedom limb. k is a diagonal matrix that has k i diagonal elements, and k i isthe spring coefficients of the position-controlled servos. J isa × H Jacobian matrix. The deflection is imposed by terrainwhere δ wall is the displacement of the terrain and δ CoM is thebody deflection, sag-down, due to the compliance as shown inFig. 2.
C. Model of Gripping Force Using Gaussian Process
The objective of using GP is to predict the maximumgripping force f g,m in a probabilistic way.There are many design decisions that go into the formulationof the GP problem, including choice of kernel, distance metric,and associated weighting between state variables [18]. Wecan start with the simplest formulation with all state variablesequally weighted under the Euclidean distance metric using thesquared exponential kernel as a starting point. In practice, thischoice was observed to work well enough to not necessitatefurther design. A more general characterization of the effectsof these hyperparameters can be found in [18]. In this work, weassume that the maximum gripping forces by spine grippers isa function of the gripper orientation and the coefficient of fric-tion [10], [11], [19], [20]. This is because with a microscopicview, the spine-asperity interaction is different depending onhow a spine is inclined with respect to the asperity as shownin Fig. 2. GP can handle the effects of other unmodeledparameters by treating them into uncertainty. Hence, the state s is a four-dimensional vector with s = [ α, β, γ, λ ] (cid:62) where α , β , γ are the Euler angles along x , y , z axis defined in Fig. 4.Here, we assume that the maximum shear force followsGaussian distribution. Given a data set S = { s , · · · , s n } with the measured shear forces y g,m = [ y g,m , . . . , y g,mn ] (cid:62) ,the maximum shear force f g,m by a gripper can therefore bemodeled as: f g,m ( s ) ∼ GP ( µ g,m ( s ) , κ g,m ( s , s ∗ )) (3)where, f g,m = [ f g,m , . . . , f g,mn ] (cid:62) , n is the number of samplesfrom a GP. µ g,m ( s ) = [ µ g,m ( s ) , . . . , µ g,mn ( s )] (cid:62) is the meanand [ κ g,m ] i,j = κ g,m ( s i , s j ) is the covariance matrix, where κ g,m ( · , · ) is a positive definite kernel. In this work, we employthe squared exponential kernel as follows: κ g,m ( s i , s j ) = σ f exp (cid:32) − | s i − s j | (cid:96) (cid:33) (4)where σ f represents the amplitude parameter and l defines thesmoothness of the function f g,m .Here, let D = [ s , · · · , s n ] (cid:62) be the matrix of the inputs.In order to predict the mean and variance matrix at D ∗ , weobtain the predictive mean and variance of the maximum shearforce by assuming that it is jointly Gaussian as follows: ˆ f g,m = E [ f g,m ( D ∗ )] = κ (cid:62)∗ (cid:0) K D + σ n I (cid:1) − y g (5) (cid:1)(cid:2)(cid:2)(cid:3) (cid:4)(cid:3)(cid:3)(cid:5)(cid:6)(cid:7)(cid:8)(cid:9)(cid:10)(cid:3)(cid:11)(cid:12)(cid:13)(cid:14)(cid:10)(cid:15)(cid:11)(cid:12)(cid:13)(cid:14)(cid:10)(cid:15)(cid:16)(cid:17)(cid:2)(cid:5)(cid:18)(cid:9)(cid:18)(cid:16)(cid:11)(cid:12)(cid:14)(cid:10)(cid:9)(cid:16)(cid:19)(cid:5)(cid:20)(cid:8)(cid:11)(cid:20)(cid:14)(cid:18)(cid:9)(cid:13)(cid:21)(cid:9)(cid:6)(cid:7)(cid:5)(cid:10)(cid:14)(cid:22)(cid:8)(cid:17)(cid:14)(cid:10)(cid:23)(cid:17)(cid:14)(cid:10)(cid:23) (cid:1) (cid:2)(cid:3)(cid:4) (cid:11)(cid:12)(cid:13)(cid:14)(cid:10)(cid:15)(cid:24)(cid:9)(cid:9)(cid:18)(cid:20)(cid:9) (cid:2)(cid:3) (cid:4)(cid:5) (cid:6) (cid:1) Fig. 4: Mechanical design of the spine gripper (cid:1)(cid:2)(cid:3)(cid:4)(cid:5)(cid:6)(cid:7)(cid:8)(cid:3)(cid:9) (cid:1)(cid:10)(cid:3)(cid:11)(cid:11)(cid:5)(cid:10) (cid:12)(cid:13)(cid:10)(cid:14)(cid:5)(cid:6)(cid:1)(cid:8)(cid:2)(cid:15)(cid:5)(cid:16)(cid:8)(cid:17)(cid:4)(cid:11)(cid:8)(cid:11)(cid:5)(cid:10) (cid:18)(cid:8)(cid:10)(cid:10)(cid:3)(cid:5)(cid:10)
Fig. 5: Experiment setup to evaluate maximum gripping forces on sandpaper ˆ Σ g,m = V [ f g,m ( D ∗ )] = κ ∗∗ − κ (cid:62)∗ (cid:0) K D + σ n I (cid:1) − κ ∗ (6)where κ ∗ = κ g,m ( D ∗ , D ) , K D = κ g,m ( D, D ) , κ ∗∗ = κ g,m ( D ∗ , D ∗ ) , and σ n is the variance of the Gaussian ob-servation noise with zero mean.Our GP procedure can be generalizable to model other grip-ping forces as long as the gripping force changes continuouslyas the orientation of the gripper changes. For instance, the GPapproach can be used to model the gecko gripper force [21]using the detachment angle as the state of the GP. D. Spine Gripper for Wall Climbing
A three-finger spine-based gripper was designed (Fig. 4)using spine cells based on [20]. Each finger consists of a spinecell with 25 machine needles loaded with 5 mN/mm springs,and a slider mechanism holds the cell with one compliantplastic spring. The diameter of the needle at the tip is 0.93mm, and it is made of carbon steel. The gripper center moduleincludes one spine palm with the same spine configurationsas the cells. The attachment component is fixed at the tip ofthe robot limb at ◦ from the limb axis to maximize thecontact area. The finger, center, and attachment members areassembled with a one-slider, two linkage mechanism (Fig. 4).This linkage system is designed to provide a passive microgrip as the center palm presses up against a wall. The threefingers are located at ◦ apart from each other in z -axis andtilted about ◦ from z -axis. E. Data Collection
To collect a dataset, maximum gripping forces f g,m wereevaluated with a minimal normal force at varied orientationsas summarized in Table I. We collected 20 data sequences forevery orientation as a training dataset and 10 data sequencesas a testing dataset. The coefficient of friction between spinesand environments was measured by loading a constant mass onthe gripper. A small activation force is necessary to compressspine springs and ensure that the spines are touching thewall, but is assumed to be negligible. These orientationswere selected to cover possible gripper angles during regularwall climbing. The gripper was fixed to a linear slider at anorientation and pulled by a force gauge on 36-grit and 80-grit IEEE ROBOTICS AND AUTOMATION LETTERS. PREPRINT VERSION. ACCEPTED JUNE, 2020
TABLE I: Varied orientations for collecting datasets of GPTraining α, β = − ◦ , ◦ , ◦ , γ = 0 ◦ , ◦ , ◦ , λ = 1 . , . Testing α = − ◦ , − ◦ , · · · , ◦ |{ β = − ◦ , γ = 30 ◦ , λ = 2 . } β = − ◦ , − ◦ , · · · , ◦ |{ α = − ◦ , γ = 30 ◦ , λ = 2 . } γ = 0 ◦ , ◦ , ◦ , · · · , ◦ |{ α = − ◦ , β = − ◦ , λ = 2 . } λ = 1 . , . , . , . |{ α = 0 ◦ , β = 0 ◦ , γ = 0 ◦ } (a) β = − ◦ , γ = 30 ◦ , λ = 2 . (b) α = − ◦ , γ = 30 ◦ , λ = 2 . (c) α = − ◦ , β = − ◦ , λ =2 . (d) α = 0 ◦ , β = 0 ◦ , γ = 0 Fig. 6: The predicted maximum gripping force PDF from GP, the trainingdata PDF, and the testing dataset sandpapers that are commonly used to emulate rough surfacewith microscopic asperities [20], as shown in Fig. 5. The GPhyperparameters were optimized using the BFGS algorithm[22]. The obtained testing data with the predicted PDF of themaximum gripping force and the PDF of the training data isillustrated in Fig. 6. The predicted maximum gripping forceand the training data are displayed as a mean ± with a 95 %confidence interval. Overall, we show that the GP predictionworks well with different states.III. RISK-AWARE MOTION PLANNINGIn this section, we present a complete risk-aware motionplanning algorithm formulated as (8a)-(8k). The objective ofour proposed planner is to find the optimal trajectory forthe Center of Mass (CoM) position, its orientation, the footposition, and the reaction force for each foot in order to arriveat the destination while satisfying constraints. Our proposedplanner enables the robot to find feasible trajectories thatconsider risk from the grippers under various environments.We define one round of movement made by a robot when itsbody and all of its limbs have moved onto the next footholds.Note that for each round, the planner investigates severalcritical instants between two postures with pre-defined gaitas explained in detail in Section IV. At j -th round, Γ is thedecision variables that are given as: Γ = { p i,j , P CoM ,j , Θ CoM ,j , θ i,j , f ri,j , ˆ f g,mi,j , ˆ Σ g,mi,j } (7)where p i,j is the foot i position, P CoM ,j is the position of thebody, Θ CoM ,j is the orientation of the body, θ i,j are the jointangles for the limb i , and f ri,j is defined in Section II-A. In thisstudy, f g,mi,j is treated as a random variable based on the model minimize Γ Ψ tot (8a)s.t., for each round j = 1 , . . . , N and for each limb i = 1 , . . . , L | ∆ P CoM | ≤ ∆ P Th (linear stride) (8b) | ∆ Θ CoM | ≤ ∆ Θ Th (angular stride) (8c) | ∆ p i | ≤ ∆ p Th (foot stride) (8d) p i,j ∈ R ( P CoM ,j , Θ CoM ,j , θ i,j ) (kinematics) (8e) p i,j ∈ T (contact region) (8f) L (cid:88) i =1 f ri,j + F tot = (force eqm) (8g) L (cid:88) i =1 (cid:0) p i,j × f ri,j (cid:1) + M tot = (moment eqm) (8h) τ i,j = J ( θ i,j ) (cid:62) f ri,j (joint torque) (8i) (cid:107) τ i,j (cid:107) ≤ τ Th (torque limit) (8j) f ri,j ∈ F (cid:0) λ i,j ( p i,j ) , n i,j , f g,mi,j (cid:1) (friction cone) (8k)of GP, which follows f g,mi,j ∼ N (cid:16) ˆ f g,mi,j ( s i,j ) , ˆ Σ g,mi,j ( s i,j ) (cid:17) .Equation (8a) is the cost function that depends on the robot’sstate. Equation (8b), (8c), and (8d) bound the range oftravel between rounds. Equation (8e) represents the forwardkinematics constraints. In (8f), it ensures that p i,j is withinthe feasible terrain where the robot is able to put its limb.In this paper, we assume that the robot generates a quasi-static motion. Hence, the planner has the static equilibriumconstraints expressed by (8g) and (8h), where F tot and M tot is the external force and moment, respectively. In this work,only gravity is considered as the external force. Equation(8i) and (8j) ensure that the motor torque is lower than themaximum motor torque where J ( θ i ) is a Jacobian matrix.The reaction force f ri is constrained by (8k), which describesthe friction cone constraints to prevent the robot from slippingwhere λ i,j ( p i,j ) denotes the coefficient of friction at p i,j .Note that this constraints (8k) is also stochastic constraintsdue to f ri . Equation (8k) can be converted into deterministicconstraints, which is explained in Section III-B.Compared to sampling-based approaches such as RRT, NLPis able to formulate relatively complicated constraints suchas friction cone constraints (8k), which are typically difficultfor the sampling-based approaches to handle in terms ofcomputation. In addition, MICP approaches such as [3], [5],[9] can increase the computation speed by decoupling the posestate from wrench states. However, they potentially limit therobot’s mobility. The robot may not choose the trajectory onthe low friction terrain in case the planner first solves thepose problem and then solves the wrench problem later sincethe pose optimization problem does not consider the wrenchinformation. Although MICP can plan the trajectories consid-ering both wrench and pose state simultaneously, it needs tosacrifice the accuracy by assuming an envelope approximationon bilinear terms [3] or allow relatively expensive computationas the number of the integer variables increases, which is HIRAI et al. : RISK-AWARE MOTION PLANNING FOR A LIMBED ROBOT WITH STOCHASTIC GRIPPING FORCES USING NONLINEAR PROGRAMMING 5 intractable for high degree-of-freedom (DoF) robots (e.g., ourrobot has 24 DoF). In contrast, NLP can simultaneously solvethe trajectory reasoning both the pose and the wrench withrelatively less computation [4].
A. Deterministic Constraints
Here, we explain two deterministic constraints (8e), (8f),that are not explicitly shown in (8a)-(8k).
1) Kinematics:
Forward kinematics (8e) is given as: p i,j = R ( Θ CoM ,j ) p bi,j + P CoM ,j (9)where R ( Θ CoM ) is the rotation matrix from the world frameto the body frame, p bi is the foot position relative to the bodyframe.
2) Feasible Contact Regions:
We utilize NLP to formulatethe planning algorithm so that any nonlinear terrain (i.e., non-flat terrain), such as tube and curve, can be directly described.If a robot traverses on the flat terrain, the footstep regionsare convex polygons as follows: C r p i,j ≤ D r (10) B. Chance Constraints
Here, we show that the friction cone constraints in (8k)can be expressed using chance constraints, which allow theplanner to convert the stochastic constraints into deterministicconstraints.One key characteristic of robotic climbing is that climbingis a highly risky operation: a robot can easily fall withoutplanning its motion correctly. Hence, it needs to restrictreaction forces using the friction cone constraints given as: n (cid:62) i,j f ri,j ≥ (11) (cid:13)(cid:13) f ri,j − (cid:0) n (cid:62) i,j f ri,j (cid:1) n (cid:62) i,j (cid:13)(cid:13) ≤ λ i (cid:0) n (cid:62) i,j f ri,j (cid:1) + f g,mi,j (12)To decrease the computation of solving for the NLP solver,we simplify the (12) by linearizing them as follows: (cid:12)(cid:12)(cid:12) ζ (cid:62) i,j f ri,j (cid:12)(cid:12)(cid:12) ≤ λ i (cid:0) n (cid:62) i f ri,j (cid:1) + f g,mi,j (13) (cid:12)(cid:12)(cid:12) ξ (cid:62) i,j f ri,j (cid:12)(cid:12)(cid:12) ≤ λ i (cid:0) n (cid:62) i f ri,j (cid:1) + f g,mi,j (14)where ζ i,j , ξ i,j are any tangential direction vectors on the wallplane.Regarding (8k) formulated as (11), (13), (14), we rearrangethe equations and the joint chance constraint is given by: Pr (cid:94) j =1 ,...,N (cid:94) k =1 ,...,M α k (cid:62) i,j f g,mi,j ≤ β ki,j ≥ − ∆ (15)where α ki,j are coefficient vectors, and β ki,j are coefficientscalars that consist of the convex polytopes defined in (11),(13), (14). In (15), M denotes the number of constraints defin-ing the polytopes. ∆ is the user-defined violation probability,where the probability of violating constraints is under the ∆ .We can regard ∆ as relating to the likelihood that gripper slipwill be responsible for the failure of the robot. For example, if ∆ is high, the planner can explore a larger space because the feasible region expands in optimization. As a result, the robottends to plan a trajectory with a high violation probability byassuming that the gripper generates enough force. For a roboticclimbing task, these chance constraints enable the robot toperform challenging motions that would be infeasible withoutconsidering the gripping force. In contrast, if ∆ is small, theplanner tends to generate more conservative motions becausethe robot assumes that the gripper does not output enoughforce to support the weight of the robot.Imposing (15) is computationally intractable. Thus, usingBoole’s inequality, Blackmore [13], showed that the feasiblesolution to (15) is the feasible solution to the followingequations: Pr (cid:0) α k (cid:62) i,j f g,mi,j ≤ β ki,j (cid:1) ≥ − ∆ j,k (16) N (cid:88) j =1 M (cid:88) k =1 ∆ j,k ≤ ∆ (17)for all j = 1 , . . . , N, k = 1 , . . . , M . The violation probabil-ity for each constraint per round ∆ j,k is constrained in (17),in order not to exceed the given ∆ . Because non-uniform riskallocation (17) is also computationally expensive [14], we usethe following relation: ∆ j,k = ∆ N M (18) α k (cid:62) i,j f g,mi,j is a multivariate Gaussian distribution such that α k (cid:62) i,j f g,mi,j ∼ N (cid:16) α k (cid:62) i,j ˆ f g,mi,j , α i,j,k (cid:62) ˆ Σ g,mi,j α ki,j (cid:17) . Thus, thestochastic constraints (16) can be then converted into a de-terministic constraint as given by: Pr (cid:0) α k (cid:62) i,j f g,mi,j ≤ β ki,j (cid:1) = Φ β ki,j − α k (cid:62) i,j ˆ f g,mi,j (cid:113) α k (cid:62) i,j ˆ Σ g,mi,j α ki,j ≥ − ∆ j,k (19)where Φ is the cumulative distribution function of the standardnormal distribution. It can be transformed further as follows: α k (cid:62) i,j ˆ f g,mi,j + (cid:113) α k (cid:62) i,j ˆ Σ g,mi,j α ki,j Φ − (1 − ∆ j,k ) ≤ β ki,j (20)where Φ − is the inverse function of Φ . C. Cost Function
The cost function consists of intermediate costs and aterminal cost. In this work, the target mission is to arrive atthe destination. Thus, the terminal cost is the distance fromthe position of the last pose to the destination. Ψ D = ( q N − q D ) (cid:62) W D ( q N − q d ) (21)where W D is the weighting matrix and q N = (cid:2) p ,N , . . . , p L,N (cid:3) while q d is the configuration at thedestination. The intermediate costs restrict a large amount of IEEE ROBOTICS AND AUTOMATION LETTERS. PREPRINT VERSION. ACCEPTED JUNE, 2020 shifting in terms of linear and rotational motion of a bodyand the foot position as follows: Ψ BPos = ∆ P (cid:62) CoM W BPos ∆ P CoM Ψ Foot = L (cid:88) i =1 ∆ p (cid:62) i W Foot ∆ p i Ψ BRot = ∆Θ (cid:62)
CoM W BRot ∆Θ CoM (22)where W BPos , W Foot , and W BRot are the weighting matrix.
D. Two Step Optimization for a Position-Controlled Robot
Although our proposed motion planner works for anylimbed robot, there is a drawback for a position-controlledrobot when wall-climbing. For the position-controlled robot,it is necessary to compute how much δ wall is necessary togenerate the planned reaction forces. Therefore, the plannerneeds to include additional constraints from (1), (2) to realizethe planned trajectory. However, we observed that the nonlin-ear solver has a numerical issue with (2), so it is intractable forthe solver to solve our proposed NLP in (8a)-(8k) with (1), (2).To avoid this problem, we decouple the optimization probleminto two-step problems shown in (23a)-(23l) and (24a)-(24d): minimize Γ Ψ D + N − (cid:88) j =1 (Ψ BPos + Ψ
Foot + Ψ
BRot ) (23a)s.t. | P CoM ,j +1 − P CoM ,j | ≤ ∆ P Th (23b) | Θ CoM ,j +1 − Θ CoM ,j | ≤ ∆ Θ Th (23c) | p i,j +1 − p i,j | ≤ ∆ p Th (23d) p i,j = R ( Θ CoM ,j ) p bi,j + P CoM ,j (23e) C r p i,j ≤ D r (23f) L (cid:88) i =1 f ri,j + F tot = (23g) L (cid:88) i =1 (cid:0) p i,j × f ri,j (cid:1) + M tot = (23h) τ i,j = J ( θ i,j ) (cid:62) f ri,j (23i) (cid:107) τ i,j (cid:107) ≤ ∆ τ (23j) α k (cid:62) i,j ˆ f g,mi,j + (cid:113) α k (cid:62) i,j ˆ Σ g,mi,j α ki,j Φ − (1 − ∆ j,k ) ≤ β ki,j (23k) ∆ j,k = ∆ N M (23l) find δ wall ,i,j , δ CoM ,i,j (24a)s.t. (cid:107) δ wall ,i,j (cid:107) ≤ δ Th , wall ,i,j (24b) f ri,j = K i ( δ wall ,i,j − δ CoM ,i,j ) (24c) K i,j = (cid:16) J ( θ i,j ) k − J ( θ i,j ) (cid:62) (cid:17) − (24d)the first planner is in charge of the pose and the reaction forceof the robot, and the second planner finds δ wall ,i,j , δ CoM ,i,j ,which are the control inputs to a position-controlled robot.In (24a)-(24d), the constraint (24b) ensures that δ wall ,i,j isbounded under a certain threshold. TABLE II: NLP specifications for climbing on non-uniform walls N Variables Constraints Average T-solve (Ipopt)1 1744 779 0.4 minutes2 3937 1680 6 minutes4 11761 4994 16 minutes7 23479 9965 248 minutes
We argue that this decoupling is reasonable because thefirst planner solves the "essential" problem (e.g., How muchreaction force is necessary? What is the footstep trajectory?)to plan the force and pose trajectory. The second planner onlycomputes the control input to the motors, and it does not have asignificant effect on the entire motion planning. As explained,if the robot is force controlled, the planner does not need toconsider (1), (2). As a result, the second optimization is notnecessary for a force-controlled robot, and the whole motionis planned only based on the first optimization problem.IV. RESULTSIn this section, we evaluate our proposed planner by test-ing the robot’s performance in three different tasks: energy-efficient climbing, climbing on non-uniform terrains, andclimbing with a tripod gait.We utilize Ipopt solver [23] to solve the planning problemon an Intel Core i7-8750H machine. The derivative of con-straints are provided by CasAdi [24]. The optimizer is initial-ized with the default configuration of the robot (Fig. 1, bottomconfiguration), and the specifications of the computation forSection IV-B is summarized in Table II.We implement the results of our proposed planning algo-rithm (i.e., the motion plan), on a six-limbed robot, each limbof which has three DoF. Each joint uses pairs of DynamixelMX-106 motors, providing a maximum torque at 27 Nm. Therobot is equipped with a battery, computer, and IMU. Therobot runs a PID loop to regulate its body orientation. No othersensor is used to control its linear position. The robot weighs11.5 kg. The width of the robot’s body is 442 mm while itsheight at its standing state is 180 mm. In each experiment,the robot climbs between two walls at a distance of 1200mm, where the wall is covered with sandpapers of differentgrit size to adjust the coefficient of friction. All hardwaredemonstrations can be viewed in the accompanied video . A. Energy Efficient Planning
The objective of this task is to assess the consumed energyof climbing with two different violation probabilities. Whilethe robot can grip the wall with a low violation probability(e.g., ∆ = 0 . ), there is a disadvantage of consuming moreenergy. On the other hand, the robot may perform an energy-efficient motion with a higher violation probability (e.g., ∆ =0 . ). Here, we set N = 7 , M = 6 to compute ∆ j,k . To showthe trade-off between the consumed energy and the violationprobability, we let the robot climb on the walls with one leggait where the robot first lifts its right front limb, puts it onthe next position, pushes its body up, lifts its right middlelimb, and so on. Within each round, the planner investigates12 critical instants for one leg gait: 6 instants after the robot Video of hardware experiments: https://youtu.be/ZDqvf1J4nS4
HIRAI et al. : RISK-AWARE MOTION PLANNING FOR A LIMBED ROBOT WITH STOCHASTIC GRIPPING FORCES USING NONLINEAR PROGRAMMING 7 (a) Time history of the consumed power for right front limb(b) Time history of the consumed power for right middle limb
Fig. 7: Time history of the consumed power under the different violationprobabilities. The shaded regions are when the robot lifts a specific limb andputs it on the next position, and white regions are when the robot pushesits body up. The figure shows that the consumed power of a particular limbdecreases when the limb is in the air, while it increases when the limb is onthe wall to generate the normal force on the wall. -3 -2 -1 violation probability c on s u m edene r g y [ J ] Fig. 8: Consumed energy with different ∆ during t = 0 − s lifts one limb, and 6 instants after the robot places the limb onthe next position and pushes its body. The planner solves theoptimization problem for these 12 instants. We measure thecurrent I i,t and the voltage V i,t of each limb i online whenthe robot climbs on the wall covered by the 36-grit sandpaperswith one leg gait and estimated the power per one limb at everysampling time t . The power P i,t is estimated as follows: P i,t = V i,t × I i,t (25)We plot the consumed power for two consecutive limbsfrom the hardware experiment in Fig. 7. Fig. 7 shows thatthe consumed power of a limb decreases when the limb is inthe air while the other limbs increase the consumed power toincrease the reaction force. Furthermore, the robot consumesmore power with smaller ∆ , which means that the robot needsto push the wall to increase f r . In contrast, if ∆ = 0 . ,the solution requires less power, but has a larger probabilityof slipping. In Fig. 8, the total consumed energy from theselimbs was calculated by integrating their power over time spentclimbing. In our robot, the robot could decrease the energyby 46.5 % under ∆ = 0 . compared with the energy under ∆ = 0 . . Fig. 9: Side view of planned footsteps on non-uniform walls under: left ∆ =0 . , right ∆ = 0 . . In the left panel, the robot puts its feet on low andhigh friction terrain by taking a high risk bound. In the right panel, the robotputs its feet only on high friction terrain. B. Climbing on Non-uniform Walls
This scenario demonstrates that the robot designs differenttrajectories under the different violation probability to climbon walls with varying coefficients of friction. The plannedtrajectories are shown in Fig. 9. In this example, the robotclimbs between two walls where the terrain shown in blackis covered by 36-grit sandpapers ( λ = 2 . ), the terrain shownin green is covered by 80-grit sandpapers ( λ = 1 . ), and theterrain shown in red is covered by the material with λ = 0 as shown in Fig. 9. The varying coefficients of friction aremodeled by a parabola function, which encourages the solverto converge on a solution. In the left panel of Fig. 9, theviolation probability ∆ is 0.1 while in the right panel, theviolation probability ∆ is 0.001 for M = 6 and N = 7 .The left panel of Fig. 9 illustrates that the robot avoids thered area (zero friction) and puts its foot mostly in the blackarea (high friction), but sometimes also in the green area (lowfriction) to minimize the trajectory length. In the right panelof Fig. 9, the violation probability is decreased, and the robotfootsteps completely remain inside the high friction area. As aresult, our proposed NLP-based planner operates the pose andforces together and makes a trade-off between a shorter butmore risky trajectory and a longer but safer trajectory. Thiscannot be achieved if the planner decouples the footstep andforce planning, such as in [9]. Fig. 10 shows the trajectory withhigher risk bound ∆ = 0 . and compares the foot location at t = 146 with the foot location with ∆ = 0 . in the hardwareexperiment. We notice that at t = 146 s, the foot touches thewhite area where the coefficient of friction is 0, which neverhappened with ∆ = 0 . . Since the robot only controls itsbody orientation based on IMU feedback and does not controlits linear position, the implemented trajectory does not strictlyfollow the planned one. We observe that lower risk bound isbeneficial in this situation to avoid failure since it compensatesfor the tracking error by the imperfect controller. C. Climbing with Less Stable Gait: Tripod Gait
In this scenario, we demonstrate that the robot can conducta tripod gait, when it lifts three legs simultaneously, bysetting the violation probability much higher. Before installingthe gripper on the current six-limbed robot, it was almostinfeasible to climb on the walls with the tripod gait becauseof the torque limits of the motors. With the grippers installed,
IEEE ROBOTICS AND AUTOMATION LETTERS. PREPRINT VERSION. ACCEPTED JUNE, 2020 (cid:1)(cid:2)(cid:3)(cid:4)(cid:5)(cid:6)(cid:7) (cid:1)(cid:2)(cid:8)(cid:9)(cid:4)(cid:5)(cid:6)(cid:7)(cid:1)(cid:2)(cid:10)(cid:10)(cid:4)(cid:5)(cid:6)(cid:7) (cid:1)(cid:2)(cid:11)(cid:9)(cid:8)(cid:4)(cid:5)(cid:6)(cid:7)
Δ (cid:2) 0.001Δ (cid:2) 0.1 (cid:1)(cid:2)(cid:11)(cid:9)(cid:8)(cid:4)(cid:5)(cid:6)(cid:7)(cid:1)(cid:2)(cid:11)(cid:9)(cid:8)(cid:4)(cid:5)(cid:6)(cid:7)
Fig. 10: Snapshots of climbing experiments on non-uniform walls under thedifferent violation probabilitiesFig. 11: Climbing with tripod gait. Left: A planned trajectory of the tripod gaitunder ∆ = 0 . . Red arrows indicate the reaction forces from the walls. Right:A snapshot of climbing experiments with the tripod gait under ∆ = 0 . . however, the robot has a greater chance to climb on the wallswith a tripod gait. If we set ∆ = 0 , the problem is infeasiblesince the constraints under the worst-case uncertainty areconservative. This result would be equivalent to the results ofother robust algorithm such as [12], where the optimization-based robust approach with the worst-case uncertainty isproposed. However, by utilizing the chance constraints andincreasing the violation probability, the planner generates afeasible solution. In our trial, we set the violation probability ∆ = 0 . for M = 6 and N = 3 , and allowed the robot toclimb on a wall covered by 36-grit sandpapers. The plannerinvestigates 4 critical instants: 2 instants after the robot liftsthree limbs, and 2 instants after the robot places them downand pushes its body. The planned trajectory is illustrated inthe left panel of Fig. 11. As shown in the right under thecondition, the robot succeeded in climbing on the walls withthe tripod gait and its climbing velocity was 2.5 cm/s, whichis three times faster than the one leg gait.V. CONCLUSION AND
FUTURE WORKSIn this paper, we presented a motion planning algorithm forlimbed robots with stochastic gripping forces. Our proposedplanner exploits NLP to simultaneously plan a pose and forcewith guaranteed bounded risk. Maximum gripping forces aremodeled as a Gaussian distribution by employing the GP, which provides the planner with the mean and the covarianceinformation to formulate the chance constraints. We showedthat under our planning framework, the robot demonstratesrich - sometimes drastically different - behaviors, includingplanning a risky but energy-efficient motion versus a safe butexhausting motion, avoiding danger zones like low frictionenvironments, and choosing fast but less stable motions (i.e.,a tripod gait) based on the different violation probabilities ∆ in hardware experiments.The current limitation in this work is that the actualprobability of failure is not strictly equal to pre-defined ∆ because other sources of uncertainty exist, such as sensornoises. In future work, we will extend our planner to takeinto consideration these sources.R EFERENCES[1] T. Bretl, S. Lall, J. C. Latombe, and S. Rock, “Multi-step motion plan-ning for free-climbing robots,” in
Algorithmic Foundations of RoboticsVI , Springer, pp. 59–74, 2004.[2] A. Shkolnik, M. Levashov, I. R. Manchester, and R. Tedrake, “Boundingon rough terrain with the LittleDog robot,”
Int. J. Rob. Res. , vol. 30,issue 2, pp. 192–215, 2011.[3] A. K. Valenzuela, “Mixed-integer convex optimization for planningaggressive motions of legged robots over rough terrain,”
PhD Thesis ,Massachusetts Institute of Technology, 2016.[4] A. W. Winkler, C. D. Bellicoso, M. Hutter, and J. Buchli, “Gait andtrajectory optimization for legged systems through phase-based end-effector parameterization,”
IEEE Robot. Autom. Lett. , vol. 3, no. 3, pp.1560–1567, 2018.[5] B. Aceituno-Cabezas et al., “Simultaneous contact, gait, and motionplanning for robust multilegged locomotion via mixed-integer convexoptimization,”
IEEE Robot. Autom. Lett. , vol. 3, no 3, pp. 2531–2538,2018.[6] A. W. Winkler et al., “Fast trajectory optimization for legged robotsusing vertex-based ZMP constraints,”
IEEE Robot. Autom. Lett. , vol. 2,no 4, pp. 2201-2208, 2017.[7] A. D. Prete and N. Mansard, “Robustness to joint-torque-tracking errorsin task-space inverse dynamics,”
IEEE Trans. Robot. , vol. 32, no 5, pp.1091-1105, 2016.[8] M. Brandão et al., “Material recognition CNNs and hierarchical planningfor biped robot locomotion on slippery terrain,” in
Proc. 2016 IEEE-RASConf. Humanoid Robots , pp. 81–88, 2016.[9] X. Lin, J. Zhang, J. Shen, G. Fernandez, and D. W. Hong, "Optimizationbased motion planning for multi-limbed vertical climbing robots," in
Proc. 2019 IEEE/RSJ Conf. Intell. Robots Syst. , pp. 1918-1925, 2019.[10] S. Wang, H. Jiang, and M. R Cutkosky, “Design and modeling oflinearly-constrained compliant spines for human-scale locomotion onrocky surfaces,”
Int. J. Rob. Res. , vol. 36, issue 9, pp. 985-999, 2017.[11] H. Jiang, S. Wang, and M. R Cutkosky, "Stochastic models of compliantspine arrays for rough surface grasping,“
Int. J. Rob. Res. , vol. 37, no.7, pp.669-687, 2018.[12] O. S. Tas and C. Stiller, “Limited visibility and uncertainty aware motionplanning for automated driving,” in
Proc. 2018 IEEE Intelligent VehiclesSymposium , pp. 1171–1178, 2018.[13] L. Blackmore and M. Ono, ”Convex chance constrained predictivecontrol without sampling,” in
Proc. AIAA Guid. Navigat. Control Conf. ,2009.[14] M. Ono and B. C. Williams, “An efficient motion planning algorithm forstochastic dynamic systems with constraints on probability of failure,”in
Proc. 2008 AAAI Conf. Artif. Intell. , pp. 3427–3432, 2008.[15] M. Ono, M. Pavone, Y. Kuwata, and J. B. Balaram, “Chance-constraineddynamic programming with application to risk-aware robotic spaceexploration,”
Autom. Robots , vol. 39, no. 4, pp. 555–571, 2015.[16] X. Lin, H. Krishnan, Y. Su, and D. W. Hong, “Multi-limbed robotvertical two wall climbing based on static indeterminacy modeling andfeasibility region analysis,” in
Proc. 2018 IEEE/RSJ Conf. Intell. RobotsSyst. , pp. 4355-4362, 2018.[17] A. Pashkevich, A. Klimchik, and D. Chablat, “Enhanced stiffnessmodeling of manipulators with passive joints, Mechanism and machinetheory,”
Mech. Mach. Theory , vol. 46, pp. 662-679, 2011.
HIRAI et al. : RISK-AWARE MOTION PLANNING FOR A LIMBED ROBOT WITH STOCHASTIC GRIPPING FORCES USING NONLINEAR PROGRAMMING 9 [18] C. Rasmussen and C. Williams, “Gaussian processes for machinelearning,”
MIT press , vol. 1, 2006.[19] K. Nagaoka et al., "Passive spine gripper for free-climbing robot inextreme terrain,"
IEEE Robot. Autom. Lett. , vol. 3, no 3, pp. 1765-1770, 2018.[20] A. Asbeck et al., "Scaling hard vertical surfaces with compliant mi-crospine arrays,"
Int. J. Rob. Res. , vol. 25, no. 12, pp. 1165-1179, 2006.[21] K. Autumn et al., "Frictional adhesion: a new angle on gecko attach-ment,"
J. Exp. Biol. , vol. 209, issue 18, pp. 3569-3579, 2006.[22] R. Fletcher, “Practical methods of optimization,”
NY: John Wiley Sons ,vol. 2, 1981.[23] A. Waechter and L. T. Biegler, “On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming,”
Mathematical Programming , vol. 106, no. 1, pp. 25–57, 2006.[24] J. A. E. Andersson et al., “CasADi – A software framework for nonlinearoptimization and optimal control,”