Walking with Confidence: Safety Regulation for Full Order Biped Models
IIEEE ROBOTICS AND AUTOMATION LETTERS. PREPRINT VERSION. ACCEPTED JULY, 2019 1
Walking with Confidence:Safety Regulation for Full Order Biped Models
Nils Smit-Anseeuw , C. David Remy , and Ram Vasudevan Abstract —Safety guarantees are valuable in the control ofwalking robots, as falling can be both dangerous and costly.Unfortunately, set-based tools for generating safety guarantees(such as sums-of-squares optimization) are typically restricted tosimplified, low-dimensional models of walking robots. For morecomplex models, methods based on hybrid zero dynamics canensure the local stability of a pre-specified limit cycle, but providelimited guarantees. This paper combines the benefits of bothapproaches by using sums-of-squares optimization on a hybridzero dynamics manifold to generate a guaranteed safe set fora 10-dimensional walking robot model. Along with this set, thispaper describes how to generate a controller that maintains safetyby modifying the manifold parameters when on the edge of thesafe set. The proposed approach, which is applied to a bipedalRabbit model, provides a roadmap for applying sums-of-squarestechniques to high dimensional systems. This opens the door fora broad set of tools that can generate flexible and safe controllersfor complex walking robot models.
Index Terms —Legged Robots; Robot Safety; UnderactuatedRobots
I. I
NTRODUCTION A VOIDING falls is a safety critical and challenging taskfor legged robotic systems. This challenge is com-pounded by strong limits on the available actuation torques;particularly at the ankle or ground contact point. These limitsin actuation mean that the motion of a legged robot is oftendominated by its mechanical dynamics, which are hybrid,nonlinear, and unstable. A consequence of these limitations isthat a controller might be required to take a safety preservingaction well before the moment a failure occurs.Consider, for example, a bipedal robot that just enteredsingle stance during a fast walking gait. The robot is pivotingdynamically over the stance foot and can only apply limitedankle torques to control its motion. To catch the robot again,the swing foot needs to be brought forward rapidly and beplaced well in front of the robot. If the forward velocity of therobot and hence the pivoting motion is too fast, there will not
Manuscript received: February, 24, 2019; Revised June, 4, 2019; AcceptedJuly, 8, 2019.This paper was recommended for publication by Editor Nikos Tsagarakisupon evaluation of the Associate Editor and Reviewers’ comments. This workwas supported by the National Science Foundation under Grant No. 1562612.Any opinion, findings, and conclusions or recommendations expressed in thismaterial are those of the authors and do not necessarily reflect the views ofthe National Science Foundation. Nils Smit-Anseeuw and Ram Vasudevan are with the Departmentof Mechanical Engineering at the University of Michigan, Ann Arbor [email protected], [email protected] C. David Remy is with the Institute for Nonlinear Mechanicas at theUniversity of Stuttgart [email protected]
Digital Object Identifier (DOI): see top of this page. -q q -q -q q Z S STQ ΔΔ -α-θ ZZ F Z F V ^ ^^ Fig. 1. Generating safety guarantees for a high dimensional robot (illustratedon Rabbit [2]). The state-space of the full robot is given in the top right figure,where
T Q is the tangent space on Q , S is the hybrid guard representing foottouchdown, and ∆ is the corresponding discrete reset map. Using feedbacklinearization, we restrict our states to lie on a low-dimensional manifold Z ,reducing the state-space dimension to an amenable size for sums-of-squaresanalysis. This manifold is parameterized by the underactuated degrees offreedom of the robot θ , as well as a set of shaping parameters α . The shapingparameters can be modified in real-time by a control input, allowing for abroad range of behaviours on Z . To guarantee safety on Z we find the set ofunsafe states Z F from which the state may leave the manifold (for instancedue to motor torque limits). We then use sums-of-squares tools [3] to finda control invariant set ˆ V ⊂ Z \ Z F . This control invariant set can be usedto define a semi-autonomous, guaranteed safe controller for the full robotdynamics. be enough time to complete this foot placement far enough infront of the stance leg to slow the robot down [1]. As a result,the robot’s speed increases further, leaving even less time forleg swing in the subsequent steps. The robot might manage tocomplete another couple of strides, but at this point a fall isinevitable and no control action can prevent it.Knowing the limits of safe operation is akin to knowing theset of states from which falls, even in the distant future, canbe avoided. Such knowledge is valuable for many reasons.Knowing that a fall is inevitable is useful in itself, as itallows a robot to brace for the imminent impact. Knowingthe distance from the border of the safe set could allow arobot to estimate the set of impulses that can be withstoodwithout failing. This would allow it to judge whether or not itcan safely interact with the environment in a given situation;for example, to push a cart while walking. Most importantly,this knowledge is valuable due to the flexibility it can create.Rather than stabilizing the robot motion along a specifiedtrajectory, one could imagine controllers that are adaptive toadjust to the environment, to maximize performance, or to a r X i v : . [ c s . R O ] J u l IEEE ROBOTICS AND AUTOMATION LETTERS. PREPRINT VERSION. ACCEPTED JULY, 2019 fulfill a secondary task such as pointing a sensor onto a target.Any of these secondary tasks can be pursued as long as thestate of the robot is within the safe set.In this context, a representation of the set of safe statesenables the construction of a regulator that monitors the systemstate and takes safety preserving actions only when the robotis at risk of failure [4]. Such a regulator could guaranteesafe operation, while allowing a secondary control system tobehave flexibly as long as safety is not threatened.Identifying such safety limits, however, is a challengingproblem for nonlinear and hybrid systems. A promising toolfor identifying the safety limits of a legged robotic system issums-of-squares (SoS) optimization [3]. This approach usessemi-definite programming to identify the limits of safety inthe state space of a system as well as associated controllers fora broad class of nonlinear [5], [6], [7] and hybrid systems [8],[9]. These safe sets can take the form of reachable sets (setsthat can reach a known safe state) [10], [9], [5] or invariantsets (sets whose members can be controlled to remain in theset indefinitely) in state space [11], [8], [12]. However, therepresentation of each of these sets in state space severelyrestricts the size of the problem that can be tackled by theseapproaches. To accommodate this limitation, sums-of-squaresanalysis has been primarily applied to reduced models of walk-ing robots: ranging from spring mass models [13], to invertedpendulum models [10], [14] and to inverted pendulum modelswith an offset torso mass [12]. The substantial differencesbetween these simple models and real robots causes difficultywhen applying these results to hardware.A contrasting approach to designing stable controllers forhigh dimensional, underactuated robot models uses hybrid zerodynamics (HZD) [15]. In this approach, feedback linearizationis used to drive the actuated degrees of freedom of therobot towards a lower dimensional hybrid zero dynamicsmanifold. This manifold is specified as the zero levelset ofa configuration-dependent output vector and represents themotion of the robot in its underactuated degrees of freedom.Significant progress has been made in the generation ofsafety certificates for HZD controllers. Much of this work[16], [17], [18], [19], [20] relies on the Poincar´e stability ofa periodic limit cycle in order to generate safety guarantees.This reliance is restrictive, as it precludes behaviors that wouldleave the neighborhood of the limit cycle. Recent work hasbeen done to extend the range of safe HZD behaviours beyonda single limit cycle neighborhood [21], [22], [23]. In [21] and[22], the controller is allowed to discretely switch betweena family of periodic gaits. Safety is then ensured using adwell time constraint that limits how frequently switchingcan occur. In [23], a combination of HZD and finite stateabstraction is used to safely regulate forward speed of a fully-actuated bipedal robot in continuous time. Our approachshares similarities with these recent papers, but allows forcontinuous-time variation of behaviour (instead of discreteswitching), and applies to underactuated robotic systems.In this paper, we build on both of these broad approachesto safety and control synthesis for legged robotic systems. Tocombine the full-model accuracy of hybrid zero dynamics andthe set-based safety guarantees of sums-of-squares program- ming, we propose the following approach (Fig. 1). First, weuse hybrid zero dynamics to map the full order dynamics toa low dimensional hybrid manifold. We control the dynamicson the manifold using a set of shaping parameters , whichare modified in continuous time to modify robot behaviour.We then use sums-of-squares programming to find a subset ofthis manifold which can be rendered forward control invariant.Once this subset is found on the low dimensional manifold, aregulator can be constructed that allows for free control of themanifold dynamics when safety is not at risk, but switches toa safety preserving controller when safety is threatened.The approach is presented in a general form that extendsto a large class of underactuated bipedal robots. Throughoutthe paper, an example implementation is given for a 10-dimensional model of the robot Rabbit [2] and a trackingtask is used to illustrate semi-autonomous safe control. To thebest of our knowledge, this is the highest dimensional walkingrobot model for which set-based safety guarantees have beengenerated thus far.The rest of this paper is organized as follows: Section II for-mally defines the assumptions and objective of this paper. Thenext two sections describe our method. Section III constructs alow dimensional zero dynamics manifold with control input. InSection IV we present a sums-of-squares optimization whichfinds a control invariant subset of the manifold that avoids adesignated set of unsafe states. Section V describes the resultsof our implementation on the robot Rabbit [2], and conclusionsare presented in Section VI.II. P
ROBLEM S ETUP
A. Robot Model
For simplicity, we apply similar modeling assumptions tothose made in [15]. That is, the robot is modeled as a planarchain of rigid links with mass. Each joint is directly torqueactuated except for the point of contact with the ground,leading to one degree of underactuation for a planar model.The full configuration of the robot is given by the set of jointangles q = { q , . . . , q n q } ∈ Q ⊂ R n q . We next define the setof feasible configurations ˜ Q ⊂ Q (similarly to [11]): Definition 1.
A configuration is feasible if the joint anglessatisfy actuator limits, and only foot points are touching theground (i.e. the robot has not fallen over).
Using the method of Lagrange, we can obtain a continuousdynamic model of the robot during swing phase: ˙ x ( t ) = f ( x ( t )) + g ( x ( t )) u ( t ) . (1)where x ( t ) = [ q (cid:62) ( t ) , ˙ q (cid:62) ( t )] (cid:62) ∈ T Q ⊂ R n q denotes thetangent space of Q , u ( t ) ∈ U , U describes the permitted inputsto the system, and t denotes time.We assume that an instantaneous and impulsive impactoccurs each time the swing foot hits the ground, with thestance leg leaving the ground immediately after impact. As in[15], we can construct a reset map for the state after impact: x ( t + ) := ∆( x ( t − )) (2) = (cid:20) ∆ q q ( t − )∆ ˙ q ( q ( t − )) ˙ q ( t − ) (cid:21) . (3) MIT-ANSEEUW et al. : WALKING WITH CONFIDENCE 3
Here the superscript plus indicates the time just after the eventand the superscript minus indicates the time just before theevent. ∆ :
T Q → T Q is the reset map of the robot state. ∆ q ∈ R n q × n q is a coordinate transformation matrix that swaps theswing leg and the stance leg after impact. ∆ ˙ q : Q → R n q × n q ,is the configuration-dependent reset map of the configurationvelocities.This equation holds true for all states in S ⊂ T Q , whichis called the guard of the hybrid system, and represents thestates of the robot with zero swing foot height and downwardsswing foot velocity. Any time the state of the robot enters S ,the reset event must occur. Example 1.
The configuration q for Rabbit is shown inFigure 1 (top left). ˜ Q is the set of robot configurations inwhich only foot points intersect the ground and all joints arewithin the limits: q , q , q ∈ [ − π/ , π/ , q , q ∈ [ − π/ , . When the swing foot intersects with the ground, we enterthe guard S . This causes an impulse to be transmitted tothe colliding foot, and the swing and stance feet swap. Theimpulse and coordinate swap are given by ∆ . The jointtorques torques are saturated to take values in the interval U = [ −
30 Nm ,
30 Nm] . All kinematic and inertial propertiesof the model are given in [2].B. Safety In this paper, safety is defined as keeping the configuration feasible for all time (i.e. q ( t ) ∈ ˜ Q, ∀ t ). To guarantee safety,this paper finds a viability domain [11]: Definition 2. A viability domain V ⊂ R n q is any setsatisfying V ⊂ T ˜ Q which is also forward control invariant.That is, there exists a Lipschitz state feedback controller u s : T ˜ Q → U , such that for every initial condition x ∈ V , theexecution of the system from the initial condition remains in V for all time t ∈ [0 , ∞ ) . We refer to any feedback controller thatis able to ensure that the system is forward control invariantas an Autonomous Viable Controller . The forward control invariance property ensures that any statethat begins within a viability domain V can be controlled toremain within the domain. Since V contains only feasibleconfigurations ( V ⊂ T ˜ Q ), we know that safety can bemaintained by at least one controller from all states in V .Once a viability domain is found, we use it to construct asemi-autonomous, safety preserving controller. Given an initialstate within V , a user defined control input is applied withoutmodification to the system. The state of the system is thencontinuously monitored. If the state approaches the boundaryof the viability domain, the control input is overridden by anautonomous viable controller. This gives the user full controlover the system until safety is threatened, at which point,safety is automatically enforced. Once safety is no longer atrisk, control is returned to the user. C. Goal
Using these definitions, we state our objective as:1) Find a viability domain and a corresponding autonomousviable controller. 2) Use this domain and autonomous viable controller toconstruct a semi-autonomous viable controller.III. C
ONTROLLED H YBRID Z ERO D YNAMICS M ANIFOLD
We intend to use sums-of-squares optimization to achievethese objectives. However, the state-space dimension of re-alistic robot models far exceeds the limits of this tool. Forinstance, the state-space of the benchmark model Rabbit [2]has dimension 10, while many sums-of-squares problemsbecome computationally challenging above dimension 6 [12].In this section, we show how the the state-space dimensioncan be reduced to a feasible size using the idea of hybrid zerodynamics [15].
A. Shaping Parameters
The hybrid zero dynamics approach uses feedback lin-earization to drive the actuated degrees of freedom onto alow-dimensional manifold specified by a set of user-chosenoutputs, which depend on the robot configuration q ∈ Q . Wemodify this approach by making these outputs also depend ona set of time varying shaping parameters α ( t ) ∈ A ⊂ R n α . Theshaping parameters α are used in this paper to provide an inputwithin the manifold dynamics. By varying α continuouslyover time, the user can change the hybrid zero dynamicsmanifold to modify the robot behaviour in real-time. The ideaof modifying the HZD manifold in real-time is similar to [23]in which the desired hip velocity v acts as an input to themanifold.We define the dynamics of α as: ˙ x α ( t ) = f α ( x α ( t )) + g α ( x α ( t )) u α ( t ) , (4)where x α ( t ) = [ α (cid:62) ( t ) , ˙ α (cid:62) ( t )] (cid:62) ∈ T A , u α ( t ) ∈ U α ⊂ R n α are the shaping parameter inputs (with permitted values U α ),and t denotes time. We require that α has vector relative degreetwo under these dynamics. We assume a trivial discrete updatefor the shaping parameters when the robot state hits a guard: x α ( t + ) = x α ( t − ) . Example 2.
As shown in the bottom left of Figure 1, we usea single shaping parameter α ( t ) ∈ [ − π/ , π/ to modifythe desired pitch angle of Rabbit. Note that this choice issomewhat arbitrary; α could instead modify properties suchas step length or center of mass height. We define the dynamicsof α as follows: ddt (cid:20) α ( t )˙ α ( t ) (cid:21) = (cid:20) ˙ α ( t )0 (cid:21) + (cid:20) (cid:21) u α ( t ) , (5) where u α represents the user-controlled pitch acceleration.B. Constructing the Manifold In this subsection, we incorporate these shaping parametersin the construction of the hybrid zero dynamics manifolddescribed in [15]. Throughout the section, we use L xf and L xg to represent the Lie derivatives in T Q with respect to f and g , and L x α f α and L x α g α to represent the Lie derivatives in T A with respect to f α and g α (where we drop the arguments). IEEE ROBOTICS AND AUTOMATION LETTERS. PREPRINT VERSION. ACCEPTED JULY, 2019
We begin by using a set of outputs: h : Q × A → R n u toimplicitly define the hybrid zero dynamics manifold as: Z := { ( q, ˙ q, α, ˙ α ) ∈ T Q × T A | h ( q, α ) = 0 , ( L xf h )( q, α, ˙ q ) + ( L x α f α h )( q, α, ˙ α ) = 0 } (6)These outputs must satisfy hypotheses similar to HH 1-4in [15], and the resulting manifold Z must satisfy the hybridinvariance condition : (cid:20) ∆( x ( t − )) x α ( t − ) (cid:21) ∈ Z ∀ (cid:20) x ( t − ) x α ( t − ) (cid:21) ∈ Z ∩ ( S × T A ) . (7)Provided these conditions are met, we can use the results in[24, Chapter 9.3.2] to show that Z is a smooth submanifoldin T Q × T A of dimension n z = 2( n q − n u + n α ) . In addition,the control input u ∗ : T Q × T A × U α → U given by: u ∗ ( x, x α , u α ) = − ( L xg ( L xf h + L x α f α h )) − (cid:16) L xf ( L xf h + L x α f α h )++ L x α g α ( L xf h + L x α f α h ) u α + L x α f α ( L xf h + L x α f α h ) (cid:17) (8)renders Z invariant under the hybrid dynamics of the robot(note the right hand side arguments are suppressed to simplifypresentation).As in hypothesis HH 3 in [15], we define a set of phas-ing coordinates θ : Q → R n q − n u which represent theunderactuated degrees of freedom of the robot. Using thesecoordinates, we can parameterize the on-manifold state of therobot ˆ x ( t ) ∈ Z as: ˆ x ( t ) = [ θ ( q ) (cid:62) , ˙ θ ( q, ˙ q ) (cid:62) , α (cid:62) , ˙ α (cid:62) ] (cid:62) (wherewe have suppressed the time dependence on the right handside). The continuous dynamics under this parameterizationare then: ˙ˆ x = ˙ θ L xf L xf θ + L xg L xf θu ∗ f α + g α u α = ˆ f (ˆ x ) + ˆ g (ˆ x ) u α , (9)where we have suppressed the time dependence. The discretemanifold dynamics are given by: ˆ x ( t + ) = ˆ∆(ˆ x ( t − )) , ∀ ˆ x ( t − ) ∈ ˆ S, (10)where t − is the state before impact, and the manifold guardand reset ( ˆ S and ˆ∆ ) are defined as: ˆ S = Z ∩ ( S × A ) (11) ˆ∆(ˆ x ( t − )) = θ (∆ q ( q ( t − ))) ∂θ∂q (∆ q ( q ( t − )))∆ ˙ q ( q ( t − )) ˙ q ( t − ) x α ( t − ) . (12) Example 3.
We begin by using the trajectory optimizationtoolbox FROST [25] to find a time-varying, periodic walkingtrajectory: q F r : [0 , t max ] → Q . For this trajectory, the stanceleg angle of the robot: θ ( q ) = − q − q − q is monotonic intime and varies from θ min to θ max . This allows us to define aphasing function t θ : [ θ min , θ max ] → [0 , t max ] which satisfies q F r ( t θ ( θ ( q F r ( t )))) = q F r ( t ) (i.e. t θ maps from points in thestate space to points along the trajectory). We modify the pitch angle of the FROST trajectory usingthe shaping parameter α , giving us the output function: h ( q, α ) = q − q F r ( t θ ( θ ( q ))) − αq − q F r ( t θ ( θ ( q ))) q − q F r ( t θ ( θ ( q ))) + αq − q F r ( t θ ( θ ( q ))) + h m ( θ ( q ) , α ) . (13) Here we also added the function h m : Q × A → R which ischosen to ensure satisfaction of the hybrid invariance condition(7) . This technique for ensuring hybrid invariance is similarto the procedure given in [26].The guard of our HZD manifold Z is given as ˆ S = { ˆ x | θ = θ max , ˙ θ > } and the reset is defined as in (12) .C. Safety on the Manifold We now revisit the safety criteria from Section II-B underthe assumption that our state is controlled to lie on Z . For thebiped to be safe, we require that the manifold state remainsin the feasible set ˜ Q , and that the state does not leave themanifold (either by leaving the manifold boundary, or byencountering actuator limits when trying to stay on Z ). Wedefine the unsafe states Z F ⊂ Z as the union of: • The infeasible states: (( T Q \ T ˜ Q ) × T A ) ∩ Z • The states that leave the manifold boundary , i.e. allmembers of the boundary set ( ∂Z = { ˆ x ∈ Z | q (ˆ x ) ∈ ∂Q or α ∈ ∂A } ) which do not lie on a guard, and thathave an outward velocity. • The states requiring unattainable actuation to remain on Z , i.e. all states ( x, x α ) ∈ Z for which u ∗ ( x, x α , u α ) / ∈ U, ∀ u α ∈ U α .Additionally we define the state-dependent set of realizableshaping parameter inputs ˆ U : T Q × T A → U α , as ˆ U ( x, x α ) = { u α ∈ U α | u ∗ ( x, x α , u α ) ∈ U } (where U α denotes the set ofall subsets of U α ).Provided we constrain the manifold state to avoid Z F , andconstrain the shaping parameter input to lie within ˆ U , oursafety criteria is maintained.Our goal from Section II-C can now be re-stated as:1) Find a viability domain on Z that does not intersect Z F ,and an autonomous viable controller ˆ u s : Z → ˆ U .2) Use this domain and autonomous viable controller toconstruct a semi-autonomous controller. Example 4.
For the Rabbit example, the set of states thatleave the manifold boundary are given by Z LMB = { ˆ x | α = π/ , ˙ α > }∪{ ˆ x | α = − π/ , ˙ α < } . All other states on themanifold boundary either lie on a guard ( θ = θ max , ˙ θ > ),or flow inwards. We use sampling and fitting to find a region Z Lim ⊂ Z where the actuator torque limits can be satisfiedfor some u α . We then define our unsafe set (see Fig. 2): Z F = ((( T Q \ T ˜ Q ) × T A ) ∩ Z ) ∪ Z LMB ∪ ( Z \ Z Lim ) . (14) The set of attainable inputs ˆ U is given by the minimum andmaximum values of u α at each sample point ( x, x α ) ∈ Z thatsatisfy u ∗ ( x, x α , u α ) ∈ U . MIT-ANSEEUW et al. : WALKING WITH CONFIDENCE 5
IV. H
YBRID C ONTROL I NVARIANT S ET This section outlines how the low dimensional safety prob-lem from Section III-C can be solved using sums-of-squaresoptimization [3]. Broadly, the sums -of-squares approach en-forces constraints of the form p ≥ (where p is a function)by constraining p to be a sum-of-squares polynomial, i.e. p = (cid:80) i p i (where p i are polynomials). We refer to thisconstraint as p ∈ SoS .We begin by showing how the sets and dynamics fromthe preceding section can be represented using polynomials.We next define a bilinear semi-definite program for findinga viability domain, and describe the alternation used to solveit. Finally, we construct a guaranteed safe semi-autonomouscontroller for the full robot, based on this viability domain.
A. Polynomial Representation
For the dynamics of the system to be used inside our sums-of-squares program, they must be represented in a polynomialform. In particular, we require polynomial representations ofthe functions ˆ f, ˆ g, ˆ∆ and the sets ˆ S, Z F , ˆ U . Since these setsand functions can contain trigonometric as well as rationalterms in their definition, we rely on approximate representa-tions. It is important to take care to ensure that the safetyguarantee is preserved under approximation.To generate polynomial approximations and verify boundingrelations, we use sampling to obtain the exact function valuesover a dense grid in the state space. This sampling approach ismade tractable by the reduction in dimension of the previoussection. In our example, this reduces the dimension that mustbe sampled from 10 to 4. We use a × × × samplegrid to fit and bound the polynomials. The bounds are thenverified using a dense set of randomly generated test points.We begin by sampling ˆ f : Z → R n z and ˆ g : Z → R n z × n α over our grid of points in Z . Least-squares fitting can then beused to obtain the corresponding polynomial representations: ˆ f p and ˆ g p . To account for the approximation error in thecontinuous dynamics functions, we introduce a set of error-bounding polynomials ˆ e p : Z → R n z which satisfy: ˆ e p (ˆ x ) ≥ (cid:12)(cid:12)(cid:12) ˆ f (ˆ x ) − ˆ f p (ˆ x ) + (ˆ g (ˆ x ) − ˆ g p (ˆ x )) ˆ u (cid:12)(cid:12)(cid:12) , (15)for all ˆ x ∈ Z and ˆ u ∈ ˆ U where the inequality and absolutevalue are taken element-wise. These polynomials can be foundusing a linear program that minimizes the integral of ˆ e p subjectto (15) enforced at our set of sample points.To represent sets in polynomial form, we require them totake the form of semi-algebraic sets (i.e. a set X ⊂ Y isdefined as X = { y ∈ Y | h i ( y ) ≥ , ∀ i = 1 , . . . n } , where h : Y → R n is a collection of polynomials). We use a boundingset to approximate the reset map ˆ∆ : ˆ S → Z in a conservativemanner. That is, we find a set R p ⊂ Z × Z that bounds allpossible reset behaviours: (ˆ x, ˆ∆(ˆ x )) ∈ R p , ∀ ˆ x ∈ ˆ S. (16)The sets ˆ S and Z F are represented with semi-algebraic outerapproximations as follows: Z Fp ⊃ Z F , ˆ S p ⊃ ˆ S . We definethe sets R p , Z Fp , ˆ S p using the respective polynomials: h R : Z × Z → R n hr , h F : Z → R n hf , h S : Z → R n hs . Thespace of feasible inputs ˆ U can be approximated using a state-dependent box constraint: ˆ u min (ˆ x ) ≤ ˆ u (ˆ x ) ≤ ˆ u max (ˆ x ) , ∀ ˆ x ∈ Z \ Z Fp (17)where ˆ u min , ˆ u max : Z \ Z Fp → U α are polynomial inputbounds, and the inequality is taken element-wise. The set ofinputs that satisfy this box constraint is denoted by ˆ U p . B. Optimization Formulation
We use an optimization similar to [12] to find the largestpossible viability domain ˆ V ⊂ Z \ Z F for our hybrid zerodynamics system. We represent ˆ V as the zero super-levelsetof a polynomial function ˆ v : Z → R (i.e. ˆ V = { ˆ x ∈ Z | ˆ v (ˆ x ) ≥ } ), and represent the autonomous viable controllerusing a polynomial function ˆ u s : Z → R n α . To enforce theviability of ˆ V according to Definition 2, we require ˆ v and ˆ u s to satisfy four conditions: Viability Conditions. ˆ V does not intersect Z F (i.e. ˆ v (ˆ x ) < , ∀ ˆ x ∈ Z F )2) All states that are contained in both the guard and ˆ V must be mapped to a state in ˆ V (i.e. ˆ v ( ˆ∆(ˆ x )) ≥ , ∀ ˆ x ∈{ ˆ x ∈ ˆ S | ˆ v (ˆ x ) ≥ } )3) At the boundary of ˆ V (i.e. where ˆ v (ˆ x ) = 0 ), the stateflows inward under the controller ˆ u s (i.e. d ˆ vdt > )4) The autonomous safe controller must satisfy the inputbounds within the safe set (i.e. ˆ u s (ˆ x ) ∈ ˆ U, ∀ ˆ x ∈ ˆ V ) Condition 1 ensures that states can not leave the viabilitydomain by simply leaving the space Z . Condition 2 ensuresthat states can not leave the viability domain when traversing aguard. Condition 3 ensures that states cannot leave the viabilitydomain under the continuous dynamics of the system. Finally,Condition 4 ensures that our controller respects the robottorque constraints. Each of these conditions are ensured witha corresponding sums-of-squares constraint, giving us: SoS Constraint 1. (Viability Condition 1) − ˆ v − σ h F ∈ SoS
Here σ : Z → R × n hf ∈ SoS are sums-of-squarespolynomials that relax the positivity constraint outside Z Fp .We refer to such polynomials as s-functions . SoS Constraint 2. (Viability Condition 2) ˆ v + − ˆ v − − σ h R − σ h − S ∈ SoS
Here σ : Z × Z → R × n hr , σ : Z → R × n hs ∈ SoS are s-functions. The superscripts − and + indicate whether afunction is evaluated using the first ( − ) or second ( + ) argu-ment of h R : Z × Z → R n hr . That is, this constraint enforces: ˆ v (ˆ x + ) − ˆ v (ˆ x − ) − h R (ˆ x − , ˆ x + ) σ (ˆ x − , ˆ x + ) − h S (ˆ x − ) σ (ˆ x − ) > , ∀ (ˆ x − , ˆ x + ) ∈ Z × Z . Note that the addition of the σ termis not strictly necessary, since points in ˆ S p must lie in R p .However, this term can help relax the constraint when pointsin R p lie outside ˆ S p . IEEE ROBOTICS AND AUTOMATION LETTERS. PREPRINT VERSION. ACCEPTED JULY, 2019
SoS Constraint 3. (Viability Condition 3) L ˆ x ˆ f p ˆ v + L ˆ x ˆ g p ˆ v ˆ u s + n z (cid:88) j =1 q j + ˆ vλ + σ h F ∈ SoS q − L ˆ x ˆ e p ˆ v + σ h F ∈ SoS q + L ˆ x ˆ e p ˆ v + σ h F ∈ SoS
Here σ : Z → R × n hf ∈ SoS and σ , σ : Z → R n d × n hf are s-functions that relax the constraint inside Z Fp , and λ : Z → R is a slack polynomial that can relax the constraintwhenever ˆ v (cid:54) = 0 . The polynomials q : Z → R n z are usedto bound the effects of the dynamics error ˆ e p on the timederivative of ˆ v . SoS Constraint 4. (Viability Condition 4) ˆ u s − ˆ u min + h F σ ∈ SoS − ˆ u s + ˆ u max + h F σ ∈ SoS
Here σ , σ : Z → R n α ∈ SoS are s-functions that relax theconstraint inside Z Fp .The desired objective of our optimization is to maximizethe volume of ˆ V . This volume is difficult to compute exactlyfor an arbitrary ˆ v , since the domain of integration is givenby a semi-algebraic set. We propose an analytically tractableapproximation to this objective: (cid:90) Z ˆ v (ˆ x ) d ˆ x. (18)This objective is combined with the following constraint inorder to approximate the volume of ˆ V : SoS Constraint 5. (Objective Constraint) − ˆ v ∈ SoS . To understand how this objective and constraint approximatethe volume of ˆ V , take a continuous function ˆ v that satisfies theconstraints of the previous section. For every point ˆ x not in theset Z F , the value ˆ v (ˆ x ) is constrained only by Constraint (5).This means that ˆ v (ˆ x ) can increase to a value of 1 for pointsinside ˆ V , and ˆ v (ˆ x ) increases to a value of 0 for points outsidethis set. As a result, ˆ v approaches the indicator function over ˆ V , and the integral in the objective function approaches thevolume of ˆ V .Combining the constraints and objective, we arrive at thefollowing sums-of-squares problem: sup ˆ v, ˆ u s ,q,λσ ,...,σ (cid:90) Z ˆ v (ˆ x ) d ˆ x (19) s . t . SoS Constraints 1 − ,σ , . . . , σ ∈ SoS
To express this problem as a semi-definite program or SDP(which can be solved with commercial solvers), all
SoS constraints must be linear functions of the decision variablepolynomials. However, Constraint 3 in the above problemincludes the terms L ˆ x ˆ g p ˆ v ˆ u s and λ ˆ v which are bilinear in ˆ u s , ˆ v and in λ, ˆ v respectively. Problems of this form are referredto as bilinear sums-of-squares problems. The bilinear nature of the constraints means that these problems are non-convex,and we can no longer guarantee a globally optimal solution tothis problem.To solve this nonconvex bilinear sums-of-squares programwe turn to a strategy called alternation. This strategy breaks(19) into a pair of linear sums-of-squares programs which caneach be solved using a commercial solver. In each programone of the bilinear variables is kept fixed while the otheris optimized over. The variables that were optimized arethen fixed while the other pair of variables are optimized.If the final solution satisfies the constraints of the originalprogram, the solution is guaranteed to be a viability domain.Computationally, each SDP is formulated in spotless andsolved using Mosek. C. Guaranteed Safe Semi-autonomous Controller
We use a feasible solution to the above optimization prob-lem to generate a guaranteed safe semi-autonomous controller.This controller modifies user input to ensure that the ViabilityConditions 3 and 4 are always satisfied. Condition 4 can beenforced by saturating the user inputs to always lie withinthe input bounds. To enforce condition 3, we note that itis only active on the boundary of ˆ V . This means that wecan ensure safety so long as we use the autonomous safecontroller ˆ u s when the state lies on the boundary of ˆ V , i.e. { ˆ x ∈ Z | ˆ v (ˆ x ) = 0 } .Since a controller that is discontinuous on the boundaryof the safe set would pose difficulties for systems with finitebandwidth, we additionally must ensure that the new controlleris continuous near the boundary. To achieve this, we smoothlyinterpolate between the user input ˆ u and the guaranteed safecontroller ˆ u s (which satisfies the safety condition when ˆ v (ˆ x ) =0 ) to get the regulated input ˆ u r : ˆ u r = ˆ u + (ˆ u s (ˆ x ) − ˆ u ) w s (ˆ v (ˆ x ) , (cid:15) ) , (20)where ˆ u s and ˆ v are computed using (19), w s : R → [0 , isa smooth step-like function that satisfies w s ( v, (cid:15) ) = 0 , ∀ v ≥ (cid:15) , and w s ( v, (cid:15) ) = 0 , ∀ v ≤ (cid:15)/ , and (cid:15) ∈ (0 , controls thesmoothness of the interpolation.When ˆ x satisfies ˆ v (ˆ x ) > (cid:15) , the user input is unmodified,as we are sufficiently removed from the boundary of the safeset. When ≤ ˆ v (ˆ x ) ≤ (cid:15)/ , the safe controller is fully active,keeping the state in the safe set.V. R ESULTS
We used the proposed approach to compute a viabilitydomain for the robot Rabbit [2]. The viability domain is rep-resented using a set of 8 degree-4 polynomials, each coveringan interval within the full range of θ . A two-dimensional sliceof the viability domain ˆ V is shown in Fig. 2.To demonstrate the semi-autonomous safe controller, weused it to ensure safety while performing a reference fol-lowing task. The task is to track a time-varying pitch angle α d : [0 , ∞ ) → A . To follow the target, we set a desired pitchacceleration u α using a ”na¨ıve” PD controller: u dα ( x α , t ) = k p ( α d ( t ) − α ) + k d ( ˙ α d ( t ) − ˙ α ) + ¨ α d ( t ) . (21) https://github.com/spot-toolbox/spotless MIT-ANSEEUW et al. : WALKING WITH CONFIDENCE 7 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.200.511.52 V^ S ^Δ^ Z (Actuator Limits Exceeded) Z F q Fr Z F (Falling Backwards) Fig. 2. A 2D slice (along α = ˙ α = 0 ) of the four-dimensional viabilitydomain ˆ V (shown in green) for Rabbit. The border at the right correspondsto the hybrid guard ˆ S of foot touchdown, where the state is reset under themap ˆ∆ to the left of the figure. The unsafe set Z F is shown in red. We avoidthe lower region ( ˙ θ < ) in order to conservatively prevent backwards falls.The upper region conservatively approximates the region in which the controlinput (8) violates the torque limits of the robot. By modifying the controlinput whenever Rabbit is at the edge of ˆ V , Z F can be avoided indefinitely.This is illustrated in the attached video. Finally, the periodic trajectory usedto generate our targets q Fr is shown in dashed black. Note that our viabilitydomain is able to guarantee robot safety even for states far away from thisnominal trajectory. We used the feedback controller (8) to map the desiredacceleration to the four motor torques of the Rabbit model.For the feedback controller to respect Rabbit’s actuatortorque limits, we first saturated u α with a real-time QuadraticProgram (QP) to get the input to our safety regulator: ˆ u ( x, x α , t ) =min u α (cid:12)(cid:12) u α − u dα ( x α , t ) (cid:12)(cid:12) (22) s . t . u ∗ ( x, x α , u α ) ∈ [ −
30 Nm ,
30 Nm] Using a QP to satisfy the actuator constraints of the systemis similar to many state of the art approaches for high-dimensional robot control [17], [18], [19], [20] . A majorlimitation of these approaches is the inability to guarantee thefeasibility of the QP. That is, for some states, there may notbe an input that satisfies the actuator constraints (the set ofsuch states is shown in red in Fig. 2).Our approach guarantees the feasibility of (22) by con-straining the state of the robot to be within the QP-feasibleregion (i.e. outside of Z F in Fig. 2). To maintain this stateconstraint, we modified the input ˆ u using the guaranteedsafe semi-autonomous controller defined in (20). In Fig. 3,we compare the results of the na¨ıve controller (21) and thesafe controller (20) using a simulation of the full dynamics ofthe robot Rabbit.When tracking the backwards pitch target, the na¨ıve con-troller slows to the point of falling backwards, while the safecontroller deviates slightly to maintaint forward walking. Forthe forward pitch target, the na¨ıve controller speeds up as itleans forward. At a certain speed, it cannot longer stay on thelow dimensional manifold under the torque limits and falls.The safe controller recognizes this risk early and deviatesfrom the desired forward pitch before reaching this speed. Thebottom-left figure shows how the set of torque-limit satisfyingcontrol inputs disappears for the na¨ıve controller.This task demonstrates that robot safety can be maintainedeven for states that are far away from any periodic limit cycle. Time (s) -0.200.20.4 P i t c h A ng l e (r ad ) ReferenceSafe TrackingNaive Tracking -20020-20020 I npu t (r ad / s ) Attainable (U)RequestedExecuted
Naive TrackingNaive TrackingSafe TrackingSafe TrackingNaive TrackingSafe Tracking
Time (s) ^ Failure(step back)Safety controller activates Failure(swing leg too slow)
Fig. 3. Tracking performance of the safe (20) and na¨ıve (21) controllersfollowing two reference trajectories under the full rabbit dynamics. The pitchangles are shown in the top left. For both references, the safe controllermodifies the input before safety is at risk, while the na¨ıve controller followsthe reference even as it leads to failure. Failure for the upper trajectorycorresponds to stepping backwards, and in the lower trajectory correspondsto moving too fast for the swing leg to reach its target. The bottom left figureshows desired input u dα and executed input for both nai¨ıve and safe trackingcontrollers following the second reference target. The state-dependent regionof inputs that satisfy the torque constraints are shown in grey. Note that underthe na¨ıve controller, this region vanishes as the forward walking speed of therobot becomes too high. Stills from the simulation trajectories are shownon the right. The dotted line is the desired pitch, and the faded line is thenearest on-manifold state q ( θ, α ) . See the attached video for an animatedpresentation of these results. Indeed, the only periodic limit cycle used in our approachkeeps the body pitch relatively upright ( α = 0 ). As such, ourapproach broadens the set of real-time safe behaviours thatcan be executed by Rabbit, since previous methods [16], [18],[20], [21] would all require a pre-computed limit cycle foreach new reference trajectory.VI. C ONCLUSION
This paper presents a method to construct a guaranteedsafe semi-autonomous controller for high-dimensional walkingrobots. The resulting controller guarantees viability and allowsfor flexible input when viability is not at risk. The method isevaluated on a model of the robot Rabbit, and a tracking taskis used to illustrate its capabilities. With a 10-dimensional statespace, this model is larger than any known model for whichcontinuous-time safety guarantees have been generated.Despite this increase in model dimension, our example isstill somewhat simplified: the dynamics are two dimensional,the terrain is flat, and the range of behaviour is limited tomodifying the torso pitch angle. In contrast, bipedal robots inthe world must traverse three dimensional, varied terrain whileperforming a wide range of tasks.When extending our method to these cases, a trade-offarises between the degree of underactuation of the model, thegenericity of the behaviour (i.e. the number of shaping param-eters), and the computational complexity of the optimizationproblem. From Section III-B, the dimension of the reducedorder manifold (our state space) is twice the sum of thedegree of underaction and the number of shaping parameters.In [12], the authors show that a 6 dimensional state space is
IEEE ROBOTICS AND AUTOMATION LETTERS. PREPRINT VERSION. ACCEPTED JULY, 2019 tractable for similar sums-of-squares programs. Our approachcan thus currently handle a maximum of three degrees ofunderactuation and/or shaping parameters.Under this constraint, we can directly extend our method to3D. For instance, take the 3D biped with controlled steeringgiven in [21]. This application has two degrees of underactu-ation (pitch and yaw) and would have one shaping parametercontrolling yaw rate (i.e. turning left or right). Using ourmethod, we could construct a safe steering controller for therobot that avoids the risk of turning too quickly and falling.An extension to rough terrain, however, will likely requireimprovements in scaling of the sums-of-squares problem. Suchscaling improvements are an active research target [27], [28].The core insight behind our approach is that sums-of-squares and hybrid zero dynamics are remarkably complemen-tary tools. Sums-of-squares analysis generates the set basedguarantees needed to render hybrid zero dynamics safe, andhybrid zero dynamics provides the dimensionality reductionneeded for sums-of-squares analysis to be tractable. The keyinnovation for combining these two tools was the introductionof a set of shaping parameters which control the dynamicson the manifold. The ability to combine sums-of-squaresand hybrid zero dynamics presents a promising path forwardfor building guaranteed safe walking controllers for complexlegged robots. R
EFERENCES[1] J. Pratt, J. Carff, S. Drakunov, and A. Goswami, “Capture point: A steptoward humanoid push recovery,” in , pp. 200–207, Dec 2006.[2] C. Chevallereau, G. Abba, Y. Aoustin, F. Plestan, E. Westervelt, C. C.de Wit, and J. Grizzle, “Rabbit: A testbed for advanced control theory,”
IEEE Control Systems Magazine , vol. 23, no. 5, pp. 57–79, 2003.[3] P. A. Parrilo,
Structured semidefinite programs and semialgebraic ge-ometry methods in robustness and optimization . PhD thesis, CaliforniaInstitute of Technology, 2000.[4] P. Wieland and F. Allg¨ower, “Constructive safety using control barrierfunctions,”
IFAC Proceedings Volumes , vol. 40, no. 12, pp. 462–467,2007.[5] A. Majumdar, R. Vasudevan, M. M. Tobenkin, and R. Tedrake, “Convexoptimization of nonlinear feedback controllers via occupation measures,”
The International Journal of Robotics Research , p. 0278364914528059,2014.[6] D. Henrion and M. Korda, “Convex computation of the region of attrac-tion of polynomial control systems,”
IEEE Transactions on AutomaticControl , vol. 59, no. 2, pp. 297–312, 2014.[7] M. Korda, D. Henrion, and C. N. Jones, “Controller design and valuefunction approximation for nonlinear dynamical systems,”
Automatica ,vol. 67, pp. 54–66, 2016.[8] S. Prajna and A. Jadbabaie, “Safety verification of hybrid systemsusing barrier certificates,” in
International Workshop on Hybrid Systems:Computation and Control , pp. 477–492, Springer, 2004.[9] V. Shia, R. Vasudevan, R. Bajcsy, and R. Tedrake, “Convex computationof the reachable set for controlled polynomial hybrid systems,” in , pp. 1499–1506, Dec 2014.[10] T. Koolen, M. Posa, and R. Tedrake, “Balance control using centerof mass height variation: limitations imposed by unilateral contact,”in
Humanoid Robots (Humanoids), 2016 IEEE-RAS 16th InternationalConference on , pp. 8–15, IEEE, 2016.[11] P.-B. Wieber, “On the stability of walking systems,” in
Proceedings ofthe international workshop on humanoid and human friendly robotics ,2002.[12] M. Posa, T. Koolen, and R. Tedrake, “Balancing and step recovery cap-turability via sums-of-squares optimization,” in , 2017.[13] P. Zhao, S. Mohan, and R. Vasudevan, “Optimal control for nonlinear hy-brid systems via convex relaxations,” arXiv preprint arXiv:1702.04310 ,2017. [14] J. Z. Tang, A. M. Boudali, and I. R. Manchester, “Invariant funnelsfor underactuated dynamic walking robots: New phase variable andexperimental validation,” in , pp. 3497–3504, May 2017.[15] E. Westervelt, J. Grizzle, and D. Koditschek, “Hybrid zero dynamics ofplanar biped walkers,”
IEEE Transactions on Automatic Control , vol. 48,no. 1, pp. 42–56, 2003.[16] A. D. Ames, K. Galloway, K. Sreenath, and J. W. Grizzle, “Rapidlyexponentially stabilizing control lyapunov functions and hybrid zerodynamics,”
IEEE Transactions on Automatic Control , vol. 59, no. 4,pp. 876–891, 2014.[17] S. C. Hsu, X. Xu, and A. D. Ames, “Control barrier function basedquadratic programs with application to bipedal robotic walking,” in , pp. 4542–4548, July 2015.[18] Q. Nguyen and K. Sreenath, “Optimal robust control for bipedalrobots through control lyapunov function based quadratic programs.,”in
Robotics: Science and Systems , 2015.[19] Q. Nguyen and K. Sreenath, “Exponential control barrier functions forenforcing high relative-degree safety-critical constraints,” in
AmericanControl Conference (ACC), 2016 , pp. 322–328, IEEE, 2016.[20] Q. Nguyen, A. Hereid, J. W. Grizzle, A. D. Ames, and K. Sreenath, “3ddynamic walking on stepping stones with control barrier functions,” in , pp. 827–834, Dec 2016.[21] M. S. Motahar, S. Veer, and I. Poulakakis, “Composing limit cycles formotion planning of 3d bipedal walkers,” in , pp. 6368–6374, IEEE, 2016.[22] S. Veer and I. Poulakakis, “Safe adaptive switching among dynamicalmovement primitives: Application to 3d limit-cycle walkers,” arXivpreprint arXiv:1810.00527 , 2018.[23] A. D. Ames, P. Tabuada, A. Jones, W.-L. Ma, M. Rungger,B. Sch¨urmann, S. Kolathaya, and J. W. Grizzle, “First steps towardformal controller synthesis for bipedal robots with experimental imple-mentation,”
Nonlinear Analysis: Hybrid Systems , vol. 25, pp. 155–173,2017.[24] S. Sastry,
Nonlinear systems: analysis, stability, and control , vol. 10.Springer Science & Business Media, 2013.[25] A. Hereid and A. D. Ames, “Frost*: Fast robot optimization andsimulation toolkit,” in , pp. 719–726, IEEE, 2017.[26] K. Sreenath, H.-W. Park, I. Poulakakis, and J. W. Grizzle, “A complianthybrid zero dynamics controller for stable, efficient and fast bipedalwalking on mabel,”
The International Journal of Robotics Research ,vol. 30, no. 9, pp. 1170–1193, 2011.[27] D. Papp and S. Yildiz, “Sum-of-squares optimization without semidef-inite programming,”
SIAM Journal on Optimization , vol. 29, no. 1,pp. 822–851, 2019.[28] A. A. Ahmadi and A. Majumdar, “Dsos and sdsos optimization: Lp andsocp-based alternatives to sum of squares optimization,” in2014 48thannual conference on information sciences and systems (CISS)