Evolved embodied phase coordination enables robust quadruped robot locomotion
Jørgen Nordmoen, Tønnes F. Nygaard, Kai Olav Ellefsen, Kyrre Glette
EEvolved embodied phase coordination enables robustquadruped robot locomotion
Jørgen Nordmoen
University of Oslo, [email protected]
Tønnes F. Nygaard
University of Oslo, [email protected]
Kai Olav Ellefsen
University of Oslo, [email protected]
Kyrre Glette
RITMO, University of Oslo, [email protected]
ABSTRACT
Overcoming robotics challenges in the real world requires resilientcontrol systems capable of handling a multitude of environmentsand unforeseen events. Evolutionary optimization using simula-tions is a promising way to automatically design such control sys-tems, however, if the disparity between simulation and the realworld becomes too large, the optimization process may result indysfunctional real-world behaviors. In this paper, we address thischallenge by considering embodied phase coordination in the evo-lutionary optimization of a quadruped robot controller based oncentral pattern generators. With this method, leg phases, and indi-rectly also inter-leg coordination, are influenced by sensor feedback.By comparing two very similar control systems we gain insightinto how the sensory feedback approach affects the evolved pa-rameters of the control system, and how the performances differin simulation, in transferal to the real world, and to different real-world environments. We show that evolution enables the design ofa control system with embodied phase coordination which is morecomplex than previously seen approaches, and that this system iscapable of controlling a real-world multi-jointed quadruped robot.The approach reduces the performance discrepancy between sim-ulation and the real world, and displays robustness towards newenvironments.
KEYWORDS
TEGOTAE, CPG, evolutionary robotics
ACM Reference Format:
Jørgen Nordmoen, Tønnes F. Nygaard, Kai Olav Ellefsen, and Kyrre Glette.2019. Evolved embodied phase coordination enables robust quadruped robotlocomotion. In
Genetic and Evolutionary Computation Conference (GECCO’19), July 13–17, 2019, Prague, Czech Republic.
ACM, New York, NY, USA,9 pages. https://doi.org/10.1145/3321707.3321762
Legged robots are an important means for increasing robot presencein everyday life and can be a valuable tool in difficult tasks suchas search and rescue. Because of their increased mobility, they
GECCO ’19, July 13–17, 2019, Prague, Czech Republic © 2019 Copyright held by the owner/author(s). Publication rights licensed to theAssociation for Computing Machinery.This is the author’s version of the work. It is posted here for your personal use. Notfor redistribution. The definitive Version of Record was published in
Genetic andEvolutionary Computation Conference (GECCO ’19), July 13–17, 2019, Prague, CzechRepublic , https://doi.org/10.1145/3321707.3321762.
Controller
Sensor feedback S i m u l a t i on R ea l w o r l d R ea l w o r l d Robot &environment
Figure 1: Ground Reaction Force acting on the legs of therobot (shown with orange arrows) aid the Central PatternGenerator (CPG) control system in handling different envi-ronments through sensor feedback and embodied phase co-ordination (illustrated with purple arrows). promise to aid on the user’s terms instead of requiring the user toaccommodate the robot. To achieve this vision legged robots needto be able to adapt to unknown and changing environments, a featthat is made more difficult by the robots own morphology whencompared to simpler wheeled robots.A promising avenue of research is Evolutionary Robotics (ER)which aims to adapt both a robot’s shape and its control systemto new challenges [7]. In ER adaptation takes place over many tri-als in a setup which often leverages software simulation to allowcomplete oversight and reduce experiment time. One of the biggestchallenges in ER is the transition from simulation to the real world.The optimization process taking place in simulation, adapting thecontroller to the robot and the environment, may deviate from use-ful real-world behaviors [13]. This difference in behavior betweenthe simulated and the real robot is often called the reality gap [16].
ECCO ’19, July 13–17, 2019, Prague, Czech Republic J. Nordmoen et al.
Many different approaches to tackling the reality gap have beenproposed in literature. One possible way to deal with the challengewould be to accept that there are differences between the simulatorand the environment, in the same way as the robot will encounterenvironmental differences in the real world. In other words, the sim-ulation is treated as just another environment that needs controlleradaptation [33].One way to achieve this is to sense the environment and havethe control system react based on information gathered from sen-sors. For complex legged robots this task is difficult because of theneed for coordination intralimb and interlimb where each jointmight be dependent on disparate sensor inputs [4]. Indeed, theuse of sensory feedback for controller adaptation in the ER fieldhas mostly been seen in wheeled robot applications and are lesscommon when legged robots are used as the application domain.Some examples include Morse et al. [15] and Tarapore and Mouret[32] which both utilize touch sensors to trigger an instantaneousphase reset, another example is Gay et al. [8] which combined aCentral Pattern Generator with neural network sensor feedbackachieving continuous adaptation to sensor input, however, requir-ing comprehensive engineering of both Central Pattern Generatorand the neural network.An intriguing approach to incorporate sensor feedback in ro-botics is the TEGOTAE approach [24, 25]. TEGOTAE relies on theconcept of embodiment, leveraging the robot body to simplify con-trol [27]. Specifically, TEGOTAE utilizes Ground Reaction Force(GRF) sensors to adapt phase between legs without explicit coor-dination. The phase of each leg is controlled individually and isslowed- or sped-up depending sensor feedback, implicitly allowingfor embodied control of phase coordination . In this way, the TEGO-TAE approach is less complex than explicit phase coordinationbecause it does not require phase differences to be optimized orspecified up front. Because TEGOTAE only affects a small portion ofthe overall control structure it can easily be combined with differentcontrol approaches, widening its appeal for ER research. However,most of the work dealing with TEGOTAE so far has focused onanalyzing locomotion to show the advantages of the TEGOTAEsensory feedback mechanism [3], using hand-tuned parametersand inverse kinematics or single jointed legs with low complexitycontrollers.In this paper, we demonstrate that we can use ER to incorporateembodied phase coordination for a complex control task: We usethe hard-to-balance quadruped ‘DyRET’ robot, shown in simulationin Figure 1, with a Central Pattern Generator (CPG)-based controlsystem for directly controlling three joints per leg. We evolve twosimilar CPG control systems, where the first system does not incor-porate feedback and the other enables embodied phase coordinationthrough sensor feedback. We compare the simulation results of thetwo control systems with real-world re-evaluation to understandthe effect of sensor feedback on the controllers in the context of thereality gap. Lastly, we perform a case-study where the two controlsystems are tested in two different and more difficult real-worldenvironments, giving insight into the adaptability of the differentcontroller approaches.Our results show that embodied phase coordination can read-ily be combined with a complex CPG control system and achievewell-performing gaits through evolutionary optimization. Where the plain CPG controller achieves high performance in simula-tion, it suffers from a significant reduction in performance whentransferred to the real world. With embodied phase coordinationenabled, the control system does not achieve unrealistically highperformance in the simulation, and when re-evaluated in the realworld the simulated performance is almost fully retained. Stabilityis better than for the plain controller, and the speed is similar to theplain controller after a warm-up period. Through our case study,we can also observe that sensor feedback makes the control systemmore robust than the plain controller when transitioning to newand more difficult surfaces.The contributions of our paper are the following. Firstly weshow that embodied phase coordination can readily be combinedwith an extensive CPG control system on a difficult-to-controlquadruped robot. Secondly, we show that the full control systemcan be evolved and results in robust controllers with little realitygap. In addition, our case study demonstrates how embodied phasecoordination allows our control system to transition to unknownenvironments. Our contributions aid both the understanding ofTEGOTAE as a control system mechanism and also as a techniquein ER for environmental adaptation and reduced reality gap.
In this section we will review related and relevant work regard-ing ER and the reality gap, control systems in the field of ER andTEGOTAE.
ER draws on principles like selection, variation and hereditarytraits found in biological evolution to design robots with embodiedintelligence [7]. In the early days of the field experiments were oftenconducted in the real world, however, this trend has shifted in recentyears to favor evolution in simulation. Simulation allows for morecontrol of the environment and rapid verification, however, it alsobrings with it some challenges. The reality gap is the discrepancythat often occurs between the performance of a robot in softwaresimulation and real-world testing [12]. This problem is challengingin ER since robots evolved in simulation tend to become finelytuned to the simulator including the areas where the simulatordisagrees with the real world [16, 29].The easiest way to avoid the reality gap is to simply not evolvethe controller in simulation [19, 21, 23]. This approach avoids the re-ality gap, but hardware evolution still has its limitations, includingthe limited number of evaluations due to time constraints, the factthat hardware will wear and deteriorate and the problem of break-age while exploring suboptimal movement during controller adap-tation [7]. For these reasons other approaches are still sought after,but always with real-world testing as the final verification [18].Approaches to solving the reality gap include the introduction ofnoise in simulation [12], adding obstacles to promote robustness [9],optimizing the simulation to better replicate the real world [36] orensuring that simulation and the real world agree on the evalua-tions [13]. Another solution is to include sensor feedback so thatthe algorithm can adapt online to the current environment [33],however, this approach is seldom used because of the difficulty volved embodied phase coordination enables robust robot locomotion GECCO ’19, July 13–17, 2019, Prague, Czech Republic in integrating sensor feedback and the time-consuming task ofcalibrating simulated sensors to the real world [29].
ER has motivated many different control systems for legged robots [4].Artificial Neural Networks (ANNs) have been used extensively be-cause of their ability to represent complex functions, given the rightevolved structure and connection weights [5, 14, 34]. Extensions tothese control systems have also come like the SUPG approach [15]which combines the advantages of neuroevolution with sensorfeedback. At the other end of the complexity scale are simpler con-trollers that directly calculate joint angles based on splines or sinewaves [6, 20]. These simpler control systems are often used becauseof the inherent complexity involved in evolving ANNs [31].Another alternative is the CPG control system [11]. This biologi-cally inspired controller is popular because of the wide diversity inimplementation, from a mathematical model [8] to generative en-coding [32], and its inherent flexibility in combining phase-couplingwith traditional kinematic control [35]. CPG control systems havealso been investigated in relation to incorporating sensor feed-back [1, 2], however, these approaches require extensive engineer-ing to incorporate sensor feedback which can be unsatisfactory ifnot the main focus of the intended work.
TEGOTAE is a minimalistic approach to utilize sensor feedback foremergent phase-coupling between the legs of the body [25]. Insteadof explicitly forcing a given phase-coupling between legs the systemrelies on GRF sensors and an explicit decoupling of legs to adapteach leg to the environment while simultaneously coordinatingthe legs through the body. Because TEGOTAE feedback utilizesGRF sensors simulator calibration is limited to accurate weightestimation of the robot and correct physic simulation. This is incontrast to other sensor-based approaches which often requireextensive calibration [12].TEGOTAE has several advantageous properties such as sponta-neous gait transitioning [24], robustness to rough terrain [17] inaddition to the emergent phase adaptation. Because of the minimaldefinition, TEGOTAE can also be combined with several differentcontrol systems making it an interesting candidate for control sys-tem research within the ER community. Most of the work dealingwith TEGOTAE has focused on analyzing locomotion to show theadvantages of the TEGOTAE sensory feedback mechanism [3], inthis paper we extend that research to encompass ER and argue thatTEGOTAE sensor feedback is versatile, can readily be combinedwith a complex CPG control system without the use of inversekinematics, is robust in light of controller evolution and can workfor complex quadruped robots.
In this section we will describe the four-legged robot, the controlsystems and the evolutionary setup utilized for the experiments inthe paper . Additional material and software download see https://folk.uio.no/jorgehn/tegotae/
Joint 0Joint 1Joint 2 (a) Joint configuration -1.0-0.50.00.51.0 (b) CPG output for Joint 0and Joint 1
Time in seconds (c) CPG output for Joint 2
Figure 2: (a) shows a visual representation of a leg of the ro-bot with joints marked. (b) shows an example control curvefor Joint 0 and Joint 1 while (c) shows an example controlcurve for Joint 2.
We use the custom developed ‘DyRET’ platform [22], shown insimulation in Figure 1, together with Robot Operating System(ROS) [28] and Gazebo/ODE for simulation. ‘DyRET’ is a fourlegged, quadruped, robot with a mammalian morphology. Each ofthe four legs contains three rotation joints as illustrated in Figure 2a.Each of the joints contains PID controllers to which our controlsystem periodically sends desired joint angles. The simulation andthe real-world robot operates on the same set of input , whichallows us to rapidly change between simulation and real-worldexperiments.To measure GRF, force sensors of type ‘OptoForce OMD-20-SE-40N’ are attached to each leg. Simulated versions of the GRF sensorsare also utilized during evolution. Of note is that the simulated GRFsensors are not calibrated to the real-world sensors, they workthrough weight and gravity simulation alone. To measure the poseof the robot we utilize an ‘OptiTrack’ motion capture system forreal world evaluations and direct measurements of the body areused in simulation. The control system for our robot is based on a network of oscillators,a CPG [11]. The CPG is based on the work of Gay et al. [8] andvariations published in related work [30]. The CPG is optimizedfor producing a quadruped gait with joint 2 containing a swingand stance phase, as shown in Figure 2c. This is advantageoussince it makes the foot trajectory capable of level tracing duringground touch. The two equations used to produce the motion of allthree joints, for each leg, are given below. Here we follow the samenomenclature as Gay et al. [8]. (cid:219) a { , } = γ ( µ a { , } − a { , } ) (1) (cid:219) o { , } = γ ( µ o { , } − o { , } ) (2) (cid:219) ϕ { , } = πω (3) θ { , } = a { , } cos ( F L ( ϕ { , } )) + o { , } (4) For more information see: https://github.com/dyret-robot/dyret_documentation/
ECCO ’19, July 13–17, 2019, Prague, Czech Republic J. Nordmoen et al. where F L is a filter applied on the phase given by F L ( ϕ i ) = (cid:40) ϕ π d if ϕ π < πd ϕ π + π ( − d ) ( − d ) otherwiseand ϕ π = ϕ i mod 2 π To achieve the swing and stance phase Joint 2 utilizes the fol-lowing equations (cid:219) a , = γ ( µ a , − a , ) (5) (cid:219) a , = γ ( µ a , − a , ) (6) (cid:219) o = γ ( µ o − o ) (7) θ = a F Γ ( ϕ ) + o (8)with a = (cid:40) a , if F L ( ϕ ) < πa , otherwise (9) F Γ ( ϕ i ) = (cid:40) − ϕ N + ϕ N if ϕ N < ( ϕ N − ) − ( ϕ N − ) + ϕ N = (cid:18) F L ( ϕ i ) π mod 0 . (cid:19) (11) a i represents the amplitude of the i -th joint, o i is the static offsetfor each joint and a , and a , is the stance and swing amplitudesfor joint 2. For each of these there is a corresponding target, µ i ,which describes the desired value of the variable. θ i is the outputvalue of each oscillator and ω is the frequency. γ is a positive gaindefining the convergence speed of the oscillator and lastly, d is avirtual duty parameter. Joint 1 utilizes the same equations as joint0 and both joint 1 and joint 2 are internally connected to joint0 according to ϕ n = ϕ n − + ψ n , where ψ n is the desired phaseshift between oscillators within a leg. A visual representation ofthe output of equations 4 and 8 can be seen in Figure 2b and 2c,respectively, where we have set a = . o = . d = . a , = . a , = . o = . i and j is obtainedby the following changes to Equation 3 (cid:219) ϕ i = πω + n (cid:213) j = , j (cid:44) i w ij sin ( ϕ j − ϕ i − φ ij ) (12)where φ ij is the desired phase difference between each oscillatorand w ij is a positive gain defining the coupling strength.For the closed-loop controller with TEGOTAE sensor feedback,the static phase-coupling is not utilized and GRF is instead sensedto slow or speed up the phase of the oscillators [25]. The changesto Equation 3 are as follows (cid:219) ϕ i = πω − αN i cos ( ϕ i ) (13)where N i is the magnitude of the GRF sensed for leg i and α isthe attraction coefficient. The equation will speed up or slow downthe phase of all CPGs in the leg, trying to stabilize the body withthe leg in stance position when force is detected at the foot sensor. Table 1: Control parameters for the two control systems.All parameters are shared between the to control systemswith identical implications except for attraction coefficientwhich is only applicable to the closed-loop system ( † ) whilecoupling strength and phase difference is only applicable tothe open-loop control system ( ‡ ).Category Name Variable Value range Global Frequency ω . γ [ . , . ] Duty Cycle d [ . , . ] Attraction coefficient † α [ . , . ] Coupling strength ‡ w [ . , . ] Phase difference ‡ φ ij µ r . µ o . µ r [ . , . ] Target offset µ o [ . , . ] Phase shift ψ π [− . , . ] Joint 2 Target swing µ r , [ . , . ] Target stance µ r , [ . , . ] Target offset µ o [ . , . ] Phase shift ψ π [− . , . ] For the rest of this paper we will refer to the CPG controller without
TEGOTAE sensor feedback as open-loop and will refer tothe CPG controller with
TEGOTAE sensor feedback as closed-loop ,borrowing the semantics from control theory literature [26].The parameters for the gait are shown in Table 1. To limit thesearch space for the two control systems we have reduced thenumber of parameters to only represent the control of one leg. Thiscontrol is then copied and mirrored for the three other legs. Theeffect of this restriction is that all legs have the same movement onlyseparated by phase. In turn, this limits the behavior of the robot to,intentionally, only move forwards or backwards. For the open-loopcontrol system, we have additionally forced the phase difference tobe a regular walking gait, more specifically a static L-S walk [24],and used a single coupling strength variable, w ij = w . These staticlimitations were put in place to ensure that evolution optimized forgaits that do not put unnecessary strain on the real-world robot. To evolve the controllers, single-objective Covariance Matrix Adap-tation Evolutionary Strategy (CMA-ES) [10] was utilized. The pa-rameters for the evolutionary algorithm are shown in Table 2. Foreach controller we ran the evolutionary algorithm 20 times to gatherstatistics about the expected performance.Due to the morphology, tall and heavy legs and a high centerof gravity, our robot is more prone to falling during evolutioncompared to other robotic platforms usually utilized in ER [18].Because of this, it is important to include some form of stabilitymeasure in the fitness function. For the experiments in this paperwe utilized the maximum angular deviation of the body from anupright pose as a stability measure. This measure allows for small volved embodied phase coordination enables robust robot locomotion GECCO ’19, July 13–17, 2019, Prague, Czech Republic
Table 2: Parameters for the evolutionary algorithm.Name Value
Algorithm CMA-ESRepetitions 20Evaluations 2510 (250 generations)Genome Real-valued [ , ] With 10 parametersEvaluation time 20 seconds λ N X . σ initial . (cid:93) max .
35 radiansrapid movements of the body and will act as a force to minimizelarge angles that are, from experience, often a precursor to falling.To further discourage falling behavior we used distance walked asa fitness measure since it will tend to favor stable walking patternsover quick sprint and fall behavior often experienced when usingspeed as a measure. Straight-line distance is also advantageous sinceit should promote gaits that walk in a straight line not turning ordoubling back on itself. We compose distance and stability in sucha way as to favor distance with stability as an additional reward.The fitness function is given below: F = F distance ( + F stability ) (14)where F distance = dir || P end − P start || (15) F stability = (cid:40) − arд max t | | (cid:93) Zt | | (cid:93) max if || (cid:93) Z t || < (cid:93) max P start and P end is the position of the robot at the start and endof the evaluation, dir is either − Y − axis and (cid:93) Z t is the angle between the world up vector −→ Z and the up vector ofthe robot pose at time t . The angle is normalized to (cid:93) max whichis used as the maximum allowable angle deviation, see Table 2 forthe specific value used. The experiments in this paper are focused on evolving gait con-trollers in simulation for a four-legged robot before evaluatingthe controllers in the real world. During evolution, we comparesolutions for their capability to walk continuously through thecomposed single-objective fitness measure described in the previ-ous section, Equation 14. We first compare the fitness of the twodifferent control systems throughout evolution to evaluate if theyare capable of generating continuous gaits for the whole evaluationperiod without falling. From the evolutionary runs, we select 5 gaitcontrollers from each control system which will be re-evaluatedin software and tested in the real world. Lastly, we perform a casestudy of two selected controllers in two different environments. Bytesting the control systems in different environments we can assess
123 0 500 1000 1500 2000 2500
Evaluation F i t n e ss Open-loop Closed-loop
Figure 3: Mean fitness, and confidence interval, for thepopulation best individual across evolutionary runs. how robust the evolved behavior is to external changes and assessthe behavioral difference between distinct environments. Figure 3 shows, for both control systems, the mean fitness of thepopulation best controller across 20 evolutionary runs. From thefigure it is clear that the open-loop control system has convergedwhile the closed-loop control system displays more variation. Partof the reason for this variation seems to be individual controllervariability where repetitions of the same controller or very similarcontrollers can display noticeable differences in fitness. Duringevolution, the open-loop control system is capable of attainingmuch better fitness compared with the closed-loop system and itis also interesting to note that the open-loop system walks furtherfrom the beginning of evolution.To better understand the evolution of parameters for the twocontrol systems we have plotted the whole genome of each controlsystem over time in Figure 4. From the two graphs, we can seethat CMA-ES is able to search the whole parameter space initiallyindicating early exploration of the search space. It is also clearthat for the open-loop control system evolution converges to asmall range of values for most variables. This is in contrast tothe closed-loop control system which has a wider distribution ofvalues for many variables, which converge for the open-loop system.Of note is that the target amplitude of joint 1 ( µ r ) converges tothe maximum value for both control systems. This could indicatethat evolution would utilize a larger value of the variable. Testsconfirm that this does give longer distance, however, at the costof movement unsuitable for the real-world robot – justifying therestriction. Another interesting observation is that some variables,e.g. µ o and µ o , converge to similar values for both control systems,while others, e.g. µ r , , have diverged to different values for eachcontrol system. To better understand how the evolved control systems behaveacross the reality gap the 5 best solutions, for each control sys-tem, were selected from the last evaluation for re-evaluation insimulation and real-world tests. Re-evaluation in software is doneto get an impression of the controller robustness. Note that thevariation observed in simulation is due to timing and dynamics of
ECCO ’19, July 13–17, 2019, Prague, Czech Republic J. Nordmoen et al. (a) Open-loop µ o ψ µ o ψ µ r , µ r , γ d w ‡ µ r Evaluation P a r a m e t e r d i s t r i b u t i o n (b) Closed-loop µ o ψ µ o ψ µ r , µ r , γ d α † µ r Evaluation
Figure 4: Genotype value distribution throughout evolution of all 20 repetitions, (a) shows parameters for the open-loop ( ‡ )controllers and (b) shows parameters for the closed-loop ( † ) control system. Note that w and α are unique for each controlsystem and all other variables have the same interpretation. the ROS/Gazebo setup. The following experiments are carried outwith 10 repetitions per individual controller, in total 100 real-worldevaluations.In Figure 5a the fitness is shown for both control systems. Theplot shows, on the left, the fitness after re-evaluation in simulationand, on the right is, the fitness after real-world evaluation . Fromthe plot we can see that the open-loop control system is able toattain better fitness both in simulation and in real-world evaluations.For the closed-loop control system there is more individual variancein simulation compared to real-world evaluations.Since the fitness measure is composed of two different metrics itcan be interesting to separate them out and see how the two controlsystems differ on each. Figure 5b shows the distance componentwhile Figure 5c shows stability. For distance we can see more orless the same performance difference as with fitness, the open-loopsystem achieves longer distances both in simulation and in thereal world. With regards to stability, the difference between thetwo controllers is smaller than for distance and the closed-loopsystem performs better than the open-loop control system in thereal world. In contrast to distance, the individual variation is lowerfor the closed-loop system both in simulation and in real-worldevaluations.One reason why the open-loop controller achieves longer dis-tances compared to the closed-loop control system is the TEGOTAEsystem apparent need for a warm-up period before it begins to walk.This period is used to build a phase difference between the legs andcan be seen in both simulated and real-world evaluations. To gaugethe effect of this warm-up period we have plotted the distance trav-eled in the last half of the evaluation in Figure 5d. From the graphwe can see that the difference in simulation is quite considerable,however, in the real world the two control systems perform equallywell. Videos of real-world evaluations: https://folk.uio.no/jorgehn/tegotae/video/
To test if the differences in previous results, Figure 5a, 5b, 5cand 5d, are significant we performed a Mann-Whitney U test foreach of the four combinations - comparing both control systemsin Simulation and the Real World and comparing Simulation withthe Real World for both control systems - for each of the fourperformance metrics. We select a threshold of significance of α = . ≈ .
003 using Bonferroni correction with 16 comparisons. Insummary, every combination except open-loop and closed-loopin ‘Real World’ for ‘Distance - last half’ shown in Figure 5d ( α = . As a case study, we selected the best controller for each controlsystem, based on fitness, from real-world testing to test in newenvironments. By understanding how the controllers perform invaried real-world environments we can further characterize thetransferability of the embodied phase coordination mechanism inaddition to gaining insight into the robustness of our controllers.To create challenging terrains within the confines of our laboratorysetup we added two carpets with different characteristics. The firstone is a rough hard carpet with large woven knots simulatingrough gravel while the other is a soft thick pile carpet with highfriction and a more sand like texture .For the case study, we focus on stability and distance in thelast half of the evaluation which should show minimal impactfrom the TEGOTAE warm-up. In Figure 6, on top, the distancein the last half of the evaluation is plotted for each controller foreach of the 10 repetitions across all environments tested. From thefigure, the performance of both controllers is about equal in thethree real-world environments and performance decreases as theenvironmental difficulty increases. https://folk.uio.no/jorgehn/tegotae/environments/ https://folk.uio.no/jorgehn/tegotae/environments/ volved embodied phase coordination enables robust robot locomotion GECCO ’19, July 13–17, 2019, Prague, Czech Republic F i t n e ss Open-loop Closed-loop (a) Fitness score for re-evaluated individuals. D i s t a n c e ( m ) Open-loop Closed-loop (b) Total distance traversed. S t a b i l i t y ( r a d i a n s ) Open-loop Closed-loop (c) Maximum deviated angle over the whole evaluation. Note thereversed Y-axis since . represents perfect stability. D i s t a n c e - l a s t h a l f ( m ) Open-loop Closed-loop (d) Distance traversed for the last half of the evaluation period.
Figure 5: Data from re-evaluation in simulation and real-world evaluations. The boxplots show aggregate data over all indi-viduals while the mean of each individual controller is shown as larger circles along with confidence intervals.
Stability, shown at the bottom of Figure 6, shows a slightly differ-ent trend compared to distance. For both controllers performancedecreases when going from simulation to real-world, however, thetwo additional real-world environments do not see decreasing per-formance. If we see the result in relation to distance traversed, ontop in Figure 6, it is still clear that the different surfaces are difficultto traverse as stability remains almost the same while the distanceis halved for the most difficult environment.We performed the same Mann-Whitney U statistical comparisonas before, this time only between the open-loop control system andthe closed-loop system, with α = . ≈ . The difference in performance between simulation and real-worldexperiments, shown in Figure 5a, illustrates that the reality gapis present in our experiments. For both systems the performancedecreases, mainly due to lower walking speed, shown in Figure 5b,but also lower stability. The open-loop control system still seemsto outperform the closed-loop system after the transition, however,the difference is much smaller and in regards to stability, Figure 5c,the sensor feedback seems to be able to overcome some of the chal-lenges encountered in the real world and outperform the open-loopcontroller. The main source for the reality gap in both systems is most likely due to simulation inaccuracies as the simulator is notable to model the slight bending of real-world materials nor the um-bilical cord and free-hanging wires needed for the real-world robot.However, as the stability results show in Figure 5c, the closed-loopcontrol system seems to adapt to the changes in environmentalcircumstances emphasizing the advantage of sensor feedback. In-terestingly for the closed-loop system variance is reduced betweensimulation and ‘Real World’ for the distance metric, again pointingto the gap between reality and simulation.Because the closed-loop system requires some time to adaptthe phase difference between legs it is also interesting to comparedistance in the latter half of the evaluation, shown in Figure 5d. Theresult shows that the real-world performance of the two controlsystems is similar once TEGOTAE has adapted the phase differences,as they are both able to achieve the same walking speed. Thisresult further illustrates the adaptiveness of the closed-loop controlsystem as it is able to retain much higher stability while at thesame time being able to walk with the same speed as the open-loopcontrol system. It should also be noted that the difference betweensimulation and real-world testing for the closed-loop control systemis much lower than for the open-loop system. This indicates thatthe simulation results for the closed-loop system are more realisticand in turn giving a reduction in the reality gap.With regard to the case study performed we can see in Figure 6that distance covered decreases for both systems, indicating thatthe rough and soft real-world surfaces are more difficult to traversefor both systems, as is evident in the large increase in variance
ECCO ’19, July 13–17, 2019, Prague, Czech Republic J. Nordmoen et al. D i s t a n c e - l a s t h a l f ( m ) Open-loop Closed-loop S t a b i l i t y ( r a d i a n s ) Figure 6: The performance of the two case study controllersfor the metrics ‘Distance - last half’ and ‘Stability’ for all en-vironments tested. The boxes summarize all 10 evaluationsand the dashed lines illustrate the trend of the median. Theorder is based on median performance on ‘Distance - lasthalf’ corresponding with the difficulty of the environment. compared to the ‘Real World’ environment. The graphs also illus-trates that the two control systems behave differently through theseenvironmental transitions and that the reduction in performancefor the closed-loop system seems to be smaller compared to theopen-loop control system. This difference points to the ability ofthe TEGOTAE sensor feedback to adapt to the environment givingincreased robustness. Because of the limited sample size of thecase study further comparisons between the two control systemsare difficult to validate. Another observation made during the casestudy is the heavy strain the open-loop system put on the robotjoints in the more difficult environments. During the evaluations,both control systems would sometimes get stuck, unable to moveone or more legs, explaining the larger variance in Figure 6. For theclosed-loop system, this presented less of a problem since sensorfeedback would detect the additional leg load and slow movementthus avoiding excessive load on the joints.Because of the similarity of the two control systems, it is inter-esting to compare the evolution of control parameters, as shown inFigure 4. One apparent property of the graph is that the open-loopsystem seems to have converged for a large number of parameterscompared to the closed-loop system. In light of the performance ofthe two systems both in simulation, but more importantly in real-world tests, one interpretation of the parameter convergence can bethat the open-loop control system has overfitted to simulation. By having sensor feedback in the closed-loop control system, it mustadapt to the GRF sensors and is not able to overfit to the perfectconditions of the simulation. This could be the reason for the morerobust controllers observed in the transition from simulation to real-world tests. The effect could be related to the technique of addingnoise in the simulation to reduce the reality gap [12]. Another inter-pretation is that because of sensor feedback the closed-loop controlsystem needs longer time to converge and is still in the process ofconverging. We are planning to address these two interpretationsin future studies by including noise during evolution hopefullymitigating the problem of overfitting.
In this paper we investigated how embodied phase coordinationthrough sensor feedback would affect the evolution, performanceand robustness of a CPG control system. We demonstrated that theaddition of embodied phase coordination allows the robot to adaptto its environment and is able to produce continuous coordinatedgaits for a complex quadrupedal robot. The reduced difference inperformance between simulation and the real world, in addition torobustness to new environments, allows for increased confidence insimulation results. Because TEGOTAE sensor feedback can easily beimplemented in physics simulators, requiring no sensor calibration,it can efficiently be implemented in other ER research.Because of the ease of which TEGOTAE can be integrated withcomplex CPG control systems future research should look into thepossibility of first evolving the CPG and later adding sensor feed-back to the same controller. This would shed light on the differencesin parameter convergence observed in this paper. Additionally, in-tegrating TEGOTAE sensor feedback with a completely differentcontrol system should also be attempted to broaden the applicabilityof the embodied phase coordination mechanism. Another topic toinvestigate is how to reduce the phase adaptation period observedfor the TEGOTAE approach. Ideally the adaptation should occurover a minimal timespan avoiding the need for a ’warm-up’ periodand maximizing the distance traversed. Since the phase coordina-tion mechanism is dependent on the body of the robot it could alsobe interesting to discover how the TEGOTAE system would handlechanges to the body. A change involving the morphology of therobot would be an interesting experiment along with changes inthe body characteristics such as joint velocity.
ACKNOWLEDGMENTS
This work is partially supported by The Research Council of Norwayunder grant agreement 240862 and through its Centers of Excellencescheme, project number 262762. The simulations were performedon resources provided by UNINETT Sigma2.
REFERENCES [1] Mostafa Ajallooeian, Sébastien Gay, Alexandre Tuleu, Alexander Spröwitz, andAuke J Ijspeert. 2013. Modular control of limit cycle locomotion over unperceivedrough terrain. In
Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ InternationalConference on . Ieee, 3390–3397.[2] Mostafa Ajallooeian, Soha Pouya, Alexander Sproewitz, and Auke J Ijspeert.2013. Central pattern generators augmented with virtual model control forquadruped rough terrain locomotion. In
Robotics and Automation (ICRA), 2013IEEE International Conference on . IEEE, 3321–3328.[3] Yuichi Ambe, Shinya Aoi, Timo Nachstedt, Poramate Manoonpong, FlorentinWörgötter, and Fumitoshi Matsuno. 2018. Simple analytical model reveals the volved embodied phase coordination enables robust robot locomotion GECCO ’19, July 13–17, 2019, Prague, Czech Republic functional role of embodied sensorimotor interaction in hexapod gaits.
PloS one
13, 2 (2018), e0192469.[4] Shinya Aoi, Poramate Manoonpong, Yuichi Ambe, Fumitoshi Matsuno, andFlorentin Wörgötter. 2017. Adaptive control strategies for interlimb coordinationin legged robots: a review.
Frontiers in neurorobotics
11 (2017), 39.[5] Randall D Beer and John C Gallagher. 1992. Evolving dynamical neural networksfor adaptive behavior.
Adaptive behavior
1, 1 (1992), 91–122.[6] Antoine Cully and Jean-Baptiste Mouret. 2016. Evolving a behavioral repertoirefor a walking robot.
Evolutionary computation
24, 1 (2016), 59–88.[7] Stephane Doncieux, Nicolas Bredeche, Jean-Baptiste Mouret, and Agoston E GuszEiben. 2015. Evolutionary robotics: what, why, and where to.
Frontiers in Roboticsand AI
EEE/RSJ International Conference on Intelligent Robots and Systems(IROS) .[9] Kyrre Glette, Andreas Leret Johnsen, and Eivind Samuelsen. 2014. Filling thereality gap: Using obstacles to promote robust gaits in evolutionary robotics. In
Evolvable Systems (ICES), 2014 IEEE International Conference on . IEEE, 181–186.[10] Nikolaus Hansen, Sibylle D Müller, and Petros Koumoutsakos. 2003. Reducingthe time complexity of the derandomized evolution strategy with covariancematrix adaptation (CMA-ES).
Evolutionary computation
11, 1 (2003), 1–18.[11] Auke Jan Ijspeert. 2008. Central pattern generators for locomotion control inanimals and robots: a review.
Neural networks
21, 4 (2008), 642–653.[12] Nick Jakobi, Phil Husbands, and Inman Harvey. 1995. Noise and the realitygap: The use of simulation in evolutionary robotics. In
European Conference onArtificial Life . Springer, 704–720.[13] Sylvain Koos, Jean-Baptiste Mouret, and Stéphane Doncieux. 2013. The trans-ferability approach: Crossing the reality gap in evolutionary robotics.
IEEETransactions on Evolutionary Computation
17, 1 (2013), 122–145.[14] Suchan Lee, Jason Yosinski, Kyrre Glette, Hod Lipson, and Jeff Clune. 2013. Evolv-ing gaits for physical robots with the HyperNEAT generative encoding: Thebenefits of simulation. In
European Conference on the Applications of EvolutionaryComputation . Springer, 540–549.[15] Gregory Morse, Sebastian Risi, Charles R Snyder, and Kenneth O Stanley. 2013.Single-unit pattern generators for quadruped locomotion. In
Proceedings of the15th annual conference on Genetic and evolutionary computation . ACM, 719–726.[16] Jean-Baptiste Mouret and Konstantinos Chatzilygeroudis. 2017. 20 years ofreality gap: a few thoughts about simulators in evolutionary robotics. In
Proceed-ings of the Genetic and Evolutionary Computation Conference Companion . ACM,1121–1124.[17] M Mutlu, S Hauser, A Bernardino, and AJ Ijspeert. 2018. Effects of passive andactive joint compliance in quadrupedal locomotion.
Advanced Robotics
32, 15(2018), 809–824.[18] Andrew L. Nelson, Gregory J. Barlow, and Lefteris Doitsidis. 2009. Fitness func-tions in evolutionary robotics: A survey and analysis.
Robotics and AutonomousSystems
57, 4 (2009), 345–370.[19] Stefano Nolfi, Dario Floreano, Orazio Miglino, and Francesco Mondada. 1994.How to evolve autonomous robots: Different approaches in evolutionary robotics.In
Artificial life iv: Proceedings of the fourth international workshop on the synthesisand simulation of living systems . MIT Press, 190–197.[20] Jørgen Nordmoen, Kai Olav Ellefsen, and Kyrre Glette. 2018. Combining MAP-Elites and Incremental Evolution to Generate Gaits for a Mammalian Quadruped Robot. In
Applications of Evolutionary Computation . Springer, 719–733.[21] Tønnes F Nygaard, Charles P Martin, Eivind Samuelsen, Jim Torresen, and KyrreGlette. 2018. Real-world evolution adapts robot morphology and control tohardware limitations. In
Proceedings of the Genetic and Evolutionary ComputationConference . ACM, 125–132.[22] Tønnes F. Nygaard, Charles P. Martin, Jim Torresen, and Kyrre Glette. 2019. Self-Modifying Morphology Experiments with DyRET: Dynamic Robot for EmbodiedTesting. In .[23] Tønnes F. Nygaard, Jim Tørresen, and Kyrre Glette. 2016. Multi-objective evolu-tion of fast and stable gaits on a physical quadruped robotic platform. In .[24] Dai Owaki and Akio Ishiguro. 2017. A quadruped robot exhibiting spontaneousgait transitions from walking to trotting to galloping.
Scientific reports
7, 1 (2017),277.[25] Dai Owaki, Takeshi Kano, Ko Nagasawa, Atsushi Tero, and Akio Ishiguro.2013. Simple robot suggests physical interlimb communication is essential forquadruped walking.
Journal of The Royal Society Interface
10, 78 (2013), 20120669.[26] H. Ozbay. 1999.
Introduction to Feedback Control Theory . Taylor & Francis.https://books.google.no/books?id=AUGmN6L6u1AC[27] Rolf Pfeifer and Josh Bongard. 2006.
How the body shapes the way we think: anew view of intelligence . MIT press.[28] Morgan Quigley, Ken Conley, Brian Gerkey, Josh Faust, Tully Foote, Jeremy Leibs,Rob Wheeler, and Andrew Y Ng. 2009. ROS: an open-source Robot OperatingSystem. In
ICRA workshop on open source software , Vol. 3. Kobe, Japan, 5.[29] Fernando Silva, Miguel Duarte, Luís Correia, Sancho Moura Oliveira, and An-ders Lyhne Christensen. 2016. Open issues in evolutionary robotics.
EvolutionaryComputation
24, 2 (2016), 205–236.[30] Alexander Spröwitz, Alexandre Tuleu, Massimo Vespignani, Mostafa Ajallooeian,Emilie Badri, and Auke Jan Ijspeert. 2013. Towards dynamic trot gait locomotion:Design, control, and experiments with Cheetah-cub, a compliant quadrupedrobot.
The International Journal of Robotics Research
32, 8 (2013), 932–950.[31] Danesh Tarapore, Jeff Clune, Antoine Cully, and Jean-Baptiste Mouret. 2016. HowDo Different Encodings Influence the Performance of the MAP-Elites Algorithm?.In
Genetic and Evolutionary Computation Conference . ACM.[32] Danesh Tarapore and Jean-Baptiste Mouret. 2015. Evolvability signatures ofgenerative encodings: beyond standard performance benchmarks.
InformationSciences
313 (2015), 43–61.[33] Joseba Urzelai and Dario Floreano. 2001. Evolution of adaptive synapses: Robotswith fast adaptive behavior in new environments.
Evolutionary computation
9, 4(2001), 495–524.[34] Vinod K Valsalam and Risto Miikkulainen. 2008. Modular neuroevolution formultilegged locomotion. In
Proceedings of the 10th annual conference on Geneticand evolutionary computation . ACM, 265–272.[35] Rui Vasconcelos, Simon Hauser, Florin Dzeladini, Mehmet Mutlu, Tomislav Hor-vat, Kamilo Melo, Paulo Oliveira, and Auke Ijspeert. 2017. Active stabilization ofa stiff quadruped robot using local feedback. In
Intelligent Robots and Systems(IROS), 2017 IEEE/RSJ International Conference on . IEEE, 4903–4910.[36] Viktor Zykov, Josh Bongard, and Hod Lipson. 2004. Evolving dynamic gaitson a physical robot. In