[PDF] Excuse me! Perception of Abrupt Direction Changes Using Body Cues and Paths on Mixed Reality Avatars

Abstract

We evaluate two methods of signalling abrupt direction changes of a robotic platform using a Mixed Reality avatar. The "Body" method uses gaze, gesture and torso direction to point to upcoming waypoints. The "Path" method visualises the change in direction using an angled path on the ground. We compare these two methods using a controlled user study and show that each method has its strengths depending on the situation. Overall the "Path" method was slightly more accurate in communicating the direction change of the robot but participants overall preferred the "Body" method.

Full PDF

EExcuse me! Perception of Abrupt Direction ChangesUsing Body Cues and Paths on Mixed Reality Avatars

Nicholas Katzakis

Department of InformaticsUniversität [email protected]

Frank Steinicke

Department of InformaticsUniversität [email protected]

Figure 1: (1) and (2) Robot pointing using its body (3) Red path visualised on the ground (4) Snapshot from the user study withthree occluding robots.

ABSTRACT

We evaluate two methods of signalling abrupt directionchanges of a robotic platform using a Mixed Reality avatar.The “Body” method uses gaze, gesture and torso direction topoint to upcoming waypoints. The “Path” method visualisesthe change in direction using an angled path on the ground.We compare these two methods using a controlled user studyand show that each method has its strengths depending onthe situation. Overall the “path” technique was slightly moreaccurate in communicating the direction change of the robotbut participants overall preferred the “Body” technique.

CCS CONCEPTS • Human-centered computing → Empirical studies invisualization ; User studies ; •

Computing methodologies → Robotic planning;

KEYWORDS

Robot, Interaction, Perception, Intentions, Gait, Externaliz-ing, AR, VR

Humans have the expectation that robotic agents behavewith a degree of social intelligence [3, 13] so that bystanderscan interpret their state and predict their action. The diverse forms and locomotion methods [15] of modern robots, how-ever, make it challenging to impart social characteristics.Some research to date has attempted to make robots morepredictable. This was done primarily by exploring humanoidrobot designs, by adapting their gestures and gaze [23], andby adapting their locomotion to be more expressive and pre-dictable [1, 8, 12]. All of these approaches however, imposea number of constraints on the design of the robot or on itslocomotion method. i.e. If the robot must perform expressivemovements to cater for perception from bystanders, then itcannot travel to its next waypoint using the shortest trajec-tory possible, or with the most energy efficient locomotionmethod.

Mixed Reality (MR) offers a possible solution to this prob-lem. Mixed reality agents [10] seperate the physical form ofthe robot from its outward appearance. The MR approachassumes that robots broadcast their future trajectories andstate etc. to bystanders. Bystanders can then use their choiceof MR display technology to display the robots – and theiractions – in a way that is easier to perceive, thus enhancingtheir comfort in the vincinity of robots. An example of theenvisioned scenario can be seen in Figure 2.This work specifically examines direction changes becausethey are a potential cause of discomfort to bystanders. Whenhumans make significant changes in walking direction theyconvey this change with a number of body cues includinggaze [9], slowing down, etc. Contrary to humans, robots can a r X i v : . [ c s . H C ] J a n igure 2: Illustration of an envisioned application of MRagents. A humanoid avatar is superimposed on top of a real,physical wheeled robot. change the parameters of their locomotion instantly [24] toaccount for a new heading or to avoid obstacles. Therefore,even if bystanders trust that robots will not collide withthem, these abrupt direction changes might be a source fordistraction and discomfort.The proliferation of autonomous agents and vehicles raisesthe question: Is it possible, with MR agents, to warn by-standers of abrupt direction changes? We explore the designspace and present two potential solutions: One uses bodycues of a humanoid MR avatar, the other displays the di-rection change as a path on the ground (Figure 1). The twoapproaches are evaluated in a controlled user study that ma-nipulates cue onset timing , cue expressiveness and robustnessto occlusion . Shoji et al. superimposed an avatar on a robot to increaseattractiveness [21] while Shimizu et al. [20] superimposedan avatar that attempts to mimic the movements of the robot.Aspects of intent or predictability, however, remain largelyunexplored.The approach of Dragone and O’Hare [4, 5, 10] makessome headway towards overcoming these limitations. In thecontext of a Mi xed R eality A gent (MiRA), they display ahumanoid avatar that appears sitting on a rhoomba robotto communicate the state of the robot. Such an avatar canindicate direction changes by pointing yet the authors havenot evaluated their proposed system in this context.Young et al. explored a similar concept using cartoon-ing [25]. They separated behavioural likeness from visuallikeness , in an attempt to avoid the “uncanny valley” [14].These cartoon-like illustrations offer a simple, yet powerfulmeans of expressing various states of the robot. It remainsunclear how this approach could be applied to express spatialintent and direction changes. The idea of externalizing the internal state of a robot usingAR visualisations has been explored by Collett et al. [2] inthe context of debugging the various sensors of the robot.More recently Hönig et al. [11] also used an augmentedreality setup with the goal of creating a safer environmentfor debugging algorithms where robots are to be used inclose proximity to humans. Other works have explored pathplanning [6, 22, 26], also in an attempt to assess accuracyand diagnose waypoint errors.Hoffman and Ju [7] have suggested that rather than visuallikeness to humanoids, motion can be a powerful tool to com-municate the state of the robot. Although this approach isideal for communication when the robot is stationary, instantwaypoint change situations leave little time for expressivemovements. The design goal of the two proposed methods is to informbystanders that the robot will change its direction and allowthem to predict where the robot is going next . The assumptionbeing that bystanders will feel more comfortable knowingwhere the robot is heading. Rather than superimposing arealistic human avatar, we chose the “Kyle” robot modelin unity because of its torso. The slim torso reduces limbocclusion from various angles and the elbow and shoulderguards make arm rotations more salient.

Body

When choosing cues for the body method, we opted for cuesthat are easy to interpret by humans. Previous studies on hu-man behavior in steering situations found that participantstypically align their head [9, 18, 19] with the next waypointimmedately following delivery of a direction change cue.This head alignment occurs before torso orientation [18].The fact that people expect such cues from other humanoidsin their vincinity, inspired the use of a humanoid avatar andthe first two expressivity levels (Figure 3).The medium expressivity level is inspired by how in somecultures people extend their forearm forward as a sign ofpoliteness. A third level of expressivity was added by exag-gerating the turn of the avatar torso, in the hope that it willmake the waypoint change cues more salient.

Path

Earlier works have explored visualising the path of the ro-bot [17, 22], yet these designs extend lines far into the futureand this means that in an MR setting where all the robotsin the vincinity broadcast their paths the visual field wouldquickly become cluttered. We were therefore interested inexploring the potential of a shorter path with a pointed tipto indicate direction. a) Body low expressivity:Only the head orients to-wards the waypoint. (b) Body medium expres-sivity: In addition to thehead, the arm closest tothe target points. (c) Body high expressiv-ity: In addition to theother two cues the entiretorso completely orientstowards the waypoint. (d) Path low expressivity:One meter length. PointA is the start of thepath; point B is the pointwhere the waypointchange stride will occur.The other two levels areidentical except with twoand three meters length.Figure 3: Expressivity levels. A path displayed at the feet of the avatar is possibly diffi-cult to perceive. i.e. when a person is fixating forward whilewalking, the path of an adjacent robot will be outside his orher field of view. Bringing the path up to the view frustumof the person might introduce unwanted view obstruction.A further limitation of a path or arrow displayed along theline of sight is that it is difficult to judge their magnitude dueto perspective [16].The displayed path begins between the feet of the robot,joins the point where the robot will change direction andextends towards the next waypoint without reaching it (Fig-ure 3d). The path maintains its length the entire time it isvisible i.e. it extends towards the next waypoint with thesame speed the robot is walking. We manipulate the lengthof the path as a means of manipulating expressivity in the path method.

We conducted a controlled study to evaluate the robustnessof the two proposed methods with regards to: ◦ Cue onset timing: How long do these methods need tobe displayed in order to be effective in communicatingthe robot’s waypoint? ◦ Cue expressiveness: How subtle do these cues need tobe? ◦ Robustness to occlusion: How do these cues performin a situation with numerous occluding robots?Conducting this study in Virtual Reality (VR) allows forstudying the perception of the robotic avatar isolated fromunwanted environmental confounding factors.

Participants - Apparatus

Fourteen unpaid participants (Ages 22-38, mean age 31) wererecruited from the faculty and students of .75% of participants were right handed and 25% left handed.91% of participants had used a virtual reality headset before.In post-experiment questionnaires participants reported ahigh-level of attention during the experiment. Written andinformed consent was obtained from each participant andthe experiment was approved by a local ethics committe.Participants stood in the center of a 4x4 meter space andmounted the HTC Vive on their head while holding oneof the Vive controllers on their dominant hand as pointer(Figure 4). Auditory cues were delivered through noise can-celling headphones. The experiment lasted approximately40 minutes and participants were free to rest at any pointduring the experiment. igure 4: A participant during the experiment. Stimuli and procedure

The virtual environment consisted of a 100 x 100 meter plaintiled floor on which participants were free to roam. At thebeginning of the experiment participants were allowed tofamiliarize themselves with the VR environment and thecontrols.When the experiment started the avatar walked segmentsbetween 1.5 and 3.0 meters. At three preset onset distances,before reaching the end of these segments, the waypointchange cue would appear (body or path). Participants wereinstructed to point using their wand to indicate where theythought the next waypoint of the robot will be (followingthe direction change – Figure 5). Participants received visualfeedback about their pointing location by a ray extendingfrom their wand in addition to a sphere at the location wherethe ray intersected with the floor. Participants were free towalk around and teleport using the thumb button (pad) of theVive controller using a standard VR teleportation technique.

Figure 5: Experiment task. Participants had to guess whereis the next waypoint of the robot.

The robot continued cycling waypoints with a walkingspeed of 1 meter/sec and participants had to keep guessingwaypoints. If participants failed to indicate the correct way-point in time the trial was queued at the end of the trialsqueue to be repeated later. Method was treated as one factorwith two levels,

Body and

Path .In addition to the cue pre-sentation method there were three factors with three levelseach. ◦ Cue onset distance , manipulated with three levels: 0.8m, 1m and 1.5m. In addition we manipulate ◦ Cue Expressivity , manipulated with three levels: low,medium and high. ◦ Number of occluding robots , manipulated with threelevels: no robots, 3 robots and 6 robots. ◦ Angle at which the robot turned, manipulated at 6levels: -30,-20,-10,10,20,30.The occluding robots spawned on a random location in acircle of 3m radius around the main robot avatar. The robotschose random waypoints in a circle with a 3m radius centeredtwo waypoints ahead of the main robot avatar. The resultwas random occlusion and intersection of the main avatar’spath. Rather than passing through one another without colli-sions, in a real situation the robots should perform collisionavoidance. Preventing collisions was a conscious design deci-sion on our part which aimed to disentangle visual cues frombehavioral prediction cues formed by prior lifetime experi-ences of the participants. i.e. We did not want participants tobe able to predict that robots will change direction to avoidcollision. They should only rely on the proposed directionchange cues. The main avatar had a white outline attachedin all conditions so that it can be distinguished from the restof the robots (Figure 6).

Figure 6: Six Occluding robots in the

Path condition. Themain robot avatar has a faint white outline for disambigua-tion.

All factors were randomized except cue method and num-ber of occluding robots. Those were repeated in counter alanced blocks. e.g. Body method was tested with 3 dis-tracting robots, followed by 0 robots, followed by 9 robotsetc. Following one Body x Distracting Robots block the Pathmethod was tested again with all Distracting robot levels.This was counter balanced across participants.This design was chosen because the experiment was toochallenging if robot numbers and technique were alternatedwith every trial (robots appearing/disappearing etc.). Giventhat the robot travelled randomly between 1.5m-3m in everypath segment this resulted on an average of 2.25 seconds persegment. The occluding robots travelled at the same speed (1m/sec) as the main avatar. Because of this, if they randomlydisappeared and spawned with every single trial, there wouldbe insufficient time for them to walk around and form arandom pattern while the appearance and disappearanceof robots would be too distracting and detached from theenvisioned application scenario of this work.We recorded the reaction time (RT), i.e. time between cueonset and trigger press by the participant. We also recordedthe prediction accuracy of participants, by recording the Error . i.e. The euclidean distance between the actual robotwaypoint and the location the participant indicated with thewand.The experiment was a within-subjects design with 2 meth-ods x 3 onset distances x 3 expressivity levels x 3 occludingrobot levels x 6 angles x 2 trials = 648 trials. 14 participantsx 648 trials per participant = 9072 total trials collected. Thedata was analyzed using a repeated-measures ANOVA testin R.

As expected, method had a significant effect on both Error( F , = . p < .

01) and RT ( F , = . p < . F , = . p = . F , = p = . F , = . p = .

9) but, as expected, a highly signifi-cant effect on RT ( F , = p < . Expressivity level had no statistical effect on RT ( F , = . p = .

7) but had a significant effect on Error ( F , = . p < . Angle on Error or RT( F , = . p < . ● ● ● Number of Occluding Robots E rr o r ( m ) technique ● bodypath Figure 7: Effect of number of distractor robots on Error(lower is better). The body method was more sensitive to thenumber of distractors. ● ● ●

Cue Onset Distance R eac ti on T i m e ( s ec ) technique ● bodypath Figure 8: Effect of cue onset distance on RT. l l l

Cue Expressivity Level E rr o r ( m ) technique l bodypath Figure 9: Effect of cue expressivity level on Error.

As expected, cueing method (Body or Path) affected the par-ticipant’s ability to accurately sense prediction. Using the PercentageColor Code

Figure 10: Participant’s ratings of the methods on a 5-pointlikert scale. The

Body method was preferred.

Body method participants were on average 1.21m away fromthe robot’s waypoint vs. 1.04 m from the path. A difference ofapproximately 17cm. On the other hand, on the low expres-sivity level, when the path was shortest and the avatar onlyturned its head, the Body technique performed significantlybetter ( F , = . p < .

001 - Figure 9). Summarizing theresponses from the questionnaires shows that participantspreferred the

Body technique (Figure 10).It is also notable that the length of the path (cue expres-sivity) resulted in a significant improvement in performancefrom the low to the high condition. The 3m path resultedin an average

Error of 0.84 m vs. 1.32 m for the 1m path.Despite this the change is not a multiple of the path length(non-linear) and this suggests that MR interface designersshould not extend the line too far into the future of the robot’splanned path. This makes additional sense when consideringthat paths might be updated in response to external factors.The path method also proved more robust to visual occlu-sions than the body method. Specifically, the path technique’sError remained around 1.03 m despite the number of robotspresent on the scene (Figure 7). In a busy environment withmany robots present designers might use the path techniquewhereas when there are fewer robots they could signal di-rection changes using the body cues alone. This is especiallyimportant considering that when robots are near, the pathmight be outside the person’s field of view.Finally, cue onset distance results can be directly translatedto time (in seconds) because the robot was travelling with a1 meter/second speed. Results from our experiment suggestthat the path method had a marginally shorter reaction timethan the body method. This is another result that can beconsidered by designers when choosing the appropriate cuefor the moment. Although we assume that robots wouldalways cater to avoid collisions with humans, mechanicalerrors could occur. In such a situation a robot might be awareof its own malfunction and broadcast its direction changefor bystanders to avoid it. When the robot is very near theparticipant’s visor could chose the path technique whenreaction time is of the essence. One participant commented that when there were manyocclusions from the other robots, they occasionally observedthe shadow to find the direction. This makes sense becausean arm extended from the robot’s shadow forms a perfectlyaligned 2D vector on the ground, pointing to the desiredwaypoint and is perhaps an additional design space for thisdomain.The results from this experiment suggest that each tech-nique has its own strengths and robotics MR interface design-ers can take these strenghts and weaknesses into accountwhen choosing which method to use.A limitation of this study is that direction changes wereconsidered only for locomotion across the ground plane.What if the robot is traveling in other axes? Can these cuesbe extended to convey direction changes when robots areperforming actions other than walking like drones or other,more exotic robots that jump around (like a spider robot)?

We have presented two methods for communicating direc-tion changes using superimposed MR avatars. Our findingssuggest that both these methods have strong points and thatinterface designers should consider these strenghts whenchoosing which technique to use for their application.This study is a pilot exploration of a design space that webelieve offers potential for rich communication. Many inter-esting questions remain open for future work: Can we detachthe MR avatar from the actual robot so that even though thephysical robot is abruptly changing position, the MR avatarstarts turning in advance? Can the MR avatar follow a curvedturning path even though the physical robot turns in a sharpangle? How would such a mapping be accomplished? Theseremain as open questions for future work.

REFERENCES [1] Cynthia Breazeal, Cory D Kidd, Andrea Lockerd Thomaz, Guy Hoff-man, and Matt Berlin. 2005. Effects of nonverbal communication onefficiency and robustness in human-robot teamwork. In

IntelligentRobots and Systems, 2005.(IROS 2005). 2005 IEEE/RSJ International Con-ference on . IEEE, 708–713.[2] Toby Hartnoll Joshua Collett and Bruce Alexander Macdonald. 2010.An Augmented Reality Debugging System for Mobile Robot SoftwareEngineers. (2010), 18–32. https://doi.org/10.6092/JOSER_2010_01_01_p18[3] K. Dautenhahn. 2004. Robots we like to live with?! - a developmentalperspective on a personalized, life-long robot companion. In

RO-MAN2004. 13th IEEE International Workshop on Robot and Human InteractiveCommunication (IEEE Catalog No.04TH8759) . 17–22. https://doi.org/10.1109/ROMAN.2004.1374720[4] Mauro Dragone, Thomas Holz, and Gregory M.P. O’Hare. 2006. MixingRobotic Realities. In

Proceedings of the 11th International Conference onIntelligent User Interfaces (IUI ’06) . ACM, New York, NY, USA, 261–263.https://doi.org/10.1145/1111449.1111504[5] Mauro Dragone, Thomas Holz, and Gregory MP O’Hare. 2007. UsingMixed Reality Agents as Social Interfaces for Robots. In

Robot and uman interactive Communication, 2007. RO-MAN 2007. The 16th IEEEInternational Symposium on . IEEE, 1161–1166.[6] Björn Giesler, Tobias Salb, Peter Steinhaus, and Rüdiger Dillmann.2004. Using augmented reality to interact with an autonomous mobileplatform. In Robotics and Automation, 2004. Proceedings. ICRA’04. 2004IEEE International Conference on , Vol. 1. IEEE, 1009–1014.[7] Guy Hoffman and Wendy Ju. 2014. Designing Robots with Movementin Mind.

Journal of Human-Robot Interaction

3, 1 (2014), 89–122.[8] Guy Hoffman, Oren Zuckerman, Gilad Hirschberger, Michal Luria,and Tal Shani Sherman. 2015. Design and Evaluation of a PeripheralRobotic Conversation Companion. In

Proceedings of the Tenth AnnualACM/IEEE International Conference on Human-Robot Interaction (HRI’15) . ACM, New York, NY, USA, 3–10. https://doi.org/10.1145/2696454.2696495[9] Mark A Hollands, Aftab E Patla, and Joan N Vickers. 2002. Look whereyouâĂŹre going: gaze behaviour associated with maintaining andchanging the direction of locomotion.

Experimental brain research

International Journal of Human-Computer Studies

69, 4 (2011),251 – 268. https://doi.org/10.1016/j.ijhcs.2010.10.001[11] W. Hönig, C. Milanes, L. Scaria, T. Phan, M. Bolas, and N. Ayanian. 2015.Mixed Reality for Robotics. In . 5382–5387. https://doi.org/10.1109/IROS.2015.7354138[12] Mohammed Hoque. 2012. An integrated approach of attention controlof target human by nonverbal behaviors of robots in different viewingsituations. In

IROS . IEEE, 1399–1406.[13] Cory D Kidd and Cynthia Breazeal. 2005. Sociable Robot Systems forReal-World Problems. In

ROMAN 2005. IEEE International Workshopon Robot and Human Interactive Communication, 2005.

Energy

7, 4 (1970), 33–35.[15] Satoshi Murata, Eiichi Yoshida, Akiya Kamimura, Haruhisa Kurokawa,Kohji Tomita, and Shigeru Kokaji. 2002. M-TRAN: Self-reconfigurablemodular robotic system.

Mechatronics, IEEE/ASME Transactions on

American Journal of Ophthalmology

32, 8 (1949), 1069 – 1087.https://doi.org/10.1016/0002-9394(49)90649-2[17] Shayegan Omidshafiei, Ali-Akbar Agha-Mohammadi, Yu Fan Chen,N Kemal Ure, Jonathan P How, John Vian, and Rajeev Surati. 2015.Mar-cps: Measurable augmented reality for prototyping cyber-physicalsystems. In

AlAA Infotech@ Aerospace Conference, 2015 .[18] Aftab Patla. 1999. Online steering: coordination and control of bodycenter of mass, head and body reorientation.

Exp. brain research

Neuroreport

8, 17 (1997), 3661–3665.[20] Noriyoshi Shimizu, Maki Sugimoto, Dairoku Sekiguchi, ShoichiHasegawa, and Masahiko Inami. 2008. Mixed Reality Robotic UserInterface: Virtual Kinematics to Enhance Robot Motion. In

Proceedingsof the 2008 International Conference on Advances in Computer Enter-tainment Technology (ACE ’08) . ACM, New York, NY, USA, 166–169.https://doi.org/10.1145/1501750.1501789[21] Michihiko Shoji, Kanako Miura, and Atsushi Konno. 2006. U-Tsu-Shi-O-Mi: the Virtual Humanoid you can Reach. In

ACM SIGGRAPH 2006Emerging technologies . ACM, 34.[22] Michael Stilman. 2005. Augmented reality for robot development andexperimentation.

Robotics Institute, CMU, Tech Report

2, 3 (2005).[23] Daniel Szafir, Bilge Mutlu, and Terry Fong. 2015. CommunicatingDirectionality in Flying Robots. In

Proceedings of the Tenth AnnualACM/IEEE International Conference on Human-Robot Interaction . ACM,19–26.[24] Keigo Watanabe, Yamato Shiraishi, Spyros G Tzafestas, Jun Tang, andToshio Fukuda. 1998. Feedback control of an omnidirectional au-tonomous platform for mobile service robots.

Journal of Intelligentand Robotic Systems

22, 3-4 (1998), 315–330.[25] James E. Young, Min Xin, and Ehud Sharlin. 2007. Robot ExpressionismThrough Cartooning. In

Proceedings of the ACM/IEEE InternationalConference on Human-robot Interaction (HRI ’07) . ACM, New York, NY,USA, 309–316. https://doi.org/10.1145/1228716.1228758[26] Stefanie Zollmann, Christof Hoppe, Tobias Langlotz, and Gerhard Reit-mayr. 2014. FlyAR: Augmented Reality Supported Micro Aerial VehicleNavigation.

Visualization and Computer Graphics, IEEE Transactionson

20, 4 (2014), 560–568.20, 4 (2014), 560–568.