[PDF] Motion Planning Combines Psychological Safety and Motion Prediction for a Sense Motive Robot

Abstract

Human safety is the most important demand for human robot interaction and collaboration (HRIC), which not only refers to physical safety, but also includes psychological safety. Although many robots with different configurations have entered our living and working environments, the human safety problem is still an ongoing research problem in human-robot coexistence scenarios. This paper addresses the human safety issue by covering both the physical safety and psychological safety aspects. First, we introduce an adaptive robot velocity control and step size adjustment method according to human facial expressions, such that the robot can adjust its movement to keep safety when the human emotion is unusual. Second, we predict the human motion by detecting the suddenly changes of human head pose and gaze direction, such that the robot can infer whether the human attention is distracted, predict the next move of human and rebuild a repulsive force to avoid potential collision. Finally, we demonstrate our idea using a 7 DOF TIAGo robot in a dynamic HRIC environment, which shows that the robot becomes sense motive, and responds to human action and emotion changes quickly and efficiently.

Full PDF

11 Motion Planning Combines Psychological Safetyand Motion Prediction for a Sense Motive Robot

Hejing Ling, Guoliang Liu,

Member, IEEE, and Guohui Tian

Abstract —Human safety is the most important demand forhuman robot interaction and collaboration (HRIC), which notonly refers to physical safety, but also includes psychologicalsafety. Although many robots with different conﬁgurations haveentered our living and working environments, the human safetyproblem is still an ongoing research problem in human-robotcoexistence scenarios. This paper addresses the human safetyissue by covering both the physical safety and psychologicalsafety aspects. First, we introduce an adaptive robot velocitycontrol and step size adjustment method according to humanfacial expressions, such that the robot can adjust its movementto keep safety when the human emotion is unusual. Second, wepredict the human motion by detecting the suddenly changes ofhuman head pose and gaze direction, such that the robot caninfer whether the human attention is distracted, predict the nextmove of human and rebuild a repulsive force to avoid potentialcollision. Finally, we demonstrate our idea using a 7 DOF TIAGorobot in a dynamic HRIC environment, which shows that therobot becomes sense motive, and responds to human action andemotion changes quickly and efﬁciently.

Index Terms —Motion planning, human-robot interaction, psy-chological safety, motion prediction, sense motive robot.

I. I

NTRODUCTION I N the past few decades, industrial robots have been widelyused in isolated workspaces considering worker safety.With the booming development of robotics and artiﬁcial in-telligence, the research and design of interactive and collabo-rative robots have attracted the attention of many researchers[1] [2]. What follows is the safety problem of human-robotcoexistence, due to the uncertainty of human actions, robotsneed to avoid human in a timely and move properly to avoidpotential harm to human. To ensure the physical safety ofworkers, people usually set up safe workspace for indus-trial robots and prohibited workers from approaching them.Nowadays, human and robot need to coexist in a sharedworkspace, so it is necessary to track people’s motion in real-time, and research on efﬁcient and stable collision avoidancestrategies for robot. Furthermore, in order to improve thesocial experience of human robot interaction, it is necessaryto ensure people’s psychological safety. As mentioned in thepaper [3] [4], psychological states have a greater impact on ourperception of risk and the behaviours we will take. Therefore,we must fully consider human psychological and optimize thecontrol of robot to achieve better HRIC [5] [6].

Hejing Ling, Guoliang Liu and Guohui Tian are with School of ControlScience and Engineering, Shandong University, Jinan, 250014 China (e-mail:[email protected], [email protected], [email protected]).(Corresponding author: Guoliang Liu.)This work was supported in part by the National Natural Science Foundationof China under Grant 91748115, in part by the National Key Research andDevelopment Program of China under Grant 2018YFB1306500.

Trajectory OptimizerFacial expressionHead and Gaze poseOpenFace 2.0Environment SDAPFSensor TIAGo Controller

Fig. 1. Overall process of our motion planning system considering humanpsychological safety and motion prediction based on TIAGo platform.

In this paper, we propose an advanced motion planningalgorithm that uses facial expressions, head pose and gazeangle to estimate human psychology states and predict futuremove direction, and an optimized control strategy of SDAPF(sampling and danger-index based artiﬁcial ﬁeld) [7] is pro-posed for physical safety and psychological safety in HRICscenarios. Our approach is convenient and real-time, and doesnot require to set up complicated environment. Speciﬁcally,we use OpenFace 2.0 [8] to detect the facial action units(AUs) of the person in each frame of the camera, and thenclassify human psychology states according to the differentcombination of the AUs, which can be used to control thevelocity and step size of the robot. In addition, we predictthe human motion according to the head pose and gaze angleoutputted by OpenFace, such that an improved repulsive forceis proposed to avoid potential collision and optimize the robottrajectory in an uncertain environment. The overall diagram ofour motion planing system for a sense motive robot is shownin Fig. 1. The main advantages of our approach include: • To ensure human psychological safety, we propose anadaptive robot velocity and step size control method byusing a convenient and real-time detection of humanfacial expressions. • Our method can optimize the repulsive force functionto avoid potential collisions by predicting the movementof the human based on the head pose and gaze angleinformation.We verify the effectiveness of the safe interaction betweenthe 7-DOF TIAGo robot and moving human arm, and provethe advantages of our dynamic collision avoidance algorithmconsidering human physical and psychological safety. The restof paper is organized as follows. In section II, we conduct abrief survey of previous work. Section III describes how we a r X i v : . [ c s . H C ] O c t use OpenFace 2.0 to get facial expressions, head pose andgaze angle. How to optimize the collision avoidance algorithmbased on the data from OpenFace 2.0 is described in sectionIV. The experiment results of our approach are analyzed insection V. Finally, we summarize our work in section VI.II. R ELATED W ORKS

A. Facial Expression Recognition

Facial expressions contain richful human behavior infor-mation, which is a form of expression of human psychol-ogy. In daily life, people can fully and subtly express theirthoughts and feelings through facial expressions, and theycan also distinguish other’s mental activities through other’sfacial expressions. Therefore, facial expression recognition hasattracted a large number of researchers. There are differentmethods to recognize emotion such as electroencephalogram(EEG), Galvanic Skin Response (GSR), speech analysis, facialexpressions, visual scanning behavior. However, with the pop-ularity of deep learning, facial expression recognition basedon images has made great progress.Traditional facial expression recognition methods are basedon Support Vector Machine (SVM) [9] and Hidden MarkovModel (HMM) [10]. First, people manually design featuressuch as Gabor, LGBP and HOG that are used to extract theappearance features from images, and then classiﬁers are usedfor facial expression classiﬁcation, such as SVM or Adaboost.Feature extraction and classiﬁcation are two separate processesand cannot be integrated into an end-to-end model. Zhang etal. [11] use the SVM-based method to carry out experimentson emotion recognition. The experiment show that SVMperforms poorly on multi-emotion classiﬁcation problems andis sensitive to the choice of kernel function.In recent years, neural networks have been used for facialexpression recognition. Ozdemir et al. [12] use CNN-basedLeNet architecture to recognize emotion by merging threedifferent datasets (KDEF, JAFFE and their custom dataset)and then training LeNet to obtain higher accuracy for emo-tion recognition. Chang et al. [13] presented a simple andefﬁcient CNN model to extract facial features, and proposed acomplexity perception classiﬁcation (CPC) algorithm for facialexpression recognition. The CPC algorithm divided the datasetinto a simple sample subspace and a complex sample subspaceby evaluating the complexity of facial features that are suitablefor classiﬁcation.

B. Motion Planning

For safe and efﬁcient HRI, ensuring human safety is ofparamount importance. Therefore, robots are required to beable to perform motion planning smoothly in real-time. Thetraditional artiﬁcial potential ﬁeld (APF) method [14] designsa virtual force ﬁeld in the environment, and the target pointgenerates an attractive force F att to the mobile robot. Obsta-cles generate repulsive force F rep to the robot. Finally, theresultant force F of attractive and repulsive force is used tocontrol the movement of the robot. APF method is simple,practical, real-time and convenient for real-time control ofthe robot controller. It is widely used in real-time obstacle avoidance and smooth control. However, there are defects suchas local minimum and goal nonreachable with obstacle nearby(GNRON) problem. Therefore, there are many improved APFmethods. For the defects that attractive force is too large dueto robot far away from goal and GNRON problem, some newrepulsive or attractive force functions are introduced [7], [15]–[17].The dynamic window approach (DWA) [18] is a velocity-based local planner, which mainly samples multiple velocitiesin the velocity space, and simulates the robot’s trajectory atthese velocities for a certain period of time, and obtains multi-ple sets of trajectories. The trajectories are then evaluated, andthe velocity corresponding to the optimal trajectory is selectedto drive the robot. Rapidly exploring random tree (RRT) [19]using collision detection of sampling points in the space,avoiding modelling for the space, can explore the space fasterand more effective, can effectively solve the path planningproblems of high-dimensional space and complex constraints,which is suitable for solving the path planning of multi-degree-of-freedom robot in complex and dynamic environments. Butone disadvantage of RRT is that it is difﬁcult to ﬁnd a pathin an environment with narrow space. Because the area ofthe narrow space is small, the probability of being occupiedby robots is low. In this case, the convergence speed of RRTis slow and the efﬁciency will be greatly reduced. Pan et al.[20] used the results of previous collision detection to predictthe collision probability of new sampling points, and thenimproved the performance of sampling-based motion planningby combining probabilistic collision checking methods withprobabilistic roadmaps (PRM) and RRT.In certain HRI scenarios, robot cannot plan an appropriatetrajectory or does not have enough time to replan, even themovement of the human may conﬂict with the initial planningtrajectory of robot, such that it is difﬁcult to ensure the safetyof human in a complex dynamic environment. Therefore,predicting human actions is very important in a dynamic HRIenvironment. Kanazawa et al. [21] determine the current taskand compute the next position of the robot by modellinghuman movement. Park et al. [22] use ofﬂine learning ofhuman actions along with temporal coherence to predict thehuman actions, and use Gaussian distribution to predict humanmotion and calculate collision probability for safe motionplanning.For safe HRI, maintaining physical safety is the focus ofthe research and industrial community, but ensuring psycho-logical safety is also crucial. Maintaining psychological safetyincludes that humans believe that the interaction with the robotis safe and will not cause any psychological discomfort to themdue to the robot’s velocity and trajectory. By simply preventingimpending collisions to maintain physical safety can lead to adiscomfort for human psychology [23]. Psychological safetyneeds to be ensured in HRI, which means the velocity andtrajectory of robot can be adjusted by detecting the humanfacial expressions, head pose and gaze angle. Yamamotoet al. [24] consider emotions in robot control and verifythe effectiveness of robot control with emotion recognition.Williams et al. [25] use the learning classiﬁer system (LCS) toanalyze emotions for better navigation, i.e., navigation system (a) The estimation of facial land-marks, head pose and gaze angle. (b) Facial action units (AUs).Fig. 2. The estimation results of facial landmarks, head pose and gaze angleof a sample image from a camera. considers emotion reduces the total number of collisions andshortens the navigation time compared with the non-adaptivenavigation system .III. F ACIAL BEHAVIOR ANALYSIS

A. A Facial Behavior Analysis Tool: OpenFace 2.0

There are many ways to detect emotional states, suchas EEG and GSR. However, these methods require to setup a complex experimental environment in order to achieveeffective experimental results. In contrast, this paper triesto simulate the human social interactions, and uses facialexpressions to infer emotional states to ensure psychologicalsafety. As for facial expression recognition, OpenFace 2.0is one of the most complete and easy-to-use tool, whichis capable of accurate facial landmark detection, head poseestimation, facial action units (AUs) recognition, and eye-gazeestimation, as shown in Fig. 2.The estimated head pose and gaze angle are represented astwo vectors: head pose vector and gaze vector. The ﬁrst threeelements of head pose vector refers to X, Y, Z componentsof the distance between the head and camera. The rest threeelements of head pose vector are pitch, yaw, roll angles. Thegaze-angle is composed of two items, which represent gazeorientation for left-right and up-down directions respectively.OpenFace 2.0 recognizes AUs with optimized linear kernelSupport Vector Machines using person speciﬁc normalizationand prediction correction. Experimental results demonstratethat OpenFace 2.0 has a better performance and a distinctspeed advantage than recent deep learning methods.

B. Facial Expression Recognition

In order to analyze the emotional states, it is necessary toobtain facial expressions based on AUs. Wikipedia gives acombination of AUs for six general expression states (includ-ing anger, disgust, fear, happy, sadness and surprise) accordingto Facial Action Coding System (FACS), which plays a veryprodigious role in the ﬁeld of face recognition. FACS isproposed for the purpose of ﬁnding a list of the muscleswhich can response separately according to changes of facialappearance. Therefore, we can analyze facial actions on the

TABLE IT

HE CORRESPONDENCE BETWEEN FACIAL EXPRESSIONS AND AU S . Facial Expressions AUs

Happy AU06 + AU07 + AU12Sad AU04 + AU15 + AU17Surprise AU01 + AU02 + AU25Fig. 3. The AUs value of the four facial expressions: expressionless, happy,sad, surprise. basis of the AUs (the result of muscular action). However, eachperson’s appearance is different, so it is difﬁcult to generalize.Although OpenFace 2.0 takes into account personal differencesin AUs detection, pre-experiments are still needed to get themost obvious action units for different facial expressions ofeach person, which helps to get stable and reliable facialexpression recognition. Considering the similarity of emotionsand relevance to the HRIC safety, we only consider three facialexpressions that can affect HRIC: happy, sad and surprise.To get the most representative AUs, we collect a groups offacial images with different distances and angles from the cam-era for each facial expression, then remove the maximum valueand minimum value for each AUs, and average the remainingdata, which is shown in Fig. 3. Furthermore, we select threemost responsive AU values for each facial expression as thecharacteristic vector, e.g., AU06, AU07 and AU12 for happyas shown in Table I.

C. Human Motion Prediction

Human motion prediction is important for HRI, whichcan be based on human skeleton detection [26] [27], humangeometric features, deep recurrent neural networks (RNN) tomodel human motion using probabilistic model [28]. However,the detection method based on human skeleton can be unstableas the human body can be occluded or changing rapidly, andneural network depends on the training dataset. The larger thedataset, the better the prediction effect. In contrast, this paperproposes to detect head pose and gaze angle to predict humanmotion, since our eyes or head will turn to the direction in (a) (b)(c) (d)Fig. 4. Head turning can lead to a speciﬁc hand motion which can be usedfor motion prediction. (a) and (b) show the person turns to the left whereas(c) and (d) show the person turns to right. The clear portrait in the imageis the current position, whereas the blurred one is the last position or nextposition. The person is holding a milk box. advance that we plan to move, as shown in Fig. 4. Therefore,we can detect head pose and gaze angle to predict humanmovement, this prediction method is simple and effective.OpenFace 2.0 can not directly provide the head pose andgaze angle, so it is necessary to convert the detection resultof OpenFace 2.0 to the real angle of the head and eyes. Thetransformation from the data of the two pose vectors to realworld data (angle of head and eyes turning) can be achievedby: A t = ( P t [0] − P c ) P max π (1) • P t [0] refers to gaze-angle[0] (left-right horizon direction),and P c means the middle value of gaze-angle[0], P max is the maximum of gaze-angle[0], then we can get trueangle of eyes turning A t . • The transformations of head pose and gaze-angle[1] (up-down vertical direction) are similarity to gaze-angle[0],so we just list the transformation of gaze-angle[0].After getting the data of head pose and gaze angle, we needto map the data of head pose and gaze angle to the distancethat we plan to move. Taking the scene of people turning toleft and right as an example, there are ten volunteers in ourexperiment. Each volunteer turns randomly to left or right 20times with different angle, and then we measure the distanceof volunteer’s arm movement. As for the 20 groups of data ofeach volunteer, we take the 5 degree as starting point, every10 degrees as a scale. Afterwards, we can obtain nine scales,which are [5, 15, 25, 35, 45, 55, 65, 75, 85], and then 20groups of data are divided into different scales according to theangle value. If a certain scale has multiple values, the medianvalue is taken as the value of the scale. The correspondingscale values of ten volunteers eventually are averaged to getthe ultimate nine scale values.

Fig. 5. A linear least square estimation of statistical model between handmovement distance D arm and head turning angle A r . We then plot human arm movement distance D arm corre-sponding to the nine scales of the turning angle A t on the Fig.5, and use a linear ordinary least square estimation to ﬁt thecurve, which is y = α + βx (2)where α and β are ﬁtting coefﬁcients to be solved. The ﬁttingcurve can be seen in Fig. 5, such that we can estimate the movedistance according to different head pose and gaze angle.IV. P SYCHOLOGICAL SAFETY AND MOTION PREDICTIONFOR MOTION PLANNING

In this paper, safeguarding human psychological safety isachieved by the control of robot’s velocity and trajectory.When robot’s velocity and trajectory change greatly, it cancause discomfort or stress to the human. Therefore, it isnecessary to constrain robot’s velocity and trajectory to ensurehuman psychological safety. Based on our previous SDAPFalgorithm for motion planning, we here introduce an improvedversion P-SDAPF to consider psychological safety accordingto the human facial expression and attention changes. There-fore, here we ﬁrst introduce the SDAPF brieﬂy, and thenintroduce the P-SDAPF to combine the psychological safety.

A. Sampling and Danger-Index Based Artiﬁcial PotentialField (SDAPF)

Our previous work introduces a sampling and danger-index based artiﬁcial potential ﬁeld (SDAPF) [7] algorithmto optimize motion planning. Compared to traditional APF,SDAPF introduces the speed repulsive force function F rev with improved impact factors and adaptive adjustment of step-size S to overcome the shortcomings of APF. The impactfactors include that the distance impact and the speed impactfactor.The distance impact factor is affected by the distance be-tween the robot and dynamic obstacle. The closer the distance,the greater the distance factor. The distance inﬂuence factorfunction is: f d =  η ( 1 p ( X ) − p max ) , p ( X ) ≤ p max , otherwise (3) The η is the distance scale factor, which can be expressed as: η = p max p min p max − p min (4)where p max , p min are the max and min inﬂuence radius ofthe moving obstacle, and p ( X ) represents the distance betweenthe robot and the obstacle at position X.In the same way, the speed inﬂuence factor depends on therelative speed between the robot and dynamic obstacle, whichcan be expressed as: k v = sgn ( γ | v o | − | v r | ) (5)where | v o | is the magnitude of velocity of the dynamicobstacle, and | v r | means the magnitude of the robot’s velocity.The γ is just a speed scale factor. The sgn () is the signfunction, which makes k v being a positive integer when γ times | v o | is greater than | v r | and vice versa. The k v > means dynamic obstacle moving fast, which inﬂuences thevelocity repulsive force. Then SDAPF can adaptively adoptdynamic obstacle avoidance strategy for the situation.The expression of the velocity repulsive force is: F rev ( v ) =  k ro f d ( v r − v o ) , k v ≤ ∩ f d > ∩ α ∈ ( − π , π k ro f d ( v r + v o ) , k v > ∩ f d > ∩ α ∈ ( − π , π , otherwise (6)where k ro is the scale factor, k v is the speed inﬂuence factor, f d is distance inﬂuence factor, v r is the velocity of end-effector, v o is the velocity of obstacle, α is the angel betweenthe relative velocity vector v or = v r − v o and the displacementvector from the end-effector to the obstacle. Whether k v is greater than 0 only affects the addition and subtractionrelationship between v r and v o . When k v is less than or equal0, v r subtracts v o . In contrast, v r plus v o , which can be seen asthe relative velocity between the robot and dynamic obstaclemoving in the opposite direction at - v o . S =  v r (cid:52) t m , P ( X t , X ot ) > p d ∩ (cid:53) P ( X t , X ot ) ≤ d , P ( X t , X ot ) ≤ p d toGoal, P ( X t , X ot ) > p d ∩ (cid:53) P ( X t , X ot ) > (7)where d represents the initial step size, v r is the velocity ofend-effector, t m is the maximum time that the manipulatorcan move when the distance between the end-effector and theobstacle is greater than p d , which is a parameter that shouldbe guaranteed to be greater than or equal to the max inﬂuenceradius of the moving obstacle. P ( X t , X ot ) is the distancebetween robot’s end-effector and obstacle. (cid:53) P ( X t , X ot ) isthe derivative of P ( X t , X ot ) . toGoal represents that we candirectly select the goal point as the next position when theend-effector is outside the inﬂuence range of the obstacle andthe relative distance is increasing. B. Adaptive Adjustment of Velocity and Step-size Based onFacial Expression Recognition

An adaptively adjustment method of velocity and step-sizeaccording to facial expressions, head pose and gaze angle is required to ensure psychological safety in the dynamicenvironment. When the expression detected is unusual or headand eyes turn to other direction, robot’s velocity and step-sizeneed to be reduced, which can be deﬁned as: V p = V t + ( P t − P t − ) V max A h / π (8) V t +1 = (1 − H )( V t − E x V b ) + HV p (9) • The setting function of step-size is similarity to thefunction of velocity, so we only list and explain thefunction of velocity. • V t refers to the velocity of robot at time t , which is a 3Dvector including X, Y, Z axis. V max is the maximumof velocity that robot can increase or decrease at thenext moment. A h is angle of head and eyes turning. P t is human head and gaze pose at time t in HRI, so P t − P t − represents the orientation of head and gazewill turn. It works only when the distance between humanand robot is small. To sum these two parts, we can getrobot’s velocity V p affected by head and gaze pose. V b isthe change range of the foundation, and E x is a ﬂag ofwhether the expressions is normal or not, i.e., the valueis when facial expressions are normal and vice versa. H is a ﬂag of whether head and eyes turn or not, whichis when head and eyes turn more than a certain angle,otherwise it is .To control the step size of end-effector according to the stateof the human, we set the initial step size as the maximumvalue, and decrease the step size each period if abnormalfacial expression is detected, or the head pose and gaze angleexceed a threshold, until it reaches the minimum value. Whenthe facial expression, head pose and gaze angle is normal,we increase the step size by each period until it reaches themaximum value. C. Motion Prediction Based on Head Pose and Gaze AngleEstimation for Trajectory Optimization

Human takes action when head and gaze change, such thatrobot’s original trajectory needs to be optimized to ensurehuman safety. The net force of SDAPF is given by F = F att + F rep + F rev (10)  F rep = k r ( 1 p ( X ) − p ) 1 p ( X ) ( X − X g ) n F rep = − n k r p ( X ) − p ) ( X − X g ) n − F rep = F rep + F rep (11)where n is a constant that is greater than zero, k r is the scalefactor, ( X − X g ) represents the distance between the robot andtarget, p ( X ) represents the distance between the end-effectorand the obstacle at position X , and p is the inﬂuence radiusof each obstacle.We optimize F rep and F rev by using head pose and gazeangle to predict human motion. For F rep and F rev , we usepredicted human pose P pre of equation 12 and predicted velocity V pre of equation 13 to replace p ( X ) of equation 3and v o of equation 5 respectively, which are deﬁned as: P pre = P ot + ( P t − P t − ) P max A h / π (12) V pre = ( P pre − P ot ) (cid:52) t (13)where P ot is the human arm pose at present moment, and ( P t − P t − ) represents the orientation of head and gaze will turn asequation (4). P max is the maximum amplitude of human armmove when head and gaze turn π , and A h is the angle of headand gaze turning. (cid:52) t is the cycle of program execution.V. E XPERIMENT AND ANALYSIS

This paper proposes a motion planning algorithm thatcombines psychological safety and motion prediction. Herepsychological safety is mainly affected by the robot’s velocityand trajectory, so our experiments analyse velocity and trajec-tory of robot while facial behavior is changing. To demonstratethe proposed idea, we ﬁrst use the Gazebo to build a simulatedHRI environment to show how the robot can capture humanemotion changes and predict the motion of human hand,and then a real TIAGo robot is used to show the practicalperformance of the proposed P-SDAPF method.

A. Dynamic Interaction Experiment in Gazebo Environment

In order to facilitate the experiment, we choose Gazeboto simulate dynamic interaction environment. In Gazebo, theTIAGo robot need to avoid the moving hand of anotherrobot (simulated human hand) to pick a box from table. Inorder to track dynamic motion of the simulated human handprecisely, we use an ArUco Marker detection method. TheArUco markers are attached to the simulated human hand andbox, such that the robot can estimate the relative pose usingthese visual markers. The human facial expression, head poseand gaze angle are estimated by OpenFace 2.0 using a realcamera.Firstly, we explore the inﬂuence of facial expressions onTIAGo’s trajectory, velocity and step size. As shown in Fig.6,when the change of facial expression is detected (t=2s, 13s),TIAGo’s velocity becomes slower and step size becomesshorter (the lower limit is 0.02) as shown in Fig.6 (b) and (d)compared to the original APF algorithm without psychologyconsideration as shown in Fig.6 (a) and (c), which helps toensure human psychological safety. In Fig.6, the ﬁgures (a)and (c) are 3D trajectories of gripper (blue) and simulatedhuman hand (green), the ﬁgures (b) and (d) show projected tra-jectories on YZ plane, step size, velocities and detected facialexpressions (i.e., 0 means expressionless, 1 means surprise,2 means sadness and 3 means happiness). The subﬁgures (e)and (f) are sampled facial images that shows the happiness andsadness. The TIAGo’s velocity and step size return to normalvalues when detected facial expressions is normal. From theexperimental results, the TIAGo’s trajectory and velocity thatcombines facial expression recognition are more smooth andcomfortable than the one without facial expression recognitionwhich can cause stress and danger for human. (a) 3D trajectories by SDAPF (b) 3D trajectories by ours(c) 2D trajectories, step size, veloc-ity and facial expression by SDAPF (d) 2D trajectories, step size, ve-locity and facial expression by P-SDAPF(e) Happy (f) SadFig. 6. Motion planning considering facial expression of human: (a) and (c)are control results of the original SDAPF, whereas (b) and (d) are results ofthe proposed P-SDAPF method by considering facial expressions, (e) and (f)are sample images of different facial expressions which correspond to happyand sad respectively. (a) and (b) are 3D trajectories of robot and movingobstacle (human hand). (c) and (d) show the projected 2D trajectories on YZplane, step size S , velocity V , facial expressions E . The ability to predict the moving direction of human canimprove the performance of path planner of the TIAGo robot.As shown in Fig.7, without the human motion prediction,the projected trajectory on YZ plane in Fig.7 (c) has severalsharp turning points due to the dynamic obstacle (humanhand), whereas our method can improve such situation bypredicting the human motion using the human head pose andgaze direction information as shown in Fig.7 (d). At t=2.5s,TIAGo’s velocity slows down and the step size is decreasedwhen the head and gaze are turning. The predicted trajectoryof dynamic human hand helps TIAGo to optimize trajectory toavoid dynamic obstacle effectively. In our experiment, whenthe turning angle of human head is less than degrees, weonly consider the gaze angle and set head angle to . Otherwisewe only consider the head angle and set the gaze angle to thehead angle as shown in the last curve of Fig.7 (d). Fig.7 (e)and (f) are the sample facial images of turning around. (a) 3D trajectories by SDAPF (b) 3D trajectories by ours(c) 2D trajectories, step size, veloc-ity and turning angle of head andgaze by SDAPF (d) 2D trajectories, step size, veloc-ity, turning angle of head and gazeby P-SDAPF(e) Turning direction (f) Turning directionFig. 7. Motion planning considering head pose and gaze angle of human:(a) and (c) are control results of the original SDAPF, whereas (b) and (d) areresults of the proposed P-SDAPF method by considering head pose and gazeangle, (e) and (f) are sample images of different head pose and gaze angle. B. Dynamic Interaction Experiment in Real Environment

To verify the effectiveness and applicability of our approach,we use a real dynamic interaction scene with a TIAGo robotas shown in Fig. 9. TIAGo needs to grab the stapler from thetable, and then place the stapler at the right side of the human.At the same time, the human picks up the water cup on thetable and place it at the left side of human. In the processof this scenario, there have overlapping operation workspacesbetween human arm and TIAGo’s end-effector. TIAGo needsto recognize the pose of water cup and human arm, and thenavoid possible collisions during the interaction.The numerical results can be seen in Fig. 8, which showsthe 3D trajectory, 2D trajectory, step size, velocity, turningangle of head or gaze and facial expression using SDAPFand our P-SDAPF methods respectively. The trajectory of theend-effector using SDAPF has several sharp turning pointsand fall back due to the sudden movement of human arm.In contrast, our P-SDAPF method can achieve a smoothertrajectory since we can predict the human motion by detectingthe human head pose and gaze angle. Meanwhile, we alsoshow that the TIAGo’s velocity becomes slower and step (a) 3D trajectories by SDAPF (b) 3D trajectories by ours(c) 2D trajectories, step size, veloc-ity, turning angle of head and gaze,and facial expression by SDAPF (d) 2D trajectories, step size, veloc-ity, turning angle of head and gaze,and facial expression by P-SDAPFFig. 8. Experimental results of dynamic obstacle avoidance using SDAPFand our P-SDAPF respectively. The trajectory using SDAPF has sharp turningpoints and fall back, while our P-SDAPF has a smoother trajectory due to itsmotion prediction ability. size becomes shorter when facial expression, head pose andgaze angle change, which help to ensure human physicaland psychological safety. Fig. 9 show the captured imagesof demonstration using two different algorithms.VI. C

ONCLUSION

In this paper, we propose a motion planning algorithmcombining psychological safety and motion prediction for asense motive robot. Safety is a core problem of human robotinteraction and collaboration, which includes not only physicalsafety, but also psychological safety. Our method aims tosolve safety problems by optimizing velocity control and stepsize adjustment method using facial expression information,head pose and gaze angle estimation in a sampling basedAPF scheme. From the experimental results using a 7-DOFTIAGo robot in the 3D Gazebo environment and real HRIenvironment, we show that our robot can recognize the humanemotion state and predict human motion, such that it cancontrol velocity and step size accordingly. In this way, boththe physical safety and psychological safety can be ensured,and the interaction experience can be improved. In future, wewould like to introduce more safe factors into our frameworkto reinforce psychological safety and physical safety of HRIC,e.g., social customs, speech interaction.

Fig. 9. The captured image sequences for dynamic obstacle avoidance using our P-SDAPF algorithm. R EFERENCES[1] C. Balaguer, A. Gimenez, A. Jardon, R. Cabas, and R. Correal, “Liveexperimentation of the service robot applications for elderly people carein home environments,” in , pp. 2345–2350, 2005.[2] J. Yi and S. Yi, “Mobile manipulation for the hsr intelligent home servicerobot,” in , pp. 169–173, IEEE, 2019.[3] D. Herrero-Fern´andez, P. Parada-Fern´andez, M. Oliva-Mac´ıas, andR. Jorge, “The inﬂuence of emotional state on risk perception inpedestrians: A psychophysiological approach,”

Safety Science , vol. 130,p. 104857, 2020.[4] S. Bhandari, M. R. Hallowell, L. Van Boven, J. Gruber, and K. M.Welker, “Emotional states and their impact on hazard identiﬁcationskills,” in

Construction Research Congress 2016 , pp. 2831–2840, 2016.[5] P. A. Lasota, T. Fong, J. A. Shah, et al. , A survey of methods for safehuman-robot interaction . Now Publishers, 2017.[6] Y.-H. Weng, C.-H. Chen, and C.-T. Sun, “Toward the human–robot co-existence society: On safety intelligence for next generation robots,”

International Journal of Social Robotics , vol. 1, no. 4, p. 267, 2009.[7] G. Liu, H. He, G. Tian, J. Zhang, and Z. Ji, “Online collision avoid-ance for human-robot collaborative interaction concerning safety andefﬁciency,” in , pp. 1667–1672, 2020.[8] T. Baltrusaitis, A. Zadeh, Y. C. Lim, and L.-P. Morency, “Openface2.0: Facial behavior analysis toolkit,” in ,pp. 59–66, IEEE, 2018.[9] L. Fu, H. Zhai, Y. Zhang, and D. Yu, “Binary tree svm-based emotionrecognition from speech signal,”

International Journal of Advancementsin Computing Technology , vol. 5, no. 1, 2013.[10] A. Bansal, S. Chaudhary, and S. D. Roy, “A novel lda and hmm-based technique for emotion recognition from facial expressions,” in

IAPR Workshop on Multimodal Pattern Recognition of Social Signals inHuman-Computer Interaction , pp. 19–26, Springer, 2012.[11] B. Zhang, C. Quan, and F. Ren, “Study on cnn in the recognition ofemotion in audio and images,” in , pp. 1–5, IEEE,2016.[12] M. A. Ozdemir, B. Elagoz, A. Alaybeyoglu, R. Sadighzadeh, andA. Akan, “Real time emotion recognition from facial expressions usingcnn architecture,” in ,pp. 1–4, IEEE, 2019.[13] T. Chang, G. Wen, Y. Hu, and J. Ma, “Facial expression recognitionbased on complexity perception classiﬁcation algorithm,” arXiv preprintarXiv:1803.00185 , 2018.[14] O. Khatib, “Real-time obstacle avoidance for manipulators and mobilerobots,” in

Autonomous robot vehicles , pp. 396–404, Springer, 1986.[15] E. Shi, T. Cai, C. He, and J. Guo, “Study of the new method forimproving artiﬁcal potential ﬁeld in mobile robot obstacle avoidance,” in ,pp. 282–286, IEEE, 2007.[16] Y. Du, X. Zhang, and Z. Nie, “A real-time collision avoidance strategy indynamic airspace based on dynamic artiﬁcial potential ﬁeld algorithm,”

IEEE Access , vol. 7, pp. 169469–169479, 2019.[17] J. Sun, G. Liu, G. Tian, and J. Zhang, “Smart obstacle avoidance usinga danger index for a dynamic environment,”

Applied Sciences , vol. 9,no. 8, p. 1589, 2019.[18] D. Fox, W. Burgard, and S. Thrun, “The dynamic window approach tocollision avoidance,”

IEEE Robotics & Automation Magazine , vol. 4,no. 1, pp. 23–33, 1997.[19] S. M. Lavalle, “Rapidly-exploring random trees : a new tool for pathplanning,”

The annual research report , 1998.[20] J. Pan and D. Manocha, “Fast probabilistic collision checking forsampling-based motion planning using locality-sensitive hashing,”

TheInternational Journal of Robotics Research , vol. 35, no. 12, pp. 1477–1496, 2016.[21] A. Kanazawa, J. Kinugawa, and K. Kosuge, “Adaptive motion planningfor a collaborative robot based on prediction uncertainty to enhancehuman safety and work efﬁciency,”

IEEE Transactions on Robotics ,vol. 35, no. 4, pp. 817–832, 2019.[22] J. S. Park, C. Park, and D. Manocha, “Intention-aware motion planningusing learning based human motion prediction.,” in

Robotics: Scienceand Systems , 2017.[23] P. A. Lasota and J. A. Shah, “Analyzing the effects of human-awaremotion planning on close-proximity human–robot collaboration,”

Hu-man factors , vol. 57, no. 1, pp. 21–33, 2015.[24] T. Yamamoto, M. Jindai, S. Shibata, and A. Shimizu, “An avoidanceplanning of robots using fuzzy control considering human emotions,”in

IEEE SMC’99 Conference Proceedings. 1999 IEEE InternationalConference on Systems, Man, and Cybernetics (Cat. No. 99CH37028) ,vol. 6, pp. 976–981, IEEE, 1999.[25] H. Williams, C. Lee-Johnson, W. N. Browne, and D. A. Carnegie, “Emo-tion inspired adaptive robotic path planning,” in , pp. 3004–3011, IEEE, 2015.[26] J. Butepage, M. J. Black, D. Kragic, and H. Kjellstrom, “Deep repre-sentation learning for human motion prediction and classiﬁcation,” in

Proceedings of the IEEE conference on computer vision and patternrecognition , pp. 6158–6166, 2017.[27] L.-Y. Gui, K. Zhang, Y.-X. Wang, X. Liang, J. M. Moura, and M. Veloso,“Teaching robots to predict human motion,” in , pp. 562–567, IEEE, 2018.[28] J. Martinez, M. J. Black, and J. Romero, “On human motion predictionusing recurrent neural networks,” in