[PDF] Adversarial Interaction Attack: Fooling AI to Misinterpret Human Intentions

Abstract

Understanding the actions of both humans and artificial intelligence (AI) agents is important before modern AI systems can be fully integrated into our daily life. In this paper, we show that, despite their current huge success, deep learning based AI systems can be easily fooled by subtle adversarial noise to misinterpret the intention of an action in interaction scenarios. Based on a case study of skeleton-based human interactions, we propose a novel adversarial attack on interactions, and demonstrate how DNN-based interaction models can be tricked to predict the participants' reactions in unexpected ways. From a broader perspective, the scope of our proposed attack method is not confined to problems related to skeleton data but can also be extended to any type of problems involving sequential regressions. Our study highlights potential risks in the interaction loop with AI and humans, which need to be carefully addressed when deploying AI systems in safety-critical applications.

Full PDF

AAdversarial Interaction Attack: Fooling AI to Misinterpret Human Intentions

Nodens Koren, Qiuhong Ke Yisen Wang James Bailey Xingjun Ma The University of Melbourne, Victoria, Australia Peking University, Beijing, China Deakin University, Geelong, Australia { nkoren@student., qiuhong.ke@, baileyj@ } unimelb.edu.au, [email protected], [email protected] Abstract

Understanding the actions of both humans and artiﬁcial in-telligence (AI) agents is important before modern AI systemscan be fully integrated into our daily life. In this paper, weshow that, despite their current huge success, deep learningbased AI systems can be easily fooled by subtle adversarialnoise to misinterpret the intention of an action in interactionscenarios. Based on a case study of skeleton-based humaninteractions, we propose a novel adversarial attack on inter-actions, and demonstrate how DNN-based interaction modelscan be tricked to predict the participants’ reactions in unex-pected ways. From a broader perspective, the scope of ourproposed attack method is not conﬁned to problems relatedto skeleton data but can also be extended to any type of prob-lems involving sequential regressions. Our study highlightspotential risks in the interaction loop with AI and humans,which need to be carefully addressed when deploying AI sys-tems in safety-critical applications.

In recent years, Artiﬁcial Intelligence (AI) has become muchmore closely connected to human activity. Many tasks thatonce used to require human labor, are now gradually beingautomated and shifted to AI. For instance, in order to copewith the COVID-19 pandemic situation, the use of robotworkers is being suggested to minimize physical contact be-tween humans. These robot technologies heavily depend onthe accuracy of action recognition/prediction and the conse-quent interaction between humans and machines.State-of-the-art action recognition and prediction modelsare deep neural networks (DNNs), due to their capabilityof modeling complex problems (Si et al. 2019; Li et al.2019a,b) in an accurate way. Nonetheless, it has also beenshown that these models are prone to adversarial examples(or attacks) (Biggio et al. 2013; Szegedy et al. 2013; Good-fellow, Shlens, and Szegedy 2014). DNNs can behave errat-ically when processing inputs with carefully crafted pertur-bations, even though such perturbations are imperceptible tohumans (Carlini and Wagner 2017; Madry et al. 2018; Croceand Hein 2020; Jiang et al. 2020; Wang et al. 2021). Thishas raised security concerns on the deployment of DNN-powered AI systems in security-critical applications such * Corresponding author.Preprint. as autonomous driving (Eykholt et al. 2018; Duan et al.2020) and medical diagnosis (Finlayson et al. 2019; Ma et al.2020). Investigating and understanding these abnormalitiesis a crucial task before machine learning based AI agents canbecome practical.In this work, we investigate the adversarial vulnerabil-ity of DNN reaction prediction (i.e., regression) models inskeleton-based interactions. Skeleton signals are among oneof the most commonly used representations for human orrobot motion (Zhang et al. 2016; Wang et al. 2018). Whileadversarial attacks have been extensively studied on images(Goodfellow, Shlens, and Szegedy 2014; Su, Vargas, andSakurai 2019; Brown et al. 2017; Duan et al. 2020), very fewworks have been proposed for skeletons (Liu, Akhtar, andMian 2019; Wang et al. 2019a; Zheng et al. 2020). In com-parison to the image space, which is continuous and wherepixels can be perturbed freely without raising obvious attacksuspicions, the skeleton space is sparse and discrete. It has atemporal nature that needs to be taken into account. Con-sequently, attacking skeleton-based models requires manymore constraints than the image space.Existing works on attacking skeleton-based models haveonly considered the single-person scenario, and have all fo-cused towards recognition (i.e., classiﬁcation) models (Liu,Akhtar, and Mian 2019; Wang et al. 2019a; Zheng et al.2020). However, interaction scenarios involving two or morecharacters are essential to the interaction between humansand AI. They should not be overlooked if our ultimate goalis to build AI agents that can ﬁt into our daily life. Neglect-ing possible attacks might lead to AI agents malfunctioningor behaving aggressively when they are not supposed to.To close this gap, we propose an Adversarial InteractionAttack (AIA) to test the vulnerability of regression DNNs inskeleton-based interactions involving two characters. Beingable to accurately recognize a person’s action is important,but it is equally important to be able go a step further and respond to the action in an appropriate way. In light of this,the usage of regression models is necessary. We hence mod-iﬁed the output layers of two previous state-of-art modelson action recognition. One model was based on a Tempo-ral Convolutional Neural Network (TCN) (Bai, Kolter, andKoltun 2018) and the other was based on Gated RecurrentUnits (GRUs) (Maghoumi and LaViola Jr 2019). The mod-els were modiﬁed to return reactor sequences instead of class a r X i v : . [ c s . A I] J a n abels, and we trained them on skeleton-based interactiondata. We examine the performance of AIA attack under bothwhite-box and black-box settings. We show that our AIAattack can easily fool the two regression models to misinter-pret the actor’s intentions and predict unexpected reactions.Such reactions have detrimental effects to either the actoror the reactor. Overall, our work reveals potential threats ofsubtle adversarial attacks on interactions involving AI.In summary, our contributions are:• We propose an adversarial attack approach - AdversarialInteraction Attack (AIA), that is domain-independent, andworks for general sequential regression models.• We propose an evaluation metric that can be applied toevaluate the performance of sequential regression attacks.Such a metric is currently missing from the literature.• We empirically show that our AIA attack can generatetargeted adversarial action sequences with small pertur-bations, which fool DNN regression models into makingincorrect (possibly dangerous) predictions.• We demonstrate via three case studies how our AIA attackmay affect human and AI interactions in real scenarios,which motivates the need for effective defense strategies.We highlight that our work is the ﬁrst work on targeted se-quential regression attack in a strict manner (i.e. purely nu-merical outputs without labels of any kind). We do not com-pare our work to previous works on skeleton-based actionrecognition as the focus of our work is fundamentally dif-ferent. Speciﬁcally, the goal of our work is to design a newtype of attack and evaluation metric that is capable of han-dling any type of regression-based problems in general. Wethus leave the compatibility between our work and the pre-viously proposed anthropomorphic constraints (Liu, Akhtar,and Mian 2019; Wang et al. 2019a; Zheng et al. 2020) as afuture area of interest. Adversarial attacks can be either white-box or black-boxdepending on the attacker’s knowledge about the targetmodel. White-box attack has full knowledge about the tar-get model including parameters and training details (Good-fellow, Shlens, and Szegedy 2014; Zheng, Chen, and Ren2019; Croce and Hein 2020; Jiang et al. 2020), while black-box attack can only query the target model (Chen et al.2017; Ilyas, Engstrom, and Madry 2018; Bhagoji et al. 2018;Dong et al. 2019b; Jiang et al. 2019; Bai et al. 2020) oruse a surrogate model (Liu et al. 2016; Tram`er et al. 2017;Dong et al. 2018, 2019a; Andriushchenko et al. 2020; Wuet al. 2020; Wang et al. 2021). Adversarial attacks can alsobe targeted or untargeted. Under the classiﬁcation setting,untargeted attacks aim to fool the model such that its out-put is different from the correct label, whereas targeted at-tack aims to fool the model to return a target label of theattacker’s interest. White-box attacks can be achieved bysolving either the targeted or untargeted adversarial objec-tive using ﬁrst-order gradient methods (Goodfellow, Shlens,and Szegedy 2014; Kurakin, Goodfellow, and Bengio 2016). Optimization-based methods have also been proposed toachieve the adversarial objective, and at the same time, min-imize the perturbation size (Carlini and Wagner 2017; Chenet al. 2018).Most of the above existing attacks were proposed for im-ages and classiﬁcation models, and the perturbation is usu-ally constrained to be small (eg. (cid:107) (cid:15) (cid:107) ∞ = 8 for pixel value in [0 , ) so as to be imperceptible to human observers. De-fenses against adversarial attacks have also been explored onimage dataset (Madry et al. 2018; Zhang et al. 2019; Wanget al. 2019b; Bai et al. 2019; Wang et al. 2020; Wu, Xia, andWang 2020; Bai et al. 2021). Attacking Regression Models.

Untargeted regression at-tacks can be derived from classiﬁcation attacks by simplyattacking the regression loss (Balda, Behboodi, and Mathar2019). However, it is more difﬁcult to perform targeted re-gression attack such that the model outputs a target se-quence. This is because, unlike classiﬁcation models thatcontain a ﬁnite set of discrete labels, regression models canhave inﬁnitely many possible outcomes. Hence, most exist-ing attacks on regression models have focused on the untar-geted setting. Meng et al. (2019) proposed a univariate re-gression loss with the goal of changing the outputs of EEG-based BCI regression models to a value that is at least t away from the natural outcome. This loss function guaran-tees only that the adversarial output will be at a speciﬁeddistance away from the natural output. It does not constrainhow large or small the output can actually become. Com-pared to Cheng et al. (2020), the order of target sequencesis more signiﬁcant for our problem. In natural language pro-cessing (NLP), Cheng et al. (2020) proposed a targeted at-tack towards recurrent language models. This work aims toreplace arbitrary words in the output sequence with a smallset of target adversarial keywords, regardless of their orderand occurrence position. While word embedding can be usedto evaluate attack performance on language models, an ap-propriate performance metric is still lacking in the ﬁeld ofinteraction prediction, making it difﬁcult to evaluate the ef-fectiveness of an attack.None of the existing works have implemented an attackthat is able to change the whole output sequence completely.In our work, we propose such an attack, which can changethe entire output sequence with target frames appearing inour desired order. Adversarial Attack on Action Recognition.

Previous at-tacks on skeleton-based action recognition have proposedseveral constraints based on extensive study of anthropo-morphism and motion. These include postural constraints asthe maximum changes in joint angles, and inter-frame con-straints based on the notion of velocities, accelerations, andjerks (Liu, Akhtar, and Mian 2019; Wang et al. 2019a; Zhenget al. 2020). Additionally, Liu, Akhtar, and Mian (2019)utilized a Generative Adversarial Network (GAN) loss tomodel anthropomorphic plausibility. These constraints aredistinct from our work, but could potentially be employedin combination with our proposed attack to improve natural-ness of adversarial action sequences. .2 Interaction Recognition and Prediction

The use of skeleton data has gained its popularity in ac-tion recognition and prediction research. Owing to the factthat reliable skeleton data can be easily extracted from mod-ern RGB-D sensors or RGB camera images, these tech-niques can be easily extended to practical applications (Yunet al. 2012). One benchmark interaction dataset is the SBUKinect Interaction Dataset. Different from most skeleton-based action recognition datasets that focus on studyingsingle-person activities, the SBU Kinect Interaction Datasetcaptures various activities with two characters involved. Pre-dicting interactions is a much harder task in comparisonto predicting single-person activities, due to the complex-ity and the non-periodicity of the problem (Yun et al. 2012).Speciﬁcally, in the interaction scenario, two characters areinvolved. However, the contribution from each charactermay not be equal. For instance, interactions such as ap-proaching and departing have only one active character; an-other character remains steady over all time frames.Convolutional Neural Networks (CNNs) (Du, Fu, andWang 2015; Nunez et al. 2018; Li et al. 2017) and Recur-rent Neural Networks (RNNs) (Du, Wang, and Wang 2015)are two popular choices to tackle the interaction recognitionproblem. Models from the RNN family such as Gated Re-current Unit (GRU) and Long Short-Term Memory (LSTM)are commonly chosen for interaction recognition, becauseit is natural for them to handle sequential data. Maghoumiand LaViola Jr (2019) proposed a recurrent-based modelnamely DeepGRU, that was able to reach state-of-the-artperformance. Temporal Convolutional Networks (TCNs) arealso a common choice of model when dealing with spatio-temporal data. TCNs, just like RNNs, can take sequences ofany length. TCNs rely on a causal convolution operation toensure no information leakage from future to the past (Bai,Kolter, and Koltun 2018). TCN is also a previous state-of-art model (Kim and Reiter 2017) and a component adoptedby many latest works on skeleton-based action recognition(Meng, Liu, and Wang 2018; Yan, Xiong, and Lin 2018).In this paper, we will modify the DeepGRU network pro-posed by Maghoumi and LaViola Jr (2019) and the TCN net-work proposed by Bai, Kolter, and Koltun (2018) for inter-action prediction and examine their vulnerability to our pro-posed attack based on the SBU Kinect Interaction Dataset.

In this section, we ﬁrst provide a mathematical formulationof the targeted adversarial sequence attack problem. We thenintroduce the loss functions used by our AIA attack.

Overview.

Intuitively, the goal of our AIA attack is to de-ceive the reactor

AI agent into thinking that the actor isdoing a different speciﬁc action by making minor changesto the positions of the actor’s joints or the angles betweenjoints. The reactor agent will consequently respond by per-forming the reaction that is targeted by the attack.

A skeleton sequence with T frames can be represented math-ematically as the vector X = ( x , x , ..., x T ) where x i is a skeleton representation of the i th frame, which is a vec-tor consists of 3D-coordinates of the human skeleton joints.More speciﬁcally, x i ∈ R N × , where N denotes the numberof the joints. In our approach, we ﬂattened x i into R N .First, we deﬁne the formal notion of interaction. Supposethe two characters in a two-person interaction scenario are actor A and reactor B . The task of an interaction predictionmodel f is to predict an appropriate reaction (i.e., skeleton) y t at each time step t for reactor B based on the observedskeleton sequence of actor A ( x , · · · , x t ) . This can be writ-ten mathematically as: f ( x , · · · , x t − , x t ) = y t . Given an input skeleton sequence X = ( x , x , ..., x T ) ,an adversarial target skeleton sequence Y (cid:48) =( y (cid:48) , y (cid:48) , ..., y (cid:48) T ) , and a prediction model f : R T × N → R T × N , the goal of our AIA attack is to ﬁnd an adversarialinput sequence X (cid:48) = ( x (cid:48) , · · · , x (cid:48) T ) by solving the followingoptimization problem: min X (cid:48) (cid:88) t ∈ T (cid:107) x (cid:48) t − x t (cid:107) ∞ s.t. (cid:88) t ∈ T (cid:107) f ( x (cid:48) , · · · , x (cid:48) t ) − y (cid:48) t (cid:107) < κ, (1)where, (cid:107) · (cid:107) p is the L p norm, and κ ≥ is a tolerance fac-tor , which serves as a cutoff that distinguishes whether theoutput sequence is recognizable as the target reaction. Thisgives us more ﬂexibility when crafting the adversarial inputsequence X (cid:48) because the acceptable target sequence is non-singular; the output sequence does not need to be exactlythe same as the target sequence to resemble a particular ac-tion. We empirically determine this factor based on informaluser survey in Section 5.1. Intuitively, the above objective isto ﬁnd a sequence X (cid:48) with minimum perturbation from X ,such that the distance between the output and the target isless than κ/T on average for each time step. Our goal is to develop a mechanism that crafts an adver-sarial input sequence which solves the above optimizationproblem given any target output sequence, while also main-taining the naturalness of the adversarial input sequence. Inorder to achieve this goal, we propose the following adver-sarial loss function: L adv = L spatial + λ L temporal , (2)where the L spatial loss term minimizes the spatial distancebetween the output sequence and the target sequence, andthe L temporal loss term maximizes the coherence of the per-turbed input sequence so as to maintain the naturalness ofthe adversarial input sequence. Spatial Loss.

The spatial loss term aims to generate adver-sarial output sequences that are visually similar to the targetreaction sequences; that is, its objective is to minimize thespatial distance between the output joint locations and the neighbourhood of the target joints for every time step. Fol-lowing the formulation of the relaxed optimization problemn (1), we use the L norm to measure the distance betweentwo sets of joint locations: L spatial = (cid:88) t ∈ T inf {(cid:107) f ( x (cid:48) , · · · , x (cid:48) t ) − p t (cid:107) | p t ∈ S t } (3)with S t being an ( N - -sphere deﬁned by: S t ( y (cid:48) t , η ) = { p t ∈ R N | (cid:107) p t − y (cid:48) t (cid:107) = η } . (4)Here, η = κ/T is the mean of the enabling tolerance factor κ in equation (1) over time T . Temporal Loss.

The temporal loss term is to guaranteethe naturalness of the generated adversarial input sequence.Speciﬁcally, the movement of each joint should be contin-uous in time, and motions with abrupt huge change or tele-portation should be penalised. The L temporal term achievesthis goal by maximizing the coherence of each element inthe perturbed input sequence with respect to its neighboringelements in the temporal dimension. This gives: L temporal = (cid:88) t ∈ T ( (cid:107) x (cid:48) t − x (cid:48) t − (cid:107) + (cid:107) x (cid:48) t − x (cid:48) t +1 (cid:107) ) (5)Note that a scaling factor ≤ λ ≤ is introduced in frontof L temporal to balance the two loss terms.We use the ﬁrst-order method Project Gradient Descent(PGD) (Madry et al. 2018) to minimize the combined adver-sarial loss iteratively as follows: X (cid:48) = XX (cid:48) m +1 = Π X ,(cid:15) (cid:0) X (cid:48) m − α · sign( ∇ X (cid:48) m L adv ( X (cid:48) m , Y (cid:48) )) (cid:1) (6)where, Π X ,(cid:15) ( · ) is the projection operation that clips the per-turbation back to (cid:15) -distance away from X when it goes be-yond, ∇ X (cid:48) m L adv ( X (cid:48) m , Y (cid:48) ) is the gradient of the adversarialloss to the input sequence, m is the current perturbation stepfor a total number of M steps, α is the step size and (cid:15) is themaximum perturbation factor. The sequence Y (cid:48) for a targetreaction can be either customized or sampled from the orig-inal dataset. In this section, we conduct case studies on three selected setsof attack objectives that can be easily associated with realscenarios and can serve as motivations behind our approach.Detailed experimental settings can be found in Section 5.The dynamic versions of the case studies and more examplesare provided in the supplementary materials.

Figure 1 illustrates a successful AIA attack that fools themodel to predict a ‘punching’ action for the reactor (thegreen character) as a response to the adversarially perturbed‘handshaking’ action of the actor (the blue character). Notethat the perturbation only slightly changed the actor’s action.This reveals an important safety risk that needs to be care-fully addressed before machine learning based AI agents canbe widely used in human daily life. Suppose that we are atan AI interactive exhibition, a participant would like to shake hands with an AI robot agent. He gradually extends his hand,sending out an interaction request to the AI agent and is ex-pecting the AI agent to respond to his handshaking invitationby shaking hand with him. However, instead of reaching itshands out gently, the AI agent decided to punch the partici-pant in the face because the participant’s body does not staystraight. It would be extremely hazardous if the human char-acter unintentionally wiggled his body in a pattern similarto the adversarial perturbation introduced in this case study.While the actual chance of this happening is extremely lowdue to the high complexity of data in both the spatial and thetemporal dimensions, this threat might nevertheless happenif AI workers become widely deployed worldwide. In thiscase, the human is a victim by inadvertently performing anadversarial attack (wiggling their body).

In this case study we consider a case opposite to the pre-vious one, where human exploiters are capable of attackingAI agents actively and derive beneﬁt from being active at-tackers. In the future, it could become a common practice toutilize AI agents to complete dangerous tasks so as to lowerthe chance of human operators incurring injuries or fatali-ties. Security guard is one such job that might be taken overby an AI agent. Imagine a secret agency that hires AI secu-rity guards is invaded by intruders and is placed in a scenariowhere combat becomes necessary. The AI guard will fail inits role if the invaders know how to apply effective adversar-ial attacks towards it. This is the case in Figure 2 where themodel was fooled to suggest ‘handshaking’ for the reactor(the green character) rather than ‘punching’.

Finally, Case Study 3 demonstrated in Figure 3 examinesthe case of how a cheater might be able to bypass an AIagent’s detection. Whilst automatic ticket checkers havebeen widely adopted, manual ticket checking is still requiredfor numerous situations. For instance, public transportationcompanies may want to check whether a passenger has paidfor the upgrade fee if he or she is in a ﬁrst class seat. Nowsuppose that a public transportation company decides to hireAI agents to do the ticket checking job. The public trans-portation company will lose a huge amount of income ifpassengers know how to stop the ticket checkers from ‘ap-proaching’ as in Figure 3, or even change their ‘approach-ing’ response to ‘departing’. κ The objective of AIA attack is deﬁned with respect to a tol-erance factor κ (see (1), (3) and (4)), which is a ﬂexible met-ric that distinguishes whether the output sequence is closeto the targeted adversarial reaction. Because there are manyfactors involved, such as the character’s height, handedness,and the direction the character is facing, conventional dis-tance metrics such as L and L norms are not suitable todeﬁne precisely what the pattern of a speciﬁc action shouldigure 1: Side-by-side comparison of Case Study 1 ‘handshaking’ to ‘punching’. Top-Bottom: original prediction, adversarialprediction. Blue character: input, green character: output.Figure 2: Side-by-side comparison of Case Study 2 ‘punching’ to ‘handshaking’. Top-Bottom: original prediction, adversarialprediction. Blue character: input, green character: output.Figure 3: Side-by-side comparison of Case Study 3 ‘approaching’ to ‘remaining’. Top-Bottom: original prediction, adversarialprediction. Blue character: input, green character: output.look like. Therefore, we determine the value of κ based onhuman perception via an informal user survey.In order to obtain appropriate values for κ to evaluatewhether an attack is successful, we randomly sampled 5 outof 8 sets of attack objectives and presented them to 82 hu-man judges, including computer science faculties and stu-dents. Each objective set is composed of an action-reactionpair and contains output sequences generated from 6 differ-ent values of (cid:15) (from left to right in ascending order). Foreach sample set, we asked the human judges to choose theleftmost sequence they believe is performing the target re-action. Sampled objectives and the responses from the 82human judges are recorded in Table 1.Based on the responses from the 82 human judges, wecomputed the tolerance factor κ in the optimization problem deﬁned in (1) based on the average of (cid:88) t ∈ T (cid:107) f ( x (cid:48) , · · · , x (cid:48) t ) − y (cid:48) t (cid:107) (7)over the 5 sample objective sets. The calculation of (7) foreach objective set is based on the minimum (cid:15) polled fromthe 82 human judges, and the corresponding value of κ isthen selected as the optimal value (boldfaced in Table 1).Note that, κ serves as a topological boundary between thenatural and the adversarial outputs , whereas (cid:15) is a maximumperturbation constraint that we don’t want the input pertur-bation to go beyond. Here, we study the effect of the temporal constraint L temporal deﬁned in (5) on the naturalness of the gener-ated adversarial input action sequence. Speciﬁcally, we in-able 1: Responses from the 82 human judges. The optimal κ for each attack objective is highlighted in bold . (cid:15) = κ = 90 . ) 4 ( κ = 84 . )

44 ( κ = . ) κ = 74 . ) 12 ( κ = 45 . ) 14 ( κ = 35 . )Punching

58 ( κ = . )

13 ( κ = 47 . ) 6 ( κ = 43 . ) 3 ( κ = 41 . ) 0 ( κ = 39 . ) 2 ( κ = 34 . )Kicking 3 ( κ = 100 . )

71 ( κ = . ) κ = 86 . ) 1 ( κ = 80 . ) 0 ( κ = 47 . ) 0 ( κ = 35 . )Departing 0 ( κ = 85 . ) 7 ( κ = 76 . )

26 ( κ = . )

12 ( κ = 67 . ) 1 ( κ = 41 . ) 10 ( κ = 32 . )Pushing 6 ( κ = 28 . ) 3 ( κ = 26 . ) 2 ( κ = 25 . ) 14 ( κ = 23 . )

49 ( κ = . ) κ = 21 . ) Figure 4: Adversarial input action sequences generated by our AIA attack with (bottom row, and λ = 0 . ) or without (top row)the temporal constraint L temporal .vestigate how the input skeleton sequence changes in thedepth axis as that is the only perturbed dimension through-out our experiments. Our hypothesis is that this additionalfactor will enable our AIA attack to ﬁnd adversarial inputsequences that change more smoothly with respect to time.We demonstrate visually a comparison between adversar-ial sequences generated with and without the temporal con-straint in Figure 4. The top sequence is an adversarial in-put sequence generated with the L temporal term removed,whereas the bottom sequence is an adversarial input se-quence generated with λ = 0 . scaling factor applied to the L temporal term. In comparison to the previous experiment,we plot the skeletons from the depth-y point of view as weare more interested in visualizing the perturbation.As shown in Figure 4, it is observable that in general, thetop sequence has more abrupt changes in body position be-tween each time step. This almost never happens in the bot-tom sequence. More speciﬁcally, in the bottom sequence,when a larger change to the body posture is necessary, thechange is always preceded by smaller changes in the samedirection. In contrast, in the top sequence, any large changescan take place in just one time step. This type of aggressivechange should be avoided as much as possible, as it couldmake the attack more easily detectable. We conduct two sets of experiments to evaluate the effec-tiveness (white-box attack success rate) and the transferabil-ity (black-box attack success rate) of our AIA attack.

Dataset.

We conduct our experiments on the benchmarkSBU Kinect Interaction Dataset, which is composed of in-teractions of eight different categories, namely ‘approach-ing’, ‘departing’, ‘kicking’, ‘punching’, ‘pushing’, ‘hug-ging, ‘handshaking’, and ‘exchanging’. It contains 21 sets of data sampled from 7 participants using a Microsoft Kinectsensor, with approximately 300 interactions in total. Eachcharacter’s information is encoded into 15 joints with the x , y , and depth dimensions. The values of x and y fall within [0 , , and depth in [0 , . .In order to extract action sequences and reaction se-quences, we partitioned each interaction into two individ-ual sequences corresponding to each character respectively.One sequence will be used as the action (input) and anotherwill be used as the reaction (output). Due to the lack of datain this dataset, we trained our response predictors from theperspectives of both characters. With this belief, we used theskeleton sequences of both characters as input data indepen-dently. That is, for each interaction sequence x = x (cid:95) x ,we create two input/target pairs ( x , x ) and ( x , x ) . Models and Training.

We adopted one convolutionalmodel, TCN (Bai, Kolter, and Koltun 2018), and one recur-rent model, DeepGRU (Maghoumi and LaViola Jr 2019),and modiﬁed them such that the models predict sequencesinstead of categorical labels. Our TCN model has 10 hid-den layers with 256 units in each layer and our DeepGRUmodel follows Maghoumi and LaViola Jr (2019) exactlywith the output being a linear layer instead of the attention-classiﬁer framework. We trained each model on the pre-processed dataset for 1,000 epochs using the Adam opti-mizer with a learning rate of 0.001. We held out sets s01s02,s03s04, s05s02, s06s04 in the original dataset as our test set.

Attack Setting.

In all experiments, we used the same stepsize of α = 0 . and run our AIA attack for M = 400 iterations. In addition, we used the Adam optimizer with alearning rate of e − to maximize the adversarial loss func-tion L adv . The scaling factor λ for the temporal loss term L temporal was set to 0.1. The tolerance factor κ was selectedfor each target reaction based on our previous informal usersurvey in Section 5.1 (the exact values can be found in Table1). .2 Effectiveness of our AIA Attack In this experiment, we examine the effectiveness of our AIAattack under the white-box setting with different values ofmaximum perturbation (cid:15) allowed. In order for an attack tobe considered successful, it has to satisfy two conditions: 1)the adversarial output sequences need to be recognizable asthe target reaction (related to κ ), and 2) the adversarial inputsequences need to be visually similar enough compared tothe natural input sequences such that it can circumvent se-curity detection (related to (cid:15) ). Hence, the smaller the (cid:15) theattack can work under, the more effective the attack is.Without loss of generality and in order to control the over-all change to the input sequence, we perturbed only the depthdimension for each joint. This makes it much easier to visu-alize perturbations. On a side note, this is a stricter optimiza-tion problem with constraints in comparison to the originalproposed problem. The outcome of this experiment is thusapplicable to the original problem as well. Adversarial Targets.

We created 8 sets of target reac-tions, corresponding to all 8 interactions in the SBU KinectInteraction Dataset. The objective of each set of targets is tochange the output reactions of all test data into one speciﬁctarget reaction. We then perform targeted adversarial attacksbased on these objectives over a range of (cid:15) values.We consider an attack to be successful if the sum termin (1) computed on the test datum is less than the human-determined κ based on the sample sets. Otherwise we con-sider the attack to have failed. The average attack successrates over all 8 target sets under various (cid:15) are reported forboth models in the left subﬁgure of Figure 5. We used the κ sampled from human judges to evaluate attack success ratesfor objectives 1 to 5. We expect the κ to be generalizable tounseen reactions, so we used the average κ over 5 objectivesets to evaluate the remaining 3 attack objectives. Maximum perturbation ε W h i t e - bo x a tt a ck s u cc e ss r a t e ( % ) DeepGRU TCN 0.15 0.225 0.3 0.375 0.45

Maximum perturbation ε B l a ck - bo x a tt a ck s u cc e ss r a t e ( % ) DeepGRU TCN

Figure 5: Average white-box (left) and black-box (right) at-tack success rate of our AIA attack on TCN and DeepGRU.

Results.

On average, with a perturbation factor (cid:15) of 0.225to 0.3, our AIA attack is able to alter almost all output se-quences of the DeepGRU model into any target sequence.On the other hand, a larger (cid:15) of 0.375 to 0.45 is necessaryfor AIA to achieve a similar level of performance on theTCN model. In general, the TCN model is more robust toour attack than the DeepGRU model. However, under thiswhite-box setting, we were able to achieve a 100% attacksuccess rate on almost all target sets for both models.We close up this experiment with a conclusion that whenmodel parameters are available, our AIA attack is very ef-fective towards deep sequential regression models. Note that the depth value falls within [0, 7.8125]. This indicates thatour AIA algorithm is able to accomplish most attack objec-tives with small perturbations of 2% to 5% to natural inputsequences. More generally, our attack works for any targetsequences, not only conﬁned to speciﬁc interactions fromthe dataset. This enables our attack method to work for bothtargeted and untargeted goals. For untargeted goals, the at-tacker simply needs to pick an arbitrary target sequence thatis far enough from the original sequence.

In addition to white-box effectiveness, we examine howtransferable our attack is. An adversarial example generatedbased on one model is said to be transferable if it can alsofool other independently trained models. In this experiment,we examine robustness of the TCN model and the Deep-GRU model towards adversarial examples generated basedon each other.

Black-box Setting.

We employed the same metric estab-lished in Section 5.1 to determine an attack to be successfulor not. To evaluate how strong our attack is under the black-box setting, we reused the adversarial input sequences in theprevious experiment. We feed all adversarial sequences gen-erated based on one model into another and inspect theireffectiveness when used to attack unseen model. In otherwords, we use adversarial sequences generated based on theDeepGRU model into the TCN model and vice versa. Theaverage black-box attack success rates over a range of (cid:15) arereported for both models in the right subﬁgure of Figure 5.

Results.

Surprisingly, adversarial examples generatedfrom the TCN model are remarkably strong. With an (cid:15) valueof 0.375 to 0.45, adversarial actions generated from the TCNmodel successfully fooled the DeepGRU model more than80% of the time for almost all attack objectives. Along withthe results in Section 6.2, this substantiates that our AIA at-tack has high transferability in addition to being effective.We also observed that adversarial actions generated fromthe DeepGRU model are rather weak on the TCN model un-der the black box setting. It is only able to achieve an averagesuccess rate of 30% irrespective to the maximum perturba-tion (cid:15) permitted. The TCN model is more robust than Deep-GRU in the white-box setting. We suspect that this is be-cause the convolutional layers used in TCN are more robustthan the gated recurrent units of DeepGRU. Speciﬁcally, inorder to fool the TCN model, the attack needs to take into ac-count the high level feature maps between the convolutionallayers. However, adversarial examples generated from theDeepGRU model might not be able to fool the convolutionallayers of TCN because these high level features were nottaken into consideration in the ﬁrst place. Note that, whilebeing relatively more robust, TCN also leads to more trans-ferable attacks. We leave further inspection to this disparityas a future work.

In this paper, we presented a framework for attacking gen-eral spatio-temporal regression models. We proposed therst targeted sequential regression attack that is capable ofaltering the entire output sequence completely - AdversarialInteraction Attack (AIA). On top of that, we also deﬁned anevaluation metric that can be adopted to evaluate the perfor-mance of adversarial attacks on sequential regression prob-lems. We demonstrated on variants of two previous state-of-art action recognition models, TCN and DeepGRU, that ourAIA attack is very effective. Additionally, we showed thatour AIA attacks are highly transferable if referenced fromproper models. We also discussed through three case stud-ies, how AIA might impact interactions between human andAI in real scenarios. We hope this serves to motivate carefulconsideration about how to effectively incorporate AI basedagents into human daily life.

References

Andriushchenko, M.; Croce, F.; Flammarion, N.; and Hein,M. 2020. Square attack: a query-efﬁcient black-box ad-versarial attack via random search. In

ECCV , 484–501.Springer.Bai, S.; Kolter, J. Z.; and Koltun, V. 2018. An Empiri-cal Evaluation of Generic Convolutional and Recurrent Net-works for Sequence Modeling. arXiv:1803.01271 .Bai, Y.; Feng, Y.; Wang, Y.; Dai, T.; Xia, S.-T.; and Jiang,Y. 2019. Hilbert-Based Generative Defense for AdversarialExamples. In

ICCV .Bai, Y.; Zeng, Y.; Jiang, Y.; Wang, Y.; Xia, S.-T.; and Guo,W. 2020. Improving query efﬁciency of black-box adversar-ial attack. In

ECCV .Bai, Y.; Zeng, Y.; Jiang, Y.; Xia, S.-T.; Ma, X.; and Wang, Y.2021. Improving Adversarial Robustness via Channel-wiseSuppressing. In

ICLR .Balda, E. R.; Behboodi, A.; and Mathar, R. 2019. Pertur-bation Analysis of Learning Algorithms: Generation of Ad-versarial Examples From Classiﬁcation to Regression.

IEEETransactions on Signal Processing

ECCV , 158–174. Springer.Biggio, B.; Corona, I.; Maiorca, D.; Nelson, B.; ˇSrndi´c, N.;Laskov, P.; Giacinto, G.; and Roli, F. 2013. Evasion attacksagainst machine learning at test time. In

ECML/PKDD ,387–402. Springer.Brown, T. B.; Man´e, D.; Roy, A.; Abadi, M.; and Gilmer, J.2017. Adversarial patch. arXiv preprint arXiv:1712.09665 .Carlini, N.; and Wagner, D. 2017. Towards evaluating therobustness of neural networks. In

S&P , 39–57.Chen, P.-Y.; Sharma, Y.; Zhang, H.; Yi, J.; and Hsieh, C.-J.2018. Ead: elastic-net attacks to deep neural networks viaadversarial examples. In

AAAI , volume 32, 10–17.Chen, P.-Y.; Zhang, H.; Sharma, Y.; Yi, J.; and Hsieh, C.-J. 2017. Zoo: Zeroth order optimization based black-boxattacks to deep neural networks without training substitutemodels. In

AISec , 15–26. Cheng, M.; Yi, J.; Zhang, H.; Chen, P.; and Hsieh, C.-J.2020. Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples.

ArXiv abs/1803.01128.Croce, F.; and Hein, M. 2020. Reliable evaluation of adver-sarial robustness with an ensemble of diverse parameter-freeattacks. In

ICML .Dong, Y.; Liao, F.; Pang, T.; Su, H.; Zhu, J.; Hu, X.; and Li,J. 2018. Boosting adversarial attacks with momentum. In

CVPR , 9185–9193.Dong, Y.; Pang, T.; Su, H.; and Zhu, J. 2019a. Evading de-fenses to transferable adversarial examples by translation-invariant attacks. In

Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition , 4312–4321.Dong, Y.; Su, H.; Wu, B.; Li, Z.; Liu, W.; Zhang, T.; andZhu, J. 2019b. Efﬁcient decision-based black-box adversar-ial attacks on face recognition. In

CVPR , 7714–7722.Du, Y.; Fu, Y.; and Wang, L. 2015. Skeleton based actionrecognition with convolutional neural network. In

ACPR ,579–583.Du, Y.; Wang, W.; and Wang, L. 2015. Hierarchical recur-rent neural network for skeleton based action recognition. In

CVPR , 1110–1118.Duan, R.; Ma, X.; Wang, Y.; Bailey, J.; Qin, A. K.; and Yang,Y. 2020. Adversarial Camouﬂage: Hiding Physical-WorldAttacks with Natural Styles. In

CVPR , 1000–1008.Eykholt, K.; Evtimov, I.; Fernandes, E.; Li, B.; Rahmati, A.;Xiao, C.; Prakash, A.; Kohno, T.; and Song, D. 2018. Robustphysical-world attacks on deep learning visual classiﬁcation.In

CVPR , 1625–1634.Finlayson, S. G.; Bowers, J. D.; Ito, J.; Zittrain, J. L.; Beam,A. L.; and Kohane, I. S. 2019. Adversarial attacks on medi-cal machine learning.

Science

ICLR .Ilyas, A.; Engstrom, L.; and Madry, A. 2018. Prior Con-victions: Black-box Adversarial Attacks with Bandits andPriors. In

ICLR .Jiang, L.; Ma, X.; Chen, S.; Bailey, J.; and Jiang, Y.-G. 2019.Black-box adversarial attacks on video recognition models.In

ACM MM , 864–872.Jiang, L.; Ma, X.; Weng, Z.; Bailey, J.; and Jiang, Y.-G.2020. Imbalanced gradients: A new cause of overestimatedadversarial robustness. arXiv preprint arXiv:2006.13726 .Kim, T. S.; and Reiter, A. 2017. Interpretable 3d humanaction analysis with temporal convolutional networks. In

CVPRW , 1623–1631.Kurakin, A.; Goodfellow, I.; and Bengio, S. 2016. Adver-sarial examples in the physical world. In

ICLR .Li, B.; Li, X.; Zhang, Z.; and Wu, F. 2019a. Spatio-temporalgraph routing for skeleton-based action recognition. In

AAAI , volume 33, 8561–8568.i, C.; Zhong, Q.; Xie, D.; and Pu, S. 2017. Skeleton-basedaction recognition with convolutional neural networks. In

ICMEW , 597–600.Li, M.; Chen, S.; Chen, X.; Zhang, Y.; Wang, Y.; and Tian,Q. 2019b. Actional-structural graph convolutional networksfor skeleton-based action recognition. In

CVPR , 3595–3603.Liu, J.; Akhtar, N.; and Mian, A. 2019. Adversarial At-tack on Skeleton-based Human Action Recognition. arXivpreprint arXiv:1909.06500 .Liu, Y.; Chen, X.; Liu, C.; and Song, D. 2016. Delv-ing into transferable adversarial examples and black-box at-tacks. arXiv preprint arXiv:1611.02770 .Ma, X.; Niu, Y.; Gu, L.; Wang, Y.; Zhao, Y.; Bailey, J.; andLu, F. 2020. Understanding adversarial attacks on deeplearning based medical image analysis systems.

PatternRecognition

ICLR .Maghoumi, M.; and LaViola Jr, J. J. 2019. DeepGRU: Deepgesture recognition utility. In

International Symposium onVisual Computing , 16–31. Springer.Meng, B.; Liu, X.; and Wang, X. 2018. Human action recog-nition based on quaternion spatial-temporal convolutionalneural network and LSTM in RGB videos.

Multimedia Toolsand Applications

NeuralInformation Processing , 476–488. Cham: Springer Interna-tional Publishing. ISBN 978-3-030-36708-4.Nunez, J. C.; Cabido, R.; Pantrigo, J. J.; Montemayor, A. S.;and Velez, J. F. 2018. Convolutional neural networks andlong short-term memory for skeleton-based human activityand hand gesture recognition.

Pattern Recognition

76: 80–94.Si, C.; Chen, W.; Wang, W.; Wang, L.; and Tan, T. 2019.An attention enhanced graph convolutional lstm network forskeleton-based action recognition. In

Proceedings of theIEEE conference on computer vision and pattern recogni-tion , 1227–1236.Su, J.; Vargas, D. V.; and Sakurai, K. 2019. One pixel at-tack for fooling deep neural networks.

IEEE Transactionson Evolutionary Computation

ICLR .Tram`er, F.; Papernot, N.; Goodfellow, I.; Boneh, D.; and Mc-Daniel, P. 2017. The space of transferable adversarial exam-ples. arXiv preprint arXiv:1704.03453 .Wang, H.; He, F.; Peng, Z.; Yang, Y.; Shao, T.; Zhou, K.; andHogg, D. 2019a. SMART: Skeletal Motion Action Recog-nition aTtack. arXiv preprint arXiv:1911.07107 . Wang, P.; Li, W.; Ogunbona, P.; Wan, J.; and Escalera, S.2018. RGB-D-based human motion recognition with deeplearning: A survey.

Computer Vision and Image Under-standing

ICLR .Wang, Y.; Ma, X.; Bailey, J.; Yi, J.; Zhou, B.; and Gu, Q.2019b. On the Convergence and Robustness of AdversarialTraining. In

ICML .Wang, Y.; Zou, D.; Yi, J.; Bailey, J.; Ma, X.; and Gu, Q.2020. Improving adversarial robustness requires revisitingmisclassiﬁed examples. In

ICLR .Wu, D.; Wang, Y.; Xia, S.-T.; Bailey, J.; and Ma, X. 2020.Skip Connections Matter: On the Transferability of Adver-sarial Examples Generated with ResNets. In

ICLR .Wu, D.; Xia, S.-T.; and Wang, Y. 2020. Adversarial WeightPerturbation Helps Robust Generalization.

NeurIPS arXiv preprint arXiv:1801.07455 .Yun, K.; Honorio, J.; Chattopadhyay, D.; Berg, T. L.; andSamaras, D. 2012. Two-person Interaction Detection Us-ing Body-Pose Features and Multiple Instance Learning. In

CVPRW , 28–35.Zhang, H.; Yu, Y.; Jiao, J.; Xing, E.; El Ghaoui, L.; and Jor-dan, M. I. 2019. Theoretically Principled Trade-off betweenRobustness and Accuracy. In

ICML .Zhang, J.; Li, W.; Ogunbona, P. O.; Wang, P.; and Tang, C.2016. RGB-D-based action recognition datasets: A survey.

Pattern Recognition

60: 86–105.Zheng, T.; Chen, C.; and Ren, K. 2019. Distributionally ad-versarial attack. In

AAAI , volume 33, 2253–2260.Zheng, T.; Liu, S.; Chen, C.; Yuan, J.; Li, B.; and Ren, K.2020. Towards Understanding the Adversarial Vulnerabil-ity of Skeleton-based Action Recognition. arXiv preprintarXiv:2005.07151arXiv preprintarXiv:2005.07151