[PDF] Towards More Practical Adversarial Attacks on Graph Neural Networks

Abstract

We study the black-box attacks on graph neural networks (GNNs) under a novel and realistic constraint: attackers have access to only a subset of nodes in the network, and they can only attack a small number of them. A node selection step is essential under this setup. We demonstrate that the structural inductive biases of GNN models can be an effective source for this type of attacks. Specifically, by exploiting the connection between the backward propagation of GNNs and random walks, we show that the common gradient-based white-box attacks can be generalized to the black-box setting via the connection between the gradient and an importance score similar to PageRank. In practice, we find attacks based on this importance score indeed increase the classification loss by a large margin, but they fail to significantly increase the mis-classification rate. Our theoretical and empirical analyses suggest that there is a discrepancy between the loss and mis-classification rate, as the latter presents a diminishing-return pattern when the number of attacked nodes increases. Therefore, we propose a greedy procedure to correct the importance score that takes into account of the diminishing-return pattern. Experimental results show that the proposed procedure can significantly increase the mis-classification rate of common GNNs on real-world data without access to model parameters nor predictions.

Full PDF

BBlack-Box Adversarial Attacks onGraph Neural Networks with Limited Node Access

Jiaqi Ma ∗† [email protected] Shuangrui Ding ∗‡ [email protected] Qiaozhu Mei †‡ [email protected] Abstract

We study the black-box attacks on graph neural networks (GNNs) under a noveland realistic constraint: attackers have access to only a subset of nodes in thenetwork, and they can only attack a small number of them. A node selection stepis essential under this setup. We demonstrate that the structural inductive biasesof GNN models can be an effective source for this type of attacks. Speciﬁcally,by exploiting the connection between the backward propagation of GNNs andrandom walks, we show that the common gradient-based white-box attacks canbe generalized to the black-box setting via the connection between the gradientand an importance score similar to PageRank. In practice, we ﬁnd attacks basedon this importance score indeed increase the classiﬁcation loss by a large margin,but they fail to signiﬁcantly increase the mis-classiﬁcation rate. Our theoreticaland empirical analyses suggest that there is a discrepancy between the loss andmis-classiﬁcation rate, as the latter presents a diminishing-return pattern when thenumber of attacked nodes increases. Therefore, we propose a greedy procedureto correct the importance score that takes into account of the diminishing-returnpattern. Experimental results show that the proposed procedure can signiﬁcantlyincrease the mis-classiﬁcation rate of common GNNs on real-world data withoutaccess to model parameters nor predictions.

Graph neural networks (GNNs) [20], the family of deep learning models on graphs, have shownpromising empirical performance on various applications of machine learning to graph data, suchas recommender systems [25], social network analysis [11], and drug discovery [15]. Like otherdeep learning models, GNNs have also been shown to be vulnerable under adversarial attacks [28],which has recently attracted increasing research interest [8]. Indeed, adversarial attacks have beenan efﬁcient tool to analyze both the theoretical properties as well as the practical accountabilityof graph neural networks. As graph data have more complex structures than image or text data,researchers have come up with diverse adversarial attack setups. For example, there are differenttasks (node classiﬁcation and graph classiﬁcation), assumptions of attacker’s knowledge (white-box,grey-box, and black-box), strategies (node feature modiﬁcation and graph structure modiﬁcation),and corresponding budget or other constraints (norm of feature changes or number of edge changes).Despite these research efforts, there is still a considerable gap between the existing attack setups andthe reality. It is unreasonable to assume that an attacker can alter the input of a large proportion ofnodes, and even if there is a budget limit, it is unreasonable to assume that they can attack any nodeas they wish. For example, in a real-world social network, the attackers usually only have access to afew bot accounts, and they are unlikely to be among the top nodes in the network; it is difﬁcult for ∗ Equal contribution. † School of Information, University of Michigan, Ann Arbor, Michigan, USA. ‡ Department of EECS, University of Michigan, Ann Arbor, Michigan, USA.Preprint. Under review. a r X i v : . [ c s . L G ] J un he attackers to hack and alter the properties of celebrity accounts. Moreover, an attacker usuallyhas limited knowledge about the underling machine learning model used by the platform (e.g., theymay roughly know what types of models are used but have no access to the model parameters ortraining labels). Motivated by the real-world scenario of attacks, in this paper we study a new type ofblack-box adversarial attack for node classiﬁcation tasks, which is more restricted and more realistic,assuming that the attacker has no access to the model parameters or predictions. Our setup differsfrom existing work with a novel constraint on node access, where attackers only have access to asubset of nodes in the graph, and they can only manipulate a small number of them.The proposed black-box adversarial attack requires a two-step procedure: 1) selecting a small subsetof nodes to attack under the limits of node access; 2) altering the node attributes or edges under aper-node budget. In this paper, we focus on the ﬁrst step and study the node selection strategy. Thekey insight of the proposed strategy lies in the observation that, with no access to the GNN parametersor predictions, the strong structural inductive biases of the GNN models can be exploited as aneffective information source of attacks. The structural inductive biases encoded by various neuralarchitectures (e.g., the convolution kernel in convolutional neural networks) play important roles inthe success of deep learning models. GNNs have even more explicit structural inductive biases due tothe graph structure and their heavy weight sharing design. Theoretical analyses have shown that theunderstanding of structural inductive biases could lead to better designs of GNN models [23, 10].From a new perspective, our work demonstrates that such structural inductive biases can turn intosecurity concerns in a black-box attack, as the graph structure is usually exposed to the attackers.Following this insight, we derive a node selection strategy with a formal analysis of the proposedblack-box attack setup. By exploiting the connection between the backward propagation of GNNs andrandom walks, we ﬁrst generalize the gradient-norm in a white-box attack into a model-independentimportance score similar to the PageRank. In practice, attacking the nodes with high importancescores increases the classiﬁcation loss signiﬁcantly but does not generate the same effect on themis-classiﬁcation rate. Our theoretical and empirical analyses suggest that such discrepancy is due tothe diminishing-return effect of the mis-classiﬁcation rate. We further propose a greedy correctionprocedure for calculating the importance scores. Experiments on three real-world benchmark datasetsand popular GNN models show that the proposed attack strategy signiﬁcantly outperforms baselinemethods. We summarize our main contributions as follows:1. We propose a novel setup of black-box attacks for GNNs with a constraint of limited nodeaccess, which is by far the most restricted and realistic compared to existing work.2. We demonstrate that the structural inductive biases of GNNs can be exploited as an effectiveinformation source of black-box adversarial attacks.3. We analyze the discrepancy between classiﬁcation loss and mis-classiﬁcation rate andpropose a practical greedy method of adversarial attacks for node classiﬁcation tasks.4. We empirically verify the effectiveness of the proposed method on three benchmark datasetswith popular GNN models. The study of adversarial attacks on graph neural networks has surged recently. A taxonomy of existingwork has been summarized by Jin et al. [8], and we give a brief introduction here. First, there are twotypes of machine learning tasks on graphs that are commonly studied, node-level classiﬁcation andgraph-level classiﬁcation. We focus on the node-level classiﬁcation in this paper. Next, there are acouple of choices of the attack form. For example, the attack can happen either during model training(poisoning) or during model testing (evasion); the attacker may aim to mislead the prediction onspeciﬁc nodes (targeted attack) [28] or damage the overall task performance (untargeted attack) [27];the adversarial perturbation can be done by modifying node features, adding or deleting edges, orinjecting new nodes [16]. Our work belongs to untargeted evasion attacks. For the adversarialperturbation, most existing works of untargeted attacks apply global constraints on the proportion ofnode features or the number of edges to be altered. Our work sets a novel local constraint on nodeaccess, which is more realistic in practice: perturbation on top (e.g., celebrity) nodes is prohibitedand only a small number of nodes can be perturbed. Finally, depending on the attacker’s knowledge2bout the GNN model, existing work can be split into three categories: white-box attacks [21, 4, 19]have access to full information about the model, including model parameters, input data, and labels;grey-box attacks [27, 28, 16] have partial information about the model and the exact setups vary ina range; in the most challenging setting, black-box attacks [5, 1, 3] can only access the input dataand sometimes the black-box predictions of the model. In this work, we consider an even more strictblack-box attack setup, where model predictions are invisible to the attackers. As far as we know, theonly existing works that conduct untargeted black-box attacks without access to model predictionsare those by Bojchevski and Günnemann [1] and Chang et al. [3]. However both of them require theaccess to embeddings of nodes, which are prohibited as well in our setup.

While having an extremely restricted black-box setup, we demonstrate that effective adversarialattacks are still possible due to the strong and explicit structural inductive biases of GNNs.Structural inductive biases refer to the structures encoded by various neural architectures, such asthe weight sharing mechanisms in convolution kernels of convolutional neural networks, or thegating mechanisms in recurrent neural networks. Such neural architectures have been recognizedas a key factor for the success of deep learning models [26], which (partially) motivate somerecent developments of neural architecture search [26], Bayesian deep learning [18], Lottery TicketHypothesis [6], etc. The natural graph structure and the heavy weight sharing mechanism grant GNNmodels even more explicit structural inductive biases. Indeed, GNN models have been theoreticallyshown to share similar behaviours as Weisfeiler-Lehman tests [13, 22] or random walks [23]. On thepositive side, such theoretical analyses have led to better GNN model designs [23, 10].Our work instead studies the negative impact of the structural inductive biases in the context ofadversarial attacks: when the graph structure is exposed to the attacker, such structural informationcan turn into the knowledge source for an attack. While most existing attack strategies more-or-lessutilize some structural properties of GNNs, they are utilized in a data-driven manner which requiresquerying the GNN model, e.g., learning to edit the graph via a trial-and-error interaction with theGNN model [5]. We formally establish connections between the structural properties and attackstrategies without any queries to the GNN model.

In this section, we derive principled strategies to attack GNNs under the novel black-box setup withlimited node access. We ﬁrst analyze the corresponding white-box attack problem in Section 3.2 andthen adapt the theoretical insights from the white-box setup to the black-box setup and propose ablack-box attack strategy in Section 3.3. Finally, in Section 3.4, we correct the proposed strategy bytaking into account of the diminishing-return effect for the mis-classiﬁcation rate.

We ﬁrst introduce necessary notations. We denote a graph as G = ( V, E ) , where V = { , , . . . , N } is the set of N nodes, and E ⊆ V × V is the set of edges. For a node classiﬁcation problem,the nodes of the graph are collectively associated with node features X ∈ R N × D and labels y ∈{ , , . . . , K } N , where D is the dimensionality of the feature vectors and K is the number of classes.Each node i ’s local neighborhood including itself is denoted as N i = { j ∈ V | ( i, j ) ∈ E } ∪ { i } ,and its degree as d i = |N i | . To ease the notation, for any matrix A ∈ R D × D in this paper, we refer A j to the transpose of the j -th row of the matrix, i.e., A j ∈ R D . GNN models.

Given the graph G , a GNN model is a function f G : R N × D → R N × K that maps thenode features X to output logits of each node. We denote the output logits of all nodes as a matrix H ∈ R N × K and H = f G ( X ) . A GNN f G is usually built by stacking a certain number ( L ) of layers,with the l -th layer, ≤ l ≤ L , taking the following form: H ( l ) i = σ  (cid:88) j ∈N i α ij W l H ( l − j  , (1)3here H ( l ) ∈ R N × D l is the hidden representation of nodes with D l dimensions, output by the l -thlayer; W l is a learnable linear transformation matrix; σ is an element-wise nonlinear activationfunction; and different GNNs have different normalization terms α ij . For instance, α ij = 1 / (cid:112) d i d j or α ij = 1 /d i in Graph Convolutional Networks (GCN) [9]. In addition, H (0) = X and H = H ( L ) . Random walks.

A random walk [12] on G is speciﬁed by the matrix of transition probabilities, M ∈ R N × N , where M ij = (cid:26) /d i , if ( i, j ) ∈ E or j = i, , otherwise.Each M ij represents the probability of transiting from i to j at any given step of the random walk.And powering the transition matrix by t gives us the t -step transition matrix M t . Given a classiﬁcation loss L : R N × K × { , . . . , K } N → R , the problem ofwhite-box attack with limited node access can be formulated as an optimization problem as follows: max S ⊆ V L ( H, y ) (2)subject to | S | ≤ r, d i ≤ m, ∀ i ∈ SH = f ( τ ( X, S )) , where r, m ∈ Z + respectively specify the maximum number of nodes and the maximum degree ofnodes that can be attacked. Intuitively, we treat high-degree nodes as a proxy of celebrity accountsin a social network. For simplicity, we have omitted the subscript G of the learned GNN classiﬁer f G . The function τ : R N × D × V → R N × D perturbs the feature matrix X based on the selectednode set S (i.e., attack set ). Under the white-box setup, theoretically τ can also be optimized tomaximize the loss. However, as our goal is to study the node selection strategy under the black-boxsetup, we set τ as a pre-determined function. In particular, we deﬁne the j -th row of the output of τ as τ ( X, S ) j = X j + [ j ∈ S ] (cid:15) , where (cid:15) ∈ R D is a small constant noise vector constructed byattackers’ domain knowledge about the features. In other words, the same small noise vector is addedto the features of every attacked node.We use the Carlili-Wagner loss for our analysis, a close approximation of cross-entropy loss and hasbeen used in the analysis of adversarial attacks on image classiﬁers [2]: L ( H, y ) (cid:44) N (cid:88) j =1 L j ( H j , y j ) (cid:44) N (cid:88) j =1 max k ∈{ ,...,K } H jk − H jy j . (3) The change of loss under perturbation.

Next we investigate how the overall loss changes whenwe select and perturb different nodes. We deﬁne the change of loss when perturbing the node i as afunction of the perturbed feature vector x : ∆ i ( x ) = L ( f ( X (cid:48) ) , y ) − L ( f ( X ) , y ) , where X (cid:48) i = x and X (cid:48) j = X j , ∀ j (cid:54) = i. To concretize the analysis, we consider the GCN model with α ij = d i in our following derivations.Suppose f is an L -layer GCN. With the connection between GCN and random walk [23] andAssumption 1 on the label distribution, we can show that, in expectation, the ﬁrst-order Taylorapproximation (cid:101) ∆ i ( x ) (cid:44) ∆ i ( X i ) + ( ∇ x ∆ i ( X i )) T ( x − X i ) is related to the sum of the i -th column ofthe L -step random walk transition matrix M L . We formally summarize this ﬁnding in Proposition 1. Assumption 1 (Label Distribution) . Assume the distribution of the labels of all nodes follows thesame constant categorical distribution, i.e.,

Pr[ y j = k ] = q k , ∀ j = 1 , , . . . , N, where < q k < for k = 1 , , . . . , K and (cid:80) Kk =1 q k = 1 . Moreover, since the classiﬁer f hasbeen well-trained and ﬁxed, the prediction of f should capture certain relationships among the K classes. Speciﬁcally, we assume the chance for f predicting any node j as any class k ∈ { , . . . , K } ,conditioned on the node label y j = l ∈ { , . . . , K } , conﬁnes to a certain distribution p ( k | l ) , i.e., Pr (cid:34)(cid:32) argmax c ∈{ ,...,K } H jc (cid:33) = k | y j = l (cid:35) = p ( k | l ) . roposition 1. For an L -layer GCN model, if Assumption 1 and a technical assumption about theGCN hold, then δ i (cid:44) E (cid:104) (cid:101) ∆ i ( x ) | x = τ ( X, { i } ) i (cid:105) = C N (cid:88) j =1 [ M L ] ji , where C is a constant independent of i . Now we turn to the black-box setup where we have no access to the model parameters or predictions.This means we are no longer able to evaluate the objective function L ( H, y ) of the optimizationproblem (2). Proposition 1 shows that the relative ratio of δ i /δ j between different nodes i (cid:54) = j onlydepends on the random walk transition matrix, which we can easily calculate based on the graph G .This implies that we can still approximately optimize the problem (2) in the black-box setup. Node selection with importance scores.

Consider the change of loss under the perturbation of aset of nodes S . If we write the change of loss as a function of the perturbed features and take the ﬁrstorder Taylor expansion, which we denote as δ , we have δ = (cid:80) i ∈ S δ i . Therefore δ is maximized bythe set of r nodes with degrees less than m and the largest possible δ i , where m, r are the limits ofnode access deﬁned in the problem (2). Therefore, we can deﬁne an importance score for each node i as the sum of the i -th column of M L , i.e., I i = (cid:80) Nj =1 [ M L ] ji , and simply select the nodes with thehighest importance scores to attack. We denote this strategy as RWCS (Random Walk Column Sum).We note that RWCS is similar to PageRank. The difference between RWCS and PageRank is that thelatter uses the stationary transition matrix M ∞ for a random walk with restart.Empirically, RWCS indeed signiﬁcantly increases the classiﬁcation loss (as shown in Section 4.2).The nonlinear loss actually increases linearly w.r.t. the perturbation strength (the norm of theperturbation noise (cid:15) ) for a wide range, which indicates that (cid:101) ∆ i is a good approximation of ∆ i .Surprisingly, RWCS fails to continue to increase the mis-classiﬁcation rate (which matters more inreal applications) when the perturbation strength becomes larger. Details of this empirical ﬁnding areshown in Figure 1 in Section 4.2. We conduct additional formal analyses on the mis-classiﬁcationrate in the following section and ﬁnd a diminishing-return effect of adding more nodes to the attackset when the perturbation strength is adequate. Our analysis is based on the investigation that eachtarget node i ∈ V will be mis-classiﬁed as we increase the attack set.To assist the analysis, we ﬁrst deﬁne the concepts of vulnerable function and vulnerable set below. Deﬁnition 1 (Vulnerable Function) . We deﬁne the vulnerable function g i : 2 V → { , } of a targetnode i ∈ V as, for a given attack set S ⊆ V , g i ( S ) = (cid:26) , if i is mis-classiﬁed when attacking S, , if i is correctly-classiﬁed when attacking S . Deﬁnition 2 (Vulnerable Set) . We deﬁne the vulnerable set of a target node i ∈ V as a set of allattack sets that could lead i to being mis-classiﬁed: A i (cid:44) { S ⊆ V | g i ( S ) = 1 } . We also make the following assumption about the vulnerable function.

Assumption 2. g i is non-decreasing for all i ∈ V , i.e., if T ⊆ S ⊆ V , then g i ( T ) ≤ g i ( S ) . With the deﬁnitions above, the mis-classiﬁcation rate can be written as the average of the vulnerablefunctions: h ( S ) = N (cid:80) Ni =1 g i ( S ) . By Assumption 2, h is also clearly non-decreasing.We further deﬁne the basic vulnerable set to characterize the minimal attack sets that can lead a targetnode to being mis-classiﬁed. This is an assumption made by Xu et al. [23], which we list as Assumption 5 in Appendix A.1. eﬁnition 3 (Basic Vulnerable Set) . ∀ i ∈ V , we call B i ⊆ A i a basic vulnerable set of i if,1) ∅ / ∈ B i ; if ∅ ∈ A i , B i = ∅ ;2) if ∅ / ∈ A i , for any nonempty S ∈ A i , there exists a T ∈ B i s.t. T ⊆ S ;3) for any distinct S, T ∈ B i , | S ∩ T | < min( | S | , | T | ) . And the existence of such a basic vulnerable set is guaranteed by Proposition 2.

Proposition 2.

For any i ∈ V , there exists a unique B i . The distribution of the sizes of the element sets of B i is closely related to the perturbation strength onthe features. When the perturbation is small, we may have to perturb multiple nodes before the targetnode is mis-classiﬁed, and thus the element sets of B i will be large. When perturbation is relativelylarge, we may be able to turn a target node to be mis-classiﬁed by perturbing a single node, if chosenwisely. In this case B i will have a lot of singleton sets.Our following analysis (Proposition 3) shows that h has a diminishing-return effect if the vulnerablesets of nodes on the graph present homophily (Assumption 3), which is common in real-worldnetworks, and the perturbation on features becomes considerably large (Assumption 4). Assumption 3 (Homophily) . ∀ S ∈ ∪ Ni =1 A i and | S | > , there are b ( S ) ≥ nodes s.t., for any node j among these nodes, S ∈ A j . Intuitively, the vulnerable sets present strong homophily if b ( S ) ’s are large. Assumption 4 (Considerable Perturbation) . ∀ S ∈ ∪ Ni =1 A i and if | S | > , then there are (cid:100) p ( S ) · b ( S ) (cid:101) nodes s.t., for any node j among these nodes, there exists a set T ⊆ S , | T | = 1 , and T ∈ A j . And rr +1 < p ( S ) ≤ . Proposition 3.

If Assumptions 3 and 4 hold, h is γ -approximately submodular for some < γ < r ,i.e., there exists a non-decreasing submodular function ˜ h : 2 V → R + , s.t. ∀ S ⊆ V , (1 − γ )˜ h ( S ) ≤ h ( S ) ≤ (1 + γ )˜ h ( S ) . As greedy methods are guaranteed to enjoy a constant approximation ratio for such approximatelysubmodular functions [7], Proposition 3 motivates us to develop a greedy correction procedure tocompensate the diminishing-return effect when calculating the importance scores.

The greedy correction procedure.

We propose an iterative node selection procedure and apply twogreedy correction steps on top of the RWCS strategy, motivated by Assumption 3 and 4.To accommodate Assumption 3, after each node is selected into the attack set, we exclude a k -hopneighborhood of the selected node for next iteration, for a given constant integer k . The intuitionis that nodes in a local neighborhood may contribute to similar target nodes due to homophily. Toaccommodate Assumption 4, we adopt an adaptive version of RWCS scores. First, we binarize the L -step random walk transition matrix M L as (cid:102) M , i.e., (cid:104) (cid:102) M (cid:105) ij = (cid:26) , if [ M L ] ij is among Top- l of [ M L ] i and [ M L ] ij (cid:54) = 0 , , otherwise, (4)where l is a given constant integer. Next, we deﬁne a new adaptive inﬂuence score as a function of amatrix Q : (cid:101) I i ( Q ) = (cid:80) Nj =1 [ Q ] ji . In the iterative node selection procedure, we initialize Q as (cid:102) M . Weselect the node with highest score (cid:101) I i ( Q ) subsequently. After each iteration, suppose we have selectedthe node i in this iteration, we will update Q by setting to zero for all the rows where the elements ofthe i -th column are . The underlying assumption of this operation is that, adding i to the selectedset is likely to mis-classify all the target nodes corresponding to the aforementioned rows, whichcomplies Assumption 4. We name this iterative procedure as the GC-RWCS (Greedily CorrectedRWCS) strategy, and summarize it in Algorithm 1 in Appendix A.3.Finally, we want to mention that, while the derivation of RWCS and GC-RWCS requires the knowl-edge of the number of layers L for GCN, we ﬁnd that the empirical performance of the proposed attackstrategies are not sensitive w.r.t. the choice of L . Therefore, the proposed methods are applicable tothe black-box setup where we do not know the exact L of the model.6 Experiments

We evaluate the proposed attack strategies on two common GNN models, GCN [9]and JK-Net [23]. For JK-Net, we test on its two variants, JKNetConcat and JKNetMaxpool, whichapply concatenation and element-wise max at last layer respectively. We set the number of layers forGCN as 2 and the number of layers for both JK-Concat and JK-Maxpool as 7. The hidden size ofeach layer is 32. For the training, we closely follow the hyper-parameter setup in Xu et al. [23].

Datasets.

We adopt three citation networks, Citeseer, Cora, and Pubmed, which are standard nodeclassiﬁcation benchmark datasets [24]. Following the setup of JK-Net [23], we randomly split eachdataset by , , and for training, validation, and testing. And we draw 40 random splits. Baseline methods for comparison.

As we summarized in Section 2.1, our proposed black-boxadversarial attack setup is by far the most restricted, and none of existing attack strategies for GNNcan be applied. We compare the proposed attack strategies with baseline strategies by selecting nodeswith top centrality metrics. We compare with three well-known network metrics capturing differentaspects of node centrality:

Degree , Betweenness , and

PageRank and name the attack strategiescorrespondingly. In classical network analysis literature [14], real-world networks are shown to befragile under attacks to high-centrality nodes. Therefore we believe these centrality metrics serve asreasonable baselines under our restricted black-box setup. For the purpose of sanity check, we alsoinclude a trivial baseline

Random , which randomly selects the nodes to be attacked.

Hyper-parameters for GC-RWCS.

For the proposed GC-RWCS strategy, we ﬁx the number ofstep L = 4 , the neighbor-hop parameter k = 1 and the parameter l = 30 for the binarized (cid:102) M inEq. (4) for all models on all datasets. Note that L = 4 is different from the number of layers of bothGCN and JK-Nets in our experiments. But we achieve effective attack performance. We also conducta sensitivity analysis in Appendix A.5 and demonstrate the proposed method is not sensitive w.r.t. L . Nuisance parameters of the attack procedure.

For each dataset, we ﬁx the limit on the number ofnodes to attack, r , as of the graph size. After the node selection step, we also need to specifyhow to perturb the node features, i.e., the design of (cid:15) in τ function in the optimization problem (2).In a real-world scenario, (cid:15) should be designed with domain knowledge about the classiﬁcation task,without access to the GNN models. In our experiments, we have to simulate the domain knowledgedue to the lack of semantic meaning of each individual feature in the benchmark datasets. Formally,we construct the constant perturbation (cid:15) ∈ R D as follows, for j = 1 , , . . . , D , (cid:15) j =  λ · sign ( (cid:80) Ni =1 ∂ L ( H,y ) ∂X ij ) , if j ∈ arg top- J (cid:18)(cid:104)(cid:12)(cid:12)(cid:12)(cid:80) Ni =1 ∂ L ( H,y ) ∂X il (cid:12)(cid:12)(cid:12)(cid:105) l =1 , ,...,D (cid:19) , , otherwise, (5)where λ is the magnitude of modiﬁcation. We ﬁx J = (cid:98) . D (cid:99) for all datasets. While gradientsof the model are involved, we emphasize that we only use extremely limited information of thegradients: determining a few number of important features and the binary direction to perturb foreach selected feature, only at the global level by averaging gradients on all nodes. We believesuch coarse information is usually available from domain knowledge about the classiﬁcation task.The perturbation magnitude for each feature is ﬁxed as a constant λ and is irrelevant to the model.In addition, the same perturbation vector is added to the features of all the selected nodes. Theconstruction of the perturbation is totally independent of the selected nodes. We ﬁrst provide em-pirical evidence for the discrepancy between classiﬁcation loss (cross-entropy) and mis-classiﬁcationrate. We compare the RWCS strategy to baseline strategies with varying perturbation strength asmeasured by λ in Eq. (5). The results shown in Figure 1 are obtained by attacking GCN on Citeseer.First, we observe that RWCS increases the classiﬁcation loss almost linearly as λ increases, indicatingour approximation of the loss by ﬁrst-order Taylor expansion actually works pretty well in practice.Not surprisingly, RWCS performs very similarly as PageRank. And RWCS performs much better than7ther centrality metrics in increasing the classiﬁcation loss, showing the effectiveness of Proposition 1.However, we see the decrease of classiﬁcation accuracy when attacked by RWCS (and PageRank)quickly saturates as λ increases. The GC-RWCS strategy that is proposed to correct the importancescores is able to decreases the classiﬁcation accuracy the most as λ becomes larger, although itincreases the classiﬁcation loss the least. (a) Loss on Test Set (b) Accuracy on Test Set Figure 1: Experiments of attacking GCN on Citeseer with increasing perturbation strength λ . Resultsare averaged over 40 random trials and error bars indicate standard error of mean. Full experiment results.

We then provide the full experiment results of attacking GCN, JKNetCon-cat, and JKNetMaxpool on all three datasets in Table 1. The perturbation strength is set as λ = 1 .The thresholds and indicate that we set the limit on the maximum degree m as the lowestdegree of the top and nodes respectively.The results clearly demonstrate the effectiveness of the proposed GC-RWCS strategy. GC-RWCSachieves the best attack performance on almost all experiment settings, and the difference to thesecond-best strategy is signiﬁcant in almost all cases. It is also worth noting that the proposed GC-RWCS strategy is able to decrease the node classiﬁcation accuracy by up to . , and GC-RWCSachieves a larger decrease of the accuracy than the Random baseline in most cases (see Table 4in Appendix A.5). And this is achieved by merely adding the same constant perturbation vector tothe features of of the nodes in the graph. This veriﬁes that the explicit structural inductive biasesof GNN models make them vulnerable even in the extremely restricted black-box attack setup.Table 1: Summary of the attack performance. The lower the accuracy (in % ) the better the attacks.The bold marker denotes the best performance. The asterisk (*) means the difference between thebest strategy and the second-best strategy is statistically signiﬁcant by a t-test at signiﬁcance level0.05. The error bar ( ± ) denotes the standard error of the mean by 40 independent trials. Cora Citeseer PubmedMethod GCN JKNetConcat JKNetMaxpool GCN JKNetConcat JKNetMaxpool GCN JKNetConcat JKNetMaxpoolNone 85.6 ± ± ± ± ± ± ± ± ± Random 81.3 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± Random 82.6 ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± In this paper, we propose a novel black-box adversarial attack setup for GNN models with constraintof limited node access, which we believe is by far the most restricted and realistic black-box attacksetup. Nonetheless, through both theoretical analyses and empirical experiments, we demonstratethat the strong and explicit structural inductive biases of GNN models make them still vulnerable tothis type of adversarial attacks. We also propose a principled attack strategy, GC-RWCS, based onour theoretical analyses on the connection between the GCN model and random walk, which correctsthe diminishing-return effect of the mis-classiﬁcation rate. Our experimental results show that theproposed strategy signiﬁcantly outperforms competing attack strategies under the same setup.8 roader Impact

For the potential positive impacts, we anticipate that the work may raise the public attention aboutthe security and accountability issues of graph-based machine learning techniques, especially whenthey are applied to real-world social networks. Even without accessing any information about themodel training, the graph structure alone can be exploited to damage a deep learning framework witha rather executable strategy.On the potential negative side, as our work demonstrates that there is a chance to attack existing GNNmodels effectively without any knowledge but a simple graph structure, this may expose a seriousalert to technology companies who maintain the platforms and operate various applications basedon the graphs. However, we believe making this security concern transparent can help practitionersdetect potential attack in this form and better defend the machine learning driven applications.

References [1] Aleksandar Bojchevski and Stephan Günnemann. Adversarial attacks on node embeddings viagraph poisoning. arXiv preprint arXiv:1809.01093 , 2018.[2] Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In , pages 39–57. IEEE, 2017.[3] Heng Chang, Yu Rong, Tingyang Xu, Wenbing Huang, Honglei Zhang, Peng Cui, Wenwu Zhu,and Junzhou Huang. A restricted black-box adversarial framework towards attacking graphembedding models. In

AAAI Conference on Artiﬁcial Intelligence , 2020.[4] Jinyin Chen, Yangyang Wu, Xuanheng Xu, Yixian Chen, Haibin Zheng, and Qi Xuan. Fastgradient attack on network embedding. arXiv preprint arXiv:1809.02797 , 2018.[5] Hanjun Dai, Hui Li, Tian Tian, Xin Huang, Lin Wang, Jun Zhu, and Le Song. Adversarialattack on graph structured data. arXiv preprint arXiv:1806.02371 , 2018.[6] Jonathan Frankle and Michael Carbin. The lottery ticket hypothesis: Finding sparse, trainableneural networks. arXiv preprint arXiv:1803.03635 , 2018.[7] Thibaut Horel and Yaron Singer. Maximization of approximately submodular functions. In

Advances in Neural Information Processing Systems , pages 3045–3053, 2016.[8] Wei Jin, Yaxin Li, Han Xu, Yiqi Wang, and Jiliang Tang. Adversarial attacks and defenses ongraphs: A review and empirical study. arXiv preprint arXiv:2003.00653 , 2020.[9] Thomas N Kipf and Max Welling. Semi-supervised classiﬁcation with graph convolutionalnetworks. arXiv preprint arXiv:1609.02907 , 2016.[10] Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. Predict then propagate:Graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997 , 2018.[11] Cheng Li, Jiaqi Ma, Xiaoxiao Guo, and Qiaozhu Mei. Deepcas: An end-to-end predictor ofinformation cascades. In

Proceedings of the 26th international conference on World Wide Web ,pages 577–586, 2017.[12] László Lovász et al. Random walks on graphs: A survey.

Combinatorics, Paul erdos is eighty ,2(1):1–46, 1993.[13] Christopher Morris, Martin Ritzert, Matthias Fey, William L Hamilton, Jan Eric Lenssen,Gaurav Rattan, and Martin Grohe. Weisfeiler and leman go neural: Higher-order graph neuralnetworks. In

Proceedings of the AAAI Conference on Artiﬁcial Intelligence , volume 33, pages4602–4609, 2019.[14] Mark Newman.

Networks . Oxford university press, 2018.[15] Chence Shi, Minkai Xu, Zhaocheng Zhu, Weinan Zhang, Ming Zhang, and Jian Tang.Graphaf: a ﬂow-based autoregressive model for molecular graph generation. arXiv preprintarXiv:2001.09382 , 2020. 916] Yiwei Sun, Suhang Wang, Xianfeng Tang, Tsung-Yu Hsieh, and Vasant Honavar. Node injectionattacks on graphs via reinforcement learning. arXiv preprint arXiv:1909.06543 , 2019.[17] Minjie Wang, Lingfan Yu, Da Zheng, Quan Gan, Yu Gai, Zihao Ye, Mufei Li, Jinjing Zhou,Qi Huang, Chao Ma, et al. Deep graph library: Towards efﬁcient and scalable deep learning ongraphs. arXiv preprint arXiv:1909.01315 , 2019.[18] Andrew Gordon Wilson. The case for bayesian deep learning. arXiv preprint arXiv:2001.10995 ,2020.[19] Huijun Wu, Chen Wang, Yuriy Tyshetskiy, Andrew Docherty, Kai Lu, and Liming Zhu. Adver-sarial examples for graph data: Deep insights into attack and defense. In

IJCAI , 2019.[20] Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. Acomprehensive survey on graph neural networks.

IEEE Transactions on Neural Networks andLearning Systems , 2020.[21] Kaidi Xu, Hongge Chen, Sijia Liu, Pin-Yu Chen, Tsui-Wei Weng, Mingyi Hong, and Xue Lin.Topology attack and defense for graph neural networks: An optimization perspective. arXivpreprint arXiv:1906.04214 , 2019.[22] Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neuralnetworks? arXiv preprint arXiv:1810.00826 , 2018.[23] Keyulu Xu, Chengtao Li, Yonglong Tian, Tomohiro Sonobe, Ken-ichi Kawarabayashi, andStefanie Jegelka. Representation learning on graphs with jumping knowledge networks. arXivpreprint arXiv:1806.03536 , 2018.[24] Zhilin Yang, William W Cohen, and Ruslan Salakhutdinov. Revisiting semi-supervised learningwith graph embeddings. arXiv preprint arXiv:1603.08861 , 2016.[25] Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton, and JureLeskovec. Graph convolutional neural networks for web-scale recommender systems. In

Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery &Data Mining , pages 974–983, 2018.[26] Barret Zoph and Quoc V Le. Neural architecture search with reinforcement learning. arXivpreprint arXiv:1611.01578 , 2016.[27] Daniel Zügner and Stephan Günnemann. Adversarial attacks on graph neural networks via metalearning. In

International Conference on Learning Representations (ICLR) , 2019.[28] Daniel Zügner, Amir Akbarnejad, and Stephan Günnemann. Adversarial attacks on neuralnetworks for graph data. In

Proceedings of the 24th ACM SIGKDD International Conferenceon Knowledge Discovery & Data Mining , pages 2847–2856, 2018.10

Appendix

A.1 Proof of Proposition 1

We ﬁrst remind the reader for some notations, a GCN model is denoted as a function f , the featurematrix is X ∈ R N × D , and the output logits H = f ( X ) ∈ R N × K . The L -step random walktransition matrix is M L . More details can be found in in Section 3.1We give in Lemma 1 the connection between GCN models and random walks. Lemma 1 relies on atechnical assumption about the GCN model (Assumption 5) and the proof can be found in Xu et al.[23]. Assumption 5 (Xu et al. [23]) . All paths in the computation graph of the given GCN model areindependently activated with the same probability of success ρ . Lemma 1. (Xu et al. [23].) Given an L -layer GCN with averaging as α i,j = 1 /d i in Eq. 1, assumethat all path in the computation graph of the model are activated with the same probability of success ρ (Assumption 5). Then, for any node i, j ∈ V , E (cid:20) ∂H j ∂X i (cid:21) = ρ · (cid:89) l = L W l [ M L ] ji , (6) where W l is the learnable parameter at l -th layer. Then we are able to prove Proposition 1 below.

Proof.

First, we derive the gradient of the loss L ( H, y ) w.r.t. the feature X i of node i , ∇ X i L ( H, y ) = ∇ X i  N (cid:88) j =1 L j ( H j , y j )  = N (cid:88) j =1 ∇ X i L j ( H j , y j )= N (cid:88) j =1 (cid:18) ∂H j ∂X i (cid:19) T ∂ L j ( H j , y j ) ∂H j , (7)where H j is the j th row of H but being transposed as column vectors and y j is the true label of node j . Note that ∂ L j ( H j ,y j ) ∂H j ∈ R K , and ∂H j ∂X i ∈ R K × D .Next, we plug Eq. 7 into (cid:101) ∆ i ( x ) | x = τ ( X, { i } ) i . For simplicity, We write (cid:101) ∆ i ( x ) | x = τ ( X, { i } ) i as (cid:101) ∆ i inthe rest of the proof. (cid:101) ∆ i = ( ∇ X i L ( H, y )) T (cid:15) = N (cid:88) j =1 (cid:18) ∂ L j ( H j , y j ) ∂H j (cid:19) T ∂H j ∂X i (cid:15). (8)Denote a j (cid:44) ∂ L j ( H j ,y j ) ∂H j ∈ R K . From the deﬁnition of loss L j ( H j , y j ) = N (cid:88) j =1 max k ∈{ ,...,K } H jk − H jy j , we have a jk =  − , if k = y j and y j (cid:54) = argmax c ∈{ ,...,K } H jc , , if k (cid:54) = y j and k = argmax c ∈{ ,...,K } H jc , , otherwise,11or k = 1 , , . . . , K . Under Assumption 1, the expectation of each element of a j is E [ a jk ] = − q k (1 − p ( k | k )) + K (cid:88) w =1 ,w (cid:54) = k p ( k | w ) q w , k = 1 , , . . . , K which is a constant independent of H j and y j . Therefore, we can write E [ a j ] = c, ∀ j = 1 , , . . . , N, where c ∈ R K is a constant vector independent of j .Taking expectation of Eq. (8) and plug in the result of Lemma 1, E (cid:104) (cid:101) ∆ i (cid:105) ≈ E  N (cid:88) j =1 (cid:18) ∂ L j ( H j , y j ) ∂H j (cid:19) T ∂H j ∂X i (cid:15)  = N (cid:88) j =1 E [ a j ] T (cid:32) ρ (cid:89) l = L W l [ M L ] ji (cid:33) (cid:15) = (cid:32) ρc T (cid:89) l = L W l (cid:15) (cid:33) N (cid:88) j =1 [ M L ] ji = C N (cid:88) j =1 [ M L ] ji , where C = ρc T (cid:81) l = L W l (cid:15) is a constant scalar independent of i . A.2 Proofs for Propositions in Section 3.4Proof of Proposition 2.

Proof. If A i = ∅ , B i ⊆ A i so B i = ∅ . The three conditions of Deﬁnition 3 are also trivially true.Below we investigate the case A i (cid:54) = ∅ .The existence can be given by a constructive proof. We check the nonempty elements in A i one byone with any order. If this element is a super set of any other element in A i , we skip it. Otherwise,we put it into B i . Then we verify that the resulted B i is a basic vulnerable set for i . B i ⊆ A i . Forcondition 1), clearly, ∅ / ∈ B i and if ∅ ∈ A i , all nonempty elements in A i are skipped so B i = ∅ . Forcondition 2), given ∅ / ∈ A i , for any nonempty S ∈ A i , if S ∈ B i , the condition holds. If S / ∈ B i , byconstruction, there exists a nonempty strict subset S ⊂ S and S ∈ A i . If S ∈ B i , the conditionholds. If S / ∈ B i , we can similarly ﬁnd a nonempty strict subset S ⊂ S and S ∈ A i . Recursively,we can get a series S ⊃ S ⊃ S ⊃ · · · . As S is ﬁnite, we will have a set S k that no longer has strictsubset so S k ∈ B i . Therefore the condition holds. Condition 3) means any set in B i is not a subsetof another set in B i . This condition holds by construction.Now we prove the uniqueness. Suppose there are two distinct basic vulnerable sets B i (cid:54) = C i . Withoutloss of generality, we assume S ∈ B i but S / ∈ C i . B i (cid:54) = ∅ so ∅ / ∈ A i . Further S ∈ A i , hence C i (cid:54) = ∅ .As S ∈ B i ⊆ A i , S (cid:54) = ∅ , and C i satisﬁes condition 2), there will be a nonempty T ∈ C i s.t. T ⊂ S .If T ∈ B i , then condition 3) is violated for B i . If T / ∈ B i , there will be a nonempty T (cid:48) ∈ B i s.t. T (cid:48) ⊂ T . But T (cid:48) ⊂ S also violates condition 3). By contradiction we prove the uniqueness.In order to prove Proposition 3, we ﬁrst would like to construct a submodular function that is close to h , with the help of Lemma 2 below. Lemma 2. If ∀ i ∈ V , B i is either empty or only contains singleton sets, then h is submodular.Proof. We ﬁrst prove the case when ∀ i ∈ V, A i (cid:54) = ∅ .First, we show that ∀ i ∈ V , if A i (cid:54) = ∅ , for any nonempty S ⊆ V, g i ( S ) = 1 if and only if B i = ∅ or ∃ T ∈ B i , T ⊆ S . On one hand, if g i ( S ) = 1 , then S ∈ A i . If ∅ ∈ A i , B i = ∅ . If ∅ / ∈ A i , by12ondition 2) of the basic vulnerable set, ∃ T ∈ B i , T ⊆ S . On the other hand, if ∃ T ∈ B i , T ⊆ S , g i ( T ) = 1 , by Assumption 2, g i ( S ) ≥ g i ( T ) , so g i ( S ) = 1 . If B i = ∅ , as A i (cid:54) = ∅ , if ∅ / ∈ A i , thecondition 2) of Deﬁnition 3 will be violated. Therefore ∅ ∈ A i so g i ( ∅ ) = 1 . Still by Assumption 2, g i ( S ) ≥ g i ( ∅ ) , so g i ( S ) = 1 .Deﬁne a function e : V → V s.t. for any node i ∈ V , e ( i ) = { j ∈ V | { i } ∈ B j } . Given B i is either empty or only contains singleton sets for any i ∈ V , for any nonempty S ⊆ Vh ( S ) = 1 N N (cid:88) i =1 g i ( S ) (9) = 1 N |{ j ∈ V | B j = ∅ or ∃ T ∈ B j , T ⊆ S }| = 1 N |{ j ∈ V | B j = ∅ or ∃{ i } ∈ B j , i ∈ S }| = 1 N |{ j ∈ V | B j = ∅ or ∃ i ∈ S, { i } ∈ B j }| = 1 N ( |∪ i ∈ S e ( i ) | + |{ j ∈ V | B j = ∅}| ) . |{ j ∈ V | B j = ∅}| is a constant independent of S . Therefore, maximizing h ( S ) over S with | S | ≤ r is equivalent to maximizing |∪ i ∈ S e ( i ) | over S with | S | ≤ r , which is a maximum coverage problem.Therefore h is submodular.The case of allowing some nodes to have empty vulnerable sets can be easily proved by removingsuch nodes in Eq. (9) as their corresponding vulnerable functions always equal to zero. Proof of Proposition 3.

For simplicity, we assume A i (cid:54) = ∅ for any i ∈ V . The proof below canbe easily adapted to the general case without this assumption, by removing the nodes with emptyvulnerable sets similarly as the proof for Lemma 2. Proof. ∀ i ∈ V , deﬁne (cid:101) B i (cid:44) { S ∈ B i | | S | = 1 } . We can then deﬁne a new group of vulnerable sets (cid:101) A i on V for i ∈ V . Let (cid:101) A i =  V , if B i = ∅ , ∅ , B i (cid:54) = ∅ but (cid:101) B i = ∅ , { S ⊆ V | ∃ T ∈ (cid:101) B i , T ⊆ S } , otherwise.Then it is clear that (cid:101) B i is a valid basic vulnerable set corresponding to (cid:101) A i , for i ∈ V . If we deﬁne ˜ g i : 2 V → { , } as ˜ g i ( S ) = (cid:26) , if B i = ∅ or ∃ T ∈ (cid:101) B i , T ⊆ S, , otherwise,we can easily verify that ˜ g i is a valid vulnerable function corresponding to (cid:101) A i , for i ∈ V . Further let ˜ h : 2 V → R + as ˜ h ( S ) = 1 N N (cid:88) i =1 ˜ g i ( S ) . By Lemma 2, as ∀ i ∈ V, (cid:101) B i is either empty or only contains singleton sets, we know ˜ h is submodular.Next we investigate the difference between h and ˜ h . First, for any S ⊆ V , if S / ∈ ∪ Ni =1 A i , clearly h ( S ) = ˜ h ( S ) = 0 ; if | S | ≤ , it’s easy to show h ( S ) = ˜ h ( S ) . Second, for any S ∈ ∪ Ni =1 A i and | S | > , by Assumption 3, there are exactly b (omitting the S in b ( S ) ) nodes whose vulnerable setcontains S . Without loss of generality, let us assume the indexes of b nodes are , , . . . , b . Then, forany node i > b , g i ( S ) = 0 , ˜ g i ( S ) = 0 . For node i = 1 , , . . . , b , g i ( S ) = 1 , and ˜ g i ( S ) = (cid:26) , if B i = ∅ or ∃ T ⊆ S , | T | = 1 and T ∈ (cid:102) B i , , otherwise.By Assumption 4, there are at least (cid:100) pb (cid:101) (omitting the S in p ( S ) ) nodes like j s.t. ˜ g j ( S ) = 1 .Therefore, h ( S ) = bN and (cid:100) pb (cid:101) N ≤ ˜ h ( S ) ≤ bN . Hence − r < ≤ h ( S )˜ h ( S ) ≤ p < r .13 .3 Algorithm Details of GC-RWCS We summarize the GC-RWCS strategy in Algorithm 1.

Algorithm 1:

The GC-RWCS Strategy for Node Selection.

Input: number of nodes limit r ; maximum degree limit m ; neighbor hops k ; binarized transitionmatrix (cid:102) M ; the adaptive inﬂuence score function (cid:101) I i , ∀ i ∈ V . Output: the set S to be attacked. Initialize the candidate set P = { i ∈ V | d i ≤ m } , and the score matrix Q = (cid:102) M ; Initialize S = ∅ ; for t = 1 , , . . . , r do z ← argmax i ∈ P (cid:101) I i ( Q ) ; S ← S ∪ { z } ; P ← P \ { i ∈ P | shortest-path ( i, z ) ≤ k } ; q ← Q · ,z ; for i ∈ V do if q i is then Q i ← ; return S ; A.4 Additional Experiment DetailsDatasets.

We adopt the Deep Graph Library [17] version of Cora, Citeseer, and Pubmed in ourexperiments. The summary statistics of the datasets are summarized in Table 2. The number of edgesdoes not include self-loops. Table 2: Summary statistics of datasets.Dataset Nodes Edges Classes FeaturesCiteseer 3,327 4,552 6 3,703Cora 2,708 5,278 7 1,433Pubmed 19,717 44,324 3 500

A.5 Additional Experiment Results