A comparative study of similarity-based and GNN-based link prediction approaches
AA comparative study of similarity-based andGNN-based link prediction approaches
Md Kamrul Islam, Sabeur Aridhi, and Malika Smail-Tabbone
Universite de Lorraine, CNRS, Inria, LORIA, 54000 Nancy, France { kamrul.islam, sabeur.aridhi, malika.smail } @loria.fr Abstract.
The task of inferring the missing links in a graph based onits current structure is referred to as link prediction. Link predictionmethods that are based on pairwise node similarity are well-establishedapproaches in the literature. They show good prediction performance inmany real-world graphs though they are heuristics and lack of univer-sal applicability. On the other hand, the success of neural networks forclassification tasks in various domains leads researchers to study them ingraphs. When a neural network can operate directly on the graph, thenit is termed as the graph neural network (GNN). GNN is able to learnhidden features from graphs which can be used for link prediction task ingraphs. Link predictions based on GNNs have gained much attention ofresearchers due to their convincing high performance in many real-worldgraphs. This appraisal paper studies some similarity and GNN-basedlink prediction approaches in the domain of homogeneous graphs thatconsists of a single type of (attributed)nodes and single type of pairwiselinks. We evaluate the studied approaches against several benchmarkgraphs with different properties from various domains.
Keywords:
Neural network · Homogeneous graph · Graph labelling · Node embedding.
One of the most interesting and long-standing problems in the field of graphmining is link prediction that predicts the probability of a link between twounconnected nodes based on available information in the current graph such asnode attributes or graph structure [1]. The prediction of missing or potential linkshelps us toward the deep understanding of structure, evolution and functionsof real-world complex graphs [2]. Some applications of link prediction includefriend recommendation in social networks ([3]), product recommendation in e-commerce [4], knowledge graph completion [5], and finding interactions betweenproteins [6].A large category of link prediction methods is based on some heuristics thatmeasure the proximity between nodes to predict whether they are likely to havea link. Though these heuristics can predict links with high accuracy in manygraphs, they lack universal applicability to different kinds of graphs. For exam-ple, the common neighbor heuristic assumes that two nodes are more likely to a r X i v : . [ c s . S I] A ug M. K. Islam et al. connect if they have many common neighbors. This assumption may be cor-rect in social networks, but is shown to fail in protein-protein interaction (PPI)networks where two proteins sharing many common neighbors are actually lesslikely to interact [7]. In case of using these heuristics, it is required to man-ually choose different heuristics for different graphs based on prior beliefs orexpensive trial and error process. On the other hand, learning-based link predic-tion approaches are able to learn suitable heuristics from the graph itself. Thesuccess of the neural network is well-known for machine learning task in manyreal-world applications like image classification [8], speech recognition [9], videoprocessing [10], natural language processing [11]. The applications can representthe data in Euclidean space and neural network is able to extract the hiddenfeatures from the data space. However, the neural network can not be applieddirectly into the graph domain due to two important challenges [12]. Firstly, agraph contains unordered nodes and a variable number of neighbours for eachnode. Secondly, the assumption of independence of data is no longer true forgraphs as each node is linked to some other nodes. The first attempt to studythe neural network in the graph domain was done in [14]. Then, Graph NeuralNetworks (GNNs) has become a powerful tool for learning hidden features ingraphs. In the last decades, researchers have developed many GNN-based meth-ods which are used for several tasks completion such as graph classification [15],node classification [16], and link prediction [17].In this paper, we first introduce the link prediction problem and highlightsimilarity-based and GNN-based methods. Then, we choose a few approachesfrom both link prediction categories to evaluate their performances on differenttypes of graphs, namely simple or homogeneous graphs and node-attributedgraphs. We compare their performance with respect to the prediction accuracyand computational time.
Consider an undirected graph at a particular time t where nodes represent en-tities and links represent the relationships between pair entities (or nodes). Thelink prediction problem is defined as discovering or inferring a set of missinglinks (existing but not observed) in the graph at time t + ∆t . The problem canbe illustrated with a simple undirected graph in Fig. 1, where circles representnodes and lines represent links between pair of nodes. Black solid lines representobserved links and red dashed lines represent missing links in the current graph.Fig. 1a shows the snapshot of the graph at time t, where two missing links ex-ist between node pairs (x, y) and (g, i). The link prediction problem aiming topredict the appearance of these two missing links as observed links in the graphin near future t + ∆t , as illustrated in Fig. 1b. The similarity-based approach is the most commonly used approach for linkprediction which is developed based on the assumption that two nodes in a omparative study of similarity-based and GNN-based link prediction 3(a) Graph at time t (b) Graph at time t + ∆t Fig. 1: Illustration of link prediction problemgraph interact if they are similar. The definition of similarity is a crucial andnon-trivial task that varies from domain to domain even from the graph to graphin the same domain [18]. As a result, numerous similarity-based approaches havebeen included in the literature to predict links in small to large graphs. Somesimilarity-based approaches use the local neighbourhood information to computesimilarity score are known as local similarity-based approach. Another categoryof similarity-based approach is global approaches those use the global topologicalinformation of graph. The computational complexity of global approaches makesthem unfeasible to be applied on large graphs as they use the global structuralinformation such as adjacency matrix [18]. For this reason, we are consideringonly the local similarity-based approaches in the current study. We have studied13 popular similarity-based approaches for link prediction. Table 1 summarizesthe approaches with the basic principle and similarity function.These approaches except CCLP use node degree, common neighborhood orlinks among common neighborhood information to compute similarity scores.CCLP uses the clustering coefficient (CC) of each common neighbour to computethe role of its to the similarity score. The clustering coefficient is defined as theratio of the number of triangles and the expected number of triangles passingthrough a node. If t z is the number of triangles passing through node z and Γ z isthe neighbourhood of z then the clustering coefficient ( CC z ) of node z is definedas CC z = 2 × t z | Γ z | ( | Γ z | −
1) (1)Overall, these local similarity-based approaches except PA work well when thegraphs have a high number of common neighbours between a pair of nodes.However, the SA, HDI and LLHN suffer from outlier when one of the two nodeshas no neighbour. In addition, some of the approaches like JA, SO, HPI sufferfrom the outlier when both of the nodes have no neighbour.
Graph neural network (GNN) is an extension of the neural network to be appliedto graph data. A GNN computes the node representation based on the availablenode information. It aggregates the information from its neighbours to find its fi-nal representation and the representation is fed into a multi-layer neural networkfor several downstream tasks like node classification, link prediction and graph
M. K. Islam et al.
Table 1: Summary of studied similarity-based approaches. The similarity func-tion is defined to predict a link between two nodes x and y . Γ x and
Γ x denotethe neighbour sets of nodes x and y respectively. r x,y denotes the link betweentwo nodes x , y . Approach Principle Similarity-function
Adamic-Adar (AA)[3] Variation of CN where each com-mon neighbour is logarithmicallypenalized by its degree S AA ( x, y ) = (cid:80) z ∈ Γx ∩ Γy log | Γz | Common Neighbours(CN) [19] Two nodes are more likely to belinked share more neighbours S CN ( x, y ) = | Γ x ∩ Γ y | Resource Allocation(RA) [20] Based on the resource allocationprocess to further penalize the highdegree common neighbours by moreamount S RA ( x, y ) = (cid:80) z ∈ Γx ∩ Γy | Γz | Preferential Attach-ment (PA) [21] Based on the rich-get-richer con-cept where the link probability be-tween two high degree nodes ishigher than two low degree nodes S PA ( x, y ) = | Γ x | × |
Γ y | Jaccard Index(JA)[22] Normalization of CN where thescore is penalized for each non-common neighbour S JA ( x, y ) = | Γx ∩ Γy || Γx ∪ Γy | Salton Index(SA)[23] Motivated by cosine similarity thatdefines link probability based on co-sine angle between adjacency vec-tors for nodes pair S SA ( x, y ) = | Γx ∩ Γy | √ | Γx |×| Γy | Srensen Index(SO)[24] Describing the overall proportion ofcommon neighbours from a localperspective. S SO ( x, y ) = ×| Γx ∩ Γy || Γx | + | Γy | Hub Promoted Index(HPI) [25] Promoting link formation betweenhigh-degree nodes and hubs S HPI ( x, y ) = | Γx ∩ Γy | max ( | Γx | , | Γy | ) Hub Depressed Index(HDI) [25] Promoting link formation betweenlow-degree nodes and hubs. S HDI ( x, y ) = | Γx ∩ Γy | min ( | Γx | , | Γy | ) Local Leicht-Holme-Newman (LLHN)[26] Utilizing both of real and expectedamount of common neighbours be-tween a pair of nodes to define theirsimilarity. S LLHN ( x, y ) = | Γx ∩ Γy || Γx |×| Γy | Individual Attraction(IA) [27] Maximizing the likelihood of linkformation for highly interlinkednodes pair. S IA ( x, y ) = (cid:80) z ∈ Γx ∩ Γy | r z,Γx ∩ Γy | +2 | Γz | CannistraiAlanis Ra-vai (CAR) [28] Utilization of level-2 links alongwith common neighbourhood infor-mation in computing the pairwisesimilarity score S CAR ( x, y ) = (cid:80) z ∈ Γx ∩ Γy | Γx ∩ Γy ∩ Γz | ClusteringCoefficient-basedLink Prediction(CCLP) [29] Quantification of the contributionof each common neighbour by uti-lizing the local clustering coefficientof the node. S CCLP ( x, y ) = (cid:80) z ∈ Γx ∩ Γy CC z omparative study of similarity-based and GNN-based link prediction 5 classification. Based on the architecture, GNNs are broadly categorized into fivecategories: recurrent graph neural network (RecGNN), convolution graph neuralnetwork (ConvGNN), graph auto-encoder (GAE), and spatial-temporal graphneural network (STGNN) [12]. RecGNNs are the pioneers of GNNs those workbased on the assumption that the nodes constantly exchange the informationwith the neighbours until a stable state is reached. Motivated by the convolu-tion operation of the neural network in the image domain, ConvGNNs computethe embedding of a node by aggregating its own information and neighboursinformation. GAEs are the unsupervised version of GNN those encode the nodesinto a latent vector space and reconstruct the graph to learn the embedding.STGNNs are used to learn the hidden features in a spatio-temporal graph basedon the spatial and temporal dependency with time. Recently, researchers havestudied the attention mechanism in RecGNN and ConvGNN to improve the pre-diction performance by allowing them to focus on the most relevant parts of thegraph [13]. ConvGNNs has become popular in recent years due to its efficientgraph convolution operation [12,30]. In this paper, we focus on the link pre-diction approaches based on ConvGNN. A ConvGNN starts with defining theneighbourhood Γ v i of each node v i in the graph G ( V, E ) which is a crucial taskas it can affect the accuracy and computational time. Some popular neighbour-hood definitions include immediate neighbours, multi-hop neighbours [17,31],sampling-based neighbours [32,33]. The feature vector, x i of each node, v i isthen computed based on its attribute and structural information. The featurevectors of nodes are fed into a stack of layers to learn the hidden features of thegraph. A simple ConvGNN update the node representation in each layer in thefollowing three basic steps [30,34]1. Computation of neural messages:
The neural messages of each link fornext layer is computed based on the current representations of both endnodes of the link. If h li and h lj are current representations of a nodes pair( v i , v j ), the message of the link is defined as m l +1 ij = M SG ( h li , h lj , r ij ) (2)Here, l represents the current layer, r ij ∈ E is the relation between thenodes pair and M SG is the message computation function for the links.Many GNN models use the link types [35] or link weight [36] for encoding r ij . The initial representations of nodes v i and v j are x i and x j respectively(i.e. h i = x i and h j = x j ).2. Aggregating the neighbour information:
The next operation of thelayer is to aggregate the neighbour messages m lij for node v i . An aggregationfunction is defined as M l +1 i = AGGR ( m l +1 ij | v j ∈ Γ v i ) (3)Here, Γ v i is the set of neighbours of node v i and AGGR is the aggregationfunction. Some popular aggregation functions exist in the literature such asmean/max pooling [37], sort pooling [38], permutation invariant [39].
M. K. Islam et al. Updating the node representation:
In this step, the representation orembedding of node v i in the next layer is updated based on the currentembedding, h li and the aggregated message, M l +1 i h l +1 i = U P DAT E ( h li , M l +1 i ) (4)Here, the U P DAT E function is a non-linear function like sigmoid, recifiedlinear unit(ReLU), hyperbolic tangent(TanH). The output embedding h l +1 i is the input for next layer.Each layer in the model follows these three steps and generates nodes embed-ding. The embedding from the last layer is fed into a standard classifier such asmultilayer perception (MLP) with a softmax layer for downstream tasks. Theparameters of the classifier are optimized using optimizer like Adam, stochasticgradient descent(SGD) along with loss functions such as cross-entropy, meanabsolute error(MAE), mean squared error(MSE) and backpropagation.There exist many link prediction approaches based on ConvGNN in the lit-erature. Most of them are applicable to homogeneous graphs and few of themare applicable to heterogeneous graphs which consist of multiple types of nodesand links, node and link attributes and multiple links between pairs of nodes.We study two recent GNN-based link prediction approaches which are applica-ble to homogeneous graphs only as our study is confined to those graphs. Thefirst one is WLNM (Weisfeiler-Lehman Neural Machine) that utilizes only thestructural information of nodes for the link prediction task. SEAL (Sub-graphs,Embeddings and Attributes) is the second one that uses the structural, latentand attribute information of node for the same task. The approaches are brieflydescribed below. Weisfeiler-Lehman Neural Machine (WLNM)
Based on the well-knownWeisfeiler-Lehman canonical labelling algorithm [40], Zhang &Chen developeda link prediction approach for graph called Weisfeiler-Lehman Neural Machine(WLNM) [32]. WLNM learns the structural features from the graph and uses inprediction task. WLNM is a three steps link prediction approach that starts withextracting sub-graphs, labelling and encoding the nodes and ends with trainingand evaluating the neural network. Fig. 2 illustrates the training process ofWLNM with one existent link (A,B) and one non-existent link(c,d). The threesteps of WLNM link prediction approach are described as following.1.
Sub-graph extraction:
WLNM starts with extracting the k-vertex neigh-bouring sub-graph of a link called enclosing sub-graph. k is the user-definedparameter that defines the size of sub-graph. For a given link, 1-hop neigh-bours are added in the sub-graph, then 2-hop neighbours and so on untilthe number of neighbours is greater or equal to k. If there are k’ nodes insub-graph such that k (cid:48) > k then k (cid:48) − k nodes with higher hop number areremoved sub-graph.2. Node labelling and encoding:
Weisfeiler-Lehman (WL) is a populargraph labelling algorithm that uses the concept of signature string for each omparative study of similarity-based and GNN-based link prediction 7
Fig. 2: Illustration of WLNM approach [32]node to compute node labels [40]. Instead of using classical WL algorithm,WLNM develops a hashing based color refinement process for faster nodelabelling. If two nodes have still the same label, WLNM uses the naughtynode labelling algorithm to break the tie [41]. The nodes are sorted accord-ing to the node label in increasing order and an upper triangular adjacencymatrix is computed.3.
Neural network training and evaluation:
WLNM uses a fully connectedmulti-layer perception (MLP) neural network to learn structural featuresfrom the sub-graph. The output layer of the MLP is a softmax layer thatclassifies the link into two classes (existent and non-existent). The uppertriangular adjacency matrix of the sub-graph is vertically fed into the MLPto train and evaluate WLNM approach. The neural network is trained forboth of existent and non-existent links.WLNM is a simple GNN-based link prediction approach which is able to learn thelink prediction heuristics from a graph. In contrast to similarity-based heuristics,WLNM has universal applicability properties. However, WLNM truncates someneighbours to limit the number of nodes in the sub-graph to a user-defined size.The truncated neighbours may be informative for the prediction task.
Learning from Sub-graphs, Embeddings and Attributes (SEAL)
Zhanget al [17] developed a ConvGNN-based link prediction approach namely SEALto learn from latent and explicit features of nodes along with the structural infor-mation of graph. Unlike WLNM, SEAL is able to handle neighbours of variablesize. SEAL replaces the fully-connected neural network in WLNM with a graphneural network to learn the graph features efficiently. The overall architecture ofthe approach is shown in Fig. 3. Like WLNM, SEAL also consists of three majorsteps which are described as follows:1.
Sub-graph extraction and node labelling:
Likewise WLNM, SEAL ap-proach uses the concept of local sub-graph instead of the whole graph fora link in prediction task. SEAL defines the sub-graph as the h-hop neigh-bours of a link which is built by the union operation on the h-hop neighbours
M. K. Islam et al.
Fig. 3: Architecture of SEAL approachof nodes of the link. For example, a 1-hop enclosing sub-graph contains allfirst-order or immediate neighbours, a 2-hop enclosing sub-graph containsall first-order and second-order neighbours. In Fig. 3, the sub-graph for link(A,B) consists of 7 nodes(5 neighbours and 2 end nodes) and the sub-graphfor link (C,D) consists of 8 nodes(6 neighbours and 2 end nodes). The ap-proach shows that setting a small h can still provide good prediction per-formance. Then a unique label is assigned to each node of the sub-graph toindicate its importance in the prediction task. SEAL designs a new node la-belling algorithm namely DRNL (double-radius node labelling) based on thetopological distances of the node from both ends of the link in the sub-graph.2.
Node information matrix construction:
The information matrix of anode in SEAL is defined based on its structural label(structural feature),embedding(latent feature) and attribute(explicit feature). One hot encod-ing technique is applied to the labelled sub-graph to compute the struc-tural vector of nodes. The structural feature vector of the node is then con-catenated with the latent feature vector of the node. The latent feature isthe low-dimensional latent representation/embedding of a node which is ob-tained by factorizing the adjacency matrix from the graph. SEAL uses theNode2Vec [42] algorithms to learn the latent feature vector for each nodein sub-graph. The last part of the information vector of the node is an ex-plicit feature vector which is computed based on the continuous or discreteattributes of the node. One hot coding technique is used to find the explicitfeature vector of each node.3.
Neural network training and evaluation:
The learned node informa-tion matrix of the sub-graph is feed into a GNN called DGCNN (DeepGraph Convolutional Neural Network) [38] to perform the link predictiontask. DGCNN consists of propagation-based convolution layer and aggrega-tion layer to aggregate the neighbour’s information vector. DGCNN uses asort-pooling layer to unify the size of the representation of the sub-graph.SEAL utilizes the available information in the graph to improve the predictionperformance. However, SEAL is limited to be applied on homogeneous graphsthough many real work graphs are heterogeneous graphs. Moreover, the use oflatent feature affects the computational time of SEAL. omparative study of similarity-based and GNN-based link prediction 9
We perform the comparative study of the above discussed similarity and GNNbased link prediction approaches in graphs from different domains. To evalu-ate and describe the performance of the link prediction approaches, we chooseten benchmark graphs from different areas: Ecoli [43], FB15K [44], NS [45],PB [46],Power [47], Router [48], USAir [49], WN18 [50], YAGO3-10 [51], andYeast [52]. Ecoli and Yeast are two biological graphs those represent the biologi-cal relations between operons in Escherichia Coli bacteria and protein-protein in-teraction in yeast. PB (Political Blog) graph represents the network among polit-ical blog pages in US where the blog pages are identified as nodes and hyperlinksbetween the blog pages are identified as links of the graph. We consider the orig-inal directed links as undirected links. Net Science (NS) graph represents a col-laboration network of researchers who publish papers on network science. Poweris an electrical grid network of western US representing the network describinghigh voltage transmission among generators, transformers and substations. TheRouter graph represents the router-level internet where each router has an iden-tifier and undirected links with other routers. The USAir graph represents thenetwork of the US air transportation system that consists of attributed nodes(airports) and links between two airports. FB1K, WN18 and YAGO3-10 are sim-plified knowledge graphs. The original FB15K is a Freebase Knowledge Graphwhich was extracted from Wikidata and DBPedia. This knowledge graph con-tains 540188 triples where each triple consists of identifiers of freebase entity witha relationship name between them. WN18 is another knowledge graph that is alarge lexical graph of English. The last knowledge graph is YAGO3-10 that wasprepared at the Max Planck Institute for Computer Science in Saarbrucken in2015. These knowledge graphs consist of subject-relationship type-object triples.However, as most studied approaches are applicable to homogeneous graphs only,we simplify these knowledge graphs by overlooking the types of relationships andreducing multiple links to single links between nodes/entities. All of the graphsare considered as undirected graphs. In this study, we are considering them aslarge graphs instead of knowledge graphs.We use the Gephi tool [53] to extract the topological statistics of the graphs.The characteristics of the graph datasets are summarized in Table 2. Based onthe number of nodes, these graphs are categorized into small/medium graphswith less or equal 10000 nodes and large graphs with more than 10000 nodes.
We follow a random sampling validation protocol to evaluate the performanceof the studied approaches [32,54]. The train and test datasets are prepared froma graph G(V, E), where V is the set of vertices and E is the set of existentlinks. Two types of both training and test datasets are prepared from the graph.The first training dataset is positive training dataset that contains randomly
Table 2: Topological statistics of graph datasets: number of nodes(
Graphs selected 90% observed links and an equal number of non-existent links form thenegative training dataset. The remaining 10% existent links form the positivetest dataset and an equal number of non-existent links form the negative testdataset. At the same time, the graph connectivity of the training set and thetest set is guaranteed. We prepare five train and five test datasets for evaluatingthe performance of the approaches.For evaluating the performance of similarity-based approaches, the graph isbuilt from the positive training dataset whereas, for graph neural network-basedapproaches, the graph is built from the original graph that contains both of pos-itive train and test datasets. However, a link is temporarily removed from thegraph to train it to the GNN-based approaches or to predict its existence. Theapproaches are evaluated on positive and negative test datasets. For WLNM,we choose the neighbour size to 10 and for SEAL we choose the hop to 1 forall graphs. The similarity scores of similarity-based approaches for test linksare computed based on the training graphs which contain only train links. Theperformance of link prediction approach is quantified by defining two standardevaluation metrics, precision and AUC (Area Under the Curve). All of the ap-proaches are run on a Dell Latitude 5400 machine with 32GB primary memoryand core i7 (CPU 1.90GHz) processor.
Precision describes the fraction of missing links that are accurately predicted asexistent link [55,56,57]. To compute the precision, all of the predicted links froma test set are ranked in decreasing order of their scores. If L r is the number ofexisting links (in the positive test set) among the L-top ranked predicted links omparative study of similarity-based and GNN-based link prediction 11 then the precision is defined as P recision = L r L (5)The precision is a measure of result relevance. The higher the precision indicatesthe higher accuracy of the prediction approach. An ideal prediction approachhas a precision of 1.0 that means all the missing links are accurately predicted.We set L to the number of existent links in the test set.On the other hand, the metric AUC is measured to demonstrate the abilityof an approach in distinguishing between an existent and a non-existent link. Itis defined as the probability that a randomly chosen missing link has a highersimilarity score than a randomly chosen non-existent link [56]. Suppose, n exis-tent and n non-existent links are chosen from positive and negative test sets. If n is the number of existent links having a higher score than non-existent linksand n is the number of existent links having equal score as non-existent linksthen AUC is defined as AU C = n + 0 . n n (6)An AUC of more than 0.5 indicates that the prediction index has a better effectthan choosing links randomly and vice versa. Generally, the degree to which AUCexceeds 0.5 indicates how much good the prediction approach. We consider halfof the total links in the positive test set and negative test set to compute AUC. The prediction approaches are evaluated in each of the five sets (train and testset) of each graph and performance metrics (precision, AUC) are recorded. Themaximum and minimum similarity scores are computed from the top-L for eachtest set of each graph. Table 3 shows the mean maximum(Max Score) and min-imum similarity (Min Score) scores for each similarity-based in each graph. Wemeasure the precision in two different ways based on the top-L test links. Firstly,we use Equation 5 as it is where L r is the number of positive links in top-L testlinks. However, the minimum similarity scores for many similarity-based ap-proaches are very low (close to 0) that creates difficulty to make a separationbetween some positive and negative test links. To overcome this problem, wedefine a threshold when defining L r . However, defining threshold to similarity-based approaches is again a non-trivial task as the maximum and minimumscores vary for different graphs and even for different test sets. To overcome thisproblem, we define a threshold as the average of the maximum and minimumscore in top-L links. We compute the number of positive test links in top-L links(as L r ) as those having similarity scores above the threshold. We compute thethreshold-based precision only for similarity-based approaches as GNN-basedapproaches do learn the threshold. The corrected precision is shown in paren-theses in Table 3. Each value of the table is the mean over the five test sets. The evaluation metrics precision and AUC for the studied approaches in the sevensmall to medium-size graphs are tabulated in Table 3.Table 3 shows that, overall, the similarity-based approaches give high preci-sion (without defining threshold) and AUC values in well-connected (high clus-tering coefficient, high node degree) graphs while GNN-based approaches showgood precision and AUC in all graphs. In the Ecoli graph, CCLP shows the high-est precision (0.96) while the lowest precision(0.78) is recorded for the PA ap-proach. The precisions of other similarity-based approaches are close to the high-est precision score. The highest clustering coefficient contributes to the successof CCLP in terms of precision in Ecoli. However, the precision of similarity-basedapproaches drops drastically when computing precision based on the thresholdas many positive links with very low similarity scores (even 0) comparing tothe threshold. The precision of WLNM and SEAL approaches are lower thanthe similarity-based approaches and they are 0.867 and 0.807 respectively. Thehighest and lowest AUC values in Ecoli are found for SEAL and PA approachesrespectively. The AUC value of another GNN-based approach WLNM is also veryhigh and close to the highest AUC value. The high values of these two GNN-based approaches state that they are highly efficient in distinguishing betweenexistent and non-existent links in Ecoli graph. Similar performance is found forother well-connected graphs (NS, PB, USAir and Yeast). In NS graph, SEALperforms with the best precision (0.96) and AUC (0.99) score and PA is theworst approach which shows the lowest precision and AUC values of 0.69 and0.66 respectively. The precision scores of other approaches lie between 0.8 to 0.9while the AUC values are between 0.9 to 0.95. A remarkable precision(highest)is found for HPI in NS graph while the precision scores of some similarity-basedapproaches like AA, CN, PA, RA are still very low when applying the thresh-old method. Overall, the AUC values of GNN-based approaches are higher thanthe similarity-based approaches in NS graph. In PB graph, the highest pre-cision score is recorded in similarity-based approaches RA, CAR and CCLPwhereas the highest AUC value is found for the GNN-based approach SEAL.LLHN performs worst in PB concerning both metrics. The precision of otherapproaches near or above 0.8. The high average node degree plays a role in mostof the similarity-based approaches in performing better than the GNN-basedapproaches in terms of precision scores in PB graph. However, the precision ofsimilarity-based approaches drops to below 0.2 when applying the threshold incomputing precision. Similarity-based approaches shows very low precisions andlow AUCs in two sparse graphs, Power and Router whereas the GNN-based ap-proaches are still able to provide high precisions and AUCs in both of the graphs.In both of USAir and Yeast graphs, SEAL shows the best results with precisionof 0.94 and 0.89 and AUC of 0.96 and 0.98 respectively while the lowest preci-sion and AUC values are recorded for WLNM and LLHN respectively. The useof node attributes for SEAL in USAir during prediction task influences in theimprovement of the performance metrics. Overall, SEAL shows the highest AUCvalues in all graphs. The use of latent feature along with structural feature is thevital reason behind this success. Table 3 shows that GNN-based approaches pro- omparative study of similarity-based and GNN-based link prediction 13 Table 3: AUC and Precision values with Max Scores and Min Scores insmall/medium graphs. Precision in () is computed based on threshold in top-L links. Graph-wise highest/lowest metrics are indicated in bold fonts whileapproach-wise highest/lowest metrics are shown in italic.
App. Metrics Ecoli NS PB Power Router USAir Yeast
AA Precision 0.90(0.06) 0.87(0.15) (0.01)0.17(0.02) (0.01) (0.16)0.83(0.06)Max scor 32.84 5.83 33.41 3.04 5.60 16.69 23.71Min scor 2.86 1.14 0.58 0.00 0.00 2.70 0.00AUC 0.93 (0.02)0.17(0.04) (.004) (0.23)0.83(0.06)Max scor 153 11.0 119 29.0 15.0 51.0 90.33Min scor 12.0 1.40 3.00 0.00 0.00 9.67 0.00AUC 0.93 0.93 0.91 0.58 (0.05) (0.02)0.83(0.01) 0.49(0.02) (0.01) (0.13) (0.06)Max scor 65679 362.0 61052 53.0 2397 8298.7 10642Min scor 3532 12.0 855.7 4.0 1.0 739.3 95.0AUC
RA Precision 0.91(0.03) 0.87(0.15) (0.01)0.17(0.03) (0.01) (0.10)0.83(0.07)Max scor 1.70 1.80 4.19 0.84 1.32 2.83 2.37Min scor 0.19 0.40 0.03 0.00 0.00 0.32 0.00AUC (0.11)0.87(0.42) 0.79(0.07) 0.17(0.07) (0.01) 0.88(0.18) 0.83(0.34)Max scor 0.49 0.60 0.37 0.60 0.39 0.45 0.50Min scor 0.10 0.09 0.04 0.00 0.00 0.17 0.00AUC (0.10)0.87(0.67) 0.80(0.06) 0.17(0.07) (0.01) 0.90(0.15) 0.83(0.40)Max scor 0.98 1.00 0.75 0.94 0.90 0.91 1.00Min scor 0.22 0.51 0.11 0.11 0.00 0.41 0.00AUC (0.11)0.87(0.64) 0.79(0.07) 0.17(0.06) (0.01) 0.88(0.18) 0.83(0.34)Max scor 0.98 1.00 0.74 0.93 0.90 0.91 1.00Min scor 0.19 0.46 0.07 0.00 0.00 0.34 0.00AUC )0.80(0.15) 0.17(0.13) (0.02) (0.45)0.83(0.70)Max scor 1.00 1.00 1.00 1.00 1.00 1.00 1.00Min scor 0.33 0.83 0.21 0.00 0.00 0.77 0.00AUC (0.08)0.87(0.64) 0.79(0.05) 0.17(0.03) (0.01) 0.88(0.18) 0.83(0.24)Max scor 0.97 1.00 0.68 0.89 0.89 0.85 1.00Min scor 0.14 0.33 0.05 0.00 0.00 0.24 0.00AUC (.001)0.87(0.13) (.001)0.17(0.03) (.003) 0.87(0.03) 0.83(0.01)Max scor 0.32 1.00 0.42 2.06 0.83 0.58 0.67Min scor 0.00 0.10 0.00 0.00 0.00 0.01 0.00AUC 0.91 (0.01) (0.28)0.83(0.07)Max scor 149.4 10.7 91.2 4.3 8.1 46.6 80.6Min scor 12.5 2.6 3.5 0.0 0.0 11.2 0.0AUC (0.02)0.17(0.03) (0.01) (0.24)0.83(0.06)Max scor 4833 46.0 1515.2 2.3 25.2 555 1831Min scor 50.2 1.4 3.0 0.0 0.0 46.0 0.0AUC (0.06) 0.73(0.21) (0.01) (0.01) (0.01) 0.91(0.18) 0.82(0.06)Max scor 30.6 8.0 27.0 1.2 1.1 21.1 39.2Min scor 1.8 0.3 0.3 0.0 0.0 2.9 0.0AUC
AUC vide high-performance metrics in all graphs while similarity-based approachesperform well in some graphs.The approaches are further evaluated in three large graphs FB15K, WN18and YAGO3-10 and the results are presented in Table 4. We can see that somesimilarity-based approaches (AA, CN, PA, RA, IA, CAR) show higher metricvalues while others (JA, SA, SO, HDI, LLHN) show lower metric values than theGNN-based approaches in FB15K graph. The highest precision score is foundfor CN, IA, CAR approaches and the highest AUC value is found for SEAL.LLHN is the worst performing approach concerning both of the metrics amongall approaches in FB15K graph. However, the precision drops to below 0.1 forall similarity-based approaches when applying the threshold to similarity scoreswith FB15K graph.As shown in Table 2, WN18 is a sparse graph with low average node degree(3.709) and clustering coefficient (0.077). This sparsity affects the performanceof similarity-based approaches as these approaches except PA highly depend onthe common neighbourhood information. The precision scores of all similarity-based approaches are below 0.2 except PA that shows a comparatively goodprecision score of 0.63. The precision further drops when applying the thresholdto similarity scores in top-L links. Compared to the similarity-based approaches,GNN-based approaches show higher precision and AUC values in WN18 graph.The highest precision and AUC values are recorded for WLNM and SEAL ap-proaches respectively. In YAGO3-10 graph, PA performs surprisingly well withprecision and AUC values of 0.83 and 0.88 respectively. However, the highest pre-cision and AUC values are found for the SEAL approach. Overall, GNN-basedapproaches are more suitable across graphs from several domains with respectto precision and AUC values.From Tables 3 and 4, the node-degree based approach PA shows higher per-formance comparing to other neighborhood based similarity approaches. Thehighest precision of PA is found in USAir (0.92) and the lowest one in Router(0.41).Similarity-based approaches based on the common neighborhood show impres-sive performance in the graphs with high average node degree and clusteringcoefficient. These approaches show very high precision of above or nearly 0.9in two well connected Ecoli and USAir graphs. These approaches show verylow precision of less than 0.2 in two large graphs, WN18 and YAGO3-10. Onthe other hand, the GNN-based approaches show very high precision and AUCacross all of the experimental graphs including small to large graphs.
The performance is further described in terms of computational time. Everyapproach is executed for each test set of each graph and their computationaltimes are recorded. The computational time for similarity-based heuristic is theaverage time required per test link to compute the nodes similarity score. On theother hand, the computational times for GNN-based prediction approaches arethe accumulated time for training the GNN and predicting the classes of links(existence or non-existence) in test sets. Table 5 shows the mean computational omparative study of similarity-based and GNN-based link prediction 15
Table 4: AUC and Precision values with Max Scores and Min Scores in largegraphs. Similar to Table 3 / App. Metrics FB15K WN18 YAGO3-10
AA Precision (0.0002) (0.0002) 0.15(0.0018)Max Score 418.60 57.32 24.44Min Score 0.12 0.00 0.00AUC
CN Precision (0.0003) (0.0004) 0.15(0.0012)Max Score 1231.3 60.00 98.00Min Score 1.00 0.00 0.00AUC
PA Precision (0.0003) (0.0006) 0.83(0.0006)Max Score 9881842.3 10636.7 2426939Min Score 942.67 6.33 109.00AUC (0.0003) (0.0002) 0.15(0.0011)Max Score 72.06 20.67 5.16Min Score 0.00 0.00 0.00AUC (0.0225) (0.0161) (0.0059)Max Score 0.50 0.50 0.50Min Score 0.01 0.00 0.00AUC
SA Precision (0.0236) (0.0218) 0.15(0.0068)Max Score 1.00 1.00 1.00Min Score 0.02 0.00 0.00AUC
SO Precision (0.0225) (0.0180) 0.15(0.0059)Max Score 1.00 1.00 1.00Min Score 0.01 0.00 0.00AUC
HPI Precision (0.0959) (0.0796) 0.15(0.0476)Max Score 1.00 1.00 1.00Min Score 0.05 0.00 0.00AUC
HDI Precision (0.0137) (0.0121) 0.15(0.0035)Max Score 1.00 1.00 1.00Min Score 0.01 0.00 0.00AUC
LLHN Precision (0.0008) (0.0046) 0.15(0.0003)Max Score 0.28 1.00 1.00Min Score 0.00 0.00 0.00AUC
IA Precision (0.0003) (0.0505) 0.15(0.0014)Max Score 757.1 4.58 95.23Min Score 2.00 0.00 0.00AUC
CAR Precision (0.0003) (0.0004) 0.15(0.0008)Max Score 6906 60.00 1430Min Score 1.00 0.00 0.00AUC
CCLP Precision (0.0015) (0.0006) 0.14(0.0013)Max Score 51.74 1.67 20.77Min Score 0.01 0.00 0.00AUC
AUC time in milliseconds. From Table 5, it is seen that PA has the lowest meanTable 5: Computational time (milliseconds). The graph-wise highest and lowestmean computational time are indicated in bold fonts and approach-wise highestand lowest mean computational time are indicated in italic.
Approach Ecoli FB15K NS PB Power Router USAir WN18 YAGO3-10 Yeast
AA 221 495 71 106 73 74
28 120 21 23
61 58
104 274 SA 58 226 40 48 105 102
82 262
257 492
886 1221 398 940
419 524 868 computational time among the similarity-based approaches in half of the graphsets as it requires a simple multiplication operation of degrees of two end nodesin a link. The computational time for simple CN approaches are close to thePA approaches in all approaches. Similarity approaches those quantify the roleof each neighbour or level-2 links such as JA, RA, IA require higher processingtime. The highest computational times are found for CCLP similarity-based ap-proach in all graphs as CCLP explores level-3 links for computing the similarityscore. However, CAR requires the minimum computational time to predict linksin sparse graphs (Power, Router, WN18) as these graphs have very lower clus-tering coefficient comparing to other graphs. The computational times of theseapproaches are affected by the graph properties such as average node degree,number of nodes and links, average clustering coefficient. For example, the com-putational time of all similarity-based approaches in NS graph is more than inUSAir as NS is larger than USAir in terms of the number of nodes and link. Thecomputational time in PB graph is more than in NS approaches as PB has moreaverage node degree than NS though the number of nodes is higher in NS.Compared to similarity-based approaches, the computational times of GNN-based ones are higher as they learn the heuristics from the graph during thetraining operation. Table 5 shows that the computational times for SEAL aregreater than WLNM in all graphs as SEAL utilizes the structural, latent andexplicit features of graph comparing whereas WLNM utilizes only the structuralfeatures of the graph. One noticeable point is that the computational time ofWLNM is more in PB, NS graphs than USAir as USAir is the smallest graphwhereas SEAL reverses the case as it uses the node attributes in USAir. The high-est computational time is recorded for SEAL among all the studied approaches. omparative study of similarity-based and GNN-based link prediction 17
We also see that the computational times for GNN-based approaches grow bymore amount than the similarity-based approaches. For example, the minimumcomputational time for PA in USAir grows by an amount of 629 millisecondsin YAGO3-10 graph whereas for SEAL it grows by an amount of 2189 millisec-onds. Overall, similarity-based approaches are more efficient than GNN-basedapproaches concerning the computational time. Except for SEAL, the approach-wise comparison in terms of computational time shows that all approaches showthe highest and lowest computational time in the largest experimental graph(YAGO3-10) and smallest graph (USAir) respectively, as expected. SEAL showshigher computational time in USAir than two sparse graphs (Router and Power)as it uses the attribute features of USAir and also the latter graphs have lowaverage node degree.
In this paper, we study several link prediction approaches for homogeneousgraphs from similarity-based and GNN-based learning categories with their work-ing principles and limitations. The approaches were evaluated against ten bench-mark graphs with different properties from various domains. The precision ofsimilarity-based approaches was computed in two different ways to overcomethe difficulty of tuning the threshold for deciding the link existence based on thesimilarity score.The experimental results show the superiority of GNN-based approaches oversimilarity-based ones with respect to the prediction performance across variousgraphs. In contrast, compared to similarity-based approaches, these GNN-basedapproaches are less suitable when the graphs need fast processing. The com-putational time of GNN-based approaches is further affected when applied tolarge graphs. In addition, the ’black box’ problem of conventional neural net-works remains unsolved with GNNs where it is very difficult to retrace the in-ternal process of GNN. This work could help a new user to study similarity andGNN-based link prediction approaches and also the corresponding evaluationprotocols.One perspective of this work is to achieve a good trade-off between predictionaccuracy and computational time by developing a GNN-based link approach ina distributed and parallel environment. In addition, the approach is expected tobe applicable to the heterogeneous graphs such as knowledge graphs.
References
1. Xu, Z., Pu, C., Yang, J.: Link prediction based on path entropy. Physica A: Statis-tical Mechanics and its Applications, , pp. 294–301, (2016).2. Shen, Z., Wang, W. X., Fan, Y., Di, Z., Lai, Y. C.: Reconstructing propagationnetworks with natural diversity and identifying hidden sources. Nature communi-cations, (1), pp. 1–10, (2014).3. Adamic, L. A., Adar, E.: Friends and neighbors on the web. Social networks, (3),pp. 211–230 (2003).8 M. K. Islam et al.4. Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommendersystems. Computer, (8), pp. 30–37,(2009).5. Nickel, M., Murphy, K., Tresp, V., Gabrilovich, E.: A review of relational machinelearning for knowledge graphs. In Proceedings of the IEEE, (1), pp. 11–33,(2015).6. Airoldi, E. M., Blei, D. M., Fienberg, S. E., Xing, E. P.: Mixed membership stochas-tic blockmodels. Journal of Machine Learning Research, (Sep), 1981–2014 (2008).7. Kovcs, I. A., Luck, K., Spirohn, K., Wang, Y., Pollis, C., Schlabach, S., ..., Calder-wood, M. A.: Network-based prediction of protein interactions. Nature Communi-cations, (1), pp. 1–8,(2019).8. Paoletti, M. E., Haut, J. M., Plaza, J., Plaza, A.: A new deep convolutional neuralnetwork for fast hyperspectral image classification. ISPRS Journal of Photogram-metry and Remote Sensing, , pp. 120–147, Elsevier, (2018).9. Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A. R., Jaitly, N., ..., Kingsbury,B.: Deep neural networks for acoustic modeling in speech recognition: The sharedviews of four research groups. IEEE Signal Processing Magazine, (6), pp. 82–97,IEEE, (2012).10. Redmon, J., Divvala, S., Girshick, R., Farhadi, A: You only look once: Unified,real-time object detection. In Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition, pp. 779–788, (2016).11. Luong, M. T., Pham, H., Manning, C. D.: Effective approaches to attention-basedneural machine translation. In Proceedings of the Empirical Methods in NaturalLanguage Processing, pp. 1412–1421, (2015).12. Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Philip, S. Y.: A comprehensive sur-vey on graph neural networks. IEEE Transactions on Neural Networks and LearningSystems, pp. 1–21, (2020).13. Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graphattention networks. In Proceedings of the International Conference on LearningRepresentations, pp. 1–12, (2018).14. Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M., Monfardini, G.: The graphneural network model. IEEE Transactions on Neural Networks, (1), pp. 61–80,(2008).15. Ying, Z., You, J., Morris, C., Ren, X., Hamilton, W., Leskovec, J.: Hierarchicalgraph representation learning with differentiable pooling. In Advances in NeuralInformation Processing Systems, pp. 4800–4810, (2018).16. Kipf, T. N., Welling, M.: Semi-supervised classification with graph convolutionalnetworks. In Proceedings of the International Conference on Learning Representa-tions, pp. 4700–4708, (2016).17. Zhang, M., Chen, Y.: Link prediction based on graph neural networks. In Advancesin Neural Information Processing Systems, pp. 5165–5175, (2018).18. Martnez, V., Berzal, F., Cubero, J. C.: A survey of link prediction in complexnetworks. ACM Computing Surveys (CSUR), (4), pp. 1–33, (2016).19. Lorrain, F., White, H. C.: Structural equivalence of individuals in social networks.The Journal of Mathematical Sociology, (1), pp. 49–80, Taylor & Francis, (1971).20. Zhou, T., L, L., Zhang, Y. C.: Predicting missing links via local information. TheEuropean Physical Journal B, (4), pp. 623–630, Springer, (2009).21. Barab´asi, A. L., Albert, R.: Emergence of scaling in random networks. Science, (5439), pp. 509–512, American Association for the Advancement of Science,(1999).22. Jaccard, P.: ´Etude comparative de la distribution florale dans une portion desAlpes et des Jura. Bull Soc Vaudoise Sci Nat, , pp. 547–579, (1901)omparative study of similarity-based and GNN-based link prediction 1923. Salton, G., McGill, M.: Introduction to modern information retrieval, pp. 448,McGraw-Hill, New York (1983).24. Srensen, T., Srensen, T. A., Srensen, T. J., Biering-Srensen, T.: A method ofestablishing groups of equal amplitude in plant sociology based on similarity ofspecies content and its application to analyses of the vegetation on Danish commons,(1948).25. Ravasz, E., Somera, A. L., Mongru, D. A., Oltvai, Z. N., Barab´asi, A. L.: Hierarchi-cal organization of modularity in metabolic networks. Science, (5586), AmericanAssociation for the Advancement of Science, (2002).26. Leicht, E. A., Holme, P., Newman, M. E.: Vertex similarity in networks. PhysicalReview E, (2), pp. 026120, (2006).27. Dong, Y., Ke, Q., Wang, B., Wu, B.: Link prediction based on local information. In2011 International Conference on Advances in Social Networks Analysis and Mining,pp. 382–386, IEEE, (2011, July).28. Cannistraci, C. V., Alanis-Lobato, G., Ravasi, T.: From link-prediction in brainconnectomes and protein interactomes to the local-community-paradigm in complexnetworks. Scientific Reports, (1), pp. 1–14, Nature, (2013).29. Wu, Z., Lin, Y., Wang, J., Gregory, S.: Link prediction with node clustering coef-ficient. Physica A: Statistical Mechanics and its Applications, , pp. 1–8, (2016).30. Zhang, Z., Cui, P., Zhu, W.: Deep learning on graphs: A survey. IEEE Transactionson Knowledge and Data Engineering, (2020).31. Xu, K., Li, C., Tian, Y., Sonobe, T., Kawarabayashi, K. I., Jegelka, S.: Repre-sentation learning on graphs with jumping knowledge networks. In InternationalConference on Machine Learning, pp. 5453–5462, (2018).32. Zhang, M., Chen, Y.: Weisfeiler-lehman neural machine for link prediction. InProceedings of the 23rd ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining pp. 575–583, (2017, August).33. Huang, W., Zhang, T., Rong, Y., Huang, J.: Adaptive sampling towards fast graphrepresentation learning. In Advances in Neural Information Processing Systems, pp.4558–4567, (2018).34. Ying, Z., Bourgeois, D., You, J., Zitnik, M., Leskovec, J.: Gnnexplainer: Gener-ating explanations for graph neural networks. In Advances in Neural InformationProcessing Systems, pp. 9240–9251, (2019).35. Zitnik, M., Agrawal, M., Leskovec, J.: Modeling polypharmacy side effects withgraph convolutional networks. Bioinformatics, (13), pp. 457–466, (2018).36. Ying, R., He, R., Chen, K., Eksombatchai, P., Hamilton, W. L., Leskovec, J.: Graphconvolutional neural networks for web-scale recommender systems. In Proceedingsof the 24th ACM SIGKDD International Conference on Knowledge Discovery &Data Mining, pp. 974–983, (2018, July).37. Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on largegraphs. In Advances in Neural Information Processing Systems, pp. 1024–1034,(2017).38. Zhang, M., Cui, Z., Neumann, M., Chen, Y.: An end-to-end deep learning archi-tecture for graph classification. In 32nd AAAI Conference on Artificial Intelligence,(2018, April).39. Xu, K., Hu, W., Leskovec, J., Jegelka, S. (2018). How powerful are graph neuralnetworks?. In International Conference on Learning Representations, (2019).40. Weisfeiler, B., Lehman, A. A.: A reduction of a graph to a canonical form and analgebra arising during this reduction. Nauchno-Technicheskaya Informatsia, (9),pp. 12–16, (1968).0 M. K. Islam et al.41. McKay, B. D., Piperno, A.: Practical graph isomorphism, II. Journal of SymbolicComputation, , pp. 94-112, Elsevier, (2014).42. Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks. In Pro-ceedings of the 22nd ACM SIGKDD International Conference on Knowledge Dis-covery and Data Mining, pp. 855–864, (2016).43. Salgado, H., Santos-Zavaleta, A., Gama-Castro, S., Milln-Zrate, D., Daz-Peredo,E., Snchez-Solano, F., ... , Collado-Vides, J.: RegulonDB (version 3.2): transcrip-tional regulation and operon organization in Escherichia coli K-12. Nucleic AcidsResearch, (1), pp. 72–74, (2001).44. Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translatingembeddings for modeling multi-relational data. In Advances in Neural InformationProcessing Systems, pp. 2787–2795, (2013).45. Newman, M. E.: Finding community structure in networks using the eigenvectorsof matrices. Physical review E, (3), pp. 036104 (2006).46. Ackland, R. : Mapping the US political blogosphere: Are conservative bloggersmore prominent?. In BlogTalk Downunder 2005 Conference, Sydney (2005).47. Watts, D. J., Strogatz, S. H.: Collective dynamics of small-worldnetworks. Nature, (6684), pp. 440, (1998).48. Spring, N., Mahajan, R., Wetherall, D.: Measuring ISP topologies with Rocketfuel.ACM SIGCOMM Computer Communication Review, (2), pp. 233–259,Springer, (2014).51. Mahdisoltani, F., Biega, J., Suchanek, F. M.: Yago3: A knowledge base from mul-tilingual wikipedias, In 7th Biennial Conference on Innovative Data Systems Re-search, Asilomar, United States, (2013, January).52. Von Mering, C., Krause, R., Snel, B., Cornell, M., Oliver, S. G., Fields, S., Bork,P.: Comparative assessment of large-scale datasets of proteinprotein interactions.Nature, (6887), pp. 399–403, (2002).53. Bastian, M., Heymann, S., Jacomy, M.: Gephi: an open source software for explor-ing and manipulating networks. In 3rd International AAAI Conference on Weblogsand Social Media, pp. 17–20, (2009, March).54. Wang, M., Yu, L., Zheng, D., Gan, Q., Gai, Y., Ye, Z., ... , Huang, Z. (2019). Deepgraph library: Towards efficient and scalable deep learning on graphs. In ICLRWorkshop on Representation Learning on Graphs and Manifolds, (2019).55. Yang, J., Zhang, X. D.: Predicting missing links in complex networks based oncommon neighbors and distance. Scientific Reports, , p. 38208, Nature PublishingGroup, (2016).56. Pan, L., Zhou, T., L¨u, L., Hu, C. K.: Predicting missing links and identifyingspurious links via likelihood analysis. Scientific Reports, (1), pp. 1–10, NaturePublishing Group, (2016).57. Wu, Z., Lin, Y., Zhao, Y., Yan, H.: Improving local clustering based top-L link pre-diction methods via asymmetric link clustering information. Physica A: StatisticalMechanics and its Applications,492