Hyperbolic Node Embedding for Signed Networks
Wenzhuo Song, Hongxu Chen, Xueyan Liu, Hongzhe Jiang, Shengsheng Wang
HHyperbolic Node Embedding for Signed Networks
Wenzhuo Song a,b , Hongxu Chen c , Xueyan Liu a,b , Hongzhe Jiang d , ShengshengWang a,b, ∗ a College of computer science and technology, Jilin University, Changchun, 130012, China b Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry ofEducation, Jilin University, Changchun, 130012, China c Advanced Analytics Institute, School of Computer Science, Faculty of Engineering and IT,University of Technology Sydney, NSW, 2007 Australia d College of Mechanical and Electronic Engineering, Nanjing Forestry University, Nanjing,210037, China
Abstract
Signed network embedding methods aim to learn vector representations of nodesin signed networks. However, existing algorithms only managed to embed net-works into low-dimensional Euclidean spaces whereas many intrinsic features ofsigned networks are reported more suitable for non-Euclidean spaces. For in-stance, previous works did not consider the hierarchical structures of networks,which is widely witnessed in real-world networks. In this work, we answer anopen question that whether the hyperbolic space is a better choice to accommo-date signed networks and learn embeddings that can preserve the correspondingspecial characteristics. We also propose a non-Euclidean signed network embed-ding method based on structural balance theory and Riemannian optimization,which embeds signed networks into a Poincar ball in a hyperbolic space. Thisspace enables our approach to capture underlying hierarchy of nodes in signednetworks because it can be seen as a continuous tree. We empirically com-pare our method against six Euclidean-based baselines in three tasks on sevenreal-world datasets, and the results show the effectiveness of our method.
Keywords: network embedding, signed networks, hyperbolic geometric ∗ The corresponding author.
Email addresses: [email protected] (Wenzhuo Song),
[email protected] (Hongxu Chen), [email protected] (Xueyan Liu), [email protected] (Hongzhe Jiang), [email protected] (Shengsheng Wang)
Preprint submitted to Neurocomputing September 1, 2020 a r X i v : . [ c s . L G ] A ug . Introduction The rapid development of the World Wide Web has enabled millions of peo-ple around the world to communicate, collaborate and share content on theweb. To analyze such complex and heterogeneous data, researchers often rep-resented this ubiquitous networked data as networks, where nodes and linksrepresent the entities and their relationships, respectively [1]. To facilitate ma-chine learning-based network analysis, Network Representation Learning (NRL)is widely studied to automatically learn low-dimension vector representation ofnodes (a.k.a. node embedding) while preserving the main structural propertiesof the original network [2, 3]. A common assumption of NRL is that the prox-imities among the vectors can reflect the relationships among the correspondingnodes such as similarity, type and polarity.Recently, Signed Network Representation Learning (SNRL) methods havegained considerable attention because the polarity of the links, i.e., positive andnegative relationship among entities in a complex networked system [4] can benaturally modelled in a signed network. It is reported that link polarities infor-mation can improve the performance of traditional tasks [5, 6] and thus signednetworks have a wide range of application scenarios such as support/oppositerelationships in social networks, synergistic/antagonistic drugs in Healthcare,and symbiotic/competitive animals in Ecosystem.Scale-free is an important property widely existed in real-world signed net-works. In Figure 1, we show that both positive and negative degree distributionsof real-world signed networks in Table 1 follow power-law distributions . Thisresult suggests that the hierarchy may be ubiquitous in many signed socialnetworks because the scale-free property is the consequence of the underlyinghierarchy [7, 8]. For example, nodes with higher hierarchy are more likely to be https://en.wikipedia.org/wiki/Scale-free network We omit Epinions2 and Slashdot2 because they have similar results w.r.t reported versions.
200 400 600 800 1000positive degree10 F r e q u e n c y Wiki-editor F r e q u e n c y Wiki-rfa F r e q u e n c y Epinions F r e q u e n c y Slashdot F r e q u e n c y Wiki-editor F r e q u e n c y Wiki-rfa F r e q u e n c y Epinions F r e q u e n c y Slashdot
Figure 1: The degree distribution of positive and negative links of four real-world signednetworks. All of them follow power-law degree distributions, which implies that these networksunderlie hierarchical organizations. connected by other nodes and vice versa.However, none of the previous methods considers this intrinsic property ofsigned networks. Beseides, existing works learn node embeddings in Euclideanspace where geometry constraints are imposed, and they may not be sufficientenough to model the data with latent hierarchies such as text, social networksand the web [9, 10]. For example, considering the task of projecting a tree (canbe seen as a simplified network) into a low-dimension Euclidean space wherethe distances between each pair of nodes are larger than a threshold. When thelevel of a tree increases, the dimensionality of the embedding space needs to bedramatically enlarged because it grows only polynomially while the size of thetree grows exponentially.On the other hand, non-Euclidean NRL methods such as hyperbolic embed-ding methods have been proposed to model network structured data as well asunstructured data such as text [9, 11, 12]. As a non-Euclidean space, hyper-bolic space is suitable for modeling datasets with power-law distributions whilecapturing the latent hierarchical structures. One reason is that the hyperbolicspace has a negative curvature so that space can expand exponentially for the3adius . The hyperbolic space can be seen as a continuous tree such that thepositions of nodes in the hyperbolic space can reflect the underlying hierarchicalpattern of the networks (i.e., nodes closer to the center can serve as the rootnodes of a network, while those nodes far apart the center are the leaf nodes).Thus, in this work, we aim to answer the question: Whether the hy-perbolic space is a better choice to represent signed networks?
Wepropose a non-Euclidean representation learning method for signed networksnamed H yperbolic S igned N etwork E mbedding ( HSNE ). Specifically, we em-ploy the structural balance theory from social theory to construct an effectiveobjective function. The key idea is that nodes connected by positive links aremore similar than those connected by negative links. For example, in a politiciansocial network, a positive link between two politicians represents they supporteach other, while a negative edge implies they are foes. The structural balancetheory is consistent in many signed social network and can provide us withguidance to learn the node embeddings [13]. Secondly, we develop an efficientlearning framework based on Riemannian stochastic gradient descent [14]. Thegradient calculation in hyperbolic space is more complex and time-consumingthan that in Euclidean space. In this work, we sample a batch of nodes andtheir positive and negative neighbor nodes to train the model. We assume thatthese triples are independent of each other so that HSNE can scale to large scaledataset. Finally, we perform extensive experiments to evaluate the effectivenessof HSNE. We compare HSNE with Euclidean-based baselines on link sign pre-diction and reconstruction tasks, and the results show that our method canachieve similar or better performance in terms of the ability of generalizationand capacity.In summary, the main contributions of this work are:1) To the best of our knowledge, HSNE is the first SNRL method based on non-Euclidean geometry. It embeds the nodes of a signed network to a Poincar https://en.wikipedia.org/wiki/Hyperbolic space
2. The Proposed Method
In this section, we discuss the proposed SNRL method named HyperbolicSigned Network Embedding, which represents the nodes in a signed networkwith vectors in a Poincar ball. Since hyperbolic space can be seen as a smoothtree, the locations of corresponding nodes can capture the underlying hierarchi-cal structure of the network, and the distances between the vectors reflect theirrelationships such as proximity and polarity.Let G = ( V, E ) denote an undirected signed network, where V = { v i ...v N } and E = { e ij } Ni,j =1 are the sets of the N nodes and M edges in G . We can usea adjacency matrix A N × N to represent G , where A ij = , e ij is positive, − , e ij is negative,0 , otherwise. (1)5SNE aims to learn a mapping function: U = f ( G ) (2)where the i th item of U = { u i } Ni =1 is a vector in a K -dimension Poincar ball D K = { u i ∈ R K , || u i || < } used to represent node v i .To learn the node embeddings for signed networks, HSNE needs a) an objec-tive function to measure the degree of fitting data, and b) an efficient learningframework used to map the nodes in signed networks to a Poincar ball. According to the objective function, SNRL methods can be roughly dividedinto similarity-based and social theory-based methods. Similarity-based meth-ods first define the similarity between nodes based on the random walk or higher-order neighbor context and then use the distance between the node vectors tofit the similarity [15, 16, 17]. These methods can consider higher-order depen-dency between nodes and often need a additional mapping function to obtainedge embeddings. Another approach usually employs the social theories, such asstatus theory and structural balance: “the friend of my friend is my friend” and“the enemy of my enemy is my enemy” [18]. Unlike similarity-based algorithms,structural balance usually treats negative links as the negation of positive, andthe corresponding nodes should be far away [19, 20, 21].Recently, the structural balance theory is extended to “one of my friendshould be closer to me than my enemies” [20]. This assumption relaxes theoriginal definition and has successfully applied to embed the nodes in signednetworks to a Euclidean space [21, 19]. Mathematically, given a node triple,i.e., three nodes v i , v j and v k in G where A ij = +1 and A ik = −
1, the extendedstructural balance theory can be written as: d ( u i , u j ) ≤ d ( u i , u k ) − λ (3)where d : ( R K , R K ) → R + denotes the distance function between two vectors,and λ > U (cid:88) ( v i ,v j ,v k ) ∈ T max(0 , d ( u i , u j ) − d ( u i , u k ) + λ ) (4)where T = { ( v i , v j , v k ) | v i , v j , v k ∈ V, A ij > , A ik < } is the triple set sampledfrom G .Next, by minimizing above objective function through gradient descent al-gorithms, we can get the node representations U = { u i } Ni =1 . However, the op-timization process has following two problems. First, the embeddings of somenodes in the network cannot be optimized since many nodes in signed networkshave only positive links. For example, if v j ’s unique anchor node v i has onlypositive neighbor nodes, v j will not appear in T, so that its vector representa-tion will not be optimized by HSNE. Second, in HSNE, U and d are defined innon-European space, so traditional gradient-based methods based on Euclideanspace in previous works cannot be used to optimize Equation 4.To solve the first problem, we relax the concepts of positive/negative neigh-bor nodes, i.e., a friend/enemy of v i is not necessarily directly connected to v i :ˆ T = { ( v i , v j , v k ) | v i , v j , v k ∈ V, ˆ A ij > , ˆ A ik < } (5)where ˆ A indicates the extended adjacency matrix. It contains the links in theoriginal training set, and few inferred links used to complement T . We canconstruct ˆ A ij through the following methods: • random sampling : If v i has no positive/negative neighbors, we ran-domly select a node from the original network; • virtual node : Suppose there is a node v , which is the enemy (or friend)of all other nodes [19]; • social theory : We utilizes structural balance theory or status theory topredict the relationship of unknown pairs of nodes [18, 22].7or the second problem, we employ the Riemannian manifold gradient algo-rithm detailed in the next subsection. Different from existing SNRL methods, the parameters and the distance inHSNE are defined on a Riemannian manifold, which means that we cannotdirectly utilize the optimization approaches of previous methods. Instead, inthis work we employ Riemannian stochastic gradient descent [14] to minimizeEquation 4. The optimization process of HSNE includes the following threesteps:1) Computing the stochastic Euclidean gradient of objective function: Specif-ically, for each triple ( v i , v j , v k ) ∈ ˆ T in the training set, the stochastic Euclideangradient for v j is defined as: (cid:53) E = ∂L ( U ) ∂d ( u i , u j ) ∂d ( u i , u j ) ∂u j (6)where L ( U ) = max(0 , d ( u i , u j ) − d ( u i , u k ) + λ ) (7)and d ( u i , u j ) = cosh − (1 + 2 (cid:107) u i − u j (cid:107) (1 − (cid:107) u i (cid:107) )(1 − (cid:107) u j (cid:107) ) ) (8)is the Poincar distance between u i and u j , and finally we get (cid:53) E = β √ γ − ( (cid:107) u i (cid:107) − (cid:104) u i ,u j (cid:105) +1 α u i − u i α ) , if λ > d ( u i , u k ) − d ( u i , u j ),0 , otherwise, (9)where α = 1 − (cid:107) u j (cid:107) , β = 1 − (cid:107) u i (cid:107) and γ = 1 + αβ (cid:107) u i − u j (cid:107) . In a similarway we can get the partial derivatives of u k .2) Deriving the Riemannian gradient from the Euclidean gradient: Sincethe hyperbolic space models are conformal to the Euclidean space, the Poincarmetric tensor g Hθ satisfies the following formula: g Hθ = λ θ g E (10)8here θ is a point in D K , λ θ = ( −(cid:107) θ (cid:107) ) is the conformal factor and g E is theEuclidean metric tensor.Finally, the Riemannian gradient (cid:53) Hθ can be calculate by (cid:53) Hθ = (1 − (cid:107) θ (cid:107) ) (cid:53) E (11)3) Applying Riemannian stochastic gradient descent (RSGD). To estimatethe parameters in Equation 4, the update formula of HSNE is: θ t +1 = R θ t ( − η t (cid:53) Hθ t ) (12)where η t is the learning rate at iteration t and R θ t ( s ) = θ t + s is a retractionoperation. We can also choose R θ t derived in [9] as: R θ t ( s ) = λ θi ( cosh ( λ θi (cid:107) s (cid:107) ) + (cid:104) θ i , s (cid:107) s (cid:107) ) sinh ( λ θi (cid:107) s (cid:107) )) ( λ θi − ) cosh ( λ θi (cid:107) s (cid:107) ) + λ θi (cid:104) θ i , s (cid:107) s (cid:107) (cid:105) sinh ( λ θi (cid:107) s (cid:107) ) θ i + (cid:107) s (cid:107) sinh ( λ θi (cid:107) s (cid:107) ) ( λ θi − ) cosh ( λ θi (cid:107) s (cid:107) ) + λ θi (cid:104) θ i , s (cid:107) s (cid:107) sinh ( λ θ i (cid:107) s (cid:107) ) , (13)but in the experiments we find that the results of the two operations are close,and the former is less computationally intensive.Finally, we employ a proj operator to avoid abnormal pointsproj( θ ) = θ/ (cid:107) θ (cid:107) − ε if (cid:107) θ (cid:107) ≥ θ otherwise (14)where ε is a hyperparameter with small value.Note that HSNE can easily scale to large dataset, because (1) we assumethat the triples in Equation 4 used to train the model are independent and(2) the proposed learning framework adopts a stochastic gradient optimizationframework. At each step, our method randomly samples a batch of node triplesto calculate the Riemannian gradient of the parameters, and then update themodel with Equation 12. In the case of a large network, we can specify a smallbatch size to reduce the amount of calculation of gradient calculation, so thatHSNE can scale to large networks. 9 . Experiments In this section, we use two groups of experiments to verify the effectivenessof HSNE. In the first group, we compare the performance of HSNE with Eu-clidean SNRL baselines in two real-world tasks, i.e., link sign prediction andreconstruction. Link sign prediction aims to test the generality ability of theSNRL methods. It is a widely used evaluation task and can predict the polarityof relationships in a complex system such as social networks, e-commerce, andthe web. Link sign reconstruction is to predict the signs of known links fromthe results of the methods. We design this task to test the ability of capacity ofthe SNRL methods to extract and store information.The second group of experiments is to evaluate whether the learned nodevectors can reflect the latent hierarchical structure in signed networks. HSNEembeds the nodes in a real-world network to a 2-dimensional Poincar disk. Werefer node vectors near the center to root nodes of the network, and those farfrom the center as leaf nodes of the network.
To evaluate the performance of the SNRL algorithms in real-world tasks, weconduct experiments on following seven real-world networks: • Wiki-editor [16] is a collaborative social network where each node is aneditor of Wikipedia . A positive link between two editors represents mostpages they co-edited are from the same category, and vice versa. • Wiki-rfa [23] was originally crawled for Person-to-Person sentiment anal-ysis by SNAP . Each node in this network represents a Wikipedia editorwho wants to become an administrator and a positive/negative link fromeditor i to j means i voted for/against j . https://snap.stanford.edu/data/wiki-RfA.html https://snap.stanford.edu/index.html Epinions1 [24] is a consumer review site founded in 1999. The positive(negative) links represent trust (distrust) relationship between two users,i.e., the nodes in the network. • Epinions2 [22] was crawled by Stanford SNAP group. Compared withEpinions1, this network is larger and more sparse. • Slashdot1 [22] is a technology news website that allows the users tosubmit stories and article links. Any user can mark other people as afriend (positive link) or foe (negative link). • Slashdot2 [25] is another network crawled from Slashdot. The nodesand links of this network have the same meanings as Slashdot1. • Correlates of War (CoW) [26, 27] is a international relations network.There are 137 nodes in this network, and each node represents a country.They are connected by 1152 links, where positive and negative links denotemilitary alliances and disputes, respectively.We remove the isolated nodes and small connected components for each data.The primary statistical information of these networks can be found in Table 1. In the first group of experiments, we compare the performance of followingSNRL methods to evaluate the effectiveness of our method: • SC [28] is a spectrum-based method. It transposes the eigenvector matrixcorresponding to the K smallest eigenvalues of the Laplacian matrix ofthe network as a low-dimensional representation of each node. https://snap.stanford.edu/data/soc-sign-epinions.html https://snap.stanford.edu/data/soc-sign-Slashdot081106.html http://konect.uni-koblenz.de/networks/slashdot-zoo http://mrvar.fdv.uni-lj.si/pajek/SVG/CoW/ able 1: The summary of seven real world signed networks used in the experiments. Data • SNE [16] adopts a random walk approach to obtain the context of a node,i.e., the signs of links and the nodes along the path. The node embeddingsare then calculated based on their similarity of node context by the Log-bilinear model[29]. • SiNE [19] uses a structural balanced-based objective function and em-ployes multi-layer neural networks to measure the distances between nodes. • SLF [30] considers neural and none relationship between node pairs inaddition to observed positive and negative links. It is designed for signprediction task and networks of any sparsity. • SIDE [17] develops a link direction and sign aware random walk frame-work to preserve the information along with multi-step connections. Inour experiments, all terms, e.g., signed proximity and bias term, are usedto represent the nodes. • BESIDE [31] combines the social balance and status theories in a jointneural network. The basic idea is that these two social-psychologic theoriescan complement each other. • HSNE is the proposed SNRL method. It embeds each node to a hyper-bolic space where the nodes are spontaneously organized hierarchically.These methods can be roughly divided into similarity-based and social theory-based methods, according to the way of organizing nodes in vector space. SiNE12 able 2: The functions for network embedding methods which map two node vectors, i.e., u i and u j , to an edge vector, i.e., B ij . All functions are element-wised and output a low-dimension vector. Operator Definitionhadamard B ij = u i ∗ u j l1-weight B ij = | u i − u j | l2-weight B ij = | u i − u j | concate B ij = u i : u j average B ij = ( u i + u j )and HSNE are social theory-based methods and assume that a node should becloser to its friends than foes. Thus, we directly use Euclidean distance andPoincar distance to predict the signs of links for SiNE and HSNE, respectively.Mathematically, we use s ( i, j ) to represent the score that A ij is a positive link,which is defined as: s ( i, j ) = − d ( i, j )where d ( i, j ) = u i u Tj for SiNE and d ( i, j ) = cosh − (1 + (cid:107) u i − u j (cid:107) (1 −(cid:107) u i (cid:107) )(1 −(cid:107) u j (cid:107) ) ) forHSNE.For other methods, such as SC and SNE, the distance between two pointsonly represents their similarity. In other words, two nodes farther away do notmean that a negative link connects them. In order to predict the type of links,these methods first design several functions to map node vectors to edge vectorsand then train a classifier to predict the type of links. Following the settings inprevious works [32, 15, 16], in this work, we test five mapping functions listedin Table 2. We also employ logistic regression as the classifiers and use theirpredict confidence scores to evaluate the results. The link sign prediction and reconstruction are essentially binary classi-fication tasks. Since the number of the two types of links in each network is13nbalanced (see Table 1), we use Macro-F1, Micro-F1 and Area under the curve(AUC) score on the test set to evaluate the performance of each method.
Macro F1 and Micro F1:
Let tp , f p , tn and f n denote true positives,false positives, true negatives and false negatives, the precision and recall aredefined as: precision = tptp + f precall = tptp + f n F1 score is defined as the harmonic average of the precision and recall: F · precision · recallprecision + recall Let F + and F − be the F1 scores for positive and negative links respectively.Macro-F1 and Micro F1 are defined as M acroF (cid:88) s ∈{ + , −} ( 12 · F s )and M icroF (cid:88) s ∈{ + , −} ( c s M · F s )where c s is the number of links with label s and n is the total number of links.To calculate F1 scores, SNRL algorithms need a binary classifier to predictthe type of each link in the test set. For SiNE and HSNE, we adopt the gridsearch algorithm on the validation set to obtain thresholds with best results andthen predicts the type of links on the test set as: A ij = 1 , if s ( i, j ) > thresholdA ij = − , if s ( i, j ) < threshold For other algorithms, we follow their setup to calculate edge embeddings vianode embeddings and five operators in Table 2, and then train a logistic regres-sion with the edge embedding and corresponding sign as labels on the validationset. Finally, we use the classifier to predict the sign of links in the test set.14
UC Score:
The value of
AU C depends on the rankings of positive andnegative links [33]. For each pair of links in test set { ( e ij , e mn ) | e ij ∈ E + , e mn ∈ E − } , we first get score ( e ij , e mn ) = , if s ( i, j ) > s ( m, n )0 . , if s ( i, j ) = s ( m, n )0 , if s ( i, j ) < s ( m, n ) (15)and the final score is obtained by averaging over all pair of links, i.e., AU C = 1 | E + | · | E − | | E + | (cid:88) i,j | E − | (cid:88) m,n score ( e ij , e mn ) (16) In the experiments, we use the grid search algorithm to tune the hyperparam-eters of each method on the validation set. For SNE, we vary the sample size ss and path length pl in ss ∈ { , , } and pl ∈ { , , } , respectively. For SiNE,we test real triplets δ and virtual triplets δ in { . , . } . For HSNE, we testthe three methods to construct ˆ A ij and find the way to add a virtual node thatperforms better and efficient. In the visualization task, we use random samplingmethod because the results are clearest. We set λ = { . , . , . , . , . , . } in link sign prediction task for Wiki-editor, Wiki-rfa, Epinions1, Epinions2,Slashdot1 and Slashdot2 respectively, and λ = 0 . K in the methods are all set to20. For other parameters, we use the default settings suggested by the authorsin the papers or the source codes. This section contains two groups of experiments, i.e., link sign predictionand reconstruction. In the first group, we hide 20% links from each network(10% as the validation set and the other 10% as the test set) and use the re-maining links as the training set. The network representation learning methodslearn the node embeddings from the training set and then predict the signs of15 able 3: The average results of the SNRL methods in link sign prediction task. For SiNEand HSNE, we directly use the distance between two nodes. For other methods, we reportthe best result of five mapping functions, and the bold numbers represent the best results othe algorithms on the test set.
Methods Metrics Wiki-editor Epinions1 Epinions2 Slashdot1 Slashdot2 Wiki-rfaSC mac F1 0.567 0.504 0.556 0.480 0.482 0.523mic F1 0.765 0.616 0.741 0.750 0.745 0.647AUC 0.619 0.576 0.612 0.527 0.515 0.619SNE mac F1 0.538 0.465 0.469 0.494 0.483 0.476mic F1 0.587 0.557 0.552 0.531 0.514 0.519AUC 0.614 0.553 0.558 0.538 0.526 0.529SiNE mac F1 0.793 0.619 0.641 0.627 0.622 0.564mic F1 0.850 0.794 0.803 0.723 0.708 0.706AUC 0.890 0.709 0.723 0.694 0.677 0.587SIDE mac F1 0.807 0.719 0.703 0.691 0.684 0.628mic F1 0.845 0.814 0.800 0.737 0.731 0.732AUC 0.918 0.865 0.838 0.791 0.775 0.722SLF mac F1 0.817 0.774 0.782 mic F1 0.880
AUC 0.910 0.909
HSNE mac F1 hidden links. Better prediction results mean that the algorithm has better gen-eralization ability. In the second group, we use the same experimental setupas the link sign prediction task except that the training set and the test setare both the entire network. Intuitively, a higher reconstruction score showsthe corresponding model can extract and preserve more information from thedata. We use the first five data sets because the small size of the CoW leadsto unstable results after hiding links. We run each methods 5 times and reportthe average performance on Table 3 and Table 4.16 able 4: The results of the SNRL methods in link sign reconstruction task. The experimentalsetup is the same as link sign prediction task except that the training set and the test setare both the entire dataset. This group of experiments aims to compare the capacity of eachalgorithm to extract and preserve information of the original networks.
Methods Metrics Wiki-editor Epinions1 Epinions2 Slashdot1 Slashdot2 Wiki-rfaSC mac F1 0.557 0.513 0.569 0.520 0.521 0.530mic F1 0.766 0.645 0.723 0.767 0.632 0.650AUC 0.643 0.570 0.626 0.578 0.561 0.619SNE mac F1 0.921 0.797 0.468 0.717 0.749 0.495mic F1 0.942 0.875 0.533 0.767 0.791 0.542AUC 0.988 0.946 0.556 0.846 0.878 0.554SiNE mac F1 0.806 0.665 0.647 0.630 0.647 0.570mic F1 0.862 0.836 0.812 0.744 0.752 0.720AUC 0.908 0.750 0.724 0.698 0.705 0.598SIDE mac F1 0.820 0.691 0.714 0.717 0.726 0.662mic F1 0.857 0.787 0.793 0.762 0.768 0.722AUC 0.913 0.848 0.872 0.837 0.840 0.771SLF mac F1 0.813 0.799 0.796 0.825 0.824 0.794mic F1 0.850 0.881 0.864 0.860 0.860 0.835AUC 0.926 0.934 0.926 0.932 0.933 0.923BESIDE mac F1 0.857 0.848 0.863 0.863 0.858 0.822mic F1 0.899 0.931 0.927 mic F1
AUC
In this group of experiments, we are interested in the ability of HSNE tocapture the latent hierarchical structure in signed networks. We use HSNE toembed the nodes in CoW to a 2-dimension Poincar disk. Since Poincar disk canbe seen as a continuous version of the tree, we refer the nodes close to the centerof the disk as the root nodes of the network, and those nodes far from the centerare leaf nodes. We divide the space into five areas according to the radius, withthe same number of nodes in each area, and the results can be found in Figure2. We also summarize some essential statistical characteristics of each group inFigure 3.From Figure 2 (a) and 3, we can find that some neutral countries, such asLuxembourg and Iceland, lie inside the group which is closest to the center.These countries have many friendly countries and few unfriendly countries, andcan, therefore, be seen as a bridge or hub between many countries. As thedistance from the center increases, the countries tend to form large alliancesand are hostile to some countries. Some countries that are hostile to many19 .00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.001.000.750.500.250.000.250.500.751.00
USA RUSICELUXGRCBEL a) Countries in CoW b) Links in CoW
Figure 2: Visualization of countries in CoW network. Each point represents a node of thenetwork, and the red and green lines represent positive or negative links between nodes,respectively. a) We divide the Poincar disk into five parts by radius, with the same numberof nodes in each. We can find that some neutral countries, such as Luxembourg, Iceland, aremapped inside the smallest circle. We refer these nodes as the root nodes of the network.b) The network generally consists of two communities where the positive links within thecommunity are dense, i.e., the red areas in the upper left and lower right. The distancebetween nodes in different communities is large, meaning that they are more likely to beconnected by negative links. On the other hand, all nodes are close to the root nodes. group d) Poincaré norm group c) d + /( d + + d ) group a) d + group b) d Figure 3: Comparison of basic information of the five groups of nodes in Figure 2, sorted bythe average Poincar norm: a) average positive degree d + ; b) average negative degree d − ; c)radio d + /d − , and d) average distance from the center, i.e., Poincar norm countries are located at the margin of the space. From Figure 2 (b) we canfind that this network generally contains two communities, where the positivelinks within the community are dense, i.e., the two sectors at the upper left andlower right. This is because HSNE utilizes an objective function based on thestructure balance theory, i.e., nodes connected by positive links are close to eachother, and vice versa. If we consider these two groups as the two “subtrees”of the network and then regard the nodes near the center as the “root” of atree, we can imagine the whole network as a binary tree. In this hierarchicalstructure, neutral countries have a higher level and serve as a bridge betweencountries. The countries with lower levels are not friendly to many countriesexcept neutral countries. We can further recursively embed the nodes in eachcommunity, and finally, get the hierarchical structure of the network.In the above experiments, we can conclude that HSNE can capture theunderlying hierarchical structure in the signed networks. One may raise thequestion of why HSNE can represent the network hierarchically? To answerthis question, we plot the nodes of CoW in Figure 4 where the x-axis and y-axis represent the distance from the center and the average distance from othernodes, respectively. We can find that, in Poincar ball, if a point is closer toother points, it is more likely to have a high level in the hierarchy, i.e., small21 .0 0.5 1.0 1.5 2.0 distance to the center d i s t a n c e t o o t h e r p o i n t s Points in Poincaré Ball
Figure 4: Visualization of the nodes in COW dataset. The x-axis represents the distancebetween a node and the center, and the y-axis is the average distance to other nodes. Thisfigure illustrates how HSNE organizes nodes in a network in hyperbolic space.
Poincar norm, and vice versa. Recall that the positive link of this networkrepresents the friendship of two countries, while the negative link representshostility. Thus, we can conclude that: 1) the countries near the center are morelikely to be friendly with other countries because their average distances fromother nodes are small. 2) As the number of unfriendly countries increases, thenodes gradually move away from the center. These countries also tend to formlarge alliances because nodes connected by positive links are more likely to beclose to each other. 3) Nodes with many negative links are more likely to havelower levels in the hierarchy because they should be far away from many nodesin the network.
4. Related works
To analyze complex and non-linear systems [34, 35, 36], researchers oftenrepresented this ubiquitous networked data as networks, where nodes and linksrepresent the entities and their relationships, respectively [1]. To facilitate ma-chine learning-based network analysis algorithms, Network representation learn-ing (NRL) methods aim to represent the nodes of a network as low-dimensional22ectors [2, 3]. The distances between the vectors reflect the relationship (suchas similarity and weight) between the nodes so that they can be used to visual-ize the network [37] and perform machine learning-based network analysis taskssuch as node classification [38], link prediction [39] and clustering [40]. NRLmethods can be divided into three categories: factorization methods, randomwalk-based techniques, and deep learning-based. Factorization methods suchas GraRep [41] and HOPE [42] use dimensional reduction algorithms to pro-cess the matrix representation of the network, such as node adjacency matrix,node transition probability matrix, and Laplacian matrix. Random walk-basedNRL methods such as DeepWalk [43] and Node2vec [32] utilize random walkalgorithms to obtain the context of each node and then embed them to low-dimension space by the similarity of contexts. This approach is especially usefulwhen the network is large or local visible. Recently, deep learning-based meth-ods have become popular due to its powerful ability to model high non-lineardata [44, 45, 46].In recent years, signed network representation learning (SNRL) has gainedconsiderable attention and proved effective in many tasks, such as node classifi-cation and sign prediction [15]. Compared with NRL, SNRL needs to considermore complex semantic information such as the polarity of the links. Recentworks point out that negative links have added values over positive links andcan improve the performance of traditional tasks [5, 6]. In order to obtain ef-fective node embeddings, SNRL methods have the following two basic steps: 1)designing an objective function to learn low dimension node embeddings. Inthis step, we can either interpret the negative links as the negation of positiveor others of links [4, 47, 15]. 2) learning node embeddings using an efficientoptimization framework. This step is essential to learn a mapping functionfrom node proximity to low-dimensional Euclidean space. Popular approachesinclude word2vec [48] and eigenvalue decomposition [28].Many recent works utilize advanced techniques to further improve the per-formance of learning framework, such as deep neural networks [19, 15], atten-tion mechanism [49], graph convolutional operation [50] and negative sampling23ethod [51]. For example, SiNE [19] and nSNE [15] apply deep neural networksand metric learning to structural balance-based and node similarity-based ob-jective functions, respectively. SIGNet [51] propose a novel social theory-basednegative sampling technology to optimize classic similarity-based functions ef-ficiently. However, most existing algorithms aim to map nodes to Euclideanspace, which is an essential difference from our proposed algorithm.Two general statistical characteristics are widely found in real-world net-works: a) scale-free which refers to the degree distribution of a network followsa power-law distribution p ( k ) = k − γ , where 2 < γ < https://en.wikipedia.org/wiki/Scale-free network . Conclusion and Future Work In this paper, we develop a novel signed network embedding method basedon hyperbolic space. This method automatically learns low-dimensional vectorrepresentations of nodes in a signed network to facilitate network analysis al-gorithms such as visualization, signed prediction and link reconstruction. Weemploy structural balance theory from social theory field to construct an ob-jective function because many works have reported that most signed networksare balanced or tend to become balanced. This theory guarantees that sim-ilar nodes are mapped to close locations in embedding space, and dissimilarnodes are mapped to distant locations. Since the learning algorithms in previ-ous SNRL methods cannot be applied to non-European space, we develop anefficient learning framework based on Riemannian stochastic gradient descent.This framework allows HSNE to scale to the large-scale dataset. We empiricallyuse link sign prediction and reconstruction tasks to compare the performance ofHSNE and Euclidean-based SNRL methods, and the results show that hyper-bolic embedding can be a better space than Euclidean counterparts to representsigned networks. We also use HSNE to embed a real-world dataset CoW. Wefind our method places neutral countries near the center of the Poincar disk.These countries have many friends and few enemies and therefore can be seen asthe bridge or hub in the network. On the other hand, as the distance from thecenter increases, the countries tend to form alliances and are hostile to othercountries. These results suggest that HSNE can extract a meaningful latenthierarchical structure from signed networks.However, HSNE does not consider the rich node attribute information, whichcan further improve the performance intuitively. Besides, advanced technologiessuch as attention mechanism and graph convolutional operation are also notintegrated into HSNE. They may provide a good boost in some domains ortasks. In the future, we will study the problem of recommendation systemsbased on hyperbolic embedding methods.25 . Acknowledgements
This work was funded by the Science & Technology Development Project ofJilin Province, China [grant numbers 20190302117GX, 20180101334JC]; Innova-tion Capacity Construction Project of Jilin Province Development and ReformCommission [grant number 2019C053-3], China Scholarship Council under grantnumber 201906170208, 201906170205.Declarations of interest: none.
References [1] M. Newman, Networks: An Introduction, Oxford University Press, 2010.[2] D. Zhang, J. Yin, X. Zhu, C. Zhang, Network representation learning: Asurvey, IEEE transactions on Big Data.[3] P. Goyal, E. Ferrara, Graph embedding techniques, applications, and per-formance: A survey, Knowledge-Based Systems 151 (2018) 78–94.[4] J. Tang, X. Hu, H. Liu, Is distrust the negation of trust?: the value ofdistrust in social media, in: Proceedings of the 25th ACM conference onHypertext and social media, ACM, 2014, pp. 148–157.[5] J. Leskovec, D. Huttenlocher, J. Kleinberg, Predicting positive and negativelinks in online social networks, in: Proceedings of the 19th InternationalConference on World Wide Web, WWW ’10, ACM, New York, NY, USA,2010, pp. 641–650. doi:10.1145/1772690.1772756 .URL http://doi.acm.org/10.1145/1772690.1772756 [6] P. Victor, C. Cornelis, M. D. Cock, A. M. Teredesai, Trust- and distrust-based recommendations for controversial reviews, IEEE Intelligent Systems26 (1) (2011) 48–55. doi:10.1109/MIS.2011.22 .[7] A. Clauset, C. Moore, M. E. Newman, Hierarchical structure and the pre-diction of missing links in networks, Nature 453 (7191) (2008) 98–101.268] E. Ravasz, A.-L. Barab´asi, Hierarchical organization in complex networks,Physical review E 67 (2) (2003) 026112.[9] O.-E. Ganea, G. Becigneul, T. Hofmann, Hyperbolic entailment cones forlearning hierarchical embeddings, in: International Conference on MachineLearning, 2018, pp. 1632–1641.[10] D. Krioukov, F. Papadopoulos, M. Kitsak, A. Vahdat, M. Bogun´a, Hy-perbolic geometry of complex networks, Physical Review E 82 (3) (2010)036106.[11] A. Tifrea, G. B´ecigneul, O.-E. Ganea, Poincar \\