[PDF] Hypergraph Laplacians in Diffusion Framework

Abstract

Full PDF

HH YPERGRAPH L APLACIANS IN D IFFUSION F RAMEWORK

A P

REPRINT

Mehmet Emin Aktas

Department of Mathematics and StatisticsUniversity of Central OklahomaEdmond, OK 73034 [email protected]

Esra Akbas

Department of Computer ScienceOklahoma State UniversityStillwater, OK 74078 [email protected]

February 18, 2021 A BSTRACT

Networks are important structures used to model complex systems where interactions take place. In abasic network model, entities are represented as nodes, and interaction and relations among them arerepresented as edges. However, in a complex system, we cannot describe all relations as pairwiseinteractions, rather should describe as higher-order interactions. Hypergraphs are successfully usedto model higher-order interactions in complex systems. In this paper, we present two new hypergraphLaplacians based on diffusion framework. Our Laplacians take the relations between higher-orderinteractions into consideration, hence can be used to model diffusion on hypergraphs not only betweenvertices but also higher-order structures. These Laplacians can be employed in different networkmining problems on hypergraphs, such as social contagion models on hypergraphs, inﬂuence studyon hypergraphs, and hypergraph classiﬁcation, to list a few. K eywords Hypergraph · simplicial complex · Laplacian · diffusion Networks are important structures used to model complex systems where interactions take place. In a basic networkmodel, entities are represented as nodes and interaction/relations among them are represented as edges. Many differentareas utilize the network data to model complex relations such as biology, chemistry, ﬁnance, and social sciences.Diffusion on networks is an important concept in network science that models how a stuff, such as information andheat, diffuses between vertices based on network topology, the pattern of who is connected to whom. For example,in a social network, modeling information diffusion can be useful in rumor controlling [1]. The graph Laplacian hasbeen used to model diffusion on basic networks that only consider pairwise relations between entities. However, as wesee in different real-world applications, such as human communication, chemical reactions, and ecological systems,interactions can occur in groups of three or more nodes. They cannot be simply described as pairwise relations [2],rather should be described as higher-order interactions. For example, in ecological systems, interactions can occurin groups of three or more nodes. As another example, in a coauthorship network, nodes represent authors and edgesrepresent coauthorship between authors. An article with three authors results in an edge between each pair of the authors.However, these edges cannot be distinguished from an edge that corresponds to an article with two authors. Hence, therich higher-order interactions are lost in the basic network model and we need to take higher-order interactions intoconsideration for a more accurate representation of complex systems. As one solution to this problem, hypergraphs areused to model complex systems [3, 4]. In a hypergraph, nodes again represent entities as it happens for graphs, butdifferently, a hypergraph has hyperedges for higher-order interactions in the network.Despite the extreme success of the graph Laplacian in the network mining area, there are limited studies on hypergraphLaplacians. [5, 6, 7, 8, 9, 10, 11]. Besides, existing studies mainly have three issues. Firstly, the hypergraph Laplaciansin these studies are only deﬁned for special hypergraphs, such as uniform hypergraphs, i.e., the hypergraphs with aﬁxed size hyperedges. Secondly, these hypergraph Laplacians neglect the relations between higher-order structures i.e.,is not suitable to model diffusion on hypergraphs. Thirdly, the hypergraph Laplacians are only used for computing the a r X i v : . [ c s . S I] F e b PREPRINT - F

EBRUARY

18, 2021graph-theoretic measures such as the average minimal cut, the isoperimetric number, the max-cut, and the independencenumber. These studies mostly about the spectral theory of the hypergraph Laplacians and could not ﬁnd a place inthe applied network science area. For example, there are no studies that employ the hypergraph Laplacian to modeldiffusion between higher-order structures and its applications in the network mining area.To address the issues discussed above, we develop new and more general hypergraph Laplacians in this paper. Weﬁrst represent hypergraphs as a simplicial complex . A simplicial complex is a topological object which is built as aunion of vertices, edges, triangles, tetrahedron, and higher-dimensional polytopes, i.e. simplices . In our representation,simplices will represent hyperedges. We then develop two hypergraph Laplacians, one is based on diffusion betweenﬁxed dimension simplices and the other is based on diffusion between all simplices. Our Laplacians do not requirethe hypergraph to be uniform. Our objective here is to take the relation between hyperedges, i.e. simplices, intoconsideration for hypergraph Laplacians, which is, for instance, crucial in modeling diffusion on hypergraphs.The paper is formatted as follows. In Section 2, we ﬁrst give the necessary preliminaries and background on graphs,hypergraphs, and Laplacians. In Section 3, we develop two new hypergraph Laplacians to address the needs. Our ﬁnalremarks with future work directions are found in Section 4.

In this section, we discuss the preliminary concepts for graphs, hypergraphs, graph Laplacian and hypergraph Laplacian.We also elaborate on related work with a particular focus on the hypergraph Laplacian that uses simplicial complex.

Graphs , also called networks in literature, are structured data representing relationships between objects [12, 13]. Theyare formed by a set of vertices (also called nodes) and a set of edges that are connections between pairs of vertices. In aformal deﬁnition, a network G is a pair of sets G = ( V, E ) where V is the set of vertices and E ⊂ V × V is the set ofedges of the network.There are various types of networks that represent the differences in the relations between vertices. While in an undirected network , edges link two vertices v, w symmetrically, in a directed network , edges link two verticesasymmetrically. If there is a score for the relationship between vertices that could represent the strength of interaction,we can represent this type of relationship or interactions by a weighted network . In a weighted network, a weightfunction w : E → R is deﬁned to assign a weight for each edge.A hypergraph H denoted by H = ( V, E = ( e i ) i ∈ I ) on the vertex set V is a family ( e i ) i ∈ I ( I is a ﬁnite set of indexes)of subsets of V called hyperedges . We say that a hypergraph is regular if all its vertices have the same degree anduniform if all its hyperedges have the same cardinality.In the rest of the paper, we study weighted undirected graphs and hypergraphs unless otherwise is stated. Let G be a weighted undirected graph with the vertex set V and a weight function w : E → R ≥ . Here we assume that w ( v, v ) = 0 for all v ∈ V and if two vertices u and v are not adjacent, then w ( u, v ) = 0 . A unweighted graph can beviewed as a special weighted graph with weight 1 on all edges and 0 otherwise.The adjacency matrix A of G is deﬁned as the n × n matrix with A ( i, j ) = w ( v i , v j ) for i, j ∈ { , ..., n } with n beingthe number of vertices of G . Furthermore, let D be the n × n diagonal matrix with D ( i, i ) = (cid:80) j w ( i, j ) , i.e., theweighted degree of the vertex i ≤ n .The graph Laplacian, ﬁrst appeared in [14] where the author analyzed ﬂows in electrical networks, is an operator ona real-valued function on vertices of a graph. We can deﬁne the graph Laplacian L as L = D − A where D is theweighted degree matrix and A is the weighted adjacency matrix. It has been shown that the spectrum of Laplacian isrelated with many graph features such as connected components, spanning trees, centralities, and diffusion [15]. In this paper, we develop hypergraph Laplacians inspiring from the simplicial complex hypergraph representation,namely the simplicial Laplacian (or Hodge Laplacian). That is why, in this section, we explain the simplical Laplacian.2

PREPRINT - F

EBRUARY

18, 2021For other hypergraph Laplacians, we refer readers to the following references [5, 6, 7, 8, 9, 11] We start with thedeﬁnition of a simplicial complex.A simplicial complex is a topological object which is built as a union of points, edges, triangles, tetrahedron, andhigher-dimensional polytopes, namely simplices . A 0-simplex is a point, a 1-simplex is two points connected with aline segment, a 2-simplex is a ﬁlled triangle etc (see Figure 1).Figure 1: 0-,1-,2-, and 3-simplex from left to right (borrowed from [16]).More formally, a simplicial complex K is a ﬁnite collection of simplices, i.e., points, edges, triangles, tetrahedron, andhigher-dimensional polytopes, such that every face of a simplex of K belongs to K and the intersection of any twosimplices of K is a common face of both of them. In graphs, 0-simplices correspond to vertices, 1-simplices to edges,2-simplices to triangles, and so on. We denote an i -simplex as σ = [ v , . . . , v i ] where v j ∈ V for all j ∈ { , . . . , i } .Let S p ( K ) be the set of all p -simplices of K . An i - chain of a simplicial complex K over the ﬁeld R is a formal sum ofits i -simplices and i -th chain group of K with real number coefﬁcients, C i ( K ) = C i ( K, R ) , is a vector space over R with basis S i ( K ) . The i -th cochain group C i ( K ) = C i ( K, R ) is the dual of the chain group which can be deﬁned by C i ( K ) := Hom ( C i ( K ) , R ) . Here Hom ( C i , R ) is the set of all homomorphisms of C i into R . For an ( i + 1) -simplex σ = [ v , . . . , v i +1 ] , its coboundary operator , δ i : C i ( K ) → C i − ( K ) , is deﬁned as ( δ i f )( σ ) = i +1 (cid:88) j =1 ( − j f ([ v , . . . , ˆ v j , . . . , v i +1 ]) , where ˆ v j denotes that the vertex v j has been omitted. The boundary operators , δ ∗ i , are the adjoints of the coboundaryoperators, · · · C i − ( K ) δ i +1 (cid:29) δ ∗ i +1 C i ( K ) δ i (cid:29) δ ∗ i C i +1 ( K ) · · · satisfying ( δ i a, b ) C i +1 = ( a, δ ∗ i b ) C i for every a ∈ C i ( K ) and b ∈ C i +1 ( K ) , where ( · , · ) C i denote the scalar producton the cochain group.In [10], the three simplicial Laplacian operators for higher-dimensional simplices, using the boundary and coboundaryoperators between chain groups, are deﬁned as L down p = δ p − δ ∗ p − down Laplacian L up p ( K ) = δ ∗ p δ p up Laplacian L p ( K ) = L up p + L down p LaplacianThese operators are self-adjoint, non-negative, compact and have different spectral properties [10].To make the expression of Laplacian explicit, they identify each coboundary operator δ p with an incidence matrix D p in [10]. The incidence matrix D p ∈ R n p +1 × R n p encodes which p -simplices are incident to which ( p + 1) -simpliceswhere n p is number of p -simplices. It is deﬁned as D p ( i, j ) = (cid:26) if σ pj is on the boundary of σ p +1 i otherwiseHere, we assume the simplices are not oriented. One can incorporate the orientations by simply adding “ D p ( i, j ) = − if σ pj is not coherent with the induced orientation of σ p +1 i " in the deﬁnition if needed.Furthermore, we assume that the simplices are weighted, i.e. there is a weight function z deﬁned on the set ofall simplices of K whose range is R + . Let W p be an n p × n p diagonal matrix with W p ( j, j ) = z ( σ pj ) for all j ∈ { , . . . , n p } . Then, the i -dimensional up Laplacian can be expressed as the matrix L up i = W − i D Ti W i +1 D i . PREPRINT - F

EBRUARY

18, 2021Similarly, the i -dimensional down Laplacian can be expressed as the matrix L down i = D i − W − i − D Ti − W i . L down i is only deﬁned for i ≥ and is equal to 0 for i = 0 . Then, to express the i -dimensional Laplacian L i , we canadd these two matrices. In this paper, we develop two new hypergraph Laplacians motivating from diffusion framework. Using the simplicialLaplacian deﬁned in Section 2.2.2 in diffusion has two issues. First, for a ﬁxed simplex dimension k , the up simplicialLaplacian models the diffusion through only ( k +1) -simplices and the down simplicial Laplacian only ( k − -simplices.However, in the diffusion framework, a stuff on a simplex, such as heat or information, can diffuse through othersimplices regardless of the dimension. For instance, in a coauthorship network, the simplicial Laplacians assume, forexample, an article with three authors can affect other articles with three authors through only the articles with two orfour authors. But this is not realistic since that article may also affect other articles through an article with one author aswell. Second, when we use the simplicial Laplacians in modeling diffusion, we need to assume that a stuff only diffusesbetween k -simplices. However, a simplex can affect other simplices regardless of the dimension. For instance, in acoauthorship network, the simplicial Laplacians assume, for example, an article with three authors can affect only thearticles with three authors. But this is again not realistic for a similar reason.To address these two issues, we deﬁne two new hypergraph Laplacians over simplices. The ﬁrst hypergraph Laplacianallows deﬁning diffusion between ﬁxed dimension simplices through any simplices, which addresses the ﬁrst issue.The second hypergraph Laplacian we propose allows deﬁning diffusion between any simplices through any simplices,which addresses the second issue. Here, we construct these two Laplacians. In this section, we prose a Laplacian between a ﬁxed dimension simplices through any simplices. Let H be a hypergraphwith the maximum simplex dimension n . In the simplicial Laplacian in Section 2.2.2, the incidence matrix is onlydeﬁned between p -simplices and ( p + 1) -simplices for ≤ p < n . In order to deﬁne the Laplacian between p -simplicesthrough other simplices, not only ( p − - and ( p + 1) -simplices, we deﬁne a new incidence matrix as follows. Deﬁnition 1

The incidence matrix between p - and r -simplices D p,r ∈ R n r × R n p for p < r encodes which p -simplicesare incident to which r -simplices where n p is number of p -simplices. It is deﬁned as D p,r ( i, j ) = (cid:26) if σ pj is on the boundary of σ ri otherwise The incidence matrix in the deﬁnition above allows us to deﬁne the Laplacian between k -simplices through anysimplices as follows. Deﬁnition 2

Laplacian between k -simplices through l -simplices with k (cid:54) = l in a hypergraph is deﬁned as L k,l = (cid:26) W − l D Tk,l W k D k,l if k < lD l,k W − k D Tl,k W l if k > l In the deﬁnition above, we follow the idea of the up and down Laplacians deﬁned in Section 2.2.2. Now, to deﬁne thehypergraph Laplacian between k -simplices through all simplices, we add up all the Laplacians as follows. Deﬁnition 3

Let H be a hypergraph with the maximum simplex dimension n . Then, Laplacian between k -simplicesthrough other simplices in H is deﬁned as L k = L k, + L k, + · · · + L k,k − + L k,k +1 + · · · + L k,n for k ∈ { , . . . , n } . Here we provide an example to the hypergraph Laplacian in Deﬁnition 3 on a toy graph.

Example 4

The simplicial complex in Figure 2 has four vertices (0-simplices), ﬁve edges (1-simplices) and twotriangles (2-simplices). PREPRINT - F

EBRUARY

18, 2021Figure 2: An unweighted simplicial complex with four vertices (0-simplices), ﬁve edges (1-simplices) and two triangles(2-simplices).

The corresponding incidence matrices as in Deﬁnition 1 are as follows. D , = v v v v   e e e e e , D , = v v v v (cid:16) (cid:17) t t , D , = e e e e e (cid:16) (cid:17) t t . Then, following Deﬁnitions 2 and 3, we get the Laplacian between 0, 1 and 2-simplices as follows. L = L , + L , =   +   =   , L = L , + L , =   +   =   , L = L , + L , = (cid:18) (cid:19) + (cid:18) (cid:19) = (cid:18) (cid:19) . We can interpret the hypergraph Laplacians for a ﬁxed dimension as follows. For a ﬁxed dimension k , the diagonalentries show the number of neighboring simplices for each k -simplex. For example, in the toy graph, the vertex v has ﬁve neighboring simplices as e , e , e , t , t . That is why L (1 ,

1) = 5 . Furthermore, the off-diagonal entriesshow the number of shared neighboring simplices between k -simplices. For example, v and v share two neighboringsimplices as e , t . That is why L (1 ,

2) = L (2 ,

1) = 2 . The hypergraph Laplacian in Deﬁnition 3 extends the simplicial Laplacian in a way to allow the diffusion through anysimplices between ﬁxed dimensional simplices. However, this Laplacian is not able to capture the diffusion betweendifferent dimensional simplices. In order to deﬁne the generalized hypergraph Laplacian, we deﬁne a new incidencematrix that allows to encode the relation between p - and r -simplices through q simplices with p < r and q / ∈ { p, r } asfollows. Deﬁnition 5

Incidence matrix between p - and r -simplices through q simplices D qp,r ∈ R n r × R n p with p < r encodeswhich p -simplices are incident to which r -simplices through q -simplices where n p is number of p -simplices. For q / ∈ { p, r } , it is deﬁned as D qp,r ( i, j ) = s where s is the number of the q -simplices that are adjacent to both σ pj and σ ri . For q ∈ { p, r } , we take D pp,r = D rp,r = D p,r as in Deﬁnition 1. Now, using the incidence matrix deﬁned above, we deﬁne a new incidence matrix between p - and r -simplices throughall simplices as follows. 5 PREPRINT - F

EBRUARY

18, 2021

Deﬁnition 6

Incidence matrix between p - and r -simplices through all simplices D p,r ∈ R n r × R n p with p < r encodeswhich p -simplices are incident to which r -simplices through any simplex where n p is number of p -simplices. It isdeﬁned as D p,r = n (cid:88) i =0 D ip,r . As the ﬁnal step, we deﬁne the generalized hypergraph Laplacian between any simplices through any simplices asfollows.

Deﬁnition 7

Let H be a hypergraph with the maximum simplex dimension n . Then we deﬁne the hypergraph Laplacianof H , L H , as the following block matrix L H =  L D T , D T , · · · D T ,n D , L D T , · · · D T ,n D , D , L · · · D T ,n ... ... ... . . . ... D ,n D ,n D ,n · · · L n  where D p,q is the incidence matrix between p - and r simplices through all simplices of H and L k is the Laplacianbetween k -simplices through other simplices of H . Here we continue the example in the previous section but this time show how to deﬁne the generalized hypergraphLaplacian.

Example 8

In the toy graph in Figure 2, the corresponding incidence matrices as in Deﬁnition 5 are D , = v v v v   e e e e e , D , = v v v v (cid:16) (cid:17) t t , D , = e e e e e (cid:16) (cid:17) t t . Then, the incidence matrices as in Deﬁnition 6 are D , =   , D , = (cid:18) (cid:19) , D , = (cid:18) (cid:19) . Finally, if we combine these incidence matrices with the hypergraph Laplacians as in Deﬁnition 7, we get the generalizedLaplacian for the hypergraph in Figure 2 as follows L H =   . As it happens in the previous hypergraph Laplacian, the diagonal entries show the number of neighboring simplices foreach k -simplex and the off-diagonal entries show the number of the shared neighboring simplices with other simplices.The diffusion between simplices happens based on the number of the shared neighboring simplices with other simplicesin the generalized Laplacian. 6 PREPRINT - F

EBRUARY

18, 2021

In this paper, we develop two new hypergraph Laplacians based on diffusion framework. These Laplacians can beemployed in different network mining problems on hypergraphs, such as social contagion models on hypergraphs,inﬂuence study on hypergraphs, and hypergraph classiﬁcation, to list a few.

References [1] Mei Li, Xiang Wang, Kai Gao, and Shanshan Zhang. A survey on information diffusion in online social networks:Models and methods.

Information , 8(4):118, 2017.[2] Federico Battiston, Giulia Cencetti, Iacopo Iacopini, Vito Latora, Maxime Lucas, Alice Patania, Jean-GabrielYoung, and Giovanni Petri. Networks beyond pairwise interactions: structure and dynamics. arXiv preprintarXiv:2006.01764 , 2020.[3] Dengyong Zhou, Jiayuan Huang, and Bernhard Schölkopf. Learning with hypergraphs: Clustering, classiﬁcation,and embedding.

Advances in neural information processing systems , 19:1601–1608, 2006.[4] Jin Huang, Rui Zhang, and Jeffrey Xu Yu. Scalable hypergraph learning and processing. In , pages 775–780. IEEE, 2015.[5] Fan Chung. The laplacian of a hypergraph.

Expanding graphs (DIMACS series) , pages 21–36, 1993.[6] Linyuan Lu and Xing Peng. High-ordered random walks and generalized laplacians on hypergraphs. In

Interna-tional Workshop on Algorithms and Models for the Web-Graph , pages 14–25. Springer, 2011.[7] Shenglong Hu and Liqun Qi. The laplacian of a uniform hypergraph.

Journal of Combinatorial Optimization ,29(2):331–366, 2015.[8] Joshua Cooper and Aaron Dutle. Spectra of uniform hypergraphs.

Linear Algebra and its applications ,436(9):3268–3292, 2012.[9] Keqin Feng et al. Spectra of hypergraphs and applications.

Journal of number theory , 60(1):1–22, 1996.[10] Danijela Horak and Jürgen Jost. Spectra of combinatorial laplace operators on simplicial complexes.

Advances inMathematics , 244:303–336, 2013.[11] T-H Hubert Chan, Anand Louis, Zhihao Gavin Tang, and Chenzi Zhang. Spectral properties of hypergraphlaplacian and approximation algorithms.

Journal of the ACM (JACM) , 65(3):1–48, 2018.[12] Charu C Aggarwal and Haixun Wang.

Managing and mining graph data , volume 40. Springer, 2010.[13] Diane J. Cook and Lawrence B. Holder.

Mining Graph Data . John Wiley & Sons, 2006.[14] Gustav Kirchhoff. Ueber die auﬂösung der gleichungen, auf welche man bei der untersuchung der linearenvertheilung galvanischer ströme geführt wird.

Annalen der Physik , 148(12):497–508, 1847.[15] Bojan Mohar. Some applications of laplace eigenvalues of graphs. In

Graph symmetry , pages 225–275. Springer,1997.[16] Mehmet E Aktas, Esra Akbas, and Ahmed El Fatmaoui. Persistence homology of networks: methods andapplications.