The iterated local transitivity model for hypergraphs
Natalie C. Behague, Anthony Bonato, Melissa A. Huggan, Rehan Malik, Trent G. Marbach
aa r X i v : . [ c s . D M ] J a n THE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS
NATALIE C. BEHAGUE, ANTHONY BONATO, MELISSA A. HUGGAN, REHAN MALIK,AND TRENT G. MARBACH
Abstract.
Complex networks are pervasive in the real world, capturing dyadic interactionsbetween pairs of vertices, and a large corpus has emerged on their mining and modeling.However, many phenomena are comprised of polyadic interactions between more than twovertices. Such complex hypergraphs range from emails among groups of individuals, scholarlycollaboration, or joint interactions of proteins in living cells. Complex hypergraphs and theirmodels form an emergent topic, requiring new models and techniques.A key generative principle within social and other complex networks is transitivity, wherefriends of friends are more likely friends. The previously proposed Iterated Local Transitivity(ILT) model incorporated transitivity as an evolutionary mechanism. The ILT model prov-ably satisfies many observed properties of social networks, such as densification, low averagedistances, and high clustering coefficients.We propose a new, generative model for complex hypergraphs based on transitivity, calledthe Iterated Local Transitivity Hypergraph (or ILTH) model. In ILTH, we iteratively apply theprinciple of transitivity to form new hypergraphs. The resulting model generates hypergraphssimulating properties observed in real-world complex hypergraphs, such as densification andlow average distances. We consider properties unique to hypergraphs not captured by their2-section. We show that certain motifs, which are specified subhypergraphs of small order,have faster growth rates in ILTH hypergraphs than in random hypergraphs with the sameorder and expected average degree. We show that the graphs admitting a homomorphisminto the 2-section of the initial hypergraph appear as induced subgraphs in the 2-section ofILTH hypergraphs. We consider new and existing hypergraph clustering coefficients, and showthat these coefficients have larger values in ILTH hypergraphs than in comparable randomhypergraphs. Introduction
Complex networks are an effective paradigm for pairwise interactions between objects in real-world systems. Such networks capture dyadic interactions in many phenomena, ranging fromfriendship ties in Facebook, to Bitcoin transactions, to interactions between proteins in livingcells. Complex networks evolve via a number of mechanisms such as preferential attachment orcopying that predict how links between vertices are formed over time.
Structural balance theory cites mechanisms to complete triads (that is, subgraphs consisting of three vertices) in socialand other complex networks [16, 19]. A central mechanism in balance theory is transitivity : if x is a friend of y, and y is a friend of z, then x is a friend of z ; see, for example, [24].The Iterated Local Transitivity ( ILT ) model introduced in [11, 12] and further studied in [8, 9,25], simulates structural properties in complex networks emerging from transitivity. Transitivitygives rise to the notion of cloning , where an introduced vertex x is adjacent to all of the neighborsof some pre-existing vertex y . Note that in the ILT model, the vertices have local influencewithin their neighbor sets. Although graphs generated by the model evolve over time, there is a Mathematics Subject Classification.
Key words and phrases. hypergraphs, transitivity, clustering coefficient, 2-section, motifs.The authors are funded by NSERC. The third author was funded by an NSERC Postdoctoral Fellowship. memory of the initial graph hidden in the structure. The ILT model simulates many propertiesof social networks. For example, as shown in [11], graphs generated by the model densify overtime and exhibit bad spectral expansion. In addition, the ILT model generates graphs with thesmall-world property, which requires graphs to have low diameter and high clustering coefficientcompared to random graphs with the same number of vertices and expected average degree.Dyadic relationships do not always fully capture the dynamics of interactions between largergroups of vertices. For example, interactions among groups of vertices occur in scholarly collab-orations, tags attached to the same web post, or metabolic interactions between more than tworeactants. In these examples, a polyadic view of interactions is more accurate, giving rise tohypergraphs. A hypergraph is a discrete structure with vertices and hyperedges , which consistsof sets of vertices. Graphs are special cases of hypergraphs, where each hyperedge has cardinal-ity two. While hypergraph theory is less developed than graph theory, it is an emerging topicin the study of complex, real-world systems; see, for example, [2, 4, 15, 17, 18, 21, 27]. For arecent article discussing the important role of hypergraphs and other higher-order methods forstudying complex networks, see [3].In the present paper, we consider a deterministic model for complex hypergraph networksbased on transitivity. The model is analogous to the ILT model, although it has its ownunusual features. While every hypergraph can be reduced to its 2-section graph, replacingeach hyperedge by a clique, not all hypergraph properties are captured by the 2-section. As wedemonstrate, the ILT hypergraph model we introduce has properties not evident in its 2-section.Further, the model simulates several properties, such as clustering and motif evolution, morerobustly when compared to random hypergraphs with analogous characteristics. For simplicity,we consider throughout k - uniform hypergraphs, where each hyperedge has cardinality k for afixed positive integer k ≥ . The
Iterated Local Transitivity Hypergraph ( ILTH ) model is defined formally as follows. Themodel is deterministic and generates k -uniform hypergraphs over discrete time-steps. The soleparameter of the model is the initial k -uniform hypergraph H = H . For a nonnegative integer t ,the hypergraph H t represents the hypergraph at time-step t . To form H t + , for each x ∈ V ( H t ),add a new vertex x ′ called the clone of x . We refer to x as the parent of x ′ , and x ′ as the child of x. For every hyperedge e of H t containing x , we add the hyperedge e ′ to H t + formedby replacing x with x ′ . Observe that e ′ = ( e ∖ { x }) ∪ { x ′ } ; we simply write e ′ = e − x + x ′ . Notethat all existing hyperedges in H t are also included in H t + . See Figure 1. We refer to H t as an ILTH hypergraph , and we sometimes write H t = ILTH t ( H ) to emphasize the initial hypergraph H. Note that ILTH t ( H ) is k -uniform for all t ≥ . We sometimes refer to the formation of thehypergraphs H t as the ILTH process .The clones form an independent set in H t + , resulting in a doubling of the order of H t . Unlikein the ILT model, a clone and its parent are not in a hyperedge. For a vertex x in H t , we willsometimes use the notation x ∗ to mean any descendant of x ; that is, x ∗ is either x or x ′ in H t + . Similarly, if e is a hyperedge in H t , then e ∗ represents one of the descendant hyperedges e or e − x + x ′ in H t + .As we will demonstrate, the ILTH model simulates many properties observed in complexhypergraphs, including the small-world property and motif counts. In Section 2, we derive adensification power law for ILTH hypergraphs, and show that distance and spectral propertiesfollow by properties of the 2-section. We then consider subhypergraphs and motifs in Section 3.Motifs are certain hypergraphs with a small number of vertices and hyperedges. In [21], it wasshown that several real-world, complex hypergraphs have motif counts dramatically higherthan comparable random hypergraphs. We show that for certain motifs arising in k -uniform HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 3 x y zx' y' z'
Figure 1.
The ILTH model with H a hyperedge with k = . hypergraphs from the list in [21] of 26 motifs formed from three hyperedges, ILTH has a provablyhigher count than in a random hypergraph with the same average degree. We prove that the2-section contains isomorphic copies of all graphs admitting a homomorphism to the 2-sectionof H in Theorem 3 and contains only such graphs; as a consequence, certain motifs will beexcluded in the ILTH process unless they appear in H .In Section 4, we provide a rigorous analysis of various clustering coefficients for ILTH hy-pergraphs. Our study of clustering coefficients further validates the small-world property ofILTH hypergraphs, and leads to interesting combinatorial analysis. We consider two clusteringcoefficients HC and HC and their asymptotic order in ILTH. The clustering coefficient HC was first studied in [17]. We introduce the new parameter HC that is a variant of one that firstappeared in [27], although we argue it is more natural and amenable to analysis. In the caseof HC , we show that these clustering coefficients provide higher clustering than is expected inrandom hypergraphs with the same average degrees. We show an analogous result for HC in avariation of the ILTH model, where clones and parents are adjacent. We finish with a summaryof our results along with open problems on the ILTH model.Throughout the paper, we consider finite, simple, undirected graphs and hypergraphs. Fora general reference on graph theory, see [26]. For a reference on hypergraphs, see [5, 6]. Forbackground on social and complex networks, see [7, 13, 14]. We define terms and notation forhypergraphs when they first appear throughout the article.2. Densification, eigenvalues, and distances
Many examples of complex networks densify in the sense that the ratio of their number ofedges to vertices tends to infinity over time; see [22]. In this section, we show that the ILTHmodel always generates hypergraphs that densify, and we give a precise statement below of itsdensification power law.Let n ( t ) be the number of vertices in H t and let e ( t ) be the number of hyperedges in H t , re-spectively. We establish elementary though important recursive formulas for these parameters. Theorem 1.
For a nonnegative integer t, we have the following. (1) n ( t ) = t n ( ) . (2) e ( t ) = ( k + ) t e ( ) . N.C. BEHAGUE, A. BONATO, M.A. HUGGAN, R. MALIK, AND T.G. MARBACH
In particular, we have that e ( t ) = Θ ( n ( t ) log ( k + ) ) .Proof. For item (1), for each vertex v in H t , there are two vertices v and v ′ in H t + . Hence, n ( t + ) = n ( t ) .For item (2), notice that for each hyperedge e in H t , we add to H t + the hyperedge e and eachof the k hyperedges e − x + x ′ where x is a vertex in e . We then have that e ( t + ) = ( k + ) e ( t ) for all t . The result follows. (cid:3) As a consequence, the average vertex degree of ILTH t ( H ) is given by ke ( t ) n ( t ) = ( k + ) t ke ( ) n ( ) , which increases exponentially with t . Hence, we have a densification power law for ILTHhypergraphs.We next turn to the 2-section of ILTH hypergraphs. For this, we consider a variant on theILT model for graphs, which we call ILT ′ . Given a graph G = G , iteratively construct ILT ′ t ( G ) , where t ≥ ′ t ( G ) . For each v ∈ V ( ILT ′ t ( G )) , the vertices v and v ′ are included in ILT ′ t + ( G ) . For each uv ∈ E ( ILT ′ t ( G )) , the edges uv , uv ′ and u ′ v areincluded in ILT ′ t + ( G ) . We have the following lemma, whose proof is immediate. Lemma 2.
For a nonnegative integer t, we have that ILT ′ t ( G ) is the 2-section of ILTH t ( H ) . We use the notation n t and e t for the order and size of ILT ′ t ( G ) . Observe that n t = t n and e t = t e edges. An implication of Lemma 2 is that any hypergraph property that dependssolely on the 2-section behaves the same way for the hypergraph model ILTH as it does for thegraph model ILT ′ . Such properties are not truly exploiting the hypergraph structures evidentin ILTH. We briefly discuss some of these properties, including the adjacency matrix, thediameter, and the average distance.The adjacency matrix A ( H ) for a hypergraph H has rows and columns indexed by thevertices of H and entry 1 if u ≠ v and there is some hyperedge of H containing both u and v ,and 0 otherwise. It is evident that this is the same as the adjacency matrix of the 2-sectionof H . In particular, to analyse the adjacency matrix of ILTH t ( H ) we need only consider theadjacency matrix of ILT ′ t ( G ) , where G is the 2-section of H .If ILT ′ t ( G ) has n × n adjacency matrix A , then ILT ′ t + ( G ) has 2 n × n adjacency matrix ( A AA ) , where is the n × n all-zeros matrix. It is straightforward to verify that if A has eigenvalue ρ with associated eigenvector v , then ( A AA ) has eigenvalues ±√ ρ with associated eigenvectors ( ±√ vv ) . In particular, given the eigenvalues for the graph G , one can calculate the eigenvaluesfor ILT ′ t ( G ) .We next consider distance in ILTH hypergraphs. A walk of length k connecting two vertices u and v in a hypergraph is a sequence of hyperedges e , e , . . . , e k such that u ∈ e , v ∈ e k and e i ∩ e i + ≠ ∅ , for all 1 ≤ i < k . We say that the distance between two vertices u, v, written d ( u, v ) , is the minimum length of a walk connecting u and v . This is the same as the distance betweentwo vertices u and v in the 2-section of the hypergraph. In particular, to analyze distanceswithin ILTH t ( H ) we could only consider distances in ILT ′ t ( G ) , where G is the 2-section of H ,but it is equally convenient to analyse ILTH directly.Consider vertices u, v in H t with u ≠ v . Let d = d ( u, v ) and let e , e , . . . , e d be a minimumlength walk connecting them. We then have that in H t + , HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 5 (1) d ( u, v ) = d , using the walk e , e , . . . , e d ;(2) d ( u, v ′ ) = d , using the walk e , e , . . . , e d − v + v ′ ;(3) d ( u ′ , v ) = d , using the walk e − u + u ′ , e , . . . , e d ;(4) d ( u ′ , v ′ ) = d if d ≥
2, using the walk e − u + u ′ , e , . . . , e d − v + v ′ ; and(5) d ( u ′ , v ′ ) = d =
1, using the walk e − u + u ′ , e − v + v ′ , so long as k ≥ d else thepredecessors of these edges would form a walk from u to v in H t of length less than d .The diameter of a hypergraph is the maximum distance between any pair of vertices. Wefind immediately that the diameter of H t + is the maximum of 2 and the diameter of H t , and,iterating this, is the maximum of 2 and the diameter of H . In either case, the diameter is aconstant, independent of t .To end this section, we determine the average distance between any pair of vertices in H t .Let W ( t ) be the sum of the distances in H t or Wiener index , written W ( t ) = ∑ u,v ∈ V ( H t ) d ( u, v ) . Assuming that H has no isolated vertices and so H t has no isolated vertices for all t ≥
1, byour calculations pertaining to distances above, we obtain that: W ( t + ) = ∑ u,v ∈ V ( H t + ) d ( u, v )= ∑ u ≠ v ∈ V ( H t ) d ( u, v ) + d ( u ′ , v ) + d ( u, v ′ ) + d ( u ′ , v ′ ) + ∑ u ∈ V ( H t ) d ( u, u ′ ) + d ( u ′ , u )= ⎛⎝ ∑ u,v ∈ V ( H t ) d ( u, v )⎞⎠ + ∣{ u ≠ v ∈ V ( H t ) ∶ d ( u, v ) = }∣ + n ( t )= W ( t ) + e ( t ) + n ( t ) . Solving this recurrence gives that W ( t ) = t ( W ( ) + e ( ) + n ( )) − e ( t ) − n ( t )= t ( W ( ) + e ( ) + n ( )) − ⋅ t e ( ) − t + n ( ) . Thus, the average distance is given by2 W ( t ) n ( t )( n ( t ) − )) = t ( W ( ) + e ( ) + n ( )) − ⋅ t e ( ) − t + n ( ) t n ( ) − t n ( ) , which tends to W ( )+ e ( )+ n n ( ) as t tends to infinity. We therefore have that ILTH hypergraphsexhibit a constant average distance, as is found in many real-world hypergraphs; see [15].3. Subhypergraphs and motifs
We next consider subhypergraphs of the ILTH model, and our first approach is to considerthe induced subgraphs of the 2-section. In Theorem 3, it is shown that a graph appears inthe 2-section of an ILTH hypergraph exactly when it admits a homomorphism to the 2-sectionof H . The theorem guarantees the absence of many kinds of induced subhypergraphs; forexample, no hypergraph clique appears in an ILTH hypergraph with larger order than H . Wethen turn to counting certain small order subhypergraphs, or motifs. Motifs are important incomplex networks, as they are one measure of similarity for graphs. For example, the countsof 3 − and 4 − vertex subgraphs gives a similarity measure for distinct graphs; see [10, 23] for N.C. BEHAGUE, A. BONATO, M.A. HUGGAN, R. MALIK, AND T.G. MARBACH implementations of this approach using machine learning. Hypergraph motifs were studiedby several authors; see for example, [1, 4, 21]. In [21], motif counts were analyzed acrossvarious real-world complex hypergraphs and compared to random hypergraphs. We show inthis section that in ILTH hypergraphs, the growth rate for certain motifs is higher than incomparable random hypergraphs.3.1.
Induced subgraphs of the 2-section.
For all t ≥ , H t is an induced subhypergraph of H t + . There exists a homomorphism f t from H t + to H t by mapping each clone to its parent,and fixing all other vertices. Note that F t = f ○ f ○ ⋅ ⋅ ⋅ ○ f t is a homomorphism from H t to H . As a result, the clique and chromatic numbers of H t are bounded above by those of H . This observation puts limitations on the kinds of subgraphs that H t contains. For additionalbackground on graph homomorphisms, the reader is directed to [20].The age of a hypergraph is its set of isomorphism types of induced subhypergraphs. As each F t is a homomorphism, we have that no H t contains k -uniform cliques larger than those in H .In particular, the set of ages of an ILTH hypergraph does not contain all hypergraphs. Thiscontrasts with the ILT model, where all graphs occur in the set of ages of ILT-graphs; see [8].Characterizing the ages of ILTH hypergraphs remains an open problem. The next resultsolves the analogous problem for the ages of 2-sections of ILTH hypergraphs. For a fixed graph G and family of graphs G , we say that G is G - hom-universal if the set of ages of G consists ofall finite graphs admitting a homomorphism to G. Theorem 3.
A graph G admits a homomorphism to G ( H ) if and only if G is an inducedsubgraph of G ( H t ) , for some integer t ≥ and where G ( H ) is the 2-section of H . In particular,the set of ages of 2-sections of hypergraphs in ILTH ( H ) is G ( H ) -hom-universal.Proof. The reverse direction follows since for an induced subgraph G of G ( H t ) , the inclusionmap is a homomorphism from G to G ( H t ) . Composing with F t gives a homomorphism from G to G ( H ) . For the forward direction, suppose that G admits a homomorphism f to G ( H ) . Let u, v ∈ V ( G ) be two vertices such that f ( u ) = f ( v ) . Define the homomorphism f ′ to G ( H ) as f ′ ( x ) = f ( x ) if x ≠ v, and f ( v ) is the clone of the vertex f ( u ) . We then note that the numberof vertices in the codomain of f ′ is one larger than the number of vertices in the codomain of f .We may repeat this procedure until we find an injective homomorphism f i from G to G ( H i ) ,for some i ≥ u, v in G which are not neighbors but such that f i ( u ) f i ( v ) is an edge in G ( H i ) . We can define a new injective homomorphism f i + to G ( H i + ) by f i + ( x ) = f i ( x ) if x ∉ { u, v } , f i + ( u ) is the clone of f i ( u ) , and f i + ( v ) is the clone of f i ( v ) . We thenhave that the induced subgraph of f i ( G ) and f i + ( G ) differ only in the edge f i ( u ) f i ( v ) , asthis edge does not exist in f i + ( G ) . We can repeat this procedure to construct an injectivehomomorphism f j from G to G ( H j ) for some j, with the property that for all u, v ∈ V ( G ) if f j ( u ) f j ( v ) is an edge in G ( H j ) , then uv is an edge in G . Hence, the subgraph induced by thevertices in f j ( G ) in G ( H j ) is isomorphic to G . (cid:3) Motifs.
We now turn to counting motifs, which are certain types of subhypergraphs.In [21], 26 distinct motifs were studied for three interacting hyperedges e , e , and e . Motifcounts may be viewed as a similarity measure for hypergraphs, such as when we are comparingreal-world hypergraphs and synthetic ones derived from models.
HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 7
The different types of motifs emerge by considering which of the following seven regions arenonempty: e ∖ ( e ∪ e ) , e ∖ ( e ∪ e ) , e ∖ ( e ∪ e ) , e ∩ e ∖ e , e ∩ e ∖ e , e ∩ e ∖ e , e ∩ e ∩ e . We may compactly reference motifs by a binary sequence i i i i i i i , so that for all j , i j = motif types . See Figure 2 for an example. We may generalize e e e e e e Figure 2.
The motif type 11 or 1011101. On the left, we represent this motifvia a Venn diagram, where the vertex in a region implies it is nonempty. On theright, we have an example of a 3-uniform hypergraph realizing this motif.this notation to a tuple of nonnegative integers, quantifying the number of elements in eachregion. The cardinality vector of a motif composed of the three hyperedges e , e , e is definedas the 7-tuple: ( a, b, c, d, e, f, g ) = (∣ e ∖ ( e ∪ e )∣ , ∣ e ∖ ( e ∪ e )∣ , ∣ e ∖ ( e ∪ e )∣ , ∣ e ∩ e ∖ e ∣ , ∣ e ∩ e ∖ e ∣ , ∣ e ∩ e ∖ e ∣ , ∣ e ∩ e ∩ e ∣) . Note that a motif contains a + b + c + d + e + f + g vertices. Further, we have that ∣ e ∣ = a + d + f + g = k , ∣ e ∣ = b + d + e + g = k , and ∣ e ∣ = c + e + f + g = k .In general hypergraphs, there are 26 non-isomorphic motif types; however, we note that only11 motif types occur in k -regular hypergraphs. With numbering taken from [21], these motiftypes are:(1) Motif type 2: 1110001,(2) Motif type 6: 1110101,(3) Motif type 11: 1011101,(4) Motif type 12: 1111101,(5) Motif type 13: 0001111,(6) Motif type 14: 1001111,(7) Motif type 15: 1011111,(8) Motif type 16: 1111111,(9) Motif type 24: 1001110,(10) Motif type 25: 1011110,(11) Motif type 26: 1111110. N.C. BEHAGUE, A. BONATO, M.A. HUGGAN, R. MALIK, AND T.G. MARBACH
We keep the numbering from [21] for brevity; for example, we refer to motif 11 rather than1011101. We focus on these motif types since they always occur in the ILTH model and havehigher counts when compared to random hypergraphs, as we describe below. Interestingly,motifs 11 and 12 are more prevalent in the co-authorship hypergraphs compared to random hy-pergraphs, as shown in [21]. The same conclusion holds for motif 16 for tag hypergraphs. Theseobservations lend credence to the view that ILTH hypergraphs simulate properties observed inreal-world, complex hypergraphs.Let α i be the maximum number of vertices that can occur in a motif of type i in a k -uniform hypergraph. Each value of α i can be calculated explicitly, and each calculation isstraightforward. For example, we may calculate α as follows. Suppose that the motif inquestion has cardinality vector ( a, , , d, e, f, g ) . Without loss of generality we have that k = a + d + f + g = d + e + g = e + f + g, as each hyperedge contains k vertices. It therefore immediately follows that d = f . The totalnumber of vertices is a + d + e + f + g = k + ( k − d − g ) , which is maximized when d = f = g = d, g, f > α = k − i α i k − k − k − k − ⌊ k − ⌋ k − k − k − k − k − k − Table 1.
The maximum number of vertices α i in a motif of type i possible in a k -uniform hypergraph. Lemma 4. If H t contains x motifs of type i with cardinality vector ( a, b, c, d, e, f, g ) , then H t + contains at least x ( g + ( c + ) d + ( b + ) f + ( a + ) e + ( a + )( b + )( c + )) motifs of type i withcardinality vector ( a, b, c, d, e, f, g ) .Proof. For a motif in H t of type i with cardinality vector ( a, b, c, d, e, f, g ) formed by the hyper-edges e , e , e , we choose a set S of up to three vertices contained in the motif to clone suchthat each hyperedge of the motif contains at most one cloned vertex. Consider the motif in H t + formed by the hyperedges e ′ , e ′ , e ′ , where e ′ i is the hyperedge obtained from e i by replacingeach vertex that is also in S with its clone and leaving other vertices unchanged. This motifis of type i and has cardinality vector ( a, b, c, d, e, f, g ) . Each motif developed in this way isunique. We must therefore find how many ways there are of choosing S , which is g + ( c + ) d + ( b + ) f + ( a + ) e + ( a + )( b + )( c + ) , and the proof follows. (cid:3) We have the following theorem.
Theorem 5.
If the initial hypergraph contains at least one hyperedge, then the number of motifsof type 11 in the ILTH model is Ω ( k t ) . HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 9
Proof.
A motif of type 11 has cardinality vector ( a, , c, d, e, , g ) , where a + d + g = d + e + g = c + e + g = k , which yields a = e and c = d . For each motif of type 11 in H t , there will be g + ( c + ) d + ( a + ) e + ( a + )( c + ) = g + a + c + c + a + ac + = ( k − g ) − ac + k − g + , motifs of type 11 in H t + , which is maximized when g = a = c = a, c, g > k − k + e be a hyperedge in H . For some u ∈ e , there is a hyperedge e = e ∪ { u ′ } ∖ { u } in H . For some v ∈ e ∖ { u } , there is a hyperedge e in H k − with e ∩ e ∩ e = { u } and e ∩ e = { u, v } . These three hyperedges form a motif of type 11 in H k − with cardinality vector ( , , k − , k − , , , ) . As such, by Lemma 4 there are at least ( k − k + ) t − k + = Ω ( k t ) motifsof type 11 in H t with cardinality vector ( , , k − , k − , , , ) . (cid:3) We can also perform a similar analysis of the other motif types that grow rapidly.
Theorem 6.
If the initial hypergraph contains at least one hyperedge, then the number of eachmotif of types 2, 6, 12, 16, and 26 in the ILTH model is Ω ( k t ) .Proof. It is straightforward to verify that H k contains a motif of type i containing α i ver-tices, for i ∈ { , , , , } . Suppose that the motif in question has cardinality vector ( a, b, c, d, e, f, g ) , and so α i = a + b + c + d + e + f + g . As α i = Ω ( k ) for these values of i ,by Lemma 4, there are at least ( α i ) -times more of this motif type and cardinality vector ineach iteration of the ILTH process. Hence, there are at least ( α i ) t − k = Ω ( k t ) of this motiftype in H t , and the result follows. (cid:3) Our analysis so far does not apply to motif types 13, 14, 15, 24, and 25 , as each of thesemotif types will not be generated in the ILTH process on one hyperedge. However, if one ofthese motif types occurs within the starting hypergraph, then we will have exponential growthof these, as shown in the following theorem. Theorem 7. If H contains a motif of type i ∈ { , , , , } that contains m vertices, thenmotif i occurs at least ( m + ) t times in H t .Proof. The proof follows by Lemma 4. (cid:3)
We contrast the motif counts for ILTH with comparable random hypergraphs. Let G ( n, k, p ) be the random hypergraph where each possible k -set is included as a hyperedge with probability p . If we fix two vertices u and w , then the expected number of hyperedges e containing both u and w is ( n − k − ) p .We consider the k -uniform hypergraph with n = n ( t ) = t n ( ) = Θ ( t ) vertices and p = e ( t )( n ( t ) k ) = Θ ( ( log ( k + )− k ) t ) . We expect Θ ( n α i p ) motifs of type i with α i vertices. To see this, give each vertex in a motifwith α i vertices a label between 1 and α i , and define three k -sets with these labels e , e , and e from the three hyperedges in the motif with these labels. We select α i vertices from theset of n vertices in the k -uniform random hypergraph, labeling the i th choice by the label i .There are n ! ( n − α i ) ! ∼ n α i possible ways to make these choices. The sets of vertices e , e , and e are hyperedges in the k -uniform random hypergraph with probability p . There is systematicdouble counting of occurrences of the motif but this only changes the expectation by a multipleof some function of k , which is a constant. The motifs of type i with fewer than α i vertices will occur o ( n α i p ) times, so the total number of motifs of type i that have any number of verticesis Θ ( n α i p ) .Therefore, we expect Θ ( n α i p ) = Θ ( ( α i − ( k − log ( k + )) ) t ) many occurrences of motif i . If α i < ( k − log ( k + )) , then the expected number of motifs oftype i will tend to 0 exponentially fast, and if α i > ( k − log k + ) , then the expected numberof motifs of type i grows exponentially. In particular, it will be useful to note that if α i ≤ k − k ≥
9, then the expected number of motifs of type i will tend to 0 exponentially fast, and if α i = k − c with c ∈ { , , , } , then the expected number of motifs of type i grows exponentiallyfast.As a consequence, we expect motifs 2, 6, 12, 16, and 26 to occur an exponential numberof times each in a random hypergraph. We expect that other motifs will rarely occur, withthe probability that we see any diminishing when k ≥ t increases. As a consequence ofTheorems 5 and 6, the growth rates of the motif types 2, 6, 11, 12, 16, and 26 is faster in ILTHthan in a comparable random k -uniform hypergraph.We finish the section with precise motif counts for ILTH with initial hypergraph a single hy-peredge. We ran the ILTH model on a computer, starting with a single hyperedge of cardinality k , for 3 ≤ k ≤ ≤ t ≤ − k .See Tables 2 to 5 below for the motif counts of these ILTH hypergraphs.t 2 6 11 261 3 12 45 126 75 453 3447 4770 1083 11414 161451 115146 12675 223655 5981355 2301930 133563 3829816 195870195 41818266 1326675 60710857 5993456427 720709290 12718443 91888021 Table 2.
The number of motifs generated by the ILTH model starting with ahyperedge of cardinality 3.t 2 6 11 12 16 261 6 42 90 504 474 504 188 2763 16660 75168 14010 42192 5116 342484 2651330 6088680 305682 1920888 107712 23413325 305991860 369517680 5764506 67434480 2026684 1227661206 28267339810 19173430584 100158594 2066592024 34911788 1285323380
Table 3.
The number of motifs generated by the ILTH model starting with ahyperedge of cardinality 4.
HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 11 t 2 6 11 12 16 261 10 102 150 1110 1490 2100 1870 4203 40210 356670 82030 540720 189610 2343604 13613610 77687610 3114650 71894820 12725950 500627405 4067088850 12719703750 97894510 6831291600 680649610 7078307400
Table 4.
The number of motifs generated by the ILTH model starting with ahyperedge of cardinality 5.t 2 6 11 12 16 2601 15 202 229 2070 3285 5040 7680 1203 79096 994680 301515 2610180 1983740 5767204 388621215 409931190 18710325 815537880 346117200 370671840
Table 5.
The number of motifs generated by the ILTH model starting with ahyperedge of cardinality 6.4.
Hypergraph clustering coefficients
The small-world property in complex networks demands low average distance and high clus-tering coefficients, relative to random graphs with the same expected average degree; see [7]for a discussion. An analogous definition holds for small-world hypergraphs, comparing theirproperties to a random hypergraph G ( n, k, p ) with the same order n and p chosen so that theyhave the same expected average degree. As we demonstrated in Section 2, ILTH hypergraphshave constant average distance. Hence, a natural next step in our investigation is to considerclustering coefficients of ILTH hypergraphs.There are a variety of hypergraph clustering coefficients we may consider; see [18] for ninedistinct coefficients. We focus on a clustering coefficient introduced in [17], along with a newone that is a variant of the one studied in [27]. We discuss these clustering coefficients byconsidering graphs. For a graph G, the global clustering coefficient is C ( G ) = × ( number of triangles in G ) number of paths of length two in G .
Note that C ( G ) is a rational number in the interval [ , ] .There are several different ways to generalize the definition of clustering coefficient to hyper-graphs. We discuss three of these in the context of the ILTH model.We define a path of length two in a hypergraph to be a 5-tuple ( u, e , v, e , w ) where u, v, w are distinct vertices, e , e are distinct hyperedges, and u, v ∈ e , v, w ∈ e . Similarly, we definea hypertriangle to be a 6-tuple ( u, e , v, e , w, e ) where u, v, w are distinct vertices, e , e , e aredistinct hyperedges, and u, v ∈ e , v, w ∈ e , w, u ∈ e . We have the following generalization ofthe clustering coefficient to hypergraphs, appearing first in [17]:HC ( H ) = × (number of hypertriangles in H )number of paths of length two in H .
Note that HC ( H ) = C ( H ) in the case that H is a graph. However, for general hypergraphs H , the values of HC ( H ) need no longer be in the interval [ , ] . For example, the complete k -uniform hypergraph on n vertices has HC ( GK ( k ) n ) = ( n − k − ) . The reason for this differencewith the graph case is because a given path of length two ( u, e , v, e , w ) can be extended toa hypertriangle in many different ways. The hyperedge e can be any hyperedge so long as itincludes u and w . The clustering coefficient HC counts the average number of hypertrianglesthat are extensions of a path of length two.We prove the following theorem on HC in Subsection 4.1. Theorem 8.
For a nonnegative integer t , we have that HC ( H t ) = Θ (( ( k − ) + ( k − ) k + ) t ) . We can show that H t has a higher value of HC than the random k -uniform hypergraph withthe same number of vertices and the same expected average degree. See the discussion at theend of Subsection 4.1.There are other ways to express the clustering coefficient on graphs that lead to differentgeneralisations to hypergraphs. One such equivalent definition is that C is the probability thatgiven a path of length two, the end vertices are adjacent: C ( G ) = P ( uv is an edge ∶ ( u, e , w, e , v ) a path of length two ) . We say two vertices u, v in a hypergraph are adjacent , written u ∼ v , if there is some hyperedge e containing both. There is then a natural way to generalize this definition of C to hypergraphs,which we think we are, surprisingly, the first to propose.HC ( H ) = P ( u ∼ v ∶ ( u, e , w, e , v ) a path of length two in H ) = number of paths ( u, e , w, e , v ) , where u ∼ v number of paths of length two . Note that since HC is a probability, this clustering coefficient is bounded between 0 and 1.Further, HC matches the clustering coefficient C on graphs.A different generalization of the clustering coefficient to hypergraphs, due to [27], also retainsthe property that the clustering coefficient is between 0 and 1, and is closely related to HC .Let I be the set of pairs of intersecting edges in H . For a ( e, f ) ∈ I , define A ( e, f ) = ∣{ u ∈ e − f ∶ for some w ∈ f − e with u ∼ w }∣ . For e , e ∈ I define EO ( e , e ) = A ( e , e ) + A ( e , e )∣ e − e ∣ + ∣ e − e ∣ . The extra overlap attempts to capture the number of connections between vertices u ∈ e − e and w ∈ e − e . It is evident that 0 ≤ EO ( e , e ) ≤
1. The following clustering coefficient from[27] is the average extra overlap over all intersecting pairs of edges:HC ( H ) = ∣ I ∣ ∑ ( e i ,e j )∈I EO ( e i , e j ) . The goals of the authors in [27] were to define a clustering coefficient on hypergraphs that i)took values in [ , ] , ii) matches the normal clustering coefficient when applied to graphs, andiii) reflects the extent of connectivity among neighbors of v due to hyperedges other than onesconnecting v with those neighbors. These three goals are satisfied by HC , but they are also allsatisfied by HC , which we believe to be a more natural definition given that it can be simply HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 13 expressed as a probability without recourse to the notion of extra overlap. For these reasons,we focus on the new clustering coefficient parameter HC . We prove the following theorem on HC in Subsection 4.2. Theorem 9.
For a nonnegative integer t, we have that HC ( H t ) = Θ (( k k + ) t ) . We show that H t has a lower value of HC than the random k -uniform hypergraph with thesame number of vertices and the same expected average degree, and so by this measure, it hasless clustering. This is in contrast to the clustering coefficient HC , and we include a discussionof this phenomenon at the end of the section. We introduce a modified version of ILTH whereclones and their parents are in certain hyperedges. For the modified ILTH model, HC hashigher values than in random hypergraphs.The following lemma will prove useful in our study of hypergraph clustering coefficients. Lemma 10.
Suppose that v ∈ V ( H t − ) and e ∈ E ( H t − ) with v / ∈ e . Let v ∗ ∈ V ( H t ) be adescendant of v and e ∗ ∈ E ( H t ) be a descendant of e . We then have that v ∗ / ∈ e ∗ .Proof. Take some v ∈ V ( H t − ) and e ∈ E ( H t − ) with v / ∈ e . The descendants of e are e and e − x + x ′ for each x ∈ e . Since v / ∈ e , it is evident that v and v ′ are not contained in any of thedescendants of e . (cid:3) Lemma 10 is more useful for our purpose in its contrapositive form.
Lemma 11.
Suppose that v ∗ ∈ V ( H t ) and e ∗ ∈ E ( H t ) with v ∗ ∈ e ∗ . If v ∈ V ( H t − ) and e ∈ E ( H t − ) are their respective predecessors, then v ∈ e . The clustering coefficient HC . This subsection is devoted to proving Theorem 8. Tothat end, we prove two combinatorial lemmas finding the asymptotic order of the number ofpaths of length two and the number of hypertriangles in H t , respectively. Lemma 12.
The number of paths of length two in H t is Θ (( k + ) t ) .Proof. Let P ′ ( t ) = {( e , v, e ) ∶ v ∈ V ( H t ) , e , e ∈ E ( H t ) , v ∈ e ∩ e } . Note that, while closelyrelated, this is not the same as the set of paths of length two as we do not include endpoints.We include the degenerate case where e = e . We find an exact value for ∣ P ′ ( t )∣ in terms of t and ∣ P ′ ( )∣ , which will enable us to bound the number of paths of length two.Fix some ( e , v, e ) ∈ P ′ ( t − ) . We wish to count the number of descendants ( e ∗ , v ∗ , e ∗ ) thishas in P ′ ( t ) . If v ∗ = v ′ , then for v ∗ ∈ e ∗ ∩ e ∗ we must have e ∗ = e − v + v ′ and e ∗ = e − v + v ′ ,so there is one descendant ( e ∗ , v ∗ , e ∗ ) in P ′ ( t ) , where v ∗ = v ′ . If v ∗ = v , then for v ∈ e ∗ ∩ e ∗ wecannot have e ∗ ≠ e − v + v ′ and e ∗ ≠ e − v + v ′ . All of the k other descendants of e and the k other descendants of e contain v so there are k descendants ( e ∗ , v ∗ , e ∗ ) in P ′ ( t ) , where v ∗ = v .In total, each ( e , v, e ) ∈ P ′ ( t − ) has k + ( e ∗ , v ∗ , e ∗ ) in P ′ ( t ) , giving ∣ P ′ ( t )∣ ≥ ( k + )∣ P ′ ( t − )∣ . Next, suppose we have some ( e ∗ , v ∗ , e ∗ ) ∈ P ′ ( t ) , so in particular, v ∗ ∈ e ∗ ∩ e ∗ . Considertheir respective predecessors e , v , and e in H t − . Lemma 11 provides that v ∈ e and v ∈ e ,so ( e , v, e ) ∈ P ′ ( t − ) . Hence, every triple in P ′ ( t ) is a descendant of a triple in P ′ ( t − ) ,and in particular, ∣ P ′ ( t )∣ = ( k + )∣ P ′ ( t − )∣ . Iterating this process, we derive that ∣ P ′ ( t )∣ = ( k + ) t ∣ P ′ ( )∣ . Now, let P ( t ) be the set of paths of length two in H t . Recall that a path of length twois ( u, e , v, e , w ) where u, v, w ∈ V ( H t ) are distinct, e , e ∈ E ( H t ) are distinct, and u, v ∈ e , v, w ∈ e .For 0 ≤ i ≤ k , let P i ( t ) be the set of ordered pairs ( e , e ) of hyperedges with ∣ e ∩ e ∣ = i . Notethat ∣ P k ( t )∣ = e ( t ) . We then have that ∣ P ′ ( t )∣ = ∑ ki = i ∣ P i ( t )∣ and P ( t ) = ∑ k − i = i ( k − i ) ∣ P i ( t )∣ .This gives that ∣ P ′ ( t )∣ − ke ( t ) = k − ∑ i = i ∣ P i ( t )∣ ≤ ∣ P ( t )∣ ≤ ( k − ) ∣ P ′ ( t )∣( k + ) t ∣ P ′ ( )∣ − k ( k + ) t e ( ) ≤ ∣ P ( t )∣ ≤ ( k − ) ( k + ) t ∣ P ′ ( )∣ , which completes the proof. (cid:3) We next have the following lemma.
Lemma 13.
The number of hypertriangles in H t is Θ ((( k − ) + ( k − )) t ) .Proof. Let T ′ ( t ) = { ( u, e , v, e , w, e ) ∶ u, v, w ∈ V ( H t ) distinct , e , e , e ∈ E ( H t ) u ∈ e ∩ e , v ∈ e ∩ e , w ∈ e ∩ e } . Note that, while closely related, this is not the same as the set of hypertriangles as we do notinsist that the edges e , e and e are distinct. We find an exact value for ∣ T ′ ( t )∣ in terms of t and ∣ T ′ ( )∣ , which will enable us to bound the number of hypertriangles.Fix some ( u, e , v, e , w, e ) ∈ T ′ ( t − ) . We wish to count the number of descendants ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ , e ∗ ) has in T ′ ( t ) . If v ∗ = v ′ , then for v ∗ ∈ e ∗ ∩ e ∗ we must have e ∗ = e − v + v ′ and e ∗ = e − v + v ′ .Since u ′ / ∈ e − v + v ′ and w ′ ∈ e − v + v ′ this means that u ∗ = u and w ∗ = w . Since u ∗ and w ∗ arein e ∗ , e ∗ must be e or e − x + x ′ for some x ∈ e not equal to u or w , and indeed each of these k − e ∗ gives a ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ , e ∗ ) in T ′ ( t ) .An analogous argument in the cases u ∗ = u ′ and w ∗ = w ′ show that if one of u ∗ , v ∗ , w ∗ is aclone then the other two are not, and there are 3 ( k − ) descendants ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ , e ∗ ) in T ′ ( t ) of this form.Otherwise, none of u ∗ , v ∗ , w ∗ is a clone. We then have that e ∗ must be e or e − x + x ′ for some x ∈ e − u − v , e ∗ must be e or e − y + y ′ for some y ∈ e − v − w , and e ∗ must be e or e − z + z ′ for some z ∈ e − u − w . Any combination of these gives a ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ , e ∗ ) in T ′ ( t ) , and so there are ( k − ) contributing to the count. In total, each ( u, e , v, e , w, e ) ∈ T ′ ( t − ) has ( k − ) + ( k − ) descendants ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ , e ∗ ) in T ′ ( t ) giving ∣ T ′ ( t )∣ ≥ (( k − ) + ( k − )) ∣ T ′ ( t − )∣ .In the other direction, suppose we have some ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ , e ∗ ) in T ′ ( t ) . Consider theirrespective predecessors u, e , v, e , w and e in H t − . We know that u, v, w must be distinct: ifsay u = v then either u ∗ = v ∗ , contradicting that ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ ) ∈ T ′ ( t ) , or { u ∗ , v ∗ } = { v, v ′ } .This in turn contradicts that there is a hyperedge e ∗ containing both. An analogous argumentshows that v ≠ w and w ≠ u . Lemma 11 provides that u ∈ e ∩ e , v ∈ e ∩ e and w ∈ e ∩ e ,so ( u, e , v, e , w, e ) ∈ T ′ ( t − ) . Hence, every 6-tuple in T ′ ( t ) is a descendant of a 6-tuple in T ′ ( t − ) , and in particular, ∣ T ′ ( t )∣ = (( k − ) + ( k − )) ∣ T ′ ( t − )∣ . Iterating this, we obtainthat ∣ T ′ ( t )∣ = (( k − ) + ( k − )) t ∣ T ′ ( )∣ . Now, let T ( t ) be the set of hypertriangles in H t . Note that ∣ T ( t )∣ is the number of 6-tuples ( u, e , v, e , w, e ) in T ′ ( t ) , where e , e and e are all distinct. Hence, we have that ∣ T ( t )∣ ≤ ∣ T ′ ( t )∣ = (( k − ) + ( k − )) t ∣ T ′ ( )∣ . HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 15
For a lower bound, we count the number of 6-tuples where e , e and e are not distinct.If e = e ≠ e , then u and w are distinct elements in e ∩ e = e ∩ e and v ∈ e − u − w .Recalling that ∣ P i ( t )∣ is the number of pairs of edges intersecting in i vertices, we find thatthere are ∑ k − i = i ( i − )( k − )∣ P i ( t )∣ such 6-tuples. Similarly, there are ∑ k − i = i ( i − )( k − )∣ P i ( t )∣ with e = e ≠ e and with e = e ≠ e .Finally, note that when e = e = e then we just have u, v, w distinct vertices in e and sothere are k ( k − )( k − ) e ( t ) T ′ ( t ) with e = e = e . Putting these together gives ∣ T ′ ( t )∣ = ∣ T ( t )∣ + ∑ k − i = i ( i − )( k − ) P i ( t ) + k ( k − )( k − ) e ( t ) .To bound ∑ k − i = i ( i − )∣ P i ( t )∣ , we use that ∑ ki = i ∣ P i ( t )∣ = ∣ P ′ ( t )∣ = ( k + ) t ∣ P ′ ( )∣ as calculatedin the proof of Lemma 12. In particular, we have that k − ∑ i = i ( i − )∣ P i ( t )∣ ≤ ( k − ) ( k ∑ i = i ∣ P i ( t )∣) ≤ ( k − )( k + ) t ∣ P ′ ( )∣ . We next have that ∣ T ( t )∣ = ∣ T ′ ( t )∣ − ( k − ) k − ∑ i = i ( i − )∣ P i ( t )∣ − k ( k − )( k − ) e ( t ) ≥ (( k − ) + ( k − )) t ∣ T ′ ( )∣ − ( k − ) ( k + ) t ∣ P ′ ( )∣ − k ( k − )( k − )( k + ) t e ( ) , which completes the proof. (cid:3) As an immediate consequence of Lemmas 12 and 13, we obtain Theorem 8 on the value ofthe HC clustering coefficient on ILTH hypergraphs. To contextualize the result of Theorem 8,we compare HC ( H t ) to HC for other k -uniform hypergraphs. For the complete k -uniformhypergraph K ( k ) n it is straightforward to derive by counting choices of u, v, w and the edgescontaining them that HC ( K ( k ) n ) = ( n ) (( n − k − )) ( n ) (( n − k − )) = ( n − k − ) . When n = n ( t ) = t n ( ) , this gives HC ( K ( k ) n ) = Θ ( ( k − ) t ) , which is larger than HC ( H t ) , asexpected.We consider the expected value of HC in the random hypergraph G ( n, k, p ) . Here, given apath ( u, e , v, e , w ) of length two, the expected number of hypertriangles of the form ( u, e , v, e , w, e ) is ( n − k − ) p . This gives E ( HC ( G ( n, k, p ))) = ( n − k − ) p. Let n = n ( t ) = t n ( ) and p = ( k + ) t e ( )( nk ) . We then have that E ( HC ( G ( n, k, p ))) = ( n − k − )( k + ) t e ( )( nk ) = k ( k − )( k + ) t e ( ) t n ( )( t n ( ) − ) = Θ (( k + ) t ) . As k + < ( k − ) + ( k − ) k + , the clustering coefficient HC for H t grows faster than that for the randomhypergraph of the same expected average degree. The clustering coefficient HC . In this subsection, we prove Theorem 9. We firstintroduce a useful set of 5-tuples: A ( t ) = { ( u, e , v, e , w ) ∶ u, v, w ∈ V ( H t ) distinct , e , e ∈ E ( H t ) , for some e ∈ E ( H t ) such that u ∈ e ∩ e , v ∈ e ∩ e , w ∈ e ∩ e } . One view of a 5-tuple in A ( t ) is as a (possibly degenerate) path of length 2 that can becompleted to a (possibly degenerate) hypertriangle. We have the following lemma counting theelements of A ( t ) , which will greatly assist in estimating HC in ILTH hypergraphs. Lemma 14.
For all nonnegative integers t, ∣ A ( t )∣ = ( k ) t ∣ A ( )∣ . Proof.
For a fixed 5-tuple ( u, e , v, e , w ) ∈ A ( t − ) , we count the number of descendants ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ ) this has in A ( t ) . If v ∗ = v ′ , then for v ∗ ∈ e ∗ ∩ e ∗ we must have e ∗ = e − v + v ′ and e ∗ = e − v + v ′ . Since u ′ / ∈ e − v + v ′ and w ′ ∈ e − v + v ′ this means that u ∗ = u and w ∗ = w , and we know there is a hyperedge e containing both. Thus, there is one descendant ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ ) in A ( t ) with v ∗ = v ′ .Otherwise, suppose v ∗ = v . We cannot have both u ∗ = u ′ and w ∗ = w ′ as there does not existany hyperedge in E ( H t ) containing both u ′ and w ′ . We can have u ∗ = u ′ and w ∗ = w , as thehyperedge e − u + u ′ ∈ e ( H t ) contains both. In this case, e ∗ must be e − u + u ′ and e ∗ must be e or e − y + y ′ for some y ∈ e − v − w , giving k − A ( t ) . Similarly, we can have u ∗ = u and w ∗ = w ′ , and there are a further k − A ( t ) of this form.Finally, we can have u ∗ = u and w ∗ = w as we know the hyperedge e contains both. Inthis case e ∗ must be e or e − x + x ′ for some x ∈ e − v − u and e ∗ must be e or e − y + y ′ for some y ∈ e − v − w , giving ( k − ) descendants in A ( t ) of this form. In total, each ( u, e , v, e , w ) ∈ A ( t − ) has k descendants ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ ) in A ( t ) giving ∣ A ( t )∣ ≥ k ∣ A ( t − )∣ .In the other direction, suppose we have some ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ ) in A ( t ) . Let e ∗ be a hyper-edge in H t containing both u ∗ and w ∗ . Consider their respective predecessors u, e , v, e , w and e in H t − . We know that u, v, w must be distinct: if say u = v then either u ∗ = v ∗ , contra-dicting ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ ) ∈ A ( t ) , or { u ∗ , v ∗ } = { v, v ′ } , contradicting that there is a hyperedge e ∗ containing both. An analogous argument shows that v ≠ w and w ≠ u . Applying Lemma 10shows that u ∈ e ∩ e , v ∈ e ∩ e and w ∈ e ∩ e , so ( u, e , v, e , w ) ∈ A ( t − ) . Hence, every5-tuple in A ( t ) is a descendant of a 5-tuple in A ( t − ) , and in particular, ∣ A ( t )∣ = k ∣ A ( t − )∣ .Iterating this, we have that ∣ A ( t )∣ = ( k ) t ∣ A ( )∣ . (cid:3) We can now use Lemma 14 to prove Theorem 9.
Proof of Theorem 9.
Recall thatHC ( H ) = number of paths ( u, e , w, e , v ) , where u and v are in a hyperedgenumber of paths of length two . Let Λ ( t ) be the number of paths ( u, e , v, e , w ) , where u ∼ w . We then have that Λ ( t ) ⊆ A ( t ) .Also, a 5-tuple ( u, e , v, e , w ) is in A ( t ) but not Λ ( t ) if and only if e = e , and there are k ( k − )( k − ) e ( t ) such 5-tuples. Thus, we have that ∣ Λ ( t )∣ = ∣ A ( t )∣ − k ( k − )( k − ) e ( t ) = k t ∣ A ∣ − k ( k − )( k − )( k + ) t e ( ) = Θ ( k t ) . Combining this with Lemma 12, we derive that HC = Θ (( k k + ) t ) , as required. (cid:3) We contextualize these results by comparing them to the random k -uniform hypergraph G ( n, k, p ) with the same expected average degree. We derive a lemma computing the expectedvalue of HC on random hypergraphs. HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 17
Lemma 15.
For a given k and p , we have that E ( HC ( G ( n, k, p ))) = − ( − p ) ( n − k − ) . Proof.
Suppose that we are given a path ( u, e , v, e , w ) and we wish to know the probabilitythat the two vertices u, w lie in some hyperedge. There are ( n − k − ) k -sets containing both u and w and the probability that none of them is a hyperedge of G ( n, k, p ) is ( − p ) ( n − k − ) . Thus, theprobability that u ∼ w is 1 − ( − p ) ( n − k − ) . (cid:3) We compare H t to a random hypergraph with the same number of vertices and the sameexpected average degree. Set n = t n ( ) and choose p such that ( nk ) p = ( k + ) t e ( ) . We thenhave that E ( HC ( G ( n, k, p ))) ≥ − ( − p ) ( n − k − ) ≥ − exp (− p ( n − k − )) ≥ − exp (− c ( k + ) t ) , where c depends only on k, n ( ) , and e ( ) . Hence, we conclude that E ( HC ( G ( n, k, p ))) is atleast 1 − exp (− c ( k + ) t ) . For k ≥
4, this quantity tends to 1 as t tends to infinity, and it doesso doubly exponentially fast. On the other hand, we have that HC ( H t ) = O (( k k + ) t ) whichtends to 0 exponentially fast as t tends to infinity. Thus, we find that by this measure theclustering for H t is extremely low compared to the random hypergraph with the same expectedaverage degree. If k =
3, then E ( HC ( G ( n, k, p ))) is at least the constant 1 − e − c , which is largerthan HC ( H t ) = O (( ) t ) .Measured by the clustering coefficient HC , the hypergraph H t has higher clustering than incomparable hypergraphs, but this fails for HC . The reason for the discrepancy is that the twoclustering coefficients are counting different structures. Given a pair of intersecting edges e , e ,the value of HC counts how many pairs of vertices u ∈ e − e , w ∈ e − e there are that arecontained in some hyperedge e . As this is low for H t compared to random hypergraphs, fewerof those pairs are contained in any hyperedge than we might expect. The value of HC roughlycounts how many edges e intersect both e and e to make a hypertriangle. As this is largefor H t when compared to random hypergraphs, there are more of these edges than we mightexpect. Hence, relative to the random hypergraph, fewer pairs of vertices u ∈ e − e , w ∈ e − e are contained in a hyperedge, but those that are contained in an hyperedge must be containedin many hyperedges.4.3. A variant of ILTH with large HC values. To remedy the situation with ILTH havinglower HC values than random hypergraphs, we consider a variant of the model where clonesand their parents are in certain hyperedges. Such a variant is a natural one, as we may expectnewly formed hyperedges to include both parent and child vertices.Let H ( ) be a fixed k -uniform hypergraph and we iteratively construct H ( ) t , where t ≥ H ( ) t . For each v ∈ V ( H ( ) t ) , add k vertices v and v , v , . . . , v k − to H ( ) t + . We call these v i the clones of v . For each e ∈ E ( H ( ) t ) , add to H ( ) t + the hyperedge e and each of the edges e − x + x i , where x is a vertex in e and 1 ≤ i ≤ k −
1. In addition, foreach v ∈ V ( H ( ) t ) add to H ( ) t + the hyperedge { v, v , v , . . . , v k − } to H ( ) t + . We refer to the modelas ILTH , and hypergraphs generated by the model are ILTH hypergraphs . See Figure 3.The ILTH model is motivated by the desire to have clones and parent adjacent, as in the x y zx x Figure 3.
The ILTH model applied to cloning x in the hyperedge xyz .original ILT model. While the models are distinct, ILTH hypergraphs share properties withthe ILTH hypergraphs such as densification and low distances. One key difference betweenILTH and ILTH is the clustering coefficient HC . We have the following theorem, whose proofis analogous to the one of Theorem 9 and so is omitted.
Theorem 16.
For nonnegative integers t, we have that HC ( H ( ) t ) = Θ ⎛⎝( − ( k − ) ( k − k + ) + k − ) t ⎞⎠ . We compare H ( ) t to the random hypergraph with the same number of vertices and the sameexpected averaged degree. Set n = k t n ( ) and choose p such that the expected number of edges ( nk ) p is e ( t ) . In particular, we have that p = Θ (( k − k + k k ) t ) . Applying Lemma 15, we have that E ( HC ( G ( n, k, p ))) = Θ (( k − k + k ) t ) . For all k ≥
2, we find that k − k + k = − k − k < − ( k − ) ( k − k + ) + k − , so the clustering coefficient HC is larger for H ( ) t than in random hypergraphs.5. Further directions
We introduced the new ILTH model for complex hypergraphs. We found that ILTH hyper-graphs densify over time and have low average distances. We considered motifs and found thatfor those occurring in the ILTH model, their counts grow faster than in random hypergraphswith the same expected average degree. The 2-sections of ILTH hypergraphs were shown tocontain isomorphic copies of all graphs admitting a homomorphism to the 2-section of H inTheorem 3. We finished with an analysis of clustering coefficients, and it was shown that HC was larger in ILTH hypergraphs than in random hypergraphs. A similar result was proven forHC applied to a variant of ILTH, where parents are adjacent to their clones. HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 19
Several questions remain surrounding ILTH hypergraphs. We may consider variants of themodel, and study properties of hypergraphs generated by the model. For example, we may allowhyperedges that are non-uniform orders, or randomize the model by adding random hyperedgesto sets of clones. An open problem is to determine the age of ILTH hypergraphs; that is, whatare the induced subhypergraphs of ILTH hypergraphs?Another direction is to consider other notions of clustering in ILTH hypergraphs. Severalhypergraph clustering coefficients were investigated in [17], for example, and it would be inter-esting to consider their values in the ILTH model.
References [1] S.G. Aksoy, C. Joslyn, C.O. Marrero, B. Praggastis, E. Purvine, Hypernetwork science via high-orderhypergraph walks,
EPJ Data Science
16, 2020.[2] A.R. Benson, R. Abebe, M.T. Schaub, A. Jadbabaie, J. Kleinberg, Simplicial closure and higher-order linkprediction,
Proceedings of the National Academy of Sciences (2018) E11221–E11230.[3] A.R. Benson, D.F. Gleich, D.J. Higham, Higher-order network analysis takes off, fueled by old ideas andnew data,
SIAM News , https://cutt.ly/gkwhM9w , last accessed January 29, 2021.[4] A.R. Benson, D.F. Gleich, J. Leskovec, Higher-order organization of complex networks, Science (2016)163–166.[5] C. Berge,
Graphs and Hypergraphs , Elsevier, New York, 1973.[6] C. Berge,
Hypergraphs: The Theory of Finite Sets , North-Holland, Amsterdam, 1989.[7] A. Bonato,
A Course on the Web Graph , American Mathematical Society, Providence, Rhode Island, 2008.[8] A. Bonato, H. Chuangpishit, S. English, B. Kay, E. Meger, The iterated local model for social networks,
Discrete Applied Mathematics (2020) 555–571.[9] A. Bonato, D.W. Cranston, M.A. Huggan, T. Marbach, R. Mutharasan, The Iterated Local DirectedTransitivity model for social networks, In:
Proceedings of WAW’20 , 2020.[10] A. Bonato, D.F. Gleich, M. Kim, D. Mitsche, P. Pra lat, A. Tian, S.J. Young, Dimensionality matching ofsocial networks using motifs and eigenvalues,
PLOS ONE (9):e106052, 2014.[11] A. Bonato, N. Hadi, P. Horn, P. Pra lat, C. Wang, Models of on-line social networks, Internet Mathematics (2011) 285–313.[12] A. Bonato, N. Hadi, P. Pra lat, C. Wang, Dynamic models of on-line social networks, In: Proceedings ofWAW’09 , 2009.[13] A. Bonato, A. Tian, Complex networks and social networks, invited book chapter in:
Social Networks ,editor E. Kranakis, Springer, Mathematics in Industry series, 2011.[14] F.R.K. Chung, L. Lu,
Complex Graphs and Networks , American Mathematical Society, Providence, RhodeIsland, 2006.[15] M.T. Do, S. Yoon, B. Hooi, K. Shin, Structural patterns and generative models of real-world hypergraphs.In:
Proceedings of Knowledge Discovery in Databases (KDD) , 2020.[16] D. Easley, J. Kleinberg,
Networks, Crowds, and Markets Reasoning about a Highly Connected World ,Cambridge University Press, 2010.[17] E. Estrada, J.A. Rodr´ıguez-Vel´azquez, Subgraph centrality and clustering in complex hyper-networks,
Physica A: Statistical Mechanics and its Applications (2006) 581–594.[18] S.R. Gallaher, D.S. Goldberg,
Clustering Coefficients in Protein Interaction Hypernetworks , BCB’13: Pro-ceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Infor-matics (2013).[19] F. Heider,
The Psychology of Interpersonal Relations , John Wiley & Sons, 1958.[20] P. Hell, J. Neˇsetˇril,
Graphs and Homomorphisms , Oxford University Press, New York, 2004[21] G. Lee, J. Ko, K. Shin, Hypergraph motifs: concepts, algorithms, and discoveries, In:
Proceedings of theVLDB Endowment , 2020.[22] J. Leskovec, J. Kleinberg, C. Faloutsos, Graphs over time: densification laws, shrinking diameters andpossible explanations, In:
Proceedings of the 13th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining , 2005.[23] V. Memiˇsevi´c, T. Milenkovi´c, N. Prˇzulj, An integrative approach to modeling biological networks,
Journalof Integrative Bioinformatics :120, 2010. [24] J.P. Scott, Social Network Analysis: A Handbook , Sage Publications Ltd, London, 2000.[25] L. Small, O. Mason, Information diffusion on the iterated local transitivity model of online social networks,
Discrete Applied Mathematics (2013) 1338–1344.[26] D.B. West,
Introduction to Graph Theory, 2nd edition , Prentice Hall, 2001.[27] W. Zhou, L. Nakhleh, Properties of metabolic graphs: biological organization in representation artifacts,
BMC Bioinformatics (2011).(A1, A2, A3, A4, A5)
Ryerson University, Toronto, Canada
Email address , A1: (A1) [email protected]
Email address , A2: (A2) [email protected]
Email address , A3: (A3) [email protected]
Email address , A4: (A4) [email protected]
Email address , A5:, A5: