[PDF] The iterated local transitivity model for hypergraphs

Abstract

Complex networks are pervasive in the real world, capturing dyadic interactions between pairs of vertices, and a large corpus has emerged on their mining and modeling. However, many phenomena are comprised of polyadic interactions between more than two vertices. Such complex hypergraphs range from emails among groups of individuals, scholarly collaboration, or joint interactions of proteins in living cells. A key generative principle within social and other complex networks is transitivity, where friends of friends are more likely friends. The previously proposed Iterated Local Transitivity (ILT) model incorporated transitivity as an evolutionary mechanism. The ILT model provably satisfies many observed properties of social networks, such as densification, low average distances, and high clustering coefficients. We propose a new, generative model for complex hypergraphs based on transitivity, called the Iterated Local Transitivity Hypergraph (or ILTH) model. In ILTH, we iteratively apply the principle of transitivity to form new hypergraphs. The resulting model generates hypergraphs simulating properties observed in real-world complex hypergraphs, such as densification and low average distances. We consider properties unique to hypergraphs not captured by their 2-section. We show that certain motifs, which are specified subhypergraphs of small order, have faster growth rates in ILTH hypergraphs than in random hypergraphs with the same order and expected average degree. We show that the graphs admitting a homomorphism into the 2-section of the initial hypergraph appear as induced subgraphs in the 2-section of ILTH hypergraphs. We consider new and existing hypergraph clustering coefficients, and show that these coefficients have larger values in ILTH hypergraphs than in comparable random hypergraphs.

Full PDF

aa r X i v : . [ c s . D M ] J a n THE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS

NATALIE C. BEHAGUE, ANTHONY BONATO, MELISSA A. HUGGAN, REHAN MALIK,AND TRENT G. MARBACH

Abstract.

Complex networks are pervasive in the real world, capturing dyadic interactionsbetween pairs of vertices, and a large corpus has emerged on their mining and modeling.However, many phenomena are comprised of polyadic interactions between more than twovertices. Such complex hypergraphs range from emails among groups of individuals, scholarlycollaboration, or joint interactions of proteins in living cells. Complex hypergraphs and theirmodels form an emergent topic, requiring new models and techniques.A key generative principle within social and other complex networks is transitivity, wherefriends of friends are more likely friends. The previously proposed Iterated Local Transitivity(ILT) model incorporated transitivity as an evolutionary mechanism. The ILT model prov-ably satisﬁes many observed properties of social networks, such as densiﬁcation, low averagedistances, and high clustering coeﬃcients.We propose a new, generative model for complex hypergraphs based on transitivity, calledthe Iterated Local Transitivity Hypergraph (or ILTH) model. In ILTH, we iteratively apply theprinciple of transitivity to form new hypergraphs. The resulting model generates hypergraphssimulating properties observed in real-world complex hypergraphs, such as densiﬁcation andlow average distances. We consider properties unique to hypergraphs not captured by their2-section. We show that certain motifs, which are speciﬁed subhypergraphs of small order,have faster growth rates in ILTH hypergraphs than in random hypergraphs with the sameorder and expected average degree. We show that the graphs admitting a homomorphisminto the 2-section of the initial hypergraph appear as induced subgraphs in the 2-section ofILTH hypergraphs. We consider new and existing hypergraph clustering coeﬃcients, and showthat these coeﬃcients have larger values in ILTH hypergraphs than in comparable randomhypergraphs. Introduction

Complex networks are an eﬀective paradigm for pairwise interactions between objects in real-world systems. Such networks capture dyadic interactions in many phenomena, ranging fromfriendship ties in Facebook, to Bitcoin transactions, to interactions between proteins in livingcells. Complex networks evolve via a number of mechanisms such as preferential attachment orcopying that predict how links between vertices are formed over time.

Structural balance theory cites mechanisms to complete triads (that is, subgraphs consisting of three vertices) in socialand other complex networks [16, 19]. A central mechanism in balance theory is transitivity : if x is a friend of y, and y is a friend of z, then x is a friend of z ; see, for example, [24].The Iterated Local Transitivity ( ILT ) model introduced in [11, 12] and further studied in [8, 9,25], simulates structural properties in complex networks emerging from transitivity. Transitivitygives rise to the notion of cloning , where an introduced vertex x is adjacent to all of the neighborsof some pre-existing vertex y . Note that in the ILT model, the vertices have local inﬂuencewithin their neighbor sets. Although graphs generated by the model evolve over time, there is a Mathematics Subject Classiﬁcation.

Key words and phrases. hypergraphs, transitivity, clustering coeﬃcient, 2-section, motifs.The authors are funded by NSERC. The third author was funded by an NSERC Postdoctoral Fellowship. memory of the initial graph hidden in the structure. The ILT model simulates many propertiesof social networks. For example, as shown in [11], graphs generated by the model densify overtime and exhibit bad spectral expansion. In addition, the ILT model generates graphs with thesmall-world property, which requires graphs to have low diameter and high clustering coeﬃcientcompared to random graphs with the same number of vertices and expected average degree.Dyadic relationships do not always fully capture the dynamics of interactions between largergroups of vertices. For example, interactions among groups of vertices occur in scholarly collab-orations, tags attached to the same web post, or metabolic interactions between more than tworeactants. In these examples, a polyadic view of interactions is more accurate, giving rise tohypergraphs. A hypergraph is a discrete structure with vertices and hyperedges , which consistsof sets of vertices. Graphs are special cases of hypergraphs, where each hyperedge has cardinal-ity two. While hypergraph theory is less developed than graph theory, it is an emerging topicin the study of complex, real-world systems; see, for example, [2, 4, 15, 17, 18, 21, 27]. For arecent article discussing the important role of hypergraphs and other higher-order methods forstudying complex networks, see [3].In the present paper, we consider a deterministic model for complex hypergraph networksbased on transitivity. The model is analogous to the ILT model, although it has its ownunusual features. While every hypergraph can be reduced to its 2-section graph, replacingeach hyperedge by a clique, not all hypergraph properties are captured by the 2-section. As wedemonstrate, the ILT hypergraph model we introduce has properties not evident in its 2-section.Further, the model simulates several properties, such as clustering and motif evolution, morerobustly when compared to random hypergraphs with analogous characteristics. For simplicity,we consider throughout k - uniform hypergraphs, where each hyperedge has cardinality k for aﬁxed positive integer k ≥ . The

Iterated Local Transitivity Hypergraph ( ILTH ) model is deﬁned formally as follows. Themodel is deterministic and generates k -uniform hypergraphs over discrete time-steps. The soleparameter of the model is the initial k -uniform hypergraph H = H . For a nonnegative integer t ,the hypergraph H t represents the hypergraph at time-step t . To form H t + , for each x ∈ V ( H t ),add a new vertex x ′ called the clone of x . We refer to x as the parent of x ′ , and x ′ as the child of x. For every hyperedge e of H t containing x , we add the hyperedge e ′ to H t + formedby replacing x with x ′ . Observe that e ′ = ( e ∖ { x }) ∪ { x ′ } ; we simply write e ′ = e − x + x ′ . Notethat all existing hyperedges in H t are also included in H t + . See Figure 1. We refer to H t as an ILTH hypergraph , and we sometimes write H t = ILTH t ( H ) to emphasize the initial hypergraph H. Note that ILTH t ( H ) is k -uniform for all t ≥ . We sometimes refer to the formation of thehypergraphs H t as the ILTH process .The clones form an independent set in H t + , resulting in a doubling of the order of H t . Unlikein the ILT model, a clone and its parent are not in a hyperedge. For a vertex x in H t , we willsometimes use the notation x ∗ to mean any descendant of x ; that is, x ∗ is either x or x ′ in H t + . Similarly, if e is a hyperedge in H t , then e ∗ represents one of the descendant hyperedges e or e − x + x ′ in H t + .As we will demonstrate, the ILTH model simulates many properties observed in complexhypergraphs, including the small-world property and motif counts. In Section 2, we derive adensiﬁcation power law for ILTH hypergraphs, and show that distance and spectral propertiesfollow by properties of the 2-section. We then consider subhypergraphs and motifs in Section 3.Motifs are certain hypergraphs with a small number of vertices and hyperedges. In [21], it wasshown that several real-world, complex hypergraphs have motif counts dramatically higherthan comparable random hypergraphs. We show that for certain motifs arising in k -uniform HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 3 x y zx' y' z'

Figure 1.

The ILTH model with H a hyperedge with k = . hypergraphs from the list in [21] of 26 motifs formed from three hyperedges, ILTH has a provablyhigher count than in a random hypergraph with the same average degree. We prove that the2-section contains isomorphic copies of all graphs admitting a homomorphism to the 2-sectionof H in Theorem 3 and contains only such graphs; as a consequence, certain motifs will beexcluded in the ILTH process unless they appear in H .In Section 4, we provide a rigorous analysis of various clustering coeﬃcients for ILTH hy-pergraphs. Our study of clustering coeﬃcients further validates the small-world property ofILTH hypergraphs, and leads to interesting combinatorial analysis. We consider two clusteringcoeﬃcients HC and HC and their asymptotic order in ILTH. The clustering coeﬃcient HC was ﬁrst studied in [17]. We introduce the new parameter HC that is a variant of one that ﬁrstappeared in [27], although we argue it is more natural and amenable to analysis. In the caseof HC , we show that these clustering coeﬃcients provide higher clustering than is expected inrandom hypergraphs with the same average degrees. We show an analogous result for HC in avariation of the ILTH model, where clones and parents are adjacent. We ﬁnish with a summaryof our results along with open problems on the ILTH model.Throughout the paper, we consider ﬁnite, simple, undirected graphs and hypergraphs. Fora general reference on graph theory, see [26]. For a reference on hypergraphs, see [5, 6]. Forbackground on social and complex networks, see [7, 13, 14]. We deﬁne terms and notation forhypergraphs when they ﬁrst appear throughout the article.2. Densification, eigenvalues, and distances

Many examples of complex networks densify in the sense that the ratio of their number ofedges to vertices tends to inﬁnity over time; see [22]. In this section, we show that the ILTHmodel always generates hypergraphs that densify, and we give a precise statement below of itsdensiﬁcation power law.Let n ( t ) be the number of vertices in H t and let e ( t ) be the number of hyperedges in H t , re-spectively. We establish elementary though important recursive formulas for these parameters. Theorem 1.

For a nonnegative integer t, we have the following. (1) n ( t ) = t n ( ) . (2) e ( t ) = ( k + ) t e ( ) . N.C. BEHAGUE, A. BONATO, M.A. HUGGAN, R. MALIK, AND T.G. MARBACH

In particular, we have that e ( t ) = Θ ( n ( t ) log ( k + ) ) .Proof. For item (1), for each vertex v in H t , there are two vertices v and v ′ in H t + . Hence, n ( t + ) = n ( t ) .For item (2), notice that for each hyperedge e in H t , we add to H t + the hyperedge e and eachof the k hyperedges e − x + x ′ where x is a vertex in e . We then have that e ( t + ) = ( k + ) e ( t ) for all t . The result follows. (cid:3) As a consequence, the average vertex degree of ILTH t ( H ) is given by ke ( t ) n ( t ) = ( k + ) t ke ( ) n ( ) , which increases exponentially with t . Hence, we have a densiﬁcation power law for ILTHhypergraphs.We next turn to the 2-section of ILTH hypergraphs. For this, we consider a variant on theILT model for graphs, which we call ILT ′ . Given a graph G = G , iteratively construct ILT ′ t ( G ) , where t ≥ ′ t ( G ) . For each v ∈ V ( ILT ′ t ( G )) , the vertices v and v ′ are included in ILT ′ t + ( G ) . For each uv ∈ E ( ILT ′ t ( G )) , the edges uv , uv ′ and u ′ v areincluded in ILT ′ t + ( G ) . We have the following lemma, whose proof is immediate. Lemma 2.

For a nonnegative integer t, we have that ILT ′ t ( G ) is the 2-section of ILTH t ( H ) . We use the notation n t and e t for the order and size of ILT ′ t ( G ) . Observe that n t = t n and e t = t e edges. An implication of Lemma 2 is that any hypergraph property that dependssolely on the 2-section behaves the same way for the hypergraph model ILTH as it does for thegraph model ILT ′ . Such properties are not truly exploiting the hypergraph structures evidentin ILTH. We brieﬂy discuss some of these properties, including the adjacency matrix, thediameter, and the average distance.The adjacency matrix A ( H ) for a hypergraph H has rows and columns indexed by thevertices of H and entry 1 if u ≠ v and there is some hyperedge of H containing both u and v ,and 0 otherwise. It is evident that this is the same as the adjacency matrix of the 2-sectionof H . In particular, to analyse the adjacency matrix of ILTH t ( H ) we need only consider theadjacency matrix of ILT ′ t ( G ) , where G is the 2-section of H .If ILT ′ t ( G ) has n × n adjacency matrix A , then ILT ′ t + ( G ) has 2 n × n adjacency matrix ( A AA ) , where is the n × n all-zeros matrix. It is straightforward to verify that if A has eigenvalue ρ with associated eigenvector v , then ( A AA ) has eigenvalues ±√ ρ with associated eigenvectors ( ±√ vv ) . In particular, given the eigenvalues for the graph G , one can calculate the eigenvaluesfor ILT ′ t ( G ) .We next consider distance in ILTH hypergraphs. A walk of length k connecting two vertices u and v in a hypergraph is a sequence of hyperedges e , e , . . . , e k such that u ∈ e , v ∈ e k and e i ∩ e i + ≠ ∅ , for all 1 ≤ i < k . We say that the distance between two vertices u, v, written d ( u, v ) , is the minimum length of a walk connecting u and v . This is the same as the distance betweentwo vertices u and v in the 2-section of the hypergraph. In particular, to analyze distanceswithin ILTH t ( H ) we could only consider distances in ILT ′ t ( G ) , where G is the 2-section of H ,but it is equally convenient to analyse ILTH directly.Consider vertices u, v in H t with u ≠ v . Let d = d ( u, v ) and let e , e , . . . , e d be a minimumlength walk connecting them. We then have that in H t + , HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 5 (1) d ( u, v ) = d , using the walk e , e , . . . , e d ;(2) d ( u, v ′ ) = d , using the walk e , e , . . . , e d − v + v ′ ;(3) d ( u ′ , v ) = d , using the walk e − u + u ′ , e , . . . , e d ;(4) d ( u ′ , v ′ ) = d if d ≥

2, using the walk e − u + u ′ , e , . . . , e d − v + v ′ ; and(5) d ( u ′ , v ′ ) = d =

1, using the walk e − u + u ′ , e − v + v ′ , so long as k ≥ d else thepredecessors of these edges would form a walk from u to v in H t of length less than d .The diameter of a hypergraph is the maximum distance between any pair of vertices. Weﬁnd immediately that the diameter of H t + is the maximum of 2 and the diameter of H t , and,iterating this, is the maximum of 2 and the diameter of H . In either case, the diameter is aconstant, independent of t .To end this section, we determine the average distance between any pair of vertices in H t .Let W ( t ) be the sum of the distances in H t or Wiener index , written W ( t ) = ∑ u,v ∈ V ( H t ) d ( u, v ) . Assuming that H has no isolated vertices and so H t has no isolated vertices for all t ≥

1, byour calculations pertaining to distances above, we obtain that: W ( t + ) = ∑ u,v ∈ V ( H t + ) d ( u, v )= ∑ u ≠ v ∈ V ( H t ) d ( u, v ) + d ( u ′ , v ) + d ( u, v ′ ) + d ( u ′ , v ′ ) + ∑ u ∈ V ( H t ) d ( u, u ′ ) + d ( u ′ , u )= ⎛⎝ ∑ u,v ∈ V ( H t ) d ( u, v )⎞⎠ + ∣{ u ≠ v ∈ V ( H t ) ∶ d ( u, v ) = }∣ + n ( t )= W ( t ) + e ( t ) + n ( t ) . Solving this recurrence gives that W ( t ) = t ( W ( ) + e ( ) + n ( )) − e ( t ) − n ( t )= t ( W ( ) + e ( ) + n ( )) − ⋅ t e ( ) − t + n ( ) . Thus, the average distance is given by2 W ( t ) n ( t )( n ( t ) − )) = t ( W ( ) + e ( ) + n ( )) − ⋅ t e ( ) − t + n ( ) t n ( ) − t n ( ) , which tends to W ( )+ e ( )+ n n ( ) as t tends to inﬁnity. We therefore have that ILTH hypergraphsexhibit a constant average distance, as is found in many real-world hypergraphs; see [15].3. Subhypergraphs and motifs

We next consider subhypergraphs of the ILTH model, and our ﬁrst approach is to considerthe induced subgraphs of the 2-section. In Theorem 3, it is shown that a graph appears inthe 2-section of an ILTH hypergraph exactly when it admits a homomorphism to the 2-sectionof H . The theorem guarantees the absence of many kinds of induced subhypergraphs; forexample, no hypergraph clique appears in an ILTH hypergraph with larger order than H . Wethen turn to counting certain small order subhypergraphs, or motifs. Motifs are important incomplex networks, as they are one measure of similarity for graphs. For example, the countsof 3 − and 4 − vertex subgraphs gives a similarity measure for distinct graphs; see [10, 23] for N.C. BEHAGUE, A. BONATO, M.A. HUGGAN, R. MALIK, AND T.G. MARBACH implementations of this approach using machine learning. Hypergraph motifs were studiedby several authors; see for example, [1, 4, 21]. In [21], motif counts were analyzed acrossvarious real-world complex hypergraphs and compared to random hypergraphs. We show inthis section that in ILTH hypergraphs, the growth rate for certain motifs is higher than incomparable random hypergraphs.3.1.

Induced subgraphs of the 2-section.

For all t ≥ , H t is an induced subhypergraph of H t + . There exists a homomorphism f t from H t + to H t by mapping each clone to its parent,and ﬁxing all other vertices. Note that F t = f ○ f ○ ⋅ ⋅ ⋅ ○ f t is a homomorphism from H t to H . As a result, the clique and chromatic numbers of H t are bounded above by those of H . This observation puts limitations on the kinds of subgraphs that H t contains. For additionalbackground on graph homomorphisms, the reader is directed to [20].The age of a hypergraph is its set of isomorphism types of induced subhypergraphs. As each F t is a homomorphism, we have that no H t contains k -uniform cliques larger than those in H .In particular, the set of ages of an ILTH hypergraph does not contain all hypergraphs. Thiscontrasts with the ILT model, where all graphs occur in the set of ages of ILT-graphs; see [8].Characterizing the ages of ILTH hypergraphs remains an open problem. The next resultsolves the analogous problem for the ages of 2-sections of ILTH hypergraphs. For a ﬁxed graph G and family of graphs G , we say that G is G - hom-universal if the set of ages of G consists ofall ﬁnite graphs admitting a homomorphism to G. Theorem 3.

A graph G admits a homomorphism to G ( H ) if and only if G is an inducedsubgraph of G ( H t ) , for some integer t ≥ and where G ( H ) is the 2-section of H . In particular,the set of ages of 2-sections of hypergraphs in ILTH ( H ) is G ( H ) -hom-universal.Proof. The reverse direction follows since for an induced subgraph G of G ( H t ) , the inclusionmap is a homomorphism from G to G ( H t ) . Composing with F t gives a homomorphism from G to G ( H ) . For the forward direction, suppose that G admits a homomorphism f to G ( H ) . Let u, v ∈ V ( G ) be two vertices such that f ( u ) = f ( v ) . Deﬁne the homomorphism f ′ to G ( H ) as f ′ ( x ) = f ( x ) if x ≠ v, and f ( v ) is the clone of the vertex f ( u ) . We then note that the numberof vertices in the codomain of f ′ is one larger than the number of vertices in the codomain of f .We may repeat this procedure until we ﬁnd an injective homomorphism f i from G to G ( H i ) ,for some i ≥ u, v in G which are not neighbors but such that f i ( u ) f i ( v ) is an edge in G ( H i ) . We can deﬁne a new injective homomorphism f i + to G ( H i + ) by f i + ( x ) = f i ( x ) if x ∉ { u, v } , f i + ( u ) is the clone of f i ( u ) , and f i + ( v ) is the clone of f i ( v ) . We thenhave that the induced subgraph of f i ( G ) and f i + ( G ) diﬀer only in the edge f i ( u ) f i ( v ) , asthis edge does not exist in f i + ( G ) . We can repeat this procedure to construct an injectivehomomorphism f j from G to G ( H j ) for some j, with the property that for all u, v ∈ V ( G ) if f j ( u ) f j ( v ) is an edge in G ( H j ) , then uv is an edge in G . Hence, the subgraph induced by thevertices in f j ( G ) in G ( H j ) is isomorphic to G . (cid:3) Motifs.

We now turn to counting motifs, which are certain types of subhypergraphs.In [21], 26 distinct motifs were studied for three interacting hyperedges e , e , and e . Motifcounts may be viewed as a similarity measure for hypergraphs, such as when we are comparingreal-world hypergraphs and synthetic ones derived from models.

HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 7

The diﬀerent types of motifs emerge by considering which of the following seven regions arenonempty: e ∖ ( e ∪ e ) , e ∖ ( e ∪ e ) , e ∖ ( e ∪ e ) , e ∩ e ∖ e , e ∩ e ∖ e , e ∩ e ∖ e , e ∩ e ∩ e . We may compactly reference motifs by a binary sequence i i i i i i i , so that for all j , i j = motif types . See Figure 2 for an example. We may generalize e e e e e e Figure 2.

The motif type 11 or 1011101. On the left, we represent this motifvia a Venn diagram, where the vertex in a region implies it is nonempty. On theright, we have an example of a 3-uniform hypergraph realizing this motif.this notation to a tuple of nonnegative integers, quantifying the number of elements in eachregion. The cardinality vector of a motif composed of the three hyperedges e , e , e is deﬁnedas the 7-tuple: ( a, b, c, d, e, f, g ) = (∣ e ∖ ( e ∪ e )∣ , ∣ e ∖ ( e ∪ e )∣ , ∣ e ∖ ( e ∪ e )∣ , ∣ e ∩ e ∖ e ∣ , ∣ e ∩ e ∖ e ∣ , ∣ e ∩ e ∖ e ∣ , ∣ e ∩ e ∩ e ∣) . Note that a motif contains a + b + c + d + e + f + g vertices. Further, we have that ∣ e ∣ = a + d + f + g = k , ∣ e ∣ = b + d + e + g = k , and ∣ e ∣ = c + e + f + g = k .In general hypergraphs, there are 26 non-isomorphic motif types; however, we note that only11 motif types occur in k -regular hypergraphs. With numbering taken from [21], these motiftypes are:(1) Motif type 2: 1110001,(2) Motif type 6: 1110101,(3) Motif type 11: 1011101,(4) Motif type 12: 1111101,(5) Motif type 13: 0001111,(6) Motif type 14: 1001111,(7) Motif type 15: 1011111,(8) Motif type 16: 1111111,(9) Motif type 24: 1001110,(10) Motif type 25: 1011110,(11) Motif type 26: 1111110. N.C. BEHAGUE, A. BONATO, M.A. HUGGAN, R. MALIK, AND T.G. MARBACH

We keep the numbering from [21] for brevity; for example, we refer to motif 11 rather than1011101. We focus on these motif types since they always occur in the ILTH model and havehigher counts when compared to random hypergraphs, as we describe below. Interestingly,motifs 11 and 12 are more prevalent in the co-authorship hypergraphs compared to random hy-pergraphs, as shown in [21]. The same conclusion holds for motif 16 for tag hypergraphs. Theseobservations lend credence to the view that ILTH hypergraphs simulate properties observed inreal-world, complex hypergraphs.Let α i be the maximum number of vertices that can occur in a motif of type i in a k -uniform hypergraph. Each value of α i can be calculated explicitly, and each calculation isstraightforward. For example, we may calculate α as follows. Suppose that the motif inquestion has cardinality vector ( a, , , d, e, f, g ) . Without loss of generality we have that k = a + d + f + g = d + e + g = e + f + g, as each hyperedge contains k vertices. It therefore immediately follows that d = f . The totalnumber of vertices is a + d + e + f + g = k + ( k − d − g ) , which is maximized when d = f = g = d, g, f > α = k − i α i k − k − k − k − ⌊ k − ⌋ k − k − k − k − k − k − Table 1.

The maximum number of vertices α i in a motif of type i possible in a k -uniform hypergraph. Lemma 4. If H t contains x motifs of type i with cardinality vector ( a, b, c, d, e, f, g ) , then H t + contains at least x ( g + ( c + ) d + ( b + ) f + ( a + ) e + ( a + )( b + )( c + )) motifs of type i withcardinality vector ( a, b, c, d, e, f, g ) .Proof. For a motif in H t of type i with cardinality vector ( a, b, c, d, e, f, g ) formed by the hyper-edges e , e , e , we choose a set S of up to three vertices contained in the motif to clone suchthat each hyperedge of the motif contains at most one cloned vertex. Consider the motif in H t + formed by the hyperedges e ′ , e ′ , e ′ , where e ′ i is the hyperedge obtained from e i by replacingeach vertex that is also in S with its clone and leaving other vertices unchanged. This motifis of type i and has cardinality vector ( a, b, c, d, e, f, g ) . Each motif developed in this way isunique. We must therefore ﬁnd how many ways there are of choosing S , which is g + ( c + ) d + ( b + ) f + ( a + ) e + ( a + )( b + )( c + ) , and the proof follows. (cid:3) We have the following theorem.

Theorem 5.

If the initial hypergraph contains at least one hyperedge, then the number of motifsof type 11 in the ILTH model is Ω ( k t ) . HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 9

Proof.

A motif of type 11 has cardinality vector ( a, , c, d, e, , g ) , where a + d + g = d + e + g = c + e + g = k , which yields a = e and c = d . For each motif of type 11 in H t , there will be g + ( c + ) d + ( a + ) e + ( a + )( c + ) = g + a + c + c + a + ac + = ( k − g ) − ac + k − g + , motifs of type 11 in H t + , which is maximized when g = a = c = a, c, g > k − k + e be a hyperedge in H . For some u ∈ e , there is a hyperedge e = e ∪ { u ′ } ∖ { u } in H . For some v ∈ e ∖ { u } , there is a hyperedge e in H k − with e ∩ e ∩ e = { u } and e ∩ e = { u, v } . These three hyperedges form a motif of type 11 in H k − with cardinality vector ( , , k − , k − , , , ) . As such, by Lemma 4 there are at least ( k − k + ) t − k + = Ω ( k t ) motifsof type 11 in H t with cardinality vector ( , , k − , k − , , , ) . (cid:3) We can also perform a similar analysis of the other motif types that grow rapidly.

Theorem 6.

If the initial hypergraph contains at least one hyperedge, then the number of eachmotif of types 2, 6, 12, 16, and 26 in the ILTH model is Ω ( k t ) .Proof. It is straightforward to verify that H k contains a motif of type i containing α i ver-tices, for i ∈ { , , , , } . Suppose that the motif in question has cardinality vector ( a, b, c, d, e, f, g ) , and so α i = a + b + c + d + e + f + g . As α i = Ω ( k ) for these values of i ,by Lemma 4, there are at least ( α i ) -times more of this motif type and cardinality vector ineach iteration of the ILTH process. Hence, there are at least ( α i ) t − k = Ω ( k t ) of this motiftype in H t , and the result follows. (cid:3) Our analysis so far does not apply to motif types 13, 14, 15, 24, and 25 , as each of thesemotif types will not be generated in the ILTH process on one hyperedge. However, if one ofthese motif types occurs within the starting hypergraph, then we will have exponential growthof these, as shown in the following theorem. Theorem 7. If H contains a motif of type i ∈ { , , , , } that contains m vertices, thenmotif i occurs at least ( m + ) t times in H t .Proof. The proof follows by Lemma 4. (cid:3)

We contrast the motif counts for ILTH with comparable random hypergraphs. Let G ( n, k, p ) be the random hypergraph where each possible k -set is included as a hyperedge with probability p . If we ﬁx two vertices u and w , then the expected number of hyperedges e containing both u and w is ( n − k − ) p .We consider the k -uniform hypergraph with n = n ( t ) = t n ( ) = Θ ( t ) vertices and p = e ( t )( n ( t ) k ) = Θ ( ( log ( k + )− k ) t ) . We expect Θ ( n α i p ) motifs of type i with α i vertices. To see this, give each vertex in a motifwith α i vertices a label between 1 and α i , and deﬁne three k -sets with these labels e , e , and e from the three hyperedges in the motif with these labels. We select α i vertices from theset of n vertices in the k -uniform random hypergraph, labeling the i th choice by the label i .There are n ! ( n − α i ) ! ∼ n α i possible ways to make these choices. The sets of vertices e , e , and e are hyperedges in the k -uniform random hypergraph with probability p . There is systematicdouble counting of occurrences of the motif but this only changes the expectation by a multipleof some function of k , which is a constant. The motifs of type i with fewer than α i vertices will occur o ( n α i p ) times, so the total number of motifs of type i that have any number of verticesis Θ ( n α i p ) .Therefore, we expect Θ ( n α i p ) = Θ ( ( α i − ( k − log ( k + )) ) t ) many occurrences of motif i . If α i < ( k − log ( k + )) , then the expected number of motifs oftype i will tend to 0 exponentially fast, and if α i > ( k − log k + ) , then the expected numberof motifs of type i grows exponentially. In particular, it will be useful to note that if α i ≤ k − k ≥

9, then the expected number of motifs of type i will tend to 0 exponentially fast, and if α i = k − c with c ∈ { , , , } , then the expected number of motifs of type i grows exponentiallyfast.As a consequence, we expect motifs 2, 6, 12, 16, and 26 to occur an exponential numberof times each in a random hypergraph. We expect that other motifs will rarely occur, withthe probability that we see any diminishing when k ≥ t increases. As a consequence ofTheorems 5 and 6, the growth rates of the motif types 2, 6, 11, 12, 16, and 26 is faster in ILTHthan in a comparable random k -uniform hypergraph.We ﬁnish the section with precise motif counts for ILTH with initial hypergraph a single hy-peredge. We ran the ILTH model on a computer, starting with a single hyperedge of cardinality k , for 3 ≤ k ≤ ≤ t ≤ − k .See Tables 2 to 5 below for the motif counts of these ILTH hypergraphs.t 2 6 11 261 3 12 45 126 75 453 3447 4770 1083 11414 161451 115146 12675 223655 5981355 2301930 133563 3829816 195870195 41818266 1326675 60710857 5993456427 720709290 12718443 91888021 Table 2.

The number of motifs generated by the ILTH model starting with ahyperedge of cardinality 3.t 2 6 11 12 16 261 6 42 90 504 474 504 188 2763 16660 75168 14010 42192 5116 342484 2651330 6088680 305682 1920888 107712 23413325 305991860 369517680 5764506 67434480 2026684 1227661206 28267339810 19173430584 100158594 2066592024 34911788 1285323380

Table 3.

The number of motifs generated by the ILTH model starting with ahyperedge of cardinality 4.

HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 11 t 2 6 11 12 16 261 10 102 150 1110 1490 2100 1870 4203 40210 356670 82030 540720 189610 2343604 13613610 77687610 3114650 71894820 12725950 500627405 4067088850 12719703750 97894510 6831291600 680649610 7078307400

Table 4.

The number of motifs generated by the ILTH model starting with ahyperedge of cardinality 5.t 2 6 11 12 16 2601 15 202 229 2070 3285 5040 7680 1203 79096 994680 301515 2610180 1983740 5767204 388621215 409931190 18710325 815537880 346117200 370671840

Table 5.

The number of motifs generated by the ILTH model starting with ahyperedge of cardinality 6.4.

Hypergraph clustering coefficients

The small-world property in complex networks demands low average distance and high clus-tering coeﬃcients, relative to random graphs with the same expected average degree; see [7]for a discussion. An analogous deﬁnition holds for small-world hypergraphs, comparing theirproperties to a random hypergraph G ( n, k, p ) with the same order n and p chosen so that theyhave the same expected average degree. As we demonstrated in Section 2, ILTH hypergraphshave constant average distance. Hence, a natural next step in our investigation is to considerclustering coeﬃcients of ILTH hypergraphs.There are a variety of hypergraph clustering coeﬃcients we may consider; see [18] for ninedistinct coeﬃcients. We focus on a clustering coeﬃcient introduced in [17], along with a newone that is a variant of the one studied in [27]. We discuss these clustering coeﬃcients byconsidering graphs. For a graph G, the global clustering coeﬃcient is C ( G ) = × ( number of triangles in G ) number of paths of length two in G .

Note that C ( G ) is a rational number in the interval [ , ] .There are several diﬀerent ways to generalize the deﬁnition of clustering coeﬃcient to hyper-graphs. We discuss three of these in the context of the ILTH model.We deﬁne a path of length two in a hypergraph to be a 5-tuple ( u, e , v, e , w ) where u, v, w are distinct vertices, e , e are distinct hyperedges, and u, v ∈ e , v, w ∈ e . Similarly, we deﬁnea hypertriangle to be a 6-tuple ( u, e , v, e , w, e ) where u, v, w are distinct vertices, e , e , e aredistinct hyperedges, and u, v ∈ e , v, w ∈ e , w, u ∈ e . We have the following generalization ofthe clustering coeﬃcient to hypergraphs, appearing ﬁrst in [17]:HC ( H ) = × (number of hypertriangles in H )number of paths of length two in H .

Note that HC ( H ) = C ( H ) in the case that H is a graph. However, for general hypergraphs H , the values of HC ( H ) need no longer be in the interval [ , ] . For example, the complete k -uniform hypergraph on n vertices has HC ( GK ( k ) n ) = ( n − k − ) . The reason for this diﬀerencewith the graph case is because a given path of length two ( u, e , v, e , w ) can be extended toa hypertriangle in many diﬀerent ways. The hyperedge e can be any hyperedge so long as itincludes u and w . The clustering coeﬃcient HC counts the average number of hypertrianglesthat are extensions of a path of length two.We prove the following theorem on HC in Subsection 4.1. Theorem 8.

For a nonnegative integer t , we have that HC ( H t ) = Θ (( ( k − ) + ( k − ) k + ) t ) . We can show that H t has a higher value of HC than the random k -uniform hypergraph withthe same number of vertices and the same expected average degree. See the discussion at theend of Subsection 4.1.There are other ways to express the clustering coeﬃcient on graphs that lead to diﬀerentgeneralisations to hypergraphs. One such equivalent deﬁnition is that C is the probability thatgiven a path of length two, the end vertices are adjacent: C ( G ) = P ( uv is an edge ∶ ( u, e , w, e , v ) a path of length two ) . We say two vertices u, v in a hypergraph are adjacent , written u ∼ v , if there is some hyperedge e containing both. There is then a natural way to generalize this deﬁnition of C to hypergraphs,which we think we are, surprisingly, the ﬁrst to propose.HC ( H ) = P ( u ∼ v ∶ ( u, e , w, e , v ) a path of length two in H ) = number of paths ( u, e , w, e , v ) , where u ∼ v number of paths of length two . Note that since HC is a probability, this clustering coeﬃcient is bounded between 0 and 1.Further, HC matches the clustering coeﬃcient C on graphs.A diﬀerent generalization of the clustering coeﬃcient to hypergraphs, due to [27], also retainsthe property that the clustering coeﬃcient is between 0 and 1, and is closely related to HC .Let I be the set of pairs of intersecting edges in H . For a ( e, f ) ∈ I , deﬁne A ( e, f ) = ∣{ u ∈ e − f ∶ for some w ∈ f − e with u ∼ w }∣ . For e , e ∈ I deﬁne EO ( e , e ) = A ( e , e ) + A ( e , e )∣ e − e ∣ + ∣ e − e ∣ . The extra overlap attempts to capture the number of connections between vertices u ∈ e − e and w ∈ e − e . It is evident that 0 ≤ EO ( e , e ) ≤

1. The following clustering coeﬃcient from[27] is the average extra overlap over all intersecting pairs of edges:HC ( H ) = ∣ I ∣ ∑ ( e i ,e j )∈I EO ( e i , e j ) . The goals of the authors in [27] were to deﬁne a clustering coeﬃcient on hypergraphs that i)took values in [ , ] , ii) matches the normal clustering coeﬃcient when applied to graphs, andiii) reﬂects the extent of connectivity among neighbors of v due to hyperedges other than onesconnecting v with those neighbors. These three goals are satisﬁed by HC , but they are also allsatisﬁed by HC , which we believe to be a more natural deﬁnition given that it can be simply HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 13 expressed as a probability without recourse to the notion of extra overlap. For these reasons,we focus on the new clustering coeﬃcient parameter HC . We prove the following theorem on HC in Subsection 4.2. Theorem 9.

For a nonnegative integer t, we have that HC ( H t ) = Θ (( k k + ) t ) . We show that H t has a lower value of HC than the random k -uniform hypergraph with thesame number of vertices and the same expected average degree, and so by this measure, it hasless clustering. This is in contrast to the clustering coeﬃcient HC , and we include a discussionof this phenomenon at the end of the section. We introduce a modiﬁed version of ILTH whereclones and their parents are in certain hyperedges. For the modiﬁed ILTH model, HC hashigher values than in random hypergraphs.The following lemma will prove useful in our study of hypergraph clustering coeﬃcients. Lemma 10.

Suppose that v ∈ V ( H t − ) and e ∈ E ( H t − ) with v / ∈ e . Let v ∗ ∈ V ( H t ) be adescendant of v and e ∗ ∈ E ( H t ) be a descendant of e . We then have that v ∗ / ∈ e ∗ .Proof. Take some v ∈ V ( H t − ) and e ∈ E ( H t − ) with v / ∈ e . The descendants of e are e and e − x + x ′ for each x ∈ e . Since v / ∈ e , it is evident that v and v ′ are not contained in any of thedescendants of e . (cid:3) Lemma 10 is more useful for our purpose in its contrapositive form.

Lemma 11.

Suppose that v ∗ ∈ V ( H t ) and e ∗ ∈ E ( H t ) with v ∗ ∈ e ∗ . If v ∈ V ( H t − ) and e ∈ E ( H t − ) are their respective predecessors, then v ∈ e . The clustering coeﬃcient HC . This subsection is devoted to proving Theorem 8. Tothat end, we prove two combinatorial lemmas ﬁnding the asymptotic order of the number ofpaths of length two and the number of hypertriangles in H t , respectively. Lemma 12.

The number of paths of length two in H t is Θ (( k + ) t ) .Proof. Let P ′ ( t ) = {( e , v, e ) ∶ v ∈ V ( H t ) , e , e ∈ E ( H t ) , v ∈ e ∩ e } . Note that, while closelyrelated, this is not the same as the set of paths of length two as we do not include endpoints.We include the degenerate case where e = e . We ﬁnd an exact value for ∣ P ′ ( t )∣ in terms of t and ∣ P ′ ( )∣ , which will enable us to bound the number of paths of length two.Fix some ( e , v, e ) ∈ P ′ ( t − ) . We wish to count the number of descendants ( e ∗ , v ∗ , e ∗ ) thishas in P ′ ( t ) . If v ∗ = v ′ , then for v ∗ ∈ e ∗ ∩ e ∗ we must have e ∗ = e − v + v ′ and e ∗ = e − v + v ′ ,so there is one descendant ( e ∗ , v ∗ , e ∗ ) in P ′ ( t ) , where v ∗ = v ′ . If v ∗ = v , then for v ∈ e ∗ ∩ e ∗ wecannot have e ∗ ≠ e − v + v ′ and e ∗ ≠ e − v + v ′ . All of the k other descendants of e and the k other descendants of e contain v so there are k descendants ( e ∗ , v ∗ , e ∗ ) in P ′ ( t ) , where v ∗ = v .In total, each ( e , v, e ) ∈ P ′ ( t − ) has k + ( e ∗ , v ∗ , e ∗ ) in P ′ ( t ) , giving ∣ P ′ ( t )∣ ≥ ( k + )∣ P ′ ( t − )∣ . Next, suppose we have some ( e ∗ , v ∗ , e ∗ ) ∈ P ′ ( t ) , so in particular, v ∗ ∈ e ∗ ∩ e ∗ . Considertheir respective predecessors e , v , and e in H t − . Lemma 11 provides that v ∈ e and v ∈ e ,so ( e , v, e ) ∈ P ′ ( t − ) . Hence, every triple in P ′ ( t ) is a descendant of a triple in P ′ ( t − ) ,and in particular, ∣ P ′ ( t )∣ = ( k + )∣ P ′ ( t − )∣ . Iterating this process, we derive that ∣ P ′ ( t )∣ = ( k + ) t ∣ P ′ ( )∣ . Now, let P ( t ) be the set of paths of length two in H t . Recall that a path of length twois ( u, e , v, e , w ) where u, v, w ∈ V ( H t ) are distinct, e , e ∈ E ( H t ) are distinct, and u, v ∈ e , v, w ∈ e .For 0 ≤ i ≤ k , let P i ( t ) be the set of ordered pairs ( e , e ) of hyperedges with ∣ e ∩ e ∣ = i . Notethat ∣ P k ( t )∣ = e ( t ) . We then have that ∣ P ′ ( t )∣ = ∑ ki = i ∣ P i ( t )∣ and P ( t ) = ∑ k − i = i ( k − i ) ∣ P i ( t )∣ .This gives that ∣ P ′ ( t )∣ − ke ( t ) = k − ∑ i = i ∣ P i ( t )∣ ≤ ∣ P ( t )∣ ≤ ( k − ) ∣ P ′ ( t )∣( k + ) t ∣ P ′ ( )∣ − k ( k + ) t e ( ) ≤ ∣ P ( t )∣ ≤ ( k − ) ( k + ) t ∣ P ′ ( )∣ , which completes the proof. (cid:3) We next have the following lemma.

Lemma 13.

The number of hypertriangles in H t is Θ ((( k − ) + ( k − )) t ) .Proof. Let T ′ ( t ) = { ( u, e , v, e , w, e ) ∶ u, v, w ∈ V ( H t ) distinct , e , e , e ∈ E ( H t ) u ∈ e ∩ e , v ∈ e ∩ e , w ∈ e ∩ e } . Note that, while closely related, this is not the same as the set of hypertriangles as we do notinsist that the edges e , e and e are distinct. We ﬁnd an exact value for ∣ T ′ ( t )∣ in terms of t and ∣ T ′ ( )∣ , which will enable us to bound the number of hypertriangles.Fix some ( u, e , v, e , w, e ) ∈ T ′ ( t − ) . We wish to count the number of descendants ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ , e ∗ ) has in T ′ ( t ) . If v ∗ = v ′ , then for v ∗ ∈ e ∗ ∩ e ∗ we must have e ∗ = e − v + v ′ and e ∗ = e − v + v ′ .Since u ′ / ∈ e − v + v ′ and w ′ ∈ e − v + v ′ this means that u ∗ = u and w ∗ = w . Since u ∗ and w ∗ arein e ∗ , e ∗ must be e or e − x + x ′ for some x ∈ e not equal to u or w , and indeed each of these k − e ∗ gives a ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ , e ∗ ) in T ′ ( t ) .An analogous argument in the cases u ∗ = u ′ and w ∗ = w ′ show that if one of u ∗ , v ∗ , w ∗ is aclone then the other two are not, and there are 3 ( k − ) descendants ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ , e ∗ ) in T ′ ( t ) of this form.Otherwise, none of u ∗ , v ∗ , w ∗ is a clone. We then have that e ∗ must be e or e − x + x ′ for some x ∈ e − u − v , e ∗ must be e or e − y + y ′ for some y ∈ e − v − w , and e ∗ must be e or e − z + z ′ for some z ∈ e − u − w . Any combination of these gives a ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ , e ∗ ) in T ′ ( t ) , and so there are ( k − ) contributing to the count. In total, each ( u, e , v, e , w, e ) ∈ T ′ ( t − ) has ( k − ) + ( k − ) descendants ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ , e ∗ ) in T ′ ( t ) giving ∣ T ′ ( t )∣ ≥ (( k − ) + ( k − )) ∣ T ′ ( t − )∣ .In the other direction, suppose we have some ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ , e ∗ ) in T ′ ( t ) . Consider theirrespective predecessors u, e , v, e , w and e in H t − . We know that u, v, w must be distinct: ifsay u = v then either u ∗ = v ∗ , contradicting that ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ ) ∈ T ′ ( t ) , or { u ∗ , v ∗ } = { v, v ′ } .This in turn contradicts that there is a hyperedge e ∗ containing both. An analogous argumentshows that v ≠ w and w ≠ u . Lemma 11 provides that u ∈ e ∩ e , v ∈ e ∩ e and w ∈ e ∩ e ,so ( u, e , v, e , w, e ) ∈ T ′ ( t − ) . Hence, every 6-tuple in T ′ ( t ) is a descendant of a 6-tuple in T ′ ( t − ) , and in particular, ∣ T ′ ( t )∣ = (( k − ) + ( k − )) ∣ T ′ ( t − )∣ . Iterating this, we obtainthat ∣ T ′ ( t )∣ = (( k − ) + ( k − )) t ∣ T ′ ( )∣ . Now, let T ( t ) be the set of hypertriangles in H t . Note that ∣ T ( t )∣ is the number of 6-tuples ( u, e , v, e , w, e ) in T ′ ( t ) , where e , e and e are all distinct. Hence, we have that ∣ T ( t )∣ ≤ ∣ T ′ ( t )∣ = (( k − ) + ( k − )) t ∣ T ′ ( )∣ . HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 15

For a lower bound, we count the number of 6-tuples where e , e and e are not distinct.If e = e ≠ e , then u and w are distinct elements in e ∩ e = e ∩ e and v ∈ e − u − w .Recalling that ∣ P i ( t )∣ is the number of pairs of edges intersecting in i vertices, we ﬁnd thatthere are ∑ k − i = i ( i − )( k − )∣ P i ( t )∣ such 6-tuples. Similarly, there are ∑ k − i = i ( i − )( k − )∣ P i ( t )∣ with e = e ≠ e and with e = e ≠ e .Finally, note that when e = e = e then we just have u, v, w distinct vertices in e and sothere are k ( k − )( k − ) e ( t ) T ′ ( t ) with e = e = e . Putting these together gives ∣ T ′ ( t )∣ = ∣ T ( t )∣ + ∑ k − i = i ( i − )( k − ) P i ( t ) + k ( k − )( k − ) e ( t ) .To bound ∑ k − i = i ( i − )∣ P i ( t )∣ , we use that ∑ ki = i ∣ P i ( t )∣ = ∣ P ′ ( t )∣ = ( k + ) t ∣ P ′ ( )∣ as calculatedin the proof of Lemma 12. In particular, we have that k − ∑ i = i ( i − )∣ P i ( t )∣ ≤ ( k − ) ( k ∑ i = i ∣ P i ( t )∣) ≤ ( k − )( k + ) t ∣ P ′ ( )∣ . We next have that ∣ T ( t )∣ = ∣ T ′ ( t )∣ − ( k − ) k − ∑ i = i ( i − )∣ P i ( t )∣ − k ( k − )( k − ) e ( t ) ≥ (( k − ) + ( k − )) t ∣ T ′ ( )∣ − ( k − ) ( k + ) t ∣ P ′ ( )∣ − k ( k − )( k − )( k + ) t e ( ) , which completes the proof. (cid:3) As an immediate consequence of Lemmas 12 and 13, we obtain Theorem 8 on the value ofthe HC clustering coeﬃcient on ILTH hypergraphs. To contextualize the result of Theorem 8,we compare HC ( H t ) to HC for other k -uniform hypergraphs. For the complete k -uniformhypergraph K ( k ) n it is straightforward to derive by counting choices of u, v, w and the edgescontaining them that HC ( K ( k ) n ) = ( n ) (( n − k − )) ( n ) (( n − k − )) = ( n − k − ) . When n = n ( t ) = t n ( ) , this gives HC ( K ( k ) n ) = Θ ( ( k − ) t ) , which is larger than HC ( H t ) , asexpected.We consider the expected value of HC in the random hypergraph G ( n, k, p ) . Here, given apath ( u, e , v, e , w ) of length two, the expected number of hypertriangles of the form ( u, e , v, e , w, e ) is ( n − k − ) p . This gives E ( HC ( G ( n, k, p ))) = ( n − k − ) p. Let n = n ( t ) = t n ( ) and p = ( k + ) t e ( )( nk ) . We then have that E ( HC ( G ( n, k, p ))) = ( n − k − )( k + ) t e ( )( nk ) = k ( k − )( k + ) t e ( ) t n ( )( t n ( ) − ) = Θ (( k + ) t ) . As k + < ( k − ) + ( k − ) k + , the clustering coeﬃcient HC for H t grows faster than that for the randomhypergraph of the same expected average degree. The clustering coeﬃcient HC . In this subsection, we prove Theorem 9. We ﬁrstintroduce a useful set of 5-tuples: A ( t ) = { ( u, e , v, e , w ) ∶ u, v, w ∈ V ( H t ) distinct , e , e ∈ E ( H t ) , for some e ∈ E ( H t ) such that u ∈ e ∩ e , v ∈ e ∩ e , w ∈ e ∩ e } . One view of a 5-tuple in A ( t ) is as a (possibly degenerate) path of length 2 that can becompleted to a (possibly degenerate) hypertriangle. We have the following lemma counting theelements of A ( t ) , which will greatly assist in estimating HC in ILTH hypergraphs. Lemma 14.

For all nonnegative integers t, ∣ A ( t )∣ = ( k ) t ∣ A ( )∣ . Proof.

For a ﬁxed 5-tuple ( u, e , v, e , w ) ∈ A ( t − ) , we count the number of descendants ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ ) this has in A ( t ) . If v ∗ = v ′ , then for v ∗ ∈ e ∗ ∩ e ∗ we must have e ∗ = e − v + v ′ and e ∗ = e − v + v ′ . Since u ′ / ∈ e − v + v ′ and w ′ ∈ e − v + v ′ this means that u ∗ = u and w ∗ = w , and we know there is a hyperedge e containing both. Thus, there is one descendant ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ ) in A ( t ) with v ∗ = v ′ .Otherwise, suppose v ∗ = v . We cannot have both u ∗ = u ′ and w ∗ = w ′ as there does not existany hyperedge in E ( H t ) containing both u ′ and w ′ . We can have u ∗ = u ′ and w ∗ = w , as thehyperedge e − u + u ′ ∈ e ( H t ) contains both. In this case, e ∗ must be e − u + u ′ and e ∗ must be e or e − y + y ′ for some y ∈ e − v − w , giving k − A ( t ) . Similarly, we can have u ∗ = u and w ∗ = w ′ , and there are a further k − A ( t ) of this form.Finally, we can have u ∗ = u and w ∗ = w as we know the hyperedge e contains both. Inthis case e ∗ must be e or e − x + x ′ for some x ∈ e − v − u and e ∗ must be e or e − y + y ′ for some y ∈ e − v − w , giving ( k − ) descendants in A ( t ) of this form. In total, each ( u, e , v, e , w ) ∈ A ( t − ) has k descendants ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ ) in A ( t ) giving ∣ A ( t )∣ ≥ k ∣ A ( t − )∣ .In the other direction, suppose we have some ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ ) in A ( t ) . Let e ∗ be a hyper-edge in H t containing both u ∗ and w ∗ . Consider their respective predecessors u, e , v, e , w and e in H t − . We know that u, v, w must be distinct: if say u = v then either u ∗ = v ∗ , contra-dicting ( u ∗ , e ∗ , v ∗ , e ∗ , w ∗ ) ∈ A ( t ) , or { u ∗ , v ∗ } = { v, v ′ } , contradicting that there is a hyperedge e ∗ containing both. An analogous argument shows that v ≠ w and w ≠ u . Applying Lemma 10shows that u ∈ e ∩ e , v ∈ e ∩ e and w ∈ e ∩ e , so ( u, e , v, e , w ) ∈ A ( t − ) . Hence, every5-tuple in A ( t ) is a descendant of a 5-tuple in A ( t − ) , and in particular, ∣ A ( t )∣ = k ∣ A ( t − )∣ .Iterating this, we have that ∣ A ( t )∣ = ( k ) t ∣ A ( )∣ . (cid:3) We can now use Lemma 14 to prove Theorem 9.

Proof of Theorem 9.

Recall thatHC ( H ) = number of paths ( u, e , w, e , v ) , where u and v are in a hyperedgenumber of paths of length two . Let Λ ( t ) be the number of paths ( u, e , v, e , w ) , where u ∼ w . We then have that Λ ( t ) ⊆ A ( t ) .Also, a 5-tuple ( u, e , v, e , w ) is in A ( t ) but not Λ ( t ) if and only if e = e , and there are k ( k − )( k − ) e ( t ) such 5-tuples. Thus, we have that ∣ Λ ( t )∣ = ∣ A ( t )∣ − k ( k − )( k − ) e ( t ) = k t ∣ A ∣ − k ( k − )( k − )( k + ) t e ( ) = Θ ( k t ) . Combining this with Lemma 12, we derive that HC = Θ (( k k + ) t ) , as required. (cid:3) We contextualize these results by comparing them to the random k -uniform hypergraph G ( n, k, p ) with the same expected average degree. We derive a lemma computing the expectedvalue of HC on random hypergraphs. HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 17

Lemma 15.

For a given k and p , we have that E ( HC ( G ( n, k, p ))) = − ( − p ) ( n − k − ) . Proof.

Suppose that we are given a path ( u, e , v, e , w ) and we wish to know the probabilitythat the two vertices u, w lie in some hyperedge. There are ( n − k − ) k -sets containing both u and w and the probability that none of them is a hyperedge of G ( n, k, p ) is ( − p ) ( n − k − ) . Thus, theprobability that u ∼ w is 1 − ( − p ) ( n − k − ) . (cid:3) We compare H t to a random hypergraph with the same number of vertices and the sameexpected average degree. Set n = t n ( ) and choose p such that ( nk ) p = ( k + ) t e ( ) . We thenhave that E ( HC ( G ( n, k, p ))) ≥ − ( − p ) ( n − k − ) ≥ − exp (− p ( n − k − )) ≥ − exp (− c ( k + ) t ) , where c depends only on k, n ( ) , and e ( ) . Hence, we conclude that E ( HC ( G ( n, k, p ))) is atleast 1 − exp (− c ( k + ) t ) . For k ≥

4, this quantity tends to 1 as t tends to inﬁnity, and it doesso doubly exponentially fast. On the other hand, we have that HC ( H t ) = O (( k k + ) t ) whichtends to 0 exponentially fast as t tends to inﬁnity. Thus, we ﬁnd that by this measure theclustering for H t is extremely low compared to the random hypergraph with the same expectedaverage degree. If k =

3, then E ( HC ( G ( n, k, p ))) is at least the constant 1 − e − c , which is largerthan HC ( H t ) = O (( ) t ) .Measured by the clustering coeﬃcient HC , the hypergraph H t has higher clustering than incomparable hypergraphs, but this fails for HC . The reason for the discrepancy is that the twoclustering coeﬃcients are counting diﬀerent structures. Given a pair of intersecting edges e , e ,the value of HC counts how many pairs of vertices u ∈ e − e , w ∈ e − e there are that arecontained in some hyperedge e . As this is low for H t compared to random hypergraphs, fewerof those pairs are contained in any hyperedge than we might expect. The value of HC roughlycounts how many edges e intersect both e and e to make a hypertriangle. As this is largefor H t when compared to random hypergraphs, there are more of these edges than we mightexpect. Hence, relative to the random hypergraph, fewer pairs of vertices u ∈ e − e , w ∈ e − e are contained in a hyperedge, but those that are contained in an hyperedge must be containedin many hyperedges.4.3. A variant of ILTH with large HC values. To remedy the situation with ILTH havinglower HC values than random hypergraphs, we consider a variant of the model where clonesand their parents are in certain hyperedges. Such a variant is a natural one, as we may expectnewly formed hyperedges to include both parent and child vertices.Let H ( ) be a ﬁxed k -uniform hypergraph and we iteratively construct H ( ) t , where t ≥ H ( ) t . For each v ∈ V ( H ( ) t ) , add k vertices v and v , v , . . . , v k − to H ( ) t + . We call these v i the clones of v . For each e ∈ E ( H ( ) t ) , add to H ( ) t + the hyperedge e and each of the edges e − x + x i , where x is a vertex in e and 1 ≤ i ≤ k −

1. In addition, foreach v ∈ V ( H ( ) t ) add to H ( ) t + the hyperedge { v, v , v , . . . , v k − } to H ( ) t + . We refer to the modelas ILTH , and hypergraphs generated by the model are ILTH hypergraphs . See Figure 3.The ILTH model is motivated by the desire to have clones and parent adjacent, as in the x y zx x Figure 3.

The ILTH model applied to cloning x in the hyperedge xyz .original ILT model. While the models are distinct, ILTH hypergraphs share properties withthe ILTH hypergraphs such as densiﬁcation and low distances. One key diﬀerence betweenILTH and ILTH is the clustering coeﬃcient HC . We have the following theorem, whose proofis analogous to the one of Theorem 9 and so is omitted.

Theorem 16.

For nonnegative integers t, we have that HC ( H ( ) t ) = Θ ⎛⎝( − ( k − ) ( k − k + ) + k − ) t ⎞⎠ . We compare H ( ) t to the random hypergraph with the same number of vertices and the sameexpected averaged degree. Set n = k t n ( ) and choose p such that the expected number of edges ( nk ) p is e ( t ) . In particular, we have that p = Θ (( k − k + k k ) t ) . Applying Lemma 15, we have that E ( HC ( G ( n, k, p ))) = Θ (( k − k + k ) t ) . For all k ≥

2, we ﬁnd that k − k + k = − k − k < − ( k − ) ( k − k + ) + k − , so the clustering coeﬃcient HC is larger for H ( ) t than in random hypergraphs.5. Further directions

We introduced the new ILTH model for complex hypergraphs. We found that ILTH hyper-graphs densify over time and have low average distances. We considered motifs and found thatfor those occurring in the ILTH model, their counts grow faster than in random hypergraphswith the same expected average degree. The 2-sections of ILTH hypergraphs were shown tocontain isomorphic copies of all graphs admitting a homomorphism to the 2-section of H inTheorem 3. We ﬁnished with an analysis of clustering coeﬃcients, and it was shown that HC was larger in ILTH hypergraphs than in random hypergraphs. A similar result was proven forHC applied to a variant of ILTH, where parents are adjacent to their clones. HE ITERATED LOCAL TRANSITIVITY MODEL FOR HYPERGRAPHS 19

Several questions remain surrounding ILTH hypergraphs. We may consider variants of themodel, and study properties of hypergraphs generated by the model. For example, we may allowhyperedges that are non-uniform orders, or randomize the model by adding random hyperedgesto sets of clones. An open problem is to determine the age of ILTH hypergraphs; that is, whatare the induced subhypergraphs of ILTH hypergraphs?Another direction is to consider other notions of clustering in ILTH hypergraphs. Severalhypergraph clustering coeﬃcients were investigated in [17], for example, and it would be inter-esting to consider their values in the ILTH model.

References [1] S.G. Aksoy, C. Joslyn, C.O. Marrero, B. Praggastis, E. Purvine, Hypernetwork science via high-orderhypergraph walks,

EPJ Data Science

16, 2020.[2] A.R. Benson, R. Abebe, M.T. Schaub, A. Jadbabaie, J. Kleinberg, Simplicial closure and higher-order linkprediction,

Proceedings of the National Academy of Sciences (2018) E11221–E11230.[3] A.R. Benson, D.F. Gleich, D.J. Higham, Higher-order network analysis takes oﬀ, fueled by old ideas andnew data,

SIAM News , https://cutt.ly/gkwhM9w , last accessed January 29, 2021.[4] A.R. Benson, D.F. Gleich, J. Leskovec, Higher-order organization of complex networks, Science (2016)163–166.[5] C. Berge,

Graphs and Hypergraphs , Elsevier, New York, 1973.[6] C. Berge,

Hypergraphs: The Theory of Finite Sets , North-Holland, Amsterdam, 1989.[7] A. Bonato,

A Course on the Web Graph , American Mathematical Society, Providence, Rhode Island, 2008.[8] A. Bonato, H. Chuangpishit, S. English, B. Kay, E. Meger, The iterated local model for social networks,

Discrete Applied Mathematics (2020) 555–571.[9] A. Bonato, D.W. Cranston, M.A. Huggan, T. Marbach, R. Mutharasan, The Iterated Local DirectedTransitivity model for social networks, In:

Proceedings of WAW’20 , 2020.[10] A. Bonato, D.F. Gleich, M. Kim, D. Mitsche, P. Pra lat, A. Tian, S.J. Young, Dimensionality matching ofsocial networks using motifs and eigenvalues,

PLOS ONE (9):e106052, 2014.[11] A. Bonato, N. Hadi, P. Horn, P. Pra lat, C. Wang, Models of on-line social networks, Internet Mathematics (2011) 285–313.[12] A. Bonato, N. Hadi, P. Pra lat, C. Wang, Dynamic models of on-line social networks, In: Proceedings ofWAW’09 , 2009.[13] A. Bonato, A. Tian, Complex networks and social networks, invited book chapter in:

Social Networks ,editor E. Kranakis, Springer, Mathematics in Industry series, 2011.[14] F.R.K. Chung, L. Lu,

Complex Graphs and Networks , American Mathematical Society, Providence, RhodeIsland, 2006.[15] M.T. Do, S. Yoon, B. Hooi, K. Shin, Structural patterns and generative models of real-world hypergraphs.In:

Proceedings of Knowledge Discovery in Databases (KDD) , 2020.[16] D. Easley, J. Kleinberg,

Networks, Crowds, and Markets Reasoning about a Highly Connected World ,Cambridge University Press, 2010.[17] E. Estrada, J.A. Rodr´ıguez-Vel´azquez, Subgraph centrality and clustering in complex hyper-networks,

Physica A: Statistical Mechanics and its Applications (2006) 581–594.[18] S.R. Gallaher, D.S. Goldberg,

Clustering Coeﬃcients in Protein Interaction Hypernetworks , BCB’13: Pro-ceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Infor-matics (2013).[19] F. Heider,

The Psychology of Interpersonal Relations , John Wiley & Sons, 1958.[20] P. Hell, J. Neˇsetˇril,

Graphs and Homomorphisms , Oxford University Press, New York, 2004[21] G. Lee, J. Ko, K. Shin, Hypergraph motifs: concepts, algorithms, and discoveries, In:

Proceedings of theVLDB Endowment , 2020.[22] J. Leskovec, J. Kleinberg, C. Faloutsos, Graphs over time: densiﬁcation laws, shrinking diameters andpossible explanations, In:

Proceedings of the 13th ACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining , 2005.[23] V. Memiˇsevi´c, T. Milenkovi´c, N. Prˇzulj, An integrative approach to modeling biological networks,

Journalof Integrative Bioinformatics :120, 2010. [24] J.P. Scott, Social Network Analysis: A Handbook , Sage Publications Ltd, London, 2000.[25] L. Small, O. Mason, Information diﬀusion on the iterated local transitivity model of online social networks,

Discrete Applied Mathematics (2013) 1338–1344.[26] D.B. West,

Introduction to Graph Theory, 2nd edition , Prentice Hall, 2001.[27] W. Zhou, L. Nakhleh, Properties of metabolic graphs: biological organization in representation artifacts,

BMC Bioinformatics (2011).(A1, A2, A3, A4, A5)

Ryerson University, Toronto, Canada

Email address , A1: (A1) [email protected]

Email address , A2: (A2) [email protected]

Email address , A3: (A3) [email protected]

Email address , A4: (A4) [email protected]

Email address , A5:, A5: