Distribution-Free Models of Social Networks
aa r X i v : . [ c s . D S ] J u l Distribution-Free Models of Social Networks ∗ Tim Roughgarden † C. Seshadhri ‡ August 3, 2020
Abstract
The structure of large-scale social networks has predominantly been articulated using gen-erative models, a form of average-case analysis. This chapter surveys recent proposals of morerobust models of such networks. These models posit deterministic and empirically supportedcombinatorial structure rather than a specific probability distribution. We discuss the formaldefinitions of these models and how they relate to empirical observations in social networks, aswell as the known structural and algorithmic results for the corresponding graph classes.
Technological developments in the 21st century have given rise to large-scale social networks, suchas the graphs defined by Facebook friendship relationships or followers on Twitter. Such networksarguably provide the most important new application domain for graph analysis in well over adecade.
There is wide consensus that social networks have predictable structure and features, and accord-ingly are not well modeled by arbitrary graphs. From a structural viewpoint, the most well studiedand empirically validated properties of social networks are:1. A heavy-tailed degree distribution, such as a power-law distribution.2. Triadic closure, meaning that pairs of vertices with a common neighbor tend to be directlyconnected—that friends of friends tend to be friends in their own right.3. The presence of “community-like structures,” meaning subgraphs that are much more richlyconnected internally than externally.4. The small-world property, meaning that it’s possible to travel from any vertex to any othervertex using remarkably few hops. ∗ Chapter 28 of the book
Beyond the Worst-Case Analysis of Algorithms (Roughgarden, 2020). † Department of Computer Science, Columbia University. Supported in part by NSF award CCF-1813188 andARO award W911NF1910294. Email: [email protected]. ‡ Department of Computer Science, University of California at Santa Cruz. Supported in part by NSF TRIPODSgrant CCF-1740850, NSF grants CCF-1813165 and CCF-1909790, and ARO award W911NF1910294. Email: [email protected] . p ); a new model is needed to capture them.From an algorithmic standpoint, empirical results indicate that optimization problems are ofteneasier to solve in social networks than in worst-case graphs. For example, lightweight heuristics areunreasonably effective in practice for finding the maximum clique or recovering dense subgraphs ofa large social network.The literature on models that capture the special structure of social networks is almost entirelydriven by the quest for generative (i.e., probabilistic) models that replicate some or all of thefour properties listed above. Dozens of generative models have been proposed, and there is littleconsensus about which is the “right” one. The plethora of models poses a challenge to meaningfultheoretical work on social networks—which of the models, if any, is to be believed? How can webe sure that a given algorithmic or structural result is not an artifact of the model chosen?This chapter surveys recent research on more robust models of large-scale social networks, whichassume deterministic combinatorial properties rather than a specific generative model. Structuraland algorithmic results that rely only on these deterministic properties automatically carry overto any generative model that produces graphs possessing these properties (with high probability).Such results effectively apply “in the worst case over all plausible generative models.” This hybridof worst-case (over input distributions) and average-case (with respect to the distribution) analysisresembles several of the semi-random models discussed elsewhere in the book, such as in the pre-ceding chapters on pseudorandom data (Chapter 26) and prior-independent auctions (Chapter 27).Sections 2 and 3 of this chapter cover two models of social networks that are motivated bytriadic closure, the second of the four signatures of social networks listed in Section 1. Sections 4and 5 discuss two models motivated by heavy-tailed degree distributions. c -Closed Graphs Triadic closure is the property that, when two members of a social network have a friend in common,they are likely to be friends themselves. In graph-theoretic terminology, two-hop paths tend toinduce triangles.Triadic closure has been studied for decades in the social sciences and there is compellingintuition for why social networks should exhibit strong triadic closure properties. Two people witha common friend are much more likely to meet than two arbitrary people, and are likely to sharecommon interests. They might also feel pressure to be friends to avoid imposing stress on theirrelationships with their common friend.The data support this intuition. Numerous large-scale studies on online social networks provideoverwhelming empirical evidence for triadic closure. The plot in Figure 1, derived from the networkof email communications at the disgraced energy company Enron, is representative. Other socialnetworks exhibit similar triadic closure properties. c -Closed Graphs The most extreme version of triadic closure would assert that whenever two vertices have a commonneighbor, they are themselves neighbors: whenever ( u, v ) and ( v, w ) are in the edge set E , so is( u, w ). The class of graphs satisfying this property is not very interesting—it is precisely the (vertex-)disjoint unions of cliques—but it forms a natural base case for more interesting parameterized2 a) Triadic closure in the Enron email network (b) Triadic closure in a random graph Figure 1: In the Enron email graph, vertices correspond to Enron employees, and there is anedge connecting two employees if one sent at least one email to the other. In (a), vertex pairs ofthis graph are grouped according to the number of common neighbors (indicated on the x -axis).The y -axis shows the fraction of such pairs that are themselves connected by an edge. The edgedensity—the fraction of arbitrary vertex pairs that are directly connected—is roughly 10 − . In (b),a cartoon of the analogous plot for an Erd˝os-R´enyi graph with edge density p = 10 − is shown.Erd˝os-R´enyi graphs are not a good model for networks like the Enron network—their closure rateis too small, and the closure rate fails to increase as the number of common neighbors increases.definitions. Our first definition of a class of graphs with strong triadic closure properties is that of c -closedgraphs . Definition 2.1 (Fox et al. (2020)) . For a positive integer c , a graph G = ( V, E ) is c -closed if,whenever u, v ∈ V have at least c common neighbors, ( u, v ) ∈ E . For a fixed number of vertices, the parameter c interpolates between unions of cliques (when c = 1) and all graphs (when c = | V | − K , ) or a diamond (i.e., K minus an edge) as an induced subgraph—isalready non-trivial. The c -closed condition is a coarse proxy for the empirical closure rates observedin social networks (like in Figure 1), asserting that the closure rate jumps to 100% for vertices with c or more common neighbors.Next is a less stringent version of the definition, which is sufficient for the main algorithmicresult of this section. Definition 2.2 (Fox et al. (2020)) . For a positive integer c , a vertex v of a graph G = ( V, E ) is c -good if whenever v has at least c common neighbors with another vertex u , ( u, v ) ∈ E . Thegraph G is weakly c -closed if every induced subgraph has at least one c -good vertex. A c -closed graph is also weakly c -closed, as each of its vertices is c -good in each of its inducedsubgraphs. The converse is false; for example, a path graph is not 1-closed, but it is weakly 1-closed (as the endpoints of a path are 1-good). Equivalent to Definition 2.2 is the condition that Recall that a clique of a graph G = ( V, E ) is a subset S ⊆ V of vertices that are fully connected, meaning that( u, v ) ∈ E for every pair u, v of distinct vertices of S . G has an elimination ordering of c -good vertices, meaning the vertices can be ordered v , v , . . . , v n such that, for every i = 1 , , . . . , n , the vertex v i is c -good in the subgraph inducedby v i , v i +1 , . . . , v n (Exercise 1). Are real-world social networks c -closed or weakly c -closed forreasonable values of c ? The next table summarizes some representative numbers. n m c weak c email-Enron 36692 183831 161 34p2p-Gnutella04 10876 39994 24 8wiki-Vote 7115 103689 420 42ca-GrQc 5242 14496 41 9Table 1: The c -closure and weak c -closure of four well-studied social networks from the SNAP (Stan-ford Large Network Dataset) collection of benchmarks ( http://snap.stanford.edu/ ). “email-Enron” is the network described in Figure 1; “p2p-Gnutella04” is the topology of a Gnutellapeer-to-peer network circa 2002; “wiki-Vote” is the network of who votes on whom in promotioncases on Wikipedia; and “ca-GrQc” is the collaboration network of authors of papers uploaded tothe General Relativity and Quantum Cosmology section of arXiv. For each network G , n indicatesthe number of vertices, m the number of edges, c the smallest value γ such that G is γ -closed, and“weak c ” the smallest value γ such that G is weakly γ -closed.These social networks are c -closed for much smaller values of c than the trivial bound of n − c -closed for quite modest values of c . Once a class of graph has been defined, such as c -closed graphs, a natural agenda is to investigatefundamental optimization problems with graphs restricted to the class. We single out the problemof finding the maximum-size clique of a graph, primarily because it is one of the most centralproblems in social network analysis. In a social network, cliques can be interpreted as the mostextreme form of a community.The problem of computing the maximum clique of a graph reduces to the problem of enumerat-ing the graph’s maximal cliques —the maximum clique is also maximal, so it appears as the largestof the cliques in the enumeration.How does the c -closed condition help with the efficient computation of a maximum clique? Wenext observe that the problem of reporting all maximal cliques is polynomial-time solvable in c -closed graphs when c is a fixed constant. The algorithm is based on backtracking. For convenience,we give a procedure that, for any vertex v , identifies all maximal cliques that contain v . (The fullprocedure loops over all vertices.)1. Maintain a history H , initially empty.2. Let N denote the vertex set comprising v and all vertices w that are adjacent to both v andall vertices in H .3. If N is a clique, report the clique H ∪ N and return.4. Otherwise, recurse on each vertex w ∈ N \ { v } with history H := H ∪ { v } . A maximal clique is a clique that is not a strict subset of another clique. n = 12 vertices.This subroutine reports all maximal cliques that contain v , whether the graph is c -closed or not(Exercise 2). In a c -closed graph, the maximum depth of the recursion is c —once | H | = c − N \ { v } has c common neighbors (namely H ∪ { v } ) and hence N must bea clique. The running time of the backtracking algorithm is therefore n c + O (1) in c -closed graphs.This simplistic backtracking algorithm is extremely slow except for very small values of c . Canwe do better? There is a simple but clever algorithm that, for an arbitrary graph, enumerates all of the maximalcliques while using only polynomial time per clique.
Theorem 2.1 (Tsukiyama et al. (1977)) . There is an algorithm that, given any input graph with n vertices and m edges, outputs all of the maximal cliques of the graph in O ( mn ) time per maximalclique. Theorem 2.1 reduces the problem of enumerating all maximal cliques in polynomial time to thecombinatorial task of proving a polynomial upper bound on the number of maximal cliques.Computing a maximum clique of an arbitrary graph is an
N P -hard problem, so presumablythere exist graphs with an exponential number of maximal cliques. The
Moon-Moser graphs are asimple and famous example. For n a multiple of 3, the Moon-Moser graph with n vertices is theperfectly balanced n -tite graph, meaning the vertices are partitioned into n groups of 3, and everyvertex is connected to every other vertex except for the 2 vertices in the same group (Figure 2).Choosing one vertex from each group induces a maximal clique, for a total of 3 n/ maximal cliques,and these are all of the maximal cliques of the graph. More generally, a basic result in graph theoryasserts that no n -vertex graph can have more than 3 n/ maximal cliques. Theorem 2.2 (Moon and Moser (1965)) . Every n -vertex graph has at most n/ maximal cliques. A Moon-Moser graph on n vertices is not c -closed even for c = n −
3, so there remains hope for apositive result for c -closed graphs with small c . The Moon-Moser graphs do show that the numberof maximal cliques of a c -closed graph can be exponential in c (since a Moon-Moser graph on c vertices is trivially c -closed). Thus the best-case scenario for enumerating the maximal cliques ofa c -closed graph is a fixed-parameter tractability result (with respect to the parameter c ), stating5hat, for some function f and constant d (independent of c ), the number of maximal cliques in an n -vertex c -closed graph is O ( f ( c ) · n d ). The next theorem shows that this is indeed the case, evenfor weakly c -closed graphs. Theorem 2.3 (Fox et al. (2020)) . Every weakly c -closed graph with n vertices has at most ( c − / · n maximal cliques. The following corollary is immediate from Theorems 2.1 and 2.3.
Corollary 2.3.1.
The maximum clique problem is polynomial-time solvable in weakly c -closed n -vertex graphs with c = O (log n ) . The proof of Theorem 2.3 proceeds by induction on the number of vertices n . (One of the factorsof n in the bound is from the n steps in this induction.) Let G be an n -vertex weakly c -closedgraph. Assume that n ≥
3; otherwise, the bound is trivial.By assumption, G has a c -good vertex v . By induction, G \ { v } has at most ( n − · ( c − / maximal cliques. (An induced subgraph of a weakly c -closed graph is again weakly c -closed.) Everymaximal clique C of G \ { v } gives rise to a unique maximal clique in G (namely C or C ∪ { v } ,depending on whether the latter is a clique). It remains to bound the number of uncounted maximalcliques of G , meaning the maximal cliques K of G for which K \ { v } is not maximal in G \ { v } .An uncounted maximal clique K must include v , with K contained in v ’s neighborhood (i.e.,in the subgraph induced by v and the vertices adjacent to it). Also, there must be a vertex u / ∈ K such that K \ { v } ∪ { u } is a clique in G \ { v } ; we say that u is a witness for K , as it certifiesthe non-maximality of K \ { v } in G \ { v } . Such a witness must be connected to every vertex of K \ { v } . It cannot be a neighbor of v , as otherwise K ∪ { u } would be a clique in G , contradicting K ’s maximality.Choose an arbitrary witness for each uncounted clique of G and bucket these cliques accordingto their witness; recall that all witnesses are non-neighbors of v . For every uncounted clique K withwitness u , all vertices of the clique K \ { v } are connected to both v and u . Moreover, because K is a maximal clique in G , K \ { v } is a maximal clique in the subgraph G u induced by the commonneighbors of u and v .How big can such a subgraph G u be? This is the step of the proof where the weakly c -closedcondition is important: Because u is a non-neighbor of v and v is a c -good vertex, u and v have atmost c − G u has at most c − G u has at most 3 ( c − / maximal cliques. Addingup over the at most n choices for u , the number of uncounted cliques is at most n · ( c − / ; thissum over possible witnesses is the source of the second factor of n in Theorem 2.3. Combining thisbound on the uncounted cliques with the inductive bound on the remaining maximal cliques of G yields the desired upper bound of( n − · ( c − / + n · ( c − / ≤ n · ( c − / . N ( v ) denotes the neighbors of v . K denotes a maximal cliqueof G such that K \ { v } is not maximal in G \ { v } . There is a vertex u , not connected to v , thatwitnesses the non-maximality of K \ { v } in G \ { v } . Because v is a c -good vertex, u and v have atmost c − Our second graph class inspired by the strong triadic closure properties of social and informationnetworks is the class of δ -triangle-dense graphs. These are graphs where a constant fraction ofvertex pairs having at least one common neighbor are directly connected by an edge. Equivalently,a constant fraction of the wedges (i.e., two-hop paths) of the graph belong to a triangle. Definition 3.1 (Gupta et al. (2016)) . The triangle density of an undirected graph G is τ ( G ) :=3 t ( G ) /w ( G ) , where t ( G ) and w ( G ) denote the number of triangles and wedges of G , respectively.(We define τ ( G ) = 0 if w ( G ) = 0 .) The class of δ -triangle-dense graphs consists of the graphs G with τ ( G ) ≥ δ . (In the social networks literature, this is also called the transitivity or the global clusteringcoefficient .) Because every triangle of a graph contains 3 wedges, and no two triangles share awedge, the triangle density of a graph is between 0 and 1—the fraction of wedges that belong to atriangle. Triangle density is another coarse proxy for the empirical closure rates observed in socialnetworks (like in Figure 1(a)).The 1-triangle-dense graphs are precisely the unions of disjoint cliques, while triangle-free graphsconstitute the 0-triangle-dense graphs. The triangle density of an Erd˝os-R´enyi graph with edgeprobability p is concentrated around p (cf., Figure 1(b)). For an Erd˝os-R´enyi graph to have constanttriangle density, one would need to set p = Ω(1). This would imply that the graph is dense, quiteunlike social networks. For example, in the year 2011 the triangle density of the Facebook graphwas computed to be 0 .
16, which is five orders of magnitude larger than in a random graph withthe same number of vertices (roughly 1 billion at the time) and edges (roughly 100 billion).
What do δ -triangle-dense graphs look like? Can we make any structural assertions about them, akinto separator theorems for planar graphs (allowing them to be viewed as “approximate grids”) orthe regularity lemma for dense graphs (allowing them to viewed as approximate unions of randombipartite graphs)? 7 a) An ideal triangle-dense graph (b) The lollipop graph Figure 4: Two examples of δ -triangle-dense graphs with δ close to 1.Given that 1-triangle-dense graphs are unions of cliques, a first guess might be that δ -triangle-dense graphs look like the approximate union of approximate cliques (as in Figure 4(a)). Suchgraphs certainly have high triangle density; could there be an “inverse theorem,” stating that theseare in some sense the only graphs with this property?In its simplest form, the answer to this question is “no,” as δ -triangle-dense graphs become quitediverse once δ is bounded below 1. For example, adding a clique on n / vertices to an arbitrarybounded-degree n -vertex graph produces a δ -triangle-dense graph with δ = 1 − o (1) as n → ∞ (seeFigure 4(b)).Nonetheless, an inverse theorem does hold if we redefine what it means to approximate a graphby a collection of approximate cliques. Instead of trying to capture most of the vertices or edges(which is impossible, as the previous example shows), we consider the goal of capturing a constantfraction of the triangles of a graph by a collection of dense subgraphs. To state an inverse theorem for triangle-dense graphs, we require a preliminary definition.
Definition 3.2 (Tightly Knit Family) . Let ρ > . A collection V , V , . . . , V k of disjoint sets ofvertices of a graph G = ( V, E ) forms a ρ -tightly-knit family if:1. For each i = 1 , , . . . , k , the subgraph induced by V i has at least ρ · (cid:0) | V i | (cid:1) edges and ρ · (cid:0) | V i | (cid:1) triangles. (That is, a ρ -fraction of the maximum possible edges and triangles.)2. For each i = 1 , , . . . , k , the subgraph induced by V i has radius at most . In Definition 3.2, the vertex sets V , V , . . . , V k are disjoint but need not cover all of V ; inparticular, the empty collection is technically a tightly knit family.The following inverse theorem states that every triangle-dense graph contains a tightly-knitfamily that captures most of the “meaningful social structure”—a constant fraction of the graph’striangles. Theorem 3.1 (Gupta et al. (2016)) . There is a function f ( δ ) = O ( δ ) such that for every δ -triangle dense graph G , there exists an f ( δ ) -tightly-knit family that contains an f ( δ ) fraction of thetriangles of G . ρ -tightly-knit families with constant ρ . The complete tripartite graph shows that Theorem 3.1does not hold if the “radius-2” condition in Definition 3.1 is strengthened to “radius-1” (Exercise 4). The proof of Theorem 3.1 is constructive, and interleaves two subroutines. To state the first, definethe
Jaccard similarity of an edge ( u, v ) of a graph G as the fraction of neighbors of u and v thatare neighbors of both: | N ( u ) ∩ N ( v ) || N ( u ) ∪ N ( v ) | − , where N ( · ) denotes the neighbors of a vertex and the “-2” is to avoid counting u and v themselves.The first subroutine, called the cleaner , is given a parameter ǫ as input and repeatedly deletes edgeswith Jaccard similarity less than ǫ until none remain. Removing edges from the graph is worrisomebecause it removes triangles, and Theorem 3.1 promises that the final tightly knit family captures aconstant fraction of the original graph’s triangles. But removing an edge with low Jaccard similaritydestroys many more wedges than triangles, and the number of triangles in the graph is at least aconstant fraction of the number of wedges (because it is δ -triangle-dense). A charging argumentalong these lines shows that, provided ǫ is at most δ/
4, the cleaner cannot destroy more than aconstant fraction of the graph’s triangles.The second subroutine, called the extractor , is responsible for extracting one of the clusters ofthe tightly-knit family from a graph in which all edges have Jaccard similarity at least ǫ . (Isolatedvertices can be discarded from further consideration.) How is this Jaccard similarity conditionhelpful? One easy observation is that, post-cleaning, the graph is “approximately locally regular,”meaning that the endpoints of any edge have degrees within a ǫ factor of each other. Startingfrom this fact, easy algebra shows that every one-hop neighborhood of the graph (i.e., the subgraphinduced by a vertex and its neighbors) has constant (depending on ǫ ) density in both edges andtriangles, as required by Theorem 3.1. The bad news is that extracting a one-hop neighborhoodcan destroy almost all of a graph’s triangles (Exercise 4). The good news is that supplementing aone-hop neighborhood with a judiciously chosen subset of the corresponding two-hop neighborhood(i.e., neighbors of neighbors) fixes the problem. Precisely, the extractor subroutine is given agraph G in which every edge has Jaccard similarity at least ǫ and proceeds as follows:1. Let v be a vertex of G with the maximum degree. Let d max denote v ’s degree and N ( v ) itsneighbors.2. Calculate a score θ w for every vertex w outside { v } ∪ N ( v ) equal to the number of trianglesthat include w and two vertices of N ( v ). In other words, θ w is the number of triangles thatwould be saved by supplementing the one-hop neighborhood { v } ∪ N ( v ) by w . (On the flipside, this would also destroy the triangles that contain w and two vertices outside N ( v ).)3. Return the union of { v } , N ( v ), and up to d max vertices outside { v } ∪ N ( v ) with the largestnon-zero θ -scores.It is clear that the extractor outputs a set S of vertices that induces a subgraph with radius atmost 2. As with one-hop neighborhoods, easy algebra shows that, because every edge has Jaccardsimilarity at least ǫ , this subgraph is dense in both edges and triangles. The important non-obvious fact, whose proof is omitted here, is that the number of triangles saved by the extractor(i.e., triangles with all three vertices in its output) is at least a constant fraction of the number of9riangles it destroys (i.e., triangles with one or two vertices in its output). It follows that alternatingbetween cleaning and extracting (until no edges remain) will produce a tightly-knit family meetingthe promises of Theorem 3.1. Arguably the most famous property of social and information networks, even more so than triadicclosure, is a power-law degree distribution , also referred to as a heavy-tailed or scale-free degreedistribution.
Consider a simple graph G = ( V, E ) with n vertices. For each positive integer d , let n ( d ) denotethe number of vertices of G with degree d . The sequence { n ( d ) } is called the degree distribution of G . Informally, a degree distribution is said to be a power-law with exponent γ > n ( d ) scales as n/d γ .There is some controversy about how to best fit power-law distributions to data, and whethersuch distributions are the “right” fit for the degree distributions in real-world social networks (asopposed to, say, lognormal distributions). Nevertheless, several of the consequences of a power-law degree distribution assumption are uncontroversial for social networks, and so a power-lawdistribution is a reasonable starting point for mathematical analysis.This section studies the algorithmic benefits of assuming that a graph has an (approximately)power-law degree distribution, in the form of fast algorithms for fundamental graph problems. Todevelop our intuition about such graphs, let’s do some rough calculations under the assumptionthat n ( d ) = cn/d γ (for some constant c ) for every d up to the maximum degree d max ; think of d max as n β for some constant β ∈ (0 , X d ≤ d max n ( d ) = n = ⇒ cn X d ≤ d max d − γ = n. (1)When γ ≤ P d< ∞ d − γ is a divergent series. In this case, we cannot satisfy the right-handside of (1) with a constant c . For this reason, results on power-law degree distributions typicallyassume that γ > X d ≤ d max d · n ( d ) = cn X d ≤ d max d − γ +1 . (2)Thus, up to constant factors, P d ≤ d max d − γ +1 is the average degree. For γ > P d< ∞ d − γ +1 is aconvergent series, and the graph has constant average degree. For this reason, much of the earlyliterature on graphs with power-law degree distributions focused on the regime where γ >
2. When γ = 2, the average degree scales with log n , and for γ ∈ (1 , d max ) − γ , which ispolynomial in n .One of the primary implications of a power-law degree distribution is upper bounds on thenumber of high-degree vertices. Specifically, under our assumption that n ( d ) = cn/d γ , the numberof vertices of degree at least k can be bounded by d max X d = k n ( d ) ≤ cn ∞ X d = k d − γ ≤ cn Z ∞ k x − γ dx = cnk − γ +1 / ( γ −
1) = Θ( nk − γ +1 ) . (3)10 .2 PLB Graphs The key definition in this section is a more plausible and robust version of the assumption that n ( d ) = cn/d γ , for which the conclusions of calculations like those in Section 4.1 remain valid.The definition allows individual values of n ( d ) to deviate from a true power law, while requiring(essentially) that the average value of n ( d ) in sufficiently large intervals of d does follow a powerlaw. Definition 4.1 (Berry et al. (2015); Brach et al. (2016)) . A graph G with degree distribution { n ( d ) } is a power-law bounded (PLB) graph with exponent γ > if there is a constant c > such that r +1 X d =2 r n ( d ) ≤ cn r +1 X d =2 r d − γ for all r ≥ . Many real-world social networks satisfy a mild generalization of this definition, in which n ( d )is allowed to scale with n/ ( d + t ) γ for a “shift” t ≥
0; see the Notes for details. For simplicity, wecontinue to assume in this section that t = 0.Definition 4.1 has several of the same implications as a pure power law assumption, includingthe following lemma (cf. (2)). Lemma 4.1.
Suppose G is a PLB graph with exponent γ > . For every c > and natural number k , X d ≤ k d c · n ( d ) = O n X d ≤ k d c − γ . The proof of Lemma 4.1 is technical but not overly difficult; we do not discuss the details here.The first part of the next lemma provides control over the number of high-degree vertices andis the primary reason why many graph problems are more easily solved on PLB graphs than ongeneral graphs. The second part of the lemma bounds the number of wedges of the graph when γ ≥ Lemma 4.2.
Suppose G is a PLB graph with exponent γ > . Then:(a) P d ≥ k n ( d ) = O ( nk − γ +1 ) .(b) Let W denote the number of wedges (i.e., two-hop paths). If γ = 3 , W = O ( n log n ) . If γ > , W = O ( n ) . Part (a) extends the computation in (3) to PLB graphs, while part (b) follows from Lemma 4.1(see Exercise 5).
Many graph problems appear to be easier in PLB graphs than in general graphs. To illustrate thispoint, we single out the problem of triangle counting , which is one of the most canonical problems insocial network analysis. For this section, we assume that our algorithms can determine in constanttime if there is an edge between a given pair of vertices; these lookups can be avoided with a carefulimplementation (Exercise 6), but such details distract from the main analysis.As a warm up, consider the following trivial algorithm to count (three times) the number oftriangles of a given graph G (“Algorithm 1”): 11 For every vertex u of G : – For every pair v, w of u ’s neighbors, check if u , v , and w form a triangle.Note that the running time of Algorithm 1 is proportional to the number of wedges in the graph G .The following running time bound for triangle counting in PLB graphs is an immediate corollaryof Lemma 4.2(b), applied to Algorithm 1. Corollary 4.0.1.
Triangle counting in n -vertex PLB graphs with exponent can be carried out in O ( n log n ) time. If the exponent is strictly greater than , it can be done in O ( n ) time. Now consider an optimization of Algorithm 1 (“Algorithm 2”): • Direct each edge of G from the lower-degree endpoint to the higher-degree endpoint (breakingties lexicographically) to obtain a directed graph D . • For every vertex u of D : – For every pair v, w of u ’s out-neighbors , check if u , v , and w form a triangle in G .Each triangle is counted exactly once by Algorithm 2, in the iteration where the lowest-degree ofits three vertices plays the role of u . Remarkably, this simple idea leads to massive time savings inpractice.A classical way to capture this running time improvement mathematically is to parameterize theinput graph G by its degeneracy , which can be thought of as a refinement of the maximum degree.The degeneracy α ( G ) of a graph G can be computed by iteratively removing a minimum-degreevertex (updating the vertex degrees after each iteration) until no vertices remain; α ( G ) is then thelargest degree of a vertex at the time of its removal. (For example, every tree has degeneracy equalto 1.) We have the following guarantee for Algorithm 2, parameterized by a graph’s degeneracy: Theorem 4.1 (Chiba and Nishizeki (1985)) . For every graph with m edges and degeneracy α , therunning time of Algorithm 2 is O ( mα ) . Every PLB graph with exponent γ > α = O ( n /γ ); see Exercise 8. For PLBgraphs with γ >
2, we can apply Lemma 4.1 with c = 1 to obtain m = O ( n ) and hence the runningtime of Algorithm 2 is O ( mα ) = O ( n ( γ +1) /γ ).Our final result for PLB graphs improves this running time bound, for all γ ∈ (2 , Theorem 4.2 (Brach et al. (2016)) . In PLB graphs with exponent γ ∈ (2 , , Algorithm 2 runs in O ( n /γ ) time.Proof. Let G = ( V, E ) denote an n -vertex PLB graph with exponent γ ∈ (2 , v in G by d v and its out-degree in the directed graph D by d + v . The running time ofAlgorithm 2 is O ( n + P v (cid:0) d + v (cid:1) ) = O ( n + P v ( d + v ) ), so the analysis boils down to bounding the out-degrees in D . One trivial upper bound is d + v ≤ d v for every v ∈ V . Because every edge is directedfrom its lower-degree endpoint to its higher-degree endpoint, we also have d + v ≤ P d ≥ d v n ( d ). ByClaim 4.2(a), the second bound is O ( nd − γ +1 v ). The second bound is better than the first roughlywhen d v ≥ nd − γ +1 v , or equivalently when d v ≥ n /γ . The running time bound actually holds for all γ ∈ (1 , γ > V ( d ) denote the set of degree- d vertices of G . We split the sum over vertices according tohow their degrees compare to n /γ , using the first bound for low-degree vertices and the secondbound for high-degree vertices: X v ∈ V ( d + v ) = X d X v ∈ V ( d ) ( d + v ) ≤ X d ≤ n /γ X v ∈ V ( d ) d + X d>n /γ X v ∈ V ( d ) O ( n d − γ +2 )= X d ≤ n /γ d · n ( d ) + O n · X d>n /γ d − γ +2 · n ( d ) . Applying Lemma 4.1 (with c = 2) to the sum over low-degree vertices, and using the fact thatwith γ < P d d − γ is divergent, we derive X d ≤ n /γ d · n ( d ) = O n X d ≤ n /γ d − γ = O ( n ( n /γ ) − γ ) = O ( n /γ ) . The second sum is over the highest-degree vertices, and Lemma 4.1 does not apply. On theother hand, we can invoke Claim 4.2(a) to obtain the desired bound: n X d>n /γ d − γ +2 · n ( d ) ≤ n ( n /γ ) − γ +2 X d>n /γ n ( d )= O ( n /γ · n ( n /γ ) − γ +1 )= O ( n /γ ) . The same reasoning shows that Algorithm 2 runs in O ( n log n ) time in n -vertex PLB graphswith exponent γ = 3, and in O ( n ) time in PLB graphs with γ > Beyond triangle counting, which computational problems should we expect to be easier on PLBgraphs than on general graphs? A good starting point is problems that are relatively easy onbounded-degree graphs. In many cases, fast algorithms for bounded-degree graphs remain fast forgraphs with bounded degeneracy. In these cases, the degeneracy bound for PLB graphs (Exercise 8)can already lead to fast algorithms for such graphs. For example, this approach can be used to showthat all of the cliques of a PLB graph with exponent γ >
This section gives an impressionistic overview of another set of deterministic conditions meant tocapture properties of “typical networks,” proposed by Borassi et al. (2017) and hereafter called the
BCT model . The precise model is technical with a number of parameters; we give only a high-leveldescription that ignores several complications. 13o illustrate the main ideas, consider the problem of computing the diameter max u,v ∈ V dist( u, v )of an undirected and unweighted n -vertex graph G = ( V, E ), where dist( u, v ) denotes the shortest-path distance between u and v in G . Define the eccentricity of a vertex u by ecc( u ) := max v ∈ V dist( u, v ),so that the diameter is the maximum eccentricity. The eccentricity of a single vertex can be com-puted in linear time using breadth-first search, which gives a quadratic-time algorithm for com-puting the diameter. Despite much effort, no subquadratic (1 + ǫ )-approximation algorithm forcomputing the graph diameter is known for general graphs. Yet there are many heuristics that per-form well in real-world networks. Most of these heuristics compute the eccentricities of a carefullychosen subset of vertices. An extreme example is the TwoSweep algorithm:1. Pick an arbitrary vertex s , and perform breadth-first search from s to compute a vertex t ∈ arg max v ∈ V dist( s, v ).2. Use breadth-first search again to compute ecc( t ) and return the result.This heuristic always produces a lower bound on a graph’s diameter, and in practice usually achievesa close approximation. What properties of “real-world” graphs might explain this empirical per-formance?The BCT model is largely inspired by the metric properties of random graphs. To explain, fora vertex s and natural number k , let τ s ( k ) denote the smallest length ℓ so that there are at least k vertices at distance (exactly) ℓ from s . Ignoring the specifics of the random graph model, the ℓ -stepneighborhoods (i.e., vertices at distance exactly ℓ ) of a vertex in a random graph resemble uniformrandom sets of size increasing with ℓ . We next use this property to derive a heuristic upper boundon dist( s, t ). Define ℓ s := τ s ( √ n ) and ℓ t := τ t ( √ n ). Since the ℓ s -step neighborhood of s and the ℓ t -step neighborhood of t act like random sets of size √ n , a birthday paradox argument impliesthat they intersect with non-trivial probability. If they do intersect, then ℓ s + ℓ t is an upper boundon dist( s, t ). In any event, we can adopt this inequality as a deterministic graph property, whichcan be tested against real network data. Property 5.1.
For all s, t ∈ V , dist( s, t ) ≤ τ s ( √ n ) + τ t ( √ n ) . One would expect this distance upper bound to be tight for pairs of vertices that are far awayfrom each other, and in a reasonably random graph, this will be true for most of the vertex pairs.This leads us to the next property. Property 5.2.
For all s ∈ V : for “most” t ∈ V , dist( s, t ) > τ s ( √ n ) + τ t ( √ n ) − . The third property posits a distribution on the τ s ( √ n ) values. Let T ( k ) denote the average n − P s ∈ V τ s ( k ). Property 5.3.
There are constants c, γ > such that the fraction of vertices s satisfying τ s ( √ n ) ≥ T ( √ n ) + γ is roughly c − γ . A consequence of this property is that the largest value of τ s ( √ n ) is T ( √ n ) + log c n + Θ(1).As we discuss below, these properties will imply that simple heuristics work well for computingthe diameter of a graph. On the other hand, these properties do not generally hold in real-worldgraphs. The actual BCT model has a nuanced version of these properties, parameterized by vertexdegrees. In addition, the BCT model imposes an approximate power-law degree distribution, in the The actual BCT model uses the upper bound τ s ( n x ) + τ t ( n y ) for x + y > δ , to ensure intersection with highenough probability. We omit the exact definition of this property in the BCT model, which is quite involved. u, v ) ≤ τ u ( √ n ) + τ v ( √ n ) ≤ τ u ( √ n ) + T ( √ n ) + log c n + O (1) . Fix u and imagine varying v to estimate ecc( u ). For “most” vertices v , dist( u, v ) ≥ τ u ( √ n ) + τ v ( √ n ) −
1. By Property 5.3, one of the vertices v satisfying this lower bound will also satisfy τ v ( √ n ) ≥ T ( √ n ) + log c n − Θ(1). Combining, we can bound the eccentricity byecc( u ) = max v dist( u, v ) = τ u ( √ n ) + T ( √ n ) + log c n ± Θ(1) . (4)The bound (4) is significant because it reduces maximizing ecc( u ) over u ∈ V to maximizing τ u ( √ n ).Pick an arbitrary vertex s and consider a vertex u that maximizes dist( s, u ). By an argumentsimilar to the one above (and because most vertices are far away from s ), we expect that dist( s, u ) ≈ τ s ( √ n ) + τ u ( √ n ). Thus, a vertex u maximizing dist( s, u ) is almost the same as a vertex maximizing τ u ( √ n ), which by (4) is almost the same as a vertex maximizing ecc( u ). This gives an explanationof why the TwoSweep algorithm performs so well. Its first use of breadth-first search identifies avertex u that (almost) maximizes ecc( u ). The second pass of breadth-first search (from u ) thencomputes a close approximation of the diameter.The analysis in this section is heuristic, but it captures much of the spirit of algorithm analysisin the BCT model. These results for TwoSweep can be extended to other heuristics that choose aset of vertices through a random process to lower bound the diameter. In general, the key insight isthat most distances dist( u, v ) in the BCT model can be closely approximated as a sum of quantitiesthat depend only on either u or v . Let’s take a bird’s-eye view of this chapter. The big challenge in the line of research described inthis chapter is the formulation of graph classes and properties that both reflect real-world graphsand lead to a satisfying theory. It seems unlikely that any one class of graphs will simultaneouslycapture all the relevant properties of (say) social networks. Accordingly, this chapter describedseveral graph classes that target specific empirically observed graph properties, each with its ownalgorithmic lessons: • Triadic closure aids the computation of dense subgraphs. • Power-law degree distributions aid subgraph counting. • ℓ -hop neighborhood structure influences the structure of shortest paths.These lessons suggest that, when defining a graph class to capture “real-world” graphs, it may beimportant to keep a target algorithmic application in mind.Different graph classes differ in how closely the definitions are tied to domain knowledge andempirically observed statistics. The c -closed and triangle-dense graph classes are in the spirit ofclassical families of graphs (e.g., planar or bounded-treewidth graphs), and they sacrifice precisionin the service of generality, cleaner definitions, and arguably more elegant theory. The PLB andBCT frameworks take the opposite view: the graph properties are quite technical and involve15any parameters, and in exchange tightly capture the properties of “real-world” graphs. Theseadditional details can add fidelity to theoretical explanations for the surprising effectiveness ofsimple heuristics.A big advantage of combinatorially defined graph classes—a hallmark of graph-theoretic work intheoretical computer science—is the ability to empirically validate them on real data. The standardstatistical viewpoint taken in network science has led to dozens of competing generative models, andit is nearly impossible to validate the details of such a model from network data. The deterministicgraph classes defined in this chapter give a much more satisfying foundation for algorithmics onreal-world graphs.Complex algorithms for real-world problems can be useful, but practical algorithms for graphanalysis are typically based on simple ideas like backtracking or greedy algorithms. An ideal theorywould reflect this reality, offering compelling explanations for why relatively simple algorithms havesuch surprising efficacy in practice.We conclude this section with some open problems.1. Theorem 2.3 gives, for constant c , a bound of O ( n ) on the number of maximal cliques ina c -closed graph. Fox et al. (2020) also prove a sharper bound of O ( n − − c ) ), which isasymptotically tight when c = 2. Is it tight for all values of c ? Additionally, parameterizingby the number of edges ( m ) rather than vertices ( n ), is the number of maximal cliques in a c -closed graph with c = O (1) bounded by O ( m )? Could there be a linear-time algorithm formaximal clique enumeration for c -closed graphs with constant c ?2. Theorem 3.1 guarantees the capture by a tightly-knit family of an O ( δ ) fraction of the trian-gles of a δ -triangle-dense graph. What is the best-possible constant in the exponent? Can theupper bound be improved, perhaps under additional assumptions (e.g., about the distributionof the clustering coefficients of the graph, rather than merely about their average)?3. Ugander et al. (2013) observe that 4-vertex subgraph counts in real-world graphs exhibitpredictable and peculiar behavior. By imposing conditions on 4-vertex subgraph counts (inaddition to triangle density), can one prove decomposition theorems better than Theorem 3.1?4. Is there a compelling algorithmic application for graphs that can be approximated by tightly-knit families?5. Benson et al. (2016) and Tsourakakis et al. (2017) defined the triangle conductance of a graph,where cuts are measured in terms of the number of triangles cut (rather than the numberof edges). Empirical evidence suggests that cuts with low triangle conductance give moremeaningful communities (i.e., denser subgraphs) than cuts with low (edge) conductance. Isthere a plausible theoretical explanation for this observation?6. A more open-ended goal is to use the theoretical insights described in this chapter to developnew and practical algorithms for fundamental graph problems. The book by Easley and Kleinberg (2010) is a good introduction to social networks analysis, includ-ing discussions of heavy-tailed degree distributions and triadic closure. A good if somewhat out-dated review of generative models for social and information networks is Chakrabarti and Faloutsos(2006). The Enron email network was first studied by Klimt and Yang (2004).16he definitions of c -closed and weakly c -closed graphs (Definitions 2.1–2.2) are from Fox et al.(2020), as is the fixed-parameter tractability result for the maximum clique problem (Theorem 2.3).Eppstein et al. (2010) proved an analogous result with respect to a different parameter, the degener-acy of the input graph. The reduction from efficiently enumerating maximal cliques to bounding thenumber of maximal cliques (Theorem 2.1) is from Tsukiyama et al. (1977). Moon-Moser graphsand the Moon-Moser bound on the maximum number of maximal cliques of a graph are fromMoon and Moser (1965).The definition of triangle-dense graphs (Definition 3.1) and the inverse theorem for them (The-orem 3.1) are from Gupta et al. (2016). The computation of the triangle density of the Facebookgraph is detailed by Ugander et al. (2011).The definition of power law bounded graphs (Definition 4.1) first appeared in Berry et al. (2015)in the context of triangle counting, but it was formalized and applied to many different problems byBrach et al. (2016), including triangle counting (Theorem 4.2), clique enumeration (Exercise 10),and linear algebraic problems for matrices with a pattern of non-zeroes that induces a PLB graph.Brach et al. (2016) also performed a detailed empirical analysis, validating Definition 4.1 (with smallshifts t ) on real data. The degeneracy-parameterized bound for counting triangles is essentially dueto Chiba and Nishizeki (1985).The BCT model (Section 5) and the fast algorithm for computing the diameter of a graph aredue to Borassi et al. (2017). Acknowledgments
The authors thank Michele Borassi, Shweta Jain, Piotr Sankowski, and Inbal Talgam-Cohen fortheir comments on earlier drafts of this chapter.
References
Benson, A., D. F. Gleich, and J. Leskovec (2016). Higher-order organization of complex networks.
Science 353 (6295), 163–166.Berry, J. W., L. A. Fostvedt, D. J. Nordman, C. A. Phillips, C. Seshadhri, and A. G. Wilson(2015). Why do simple algorithms for triangle enumeration work in the real world?
InternetMathematics 11 (6), 555–571.Borassi, M., P. Crescenzi, and L. Trevisan (2017). An axiomatic and an average-case analysis ofalgorithms and heuristics for metric properties of graphs. In
Proceedings of the Twenty-EighthAnnual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pp. 920–939.Brach, P., M. Cygan, J. Lacki, and P. Sankowski (2016). Algorithmic complexity of power lawnetworks. In
Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on DiscreteAlgorithms (SODA) , pp. 1306–1325.Chakrabarti, D. and C. Faloutsos (2006). Graph mining: Laws, generators, and algorithms.
ACMComputing Surveys 38 (1).Chiba, N. and T. Nishizeki (1985). Arboricity and subgraph listing algorithms.
SIAM Journal onComputing 14 (1), 210–223.Easley, D. and J. Kleinberg (2010).
Networks, Crowds, and Markets . Cambridge University Press.17ppstein, D., M. L¨offler, and D. Strash (2010). Listing all maximal cliques in sparse graphsin near-optimal time. In
Proceedings of the 21st International Symposium on Algorithms andComputation (ISAAC) , pp. 403–414.Fox, J., T. Roughgarden, C. Seshadhri, F. Wei, and N. Wein (2020). Finding cliques in socialnetworks: A new distribution-free model.
SIAM Journal on Computing 49 (2), 448–464.Gupta, R., T. Roughgarden, and C. Seshadhri (2016). Decompositions of triangle-dense graphs.
SIAM Journal on Computing 45 (2), 197–215.Klimt, B. and Y. Yang (2004). The enron corpus: A new dataset for email classification research.In
Proceedings of the 15th European Conference on Machine Learning (ECML) , pp. 217–226.Moon, J. and L. Moser (1965). On cliques in graphs.
Israel Journal of Mathematics 3 , 23–28.Roughgarden, T. (Ed.) (2020).
Beyond the Worst-Case Analysis of Algorithms . Cambridge Uni-versity Press.Tsourakakis, C. E., J. W. Pachocki, and M. Mitzenmacher (2017). Scalable motif-aware graphclustering. In
Proceedings of the Web Conference (WWW) , Volume abs/1606.06235, pp. 1451–1460.Tsukiyama, S., M. Ide, H. Ariyoshi, and I. Shirakawa (1977). A new algorithm for generating allthe maximal independent sets.
SIAM Journal on Computing 6 (3), 505––517.Ugander, J., L. Backstrom, and J. Kleinberg (2013). Subgraph frequencies: Mapping the empiricaland extremal geography of large graph collections. In
Proceedings of World Wide Web Conference ,pp. 1307–1318.Ugander, J., B. Karrer, L. Backstrom, and C. Marlow (2011). The anatomy of the facebook socialgraph. arXiv:1111.4503.
Exercises
1. Prove that a graph is weakly c -closed in the sense of Definition 2.2 if and only if its verticescan be ordered v , v , . . . , v n such that, for every i = 1 , , . . . , n , the vertex v i is c -good in thesubgraph induced by v i , v i +1 , . . . , v n .2. Prove that the backtracking algorithm in Section 2.3 enumerates all of the maximal cliquesof a graph.3. Prove that a graph has triangle density 1 if and only if it is a disjoint union of cliques.4. Let G be the complete regular tripartite graph with n vertices—three vertex sets of size n each, with each vertex connected to every vertex of the other two groups and none of thevertices within the same group.(a) What is the triangle density of the graph?(b) What is the output of the cleaner (Section 3.4) when applied to this graph? What isthen the output of the extractor? 18c) Prove that G admits no tightly-knit family that contains a constant fraction (as n → ∞ )of the graph’s triangles and uses only radius-1 clusters.5. Prove Claim 4.2.[Hint: To prove (a), break up the sum over degrees into sub-sums between powers of 2. ApplyDefinition 4.1 to each sub-sum.]6. Implement Algorithm 2 from Section 4.3 in O ( P v ( d + v ) + n ) time, where d + v is the number ofout-neighbors of v in the directed version D of G , assuming that the input G is representedusing only adjacency lists.[Hint: you may need to store the in- and out-neighbor lists of D .]7. Prove that every graph with m edges has degeneracy at most √ m . Exhibit a family ofgraphs showing that this bound is tight (up to lower order terms).8. Suppose G is a PLB graph with exponent γ > G is O ( n / ( γ − ).(b) Prove that the degeneracy is O ( n /γ ).[Hint: For (b), use the main idea in the proof of Exercise 7 and Claim 4.2.]9. Prove that Algorithm 2 in Section 4.3 runs in O ( n log n ) time and O ( n ) time in n -vertex PLBgraphs with exponents γ = 3 and γ >
3, respectively.10. Prove that all of the cliques of a graph with degeneracy α can be enumerated in O ( n αα