[PDF] Tree decompositions and social graphs

Abstract

Recent work has established that large informatics graphs such as social and information networks have non-trivial tree-like structure when viewed at moderate size scales. Here, we present results from the first detailed empirical evaluation of the use of tree decomposition (TD) heuristics for structure identification and extraction in social graphs. Although TDs have historically been used in structural graph theory and scientific computing, we show that---even with existing TD heuristics developed for those very different areas---TD methods can identify interesting structure in a wide range of realistic informatics graphs. Our main contributions are the following: we show that TD methods can identify structures that correlate strongly with the core-periphery structure of realistic networks, even when using simple greedy heuristics; we show that the peripheral bags of these TDs correlate well with low-conductance communities (when they exist) found using local spectral computations; and we show that several types of large-scale "ground-truth" communities, defined by demographic metadata on the nodes of the network, are well-localized in the large-scale and/or peripheral structures of the TDs. Our other main contributions are the following: we provide detailed empirical results for TD heuristics on toy and synthetic networks to establish a baseline to understand better the behavior of the heuristics on more complex real-world networks; and we prove a theorem providing formal justification for the intuition that the only two impediments to low-distortion hyperbolic embedding are high tree-width and long geodesic cycles. Our results suggest future directions for improved TD heuristics that are more appropriate for realistic social graphs.

Full PDF

TTree decompositions and social graphs

Aaron B. Adcock ∗ Blair D. Sullivan † Michael W. Mahoney ‡ Abstract

Recent work has established that large informatics graphs such as social and informationnetworks have non-trivial tree-like structure when viewed at moderate size scales. Here, wepresent results from the ﬁrst detailed empirical evaluation of the use of tree decomposition(TD) heuristics for structure identiﬁcation and extraction in social graphs. Although TDs havehistorically been used in structural graph theory and scientiﬁc computing, we show that—evenwith existing TD heuristics developed for those very diﬀerent areas—TD methods can identifyinteresting structure in a wide range of realistic informatics graphs. Our main contributions arethe following: we show that TD methods can identify structures that correlate strongly withthe core-periphery structure of realistic networks, even when using simple greedy heuristics; weshow that the peripheral bags of these TDs correlate well with low-conductance communities(when they exist) found using local spectral computations; and we show that several typesof large-scale “ground-truth” communities, deﬁned by demographic metadata on the nodesof the network, are well-localized in the large-scale and/or peripheral structures of the TDs.Our other main contributions are the following: we provide detailed empirical results forTD heuristics on toy and synthetic networks to establish a baseline to understand betterthe behavior of the heuristics on more complex real-world networks; and we prove a theoremproviding formal justiﬁcation for the intuition that the only two impediments to low-distortionhyperbolic embedding are high tree-width and long geodesic cycles. Our results suggest futuredirections for improved TD heuristics that are more appropriate for realistic social graphs.

Understanding the properties of realistic informatics graphs such as large social and informationnetworks and developing algorithmic and statistical tools to analyze such graphs is of continuinginterest, and recent work has focused on identifying and exploiting what may be termed tree-likestructure in these real-world graphs. Since an undirected graph is a tree if any two vertices areconnected by exactly one path, or equivalently if the graph is connected but has no cycles, real-world graphs are clearly not trees in any na¨ıve sense of the word. For example, realistic socialgraphs have non-zero clustering coeﬃcient, indicating an abundance of cycles of length three.There are, however, more sophisticated notions that can be used to characterize the manner inwhich a graph may be viewed as tree-like. These are of interest since, e.g., graphs that are treeshave many nice algorithmic and statistical properties, and the hope is that graphs that are tree-like inherit some of these nice properties. In particular, δ -hyperbolicity is a notion from geometricgroup theory that quantiﬁes a way in which a graph is tree-like in terms of its distance or metricproperties. Alternatively, tree decompositions (TDs) are tools from structural graph theory thatquantify a way in which a graph is tree-like in terms of its cut or partitioning properties. ∗ Department of Electrical Engineering, Stanford University, Stanford, CA 94305. Email: [email protected] † Department of Computer Science, North Carolina State University, Raleigh, NC 27695. Email:blair [email protected] ‡ International Computer Science Institute and Department of Statistics, University of California at Berkeley,Berkeley, CA 94720. Email: [email protected] a r X i v : . [ c s . D S ] M a y lthough TDs and δ -hyperbolicity capture very diﬀerent ways in which a general graph can betree-like, recent empirical work (described in more detail below) has shown interesting connectionsbetween them. In particular, for realistic social and information networks, these two notions oftree-likeness capture very similar structural properties, at least when the graphs are viewed atlarge size scales; and this structure is closely related to what may be termed the nested core-periphery or k -core structure of these networks. Recent work has also shown that computing δ -hyperbolicity exactly is extremely expensive, that hyperbolicity is quite brittle and diﬃcult towork with for realistic social graphs, and that common methods to approximate δ provide onlya very rough guide to its extremal value and associated graph properties. Motivated by this, aswell as by the large body of work in linear algebra and scientiﬁc computing on practical methodsfor computing TDs, in this paper we present results from the ﬁrst detailed empirical evaluationof the use of TD heuristics for structure identiﬁcation and extraction in social graphs.A TD (deﬁned more precisely below) is a specialized mapping of an arbitrary input graph G into a tree H , where the nodes of H (called bags) consist of overlapping subsets of vertices of G .Quantities such as the treewidth—the size of the bag in H that contains the largest number ofvertices from G —can be used to characterize how tree-like is G . A single bag that contains everyvertex from G is a legitimate but trivial TD (since the width is as large as possible for a graph ofthe given size). Thus, one usually focuses on ﬁnding “better” TDs, where better typically meansminimizing the width. The problem of ﬁnding the treewidth of G and of ﬁnding an optimalTD of G are both NP-hard, and thus most eﬀort has focused on developing heuristics, e.g., byconstructing the TD iteratively by choosing greedily vertices of G that minimize quantities suchas the degree or ﬁll. Since we are interested in applying TDs on realistic graphs, it is theseheuristics (to be described in more detail below) that we will use in this paper.Our goals are to describe the behavior of TD heuristics on real-world and synthetic socialgraphs and to use these TD tools to identify and extract useful structure from these graphs.In particular, (in Section 6) we show the following. We ﬁrst show (in Section 6.1) that TDmethods can identify large-scale structures in realistic networks that correlate strongly with therecently-described core-periphery structure of these networks, even when using simple greedy TDheuristics. We do this by relating the global “core-periphery” structure of these networks, ascaptured using the k -core decomposition, with what we call the “central-perimeter” structure,which is a measure of the centrality or eccentricity of each bag in the TD. We also describehow small-scale structures such as the internal bag structure of TDs of these networks reﬂects—depending on the density and other properties of these networks—their clustering coeﬃcient andother related clustering properties of the original networks. We next show (in Section 6.2) thatthe peripheral bags of these TDs correlate well with low-conductance communities/clusters (whenthey exist) found using local spectral computations, in the sense that these low-conductance (i.e.,good-conductance) communities/clusters occupy a small number of peripheral bags in the TDs.In particular, this shows that in graphs for which the so-called Network Community Proﬁle (NCP)Plot is upward-sloping (as, e.g., described in [1], and indicating the presence of good small andabsence of good large clusters), the small-scale “dips” in the NCP are localized in clusters thatare on a peripheral branch in the TD. We ﬁnally consider (in Section 6.3) how several typesof large-scale “ground-truth” communities/clusters, as deﬁned by demographic metadata on thenodes of the network (and that are not good-conductance clusters), are localized in the TDs. Inparticular, we look at two social network graphs consisting of friendship edges between studentsat a university, we use metadata associated with graduation year and residence hall information,and we show that clusters deﬁned by these metadata are well-localized in the large-scale centraland/or small-scale peripheral structures of the TDs.A signiﬁcant challenge in applying existing TD heuristics—which have been developed forvery diﬀerent applications in scientiﬁc computing and numerical linear algebra—is that it can be2iﬃcult to determine whether one is observing a “real” property of the networks or an artifact ofthe particular TD heuristic that was used to examine the network. Thus, to establish a baselineand to determine their behavior in idealized settings, we have ﬁrst applied several existing TDheuristics to a range of toy and synthetic data. (See Section 4 and Section 5, respectively.)The toy data consist of a binary tree, a lattice, a cycle, a clique, and a dense random graph,i.e., graphs for which optimal TDs are known. The synthetic data consists of Erd˝os-R´enyi andpower law random graph models, which help us understand the eﬀect of noise/randomness on theTDs. (Other random graph models exhibit similar properties, when their parameters are set tocorrespondingly sparse values.) For these graphs, we place a particular emphasis on the propertiesof the TDs as the density parameters (i.e., the connection probability for the Erd˝os-R´enyi graphsand the power law parameter for the power law graphs) are varied from very sparse to extremelysparse, and we are interested in how this relates to the large-scale core-periphery structure.Our detailed empirical results for TD heuristics on toy and synthetic networks are importantfor understanding the behavior of these heuristics on more complex real-world networks; but ourresults on synthetic and real-world networks also suggest future directions for the development ofTD heuristics that are more appropriate for social graph data. Existing TD heuristics focus onproducing minimum-width TDs, which are of interest in more traditional graph theory and linearalgebra applications, but they are not well-optimized for ﬁnding structures of interest in socialgraphs. Although it is beyond the scope of this paper, the development of TD heuristics thatare more appropriate for social graph applications (e.g., understanding how the bag structure ofthose TDs relates to the output of recently-developed local spectral methods that ﬁnd good smallclusters in large networks) is an important question raised by our results.The remainder of this paper is organized as follows. In Section 2, we present deﬁnitionsfrom graph theory, a detailed discussion of tree decompositions and the algorithms for theirconstruction, and a brief discussion of other prior related work. Section 3 details the datasetswe make use of throughout the paper. The subsequent four sections provide our main empiricalresults. In particular, in Section 4, we consider several TD heuristics applied to toy graphs; andin Section 5, we consider TD heuristics applied to synthetic random graphs. Then, in Section 6,we describe the results of applying TDs to a carefully-chosen suite of real-world social graphs.In Section 7, we prove a theoretical result connecting treewidth and treelength with the (verydiﬀerent) notion of δ -hyperbolicity, under an assumption on the length of the longest geodesiccycle in the graph. Finally, in Section 8, we provide a brief discussion of results and conclusion. In this section, we will review relevant graph theory, TD ideas, and computational methods, aswell as relevant related work.

Let G = ( V, E ) be a graph with vertex set V and edge set E ⊆ V × V . We often refer to graphs as networks and vertices as nodes , and we will model social and information networks by undirectedgraphs. We note that TDs are themselves graphs (constructed from other input graphs). The degree of a vertex v , denoted d ( v ), is deﬁned as the number of vertices that are adjacent to v (orthe sum of the weights of adjacent edges, if the graph is weighted). The average degree is denoted¯ d . A graph is called connected if there exists a path between any two vertices. A graph is calleda tree if it is connected and has no cycles. A vertex in a tree is called a leaf if it has degree 1.A graph H = ( S, F ) is a subgraph of G if S ⊆ V, F ⊆ E . An induced subgraph of G on a set of3ertices S ⊆ V is the graph G [ S ] := ( S, S × S ∩ E ). Unless otherwise speciﬁed, our analyses willalways consider the giant component , i.e., the largest connected subgraph of G .The diameter of a graph is the maximum distance between any two vertices, and the eccen-tricity of a vertex is the maximum distance between that vertex and all other vertices in thegraph. Note that the maximum eccentricity of a graph is equal to the diameter. The clusteringcoeﬃcient of a vertex is the ratio of the number of edges present among its neighbors to themaximum possible number of such edges; when we refer to the clustering coeﬃcient of a network,we use the average of the clustering coeﬃcient of all its vertices. A cut is a partitioning of anetwork’s vertex set into two pieces. The volume of a cut is the sum over vertices in the smallerpiece of the number of incident edges, and the surface area of a cut is the number of edges withone end-point in each piece. In this case, the conductance of a cut—one of the most importantmeasures for assessing the quality of a cut—is the surface area divided by the volume (that is,we will be following the conventions used in previous work [1, 2]).Finally, we will refer to the “core-periphery” structure of a network. Following prior work [1,3, 2], we use the k -core decomposition to identify these core nodes. The k -core of a network G is the maximal induced subgraph H ⊆ G such that every node in H has degree at least k . The k -core has the advantage of being easily computable in O ( V + E ) time [4, 5, 6]. TDs are combinatorial objects that describe specialized mappings of cuts in a network to nodes ofa tree. Although originally introduced in the context of structural graph theory (the proof of theGraph Minors Theorem [7]), TDs have gained attention in the broader community due to theiruse in eﬃcient algorithms for certain NP-hard problems. In particular, there are polynomial-timealgorithms for solving many such problems on all graphs that have TDs whose width (deﬁnedbelow) is bounded from above by a constant [8, 9]. These algorithms have been applied toproblems in constraint satisfaction, computational biology, linear algebra, probabilistic networks,and machine learning [10, 11, 12, 13, 14, 15, 16, 17, 18].

Deﬁnition 1. A tree decomposition (TD) of a graph G = ( V, E ) is a pair ( X = { X i : i ∈ I } , T = ( I, F )) , with each X i ⊆ V , and T a tree with the following properties:1. ∪ i ∈ I X i = V ,2. For all ( v, w ) ∈ E, ∃ i ∈ I with v, w ∈ X i , and3. For all v ∈ V , { i ∈ I : v ∈ X i } forms a connected subtree of T.The X i are called the bags of the tree decomposition. The third condition of the deﬁnition is a continuity requirement that allows the TD to be usedin dynamic programming algorithms for many NP-hard problems. It is equivalent to requiringthat for all i, j, k ∈ I , if j is on the path from i to k in T , then X i ∩ X k ⊆ X j . The quality of atree decomposition is often measured in terms of its largest bag size. Alternatively, the bags and edges of the TD form separators (cuts) in the graph. The set of vertices containedin any bag, or intersection of two adjacent bags, form a separator in G . This structural property is important as itallows TDs to be thought of as a method of organizing cuts in a network. This is also related to how the treewidthof a network is used to measure how tree-like a network is. Intuitively, a tree has a treewidth of 1 because thegraph can be separated by the removal of a single edge (or vertex) in the network, whereas a cycle requires twoedges to be cut and thus has a treewidth of 2. TDs with large widths require larger numbers of vertices to separatea network into two disconnected pieces. A related aspect of the deﬁnition of a TD is the overlapping nature of the bags of a TD. Vertices in the graphwill appear in many bags in the TD. This is particularly true of high degree or high k -core nodes [3]. eﬁnition 2. Let T = ( { X i } , T = ( I, F )) be a tree decomposition of a graph G . The width of T is deﬁned to be max i ∈ I | X i | − , and the treewidth of G , denoted tw ( G ) , is the minimum widthover all valid tree decompositions of G . A tree decomposition whose width is equal to the treewidthis often referred to as optimal . By this deﬁnition, trees have the minimum possible treewidth of 1 (their bags contain the edgesof the original tree and thus have size 2); but, in contrast to δ -hyperbolicity (see Section 2.4), an n -vertex clique is the least tree-like graph (attaining the maximum treewidth of n − W is any complete subgraphof G , then every TD of G has some bag that contains all the vertices of W [19].Two other canonical examples (to which we will return in detail below) are the cycle and thegrid, which have vastly diﬀering treewidths. All cycles (regardless of the number of vertices) havetreewidth 2 (see Figure 5 below). The n × n planar grid, on the other hand, has treewidth n ,and thus it is not tree-like by this measure. Grids are particularly noteworthy in the discussionof TDs due to a result (described in more detail below) showing that they are essentially the onlyobstruction to having bounded treewidth. Finding a TD for a given graph whose width is minimal (equal to the treewidth) is an NP-hard problem [21, 12]. Most methods (including those discussed here) for constructing TDs weredesigned to minimize width, as most prior work focused on using these structures to reducecomputational cost for an algorithm/application. Also, although treewidth is a graph invariant,TDs of a network are not unique, even under the condition of having minimum width. See, e.g.,Figure 5 below, which shows several distinct minimum width TDs of a cycle.Finally, although it is not standard, we will abuse the term width to apply it directly to a bagof a tree decomposition (in which case, it takes the value of the cardinality of the set minus one),so that we can talk about the maximum width (which is the equivalent to the usual deﬁnition ofwidth), and median width of a decomposition (which is the median of the widths of the bags).We will use the term center to refer to the bag (or bags) associated with the node(s) of minimumeccentricity in the tree underlying a TD, and the term perimeter for bags associated with nodes ofrelatively high eccentricity. We do this to help provide a framework for discussing the connectionin many social and information networks between the core (resp. periphery) of the network andthe central (resp. perimeter) bags of its TD computed with certain heuristics. Note that bydeﬁnition, a tree will have at most two bags at its center.

Here, we give a brief overview of existing algorithms for constructing TDs; more comprehensivesurveys can be found in [12, 22]. The algorithms for ﬁnding low-width TDs are generally dividedinto two classes: “theoretical” and “computational.” The former category includes, for example,the linear algorithm of Bodlaender [23], which checks if a TD of width at most k exists (for aﬁxed constant k ), and the approximation algorithms of Amir [24]. These are generally considered In particular, in the so-called Grid Minor Theorem, Robertson and Seymour showed that every graph oftreewidth at least k contains a f ( k ) × f ( k ) grid as a graph minor, for some integer-valued function f . The originalestimate of the function f gave an exponential relationship between the treewidth and the grid size, and althoughseveral results greatly improved the relationship, the question of whether or not it held for any polynomial function f remained open for over 25 years. Recently, Chekuri and Chuzhoy proved that there is a universal constant δ > k have a grid-minor of size Ω( k δ ) × Ω( k δ ) [20], resolving this conjecture. In the context of understanding the intermediate-scale structure of real networks and improving inference(e.g., link prediction, overlapping community detection, etc.), there are likely more appropriate objective functions,although their general identiﬁcation and development is left as future work. k = 4. The approximation algorithms of Amir have been tested on graphs with up toseveral hundred vertices, but they require hours of running time even at this size scale. There hasalso been work on exact algorithms, the most computational of which is perhaps the QuickTreealgorithm of Shoikhet and Geiger, which was tested on graphs with up to about 100 verticesand treewidth 11 [26]. Thus, in practice, most computational work requires the use of heuristicapproaches (i.e., those which oﬀer no worst-case guarantee on their maximum deviation fromoptimality). Since we are interested in applying TDs to real network data, we will focus on these“practical” algorithms in the remainder of this paper. We used INDDGO [27, 28], an open sourcesoftware suite for computing TDs and numerous graph and TD parameters. A common method for constructing TDs is based on algorithms for decomposing chordal graphs . Deﬁnition 3.

A graph G is chordal if it has no induced cycles of length greater than three(equivalently, every cycle in G with length at least four, has a chord). Chordal graphs are characterized by the existence of an ordering π = ( v , . . . , v n ) of their verticesso that for each v i , the set of its neighbors v j with j > i form a clique. This is a perfect eliminationordering , and it gives a straightforward construction for a TD (also called the clique-tree) of achordal graph, with bags consisting of the sets of higher-indexed neighbors of each vertex.For a general graph G , one common approach for ﬁnding TDs is to ﬁrst ﬁnd a chordal graph H containing G , then use the associated TD (since, as mentioned earlier, TDs remain valid forall subgraphs on the same vertex set). The typical approach is via triangulation , a process thatuses a permutation of the vertex set (called the elimination ordering ) to guide the addition ofedges, which are referred to as ﬁll edges. Chordal completions are not unique. For example, thecomplete graph formed on the vertices of G is chordal and contains G (although, it is a trivial orthe “worst” triangulation, in the sense that it has the most ﬁll edges and largest possible cliquesubgraph among all triangulations).We will use the notation G + π to denote the triangulation of G using ordering π . An outline ofthe process is given in Algorithm 1. The process for ﬁnding a TD T π using an elimination order π and Gavril’s algorithm ([29]) for decomposing chordal graphs is given in Algorithm 2. We mayrefer to the width of an ordering, by which we mean the treewidth of the chordal graph G + π . Theliterature includes several slight variants on Gavril’s construction routine (such as Algorithm 2in [22]), but the overall process and width of the TD produced is the same for each.Perhaps surprisingly, there always exists some elimination ordering which produces an optimalTD (one of minimum width), and this may co-occur with high ﬁll. The following theorem (see [22])presents the connections between treewidth, triangulations, and elimination orderings. Theorem 1. [22] Let G = ( V, E ) be a graph, and let k ≤ n be a non-negative integer. Then thefollowing are equivalent.1. G has treewidth at most k .2. G has a triangulation H s.t. any complete subgraph of H (clique) has at most k + 1 vertices.3. There is an elimination ordering π , such that G + π does not contain any clique on k + 2 vertices as a subgraph.4. There is an elimination ordering π , such that no vertex v ∈ V has more than k neighborsin G + π which occur later in π . lgorithm 1 Triangulate a graph G into a chordal graph G + π Input:

Graph G = ( V, E ), and π = ( v , . . . , v n ), a permutation of V Output:

Chordal graph G + π ⊇ G , for which π is a perfect elimination ordering Initialize G + π = ( V (cid:48) , E (cid:48) ) with V (cid:48) = V and E (cid:48) = E for i = 1 to n do Let N i = { v j | j > i and ( v i , v j ) ∈ E } for { x, y } ⊆ N i do if x (cid:54) = y and ( x, y ) (cid:54)∈ N i then E (cid:48) = E (cid:48) ∪ { ( x, y ) } end if end for end for return G + π Algorithm 2

Construct a TD T π of a graph G using elimination ordering π and Gavril’s algorithm Input:

Graph G = ( V, E ), π a permutation of V Output: TD T π = ( X, ( I, F )) with (

I, F ) a tree, and bags X = { X i } , X i ⊆ V Initialize T = ( X, ( I, F )) with X = I = F = ∅ , n = | V | Create an empty n -long array t [] Use Algorithm 1 to create a triangulation G + π using π . Let k = 1, I = { } , X = { π n } , t [ π n ] = 1 for i = n − do Find B i = { neighbors of π i in G + π } ∩ { π i +1 , . . . , π n } Find m = j such that j ≤ k for all π k ∈ B i if B i = X t [ m ] then X t [ m ] = X t [ m ] ∪ { π i } ; t [ π i ] = t [ m ] else k = k + 1 I = I ∪ { k } ; X k = B i ∪ { π i } F = F ∪ { ( k, t [ m ]) } ; t [ π i ] = k end if end for return T π = ( X, ( I, F ))Thus, if one can produce a “good” elimination ordering (i.e., one with a small maximum clique),it is easy to construct a low-width TD, and such an ordering always exists if the treewidth isbounded. The intuition behind ﬁll-reducing orderings to minimize width follows from the ideathat in order to produce a large clique that wasn’t already in the network, one “should” have toadd many ﬁll edges.

Here, we describe the landscape of heuristics for creating elimination orderings, focusing on thoseused in our empirical evaluations. A more detailed analysis of heuristics as well as theoreticalconnections between chordal graphs and TDs is available [22]. The space of all possible eliminationorderings is O ( n !) for a graph on n vertices, making it impractical to search using brute forcetechniques. One possibility for exploring the space is to apply a stochastic local search approach7ike simulated annealing, but since this is relatively slow, it is not common in practice.The ﬁrst class of specialized methods are known as triangulation recognition heuristics, whichinclude lexicographic breadth-ﬁrst-search ( lex-m ) and maximum cardinality search ( mcs ) [22,30, 31, 32, 33, 34, 35]. These methods are guaranteed to provide a perfect elimination orderingfor chordal graphs, so many believed they would produce low-ﬁll and/or low-width orderingsfor more general graphs. In [22], the authors report good results with respect to width whenusing these methods on graphs which are already chordal or have regular structures, but poorresults compared to the greedy heuristics when even small amounts of randomness is added to thenetwork. Further empirical evaluation in [27] supports these claims. Additionally, these heuristicsare too computationally expensive to run on very large graphs.A large set of additional heuristics uses the idea of splitting the graph (using a small separator),recursively decomposing the resulting pieces, and then “gluing” the solutions into a single TD [24,36, 37, 38, 11, 39]. To quote Bodlaender and Koster [22], “they are signiﬁcantly more complex,signiﬁcantly slower, and often give bounds that are higher than those of simple algorithms.” Wedo, however, use a related approach that ﬁnds a set of nested graph partitions, but instead ofdecomposing the resulting pieces, it places the separators into an elimination ordering. Thisapproach is called nested dissection [40, 41], and it is quite popular for computing ﬁll-reducingorderings for sparse matrices in numerical linear algebra. The algorithm recursively ﬁnds a smallvertex separator (bisector) in a graph, and it ensures that in the resulting elimination ordering,the vertices in the two components formed by the bisection all appear before the vertices in theseparator. We use the “node nested dissection” algorithm implemented in METIS [42] (calledthrough INDDGO), and we refer to this heuristic as metnnd . In

METIS , the recursion isstopped when the components are smaller than a certain size, and some version of minimumdegree ordering is then applied to the remaining pieces. The software “grows” each bisectionusing a greedy node-based strategy. Since the algorithm is searching for bisections, there is atunable “balance” condition (determining how close to 50/50 the split needs to be), although forall computations reported in this paper, we left the parameter at its default value.Perhaps the most popular class of elimination ordering routines are greedy heuristics, namedbecause they make greedy decisions to pick the subsequent node in the elimination ordering. Thereare innumerable variations, but the most common use two basic concepts: choosing a vertex tominimize ﬁll (how many new edges will be added to the graph if a vertex is chosen to be next in theordering); or choosing a vertex of minimum degree (low-degree vertices have small neighborhoods,which also limits the potential ﬁll). When applied in their most rudimentary forms, these are the mindeg [43] and minfill orderings. Both of these indirectly limit the size of cliques producedin the ﬁnal triangulation (although they were originally designed to minimize the number of ﬁlledges added during the triangulation, a quantity which is not always correlated). For additionalheuristics combining these strategies and incorporating additional local information, see [22].Even though keeping updated vertex degrees for mindeg during triangulation (greedy order-ings make their decisions based on a partially triangulated graph at each step) is signiﬁcantlyless computationally intensive than computing current vertex ﬁlls for minfill , there have beeneﬀorts to reduce further the complexity. In particular, the approximate minimum degree or amd heuristic [44, 45]. This heuristic computes an upper bound on the degree of each node in eachpass using techniques based on the quotient graph for matrix factorization, and it has been shownto be signiﬁcantly faster and of similar quality (in terms of ﬁll and width minimization). We use amd interchangeably with the traditional mindeg , especially on larger networks.8 .4 Additional Related Work

For completeness, we provide here a brief overview of the large body of additional related work. Asalready mentioned, TDs played an important role in the proving of the graph minor theorem [7],but they have also become popular in theoretical computer science, as many NP-hard optimizationproblems have a polynomial time algorithm for graphs with bounded treewidth [8]. In addition,bounding the treewidth of the underlying graph of probabilistic graphical models allows for fastinference computations [46]. Additional overviews of TDs and their uses in discrete optimizationare available [47, 48, 49, 50]; and one can also learn more about the uses of these methodsin numerical linear algebra and sparse matrix computations [51], as well as connections withtriangulation methods: triangulation of minimum treewidth [52], empirical work on treewidthcomputations [53], the minimum degree heuristic and connections with triangulation [54], anda survey of triangulation methods [55]. Finally, the treewidth of random graphs for variousparameter settings has been studied [56, 57].A diﬀerent notion of tree-likeness is provided by δ -hyperbolicity. Early more mathematicalwork did not consider graphs and networks [60, 61], but more recent more applied work has [62,63, 64]. Computing δ exactly is very expensive [58], and sampling-based methods to approximateit provide only a very rough guide to its value and properties [3]. For many references on δ -hyperbolicity in network analysis, see [64] (and the more recent paper [65]) and references therein.There has been work on trying to relate hyperbolicity and TD-based ideas, often going beyondtreewidth to consider other metrics such as treelength or chordality or the expansion propertiesof the graph [66, 67, 68, 69, 70, 71, 72, 73, 74].Recent work in network analysis and community structure analysis has pointed to some sortof “core-periphery” structure in many real networks [1, 75, 76, 3, 2]; and recently this has been re-lated to the k -core decomposition—see, e.g., [3] and references therein. The k -core decompositionis of interest more generally, and additional references for k -core decompositions, including theiruse in visualization and in larger-scale applications, include [77, 78, 79, 80, 81, 82, 83]. Questionsof well-connected or expander-like cores are of particular interest in applications having to dowith diﬀusion processes, inﬂuential spreaders, and related questions of social contagion [84, 85].There are a few other papers that have used TDs to investigate the structural properties ofsocial and information networks: e.g., to look at the tree-likeness of internet latency and band-width [86]; to compare hyperbolicity and treewidth on internet networks [87]; and to examine therelationship between hyperbolicity, treewidth, and the core-periphery structure in a much widerrange of social and information networks [3]. In particular, [87] concludes that the hyperbolicityis small in the networks they examined but the treewidth is relatively large, presumably due to ahighly connected core; and [3] concludes that many real social and information networks do havea tree-like structure, with respect to both metric-based hyperbolicity and (in spite of the largetreewidth) the cut-based TDs, that corresponds to the core-periphery structure of the network.Finally, very recently we became aware of [88] and [89]. We have examined the empirical performance of existing TD heuristics on a broad set of real-world social and information networks as well as a large corpus of synthetic graphs. The real-worldnetworks have been chosen to be representative of a broad range of networks, as analyzed in priorwork [1, 3, 2], and the synthetic graphs have been chosen to illustrate the behavior of TD methods Our prior work focused on the use of δ -hyperbolicity [58, 3, 59]. It can be a useful tool for describing andanalyzing real networks, even though it is expensive to compute, but aside from our theoretical result in Section 7relating it to treewidth and treelength, it is not our focus in this paper.

9n controlled settings. See Table 1 for a summary of the networks we have considered. The real-world graphs are connected, but we are interested in parameter values for the synthetic graphswhich might cause the instances to be disconnected . In these cases, we work with the giantcomponent, and the statistics in Table 1 are for this connected subgraph.

Network n c k l k m ¯ d ¯ C D

ER Random Graphs

ER(1.6)

ER(1.8) × − ER(2) × − ER(4) × − ER(8) × − ER(16) × − ER(32) × − PL(2.50) × − PL(2.75) × − PL(3.00) × − CA-GrQc

CA-AstroPh as20000102

Gnutella09

Email-Enron

FB-Caltech

762 1 35 43.7 .426 6

FB-Haverford

FB-Lehigh

FB-Rice

FB-Stanford

PowerGrid

Polblogs

PlanarGrid road-TX web-Stanford . × − Table 1: Statistics of analyzed networks: nodes in giant component n c ; k l the lowest k -core; k m the maximum k -core; average degree ¯ d = 2 E/N ; average clustering coeﬃcient ¯ C ; and diameter D . Erd˝os-R´enyi (ER) graphs.

Although ER graphs are often criticized for their inabilityto model pertinent properties of realistic networks, extremely sparse ER graphs have severalstructural inhomogeneities that are important for understanding tree-like structure in realisticnetworks [3]. In particular, in the extremely sparse regime of 1 /n < p < log( n ) /n , ER graphsare (w.h.p.) not even fully connected; ER graphs in this regime have an upward-trending NCP(network community proﬁle) [1]; with respect to their k -core structure, a shallow (but non-trivial)core-periphery structure emerges [3, 90]; and with respect to their metric properties (as measuredwith δ -hyperbolicity), graphs in this regime have non-trivial tree-like properties [3]. Followingprevious work [3], we set the target number of vertices to n = 5000, and we choose p = dn forvarious values of d from d = 1 . d = 32. We denote these networks using ER( d ) . Table 1 The same is true for many other less unrealistic random graph models, assuming their parameters are set toanalogously sparse values (which they are often not). increasing d , i.e., increasing p , the size of the giant componentincreases to 5000, the number of edges increases dramatically, the clustering coeﬃcient remainsclose to zero, the average degree ¯ d increases, and the diameter decreases dramatically. Power Law (PL) graphs.

We also considered the Chung-Lu model [91], an ER-like randomgraph model parameterized to have a power law degree distribution (in expectation) with powerlaw (or heterogeneity) parameter γ , which we vary between 2 and 3. We denote these networksusing PL( γ ) . We consider values of the degree heterogeneity parameter γ ∈ { . , . , . } .Table 1 shows that, as a function of decreasing γ , the size of the giant component increases, theaverage degree ¯ d increases, and the diameter decreases. Although not shown in Table 1, as γ decreases, PL graphs also form a rather prominent, and moderately-deep, k -core structure [3].These are all trends that parallel the behavior of ER as d increases. SNAP graphs.

We selected various social/information networks that were used in the large-scale empirical analysis that ﬁrst established the upward-sloping NCP and associated nestedcore-periphery for a broad range of realistic social and information graphs [1]. These are avail-able at the SNAP website [92]. In particular, the networks we considered are

CA-GrQc and

CA-AstroPh (two collaboration networks); as20000102 (an autonomous system snapshot);

Gnutella09 (a peer-to-peer network from Gnutella);

Email-Enron (an email network fromthe Enron database); as well as the Stanford Web network web-Stanford and the Texas roadnetwork road-TX . These networks are very sparse, e.g., fewer than ca. 10 edges per node; andthey exhibit substantial degree heterogeneity, moderately high clustering coeﬃcients (except for

Gnutella09 , web-Stanford , and road-TX ), and moderately small diameters. In addition,although not presented in Table 1 (and with the exception of road-TX ), these graphs have amuch stronger core-periphery structure, as measured by the k -core decomposition, than typicalsynthetic networks [3, 2, 1]. Facebook Networks.

We selected several representative Facebook graphs out of ca. 100Facebook graphs from various American universities collected in ca. 2005 [93]. These data setsrange in size from around 700 vertices (

FB-Caltech ) to approximately 30 ,

000 vertices (

FB-Texas84 ). In particular, we examine

FB-Caltech , FB-Rice , FB-Haverford , FB-Lehigh ,and

FB-Stanford in this paper. These networks all arise via similar generative procedures, andthus there are strong similarities between them. There are a few distinctive networks, however,that are worth mentioning. In particular, several universities (

FB-Caltech, FB-Rice, FB-UCSC ) have a particularly strong resident housing system, and it is known that this manifestsitself in structural properties of the graphs [93]. Below, we will use the meta-information asso-ciated with this housing system to provide “ground-truth” clusters/communities for comparisonand evaluation. One important characteristic to observe from Table 1 is that these Facebooknetworks, while sparse, are much denser than any of the SNAP graphs we consider or that wereconsidered previously [1]. Miscellaneous Networks.

We also selected a variety of real-world networks that, based onprior work [1, 3, 2], are known to have very diﬀerent properties than the SNAP social graphs or theFacebook social graphs. In particular, we consider

Polblogs , a political bloggers network [94] (agraph constructed from political blogs which are linked); the Western US power grid

PowerGrid [95]; and a two-dimensional 50 ×

50 planar grid

PlanarGrid . For sparse ER graphs, this happens since there is not enough edges for concentration of measure to occur, i.e.,for empirical quantities such as the empirical degrees to be very close to their expected value. For PL graphs, ananalogous lack of measure concentration occurs due to the exogenously-speciﬁed heterogeneity parameter γ . See [2] for how this aﬀects the NCP of these networks. Among the diﬀerences caused by the much higher density of Facebook networks is that these networks have amuch deeper k -core structure than the other real networks, and they tend to lack deep cuts, e.g., they lack evengood very-imbalanced partitions such as those responsible for the upward-sloping NCP [1, 2]. Tree decompositions of toy networks

In this section, we will describe the results of using a variety of TD heuristics on a set of verysimple “toy” networks, on which the optimal-width TDs are known. The ﬁve toy networks weconsider are a binary tree (

SmallBinary ), a small section of the two-dimensional planar grid(

SmallPlanar ), a cycle (

SmallCycle ), a clique (

SmallClique ), and an Erd˝os-R´enyi graphwith an edge probability of p = 0 . SmallER ). Each of these networks has 100 nodes (exceptfor

SmallCycle , which has only 10 nodes—the reason for this is that the principal change byhaving a larger cycle is that the eccentricity of the decompositions becomes much larger, whichsimply makes it more diﬃcult to visualize—and

SmallBinary , which has 128 nodes to maintainsymmetry). In Figure 1, we provide visualizations of each of the ﬁve networks.These very simple network topologies illustrate in a controlled way the behavior of diﬀerentTD heuristics in a range of settings. For example, while SmallBinary is a tree, the other graphsare not; the two-dimensional grid is quite diﬀerent from a tree, as is

SmallCycle (although,from the treewidth perspective it is fairly close to a tree), and both have high-quality well-balanced partitions; and both

SmallClique and

SmallER are expanders (not constant degreeexpanders, but expanders in the sense that they don’t have any good partitions) and thus verynon-tree-like (from the TD perspective), but each has important diﬀerences with respect to theirrespective TDs. We will focus on which types of structures diﬀerent heuristics tend to capture,as well as how diﬀerent heuristics deal with nodes (not bags) which are associated with the coreor periphery of the original network. Importantly, these toy networks have basic constructions,and they (mostly) have known optimal width TDs—e.g.,

SmallPlanar and

SmallCycle haveseveral known equivalent minimum width TDs—and the

SmallER network serves to illustratesome of the eﬀects of randomness on a TD. The insights we obtain here can be used to interpretthe output of TD heuristics in much more complex synthetic and real networks.

In Figures 2, 3, and 4, we show visualizations of TDs produced by various heuristics (the greedy mindeg in Figure 2; the metnnd , nested node dissection via METIS, in Figure 3; and lexm in Figure 4) for each of these ﬁve toy networks. In these visualizations, the size of the bagcorresponds with the bag’s width, and the coloring is based on the fraction of edges present in theinduced subgraph of the bag. In particular, if the nodes in the bag form a clique in the originalnetwork, then the fraction of edges present is 1 . Small-Clique ; but for all of the other networks, including

SmallBinary , there are diﬀerences in thedecompositions produced by the diﬀerent heuristics. Consider

SmallPlanar , SmallCycle ,and

SmallER . For both

SmallPlanar and

SmallER , mindeg and metnnd return TDs withseveral prominent branches, while lexm returns a path for the TD. For mindeg , this is due tothe tendency of the algorithm to pick low-degree nodes on the “outside” of the network and thenwork its way around the outside of the network. For metnnd , this is due to the tendency of thealgorithm to cut the networks repeatedly into smaller pieces and then recursively “eat away” atthese smaller pieces to form the TD. On the other hand, the lexm heuristic works to producea minimal triangulation using lexicographic labelings along paths. This often results in a path-like TD, as the algorithm uses a breadth-ﬁrst search through the network. For SmallCycle , metnnd returns a “branchy” TD, while both mindeg and lexm return path-like TDs. These and other visualizations were created with the GraphViz command neato [96], with the help of [97, 98]. (a) SmallBinary (b) 10 × SmallPlanar (c)

SmallCycle

12 34 567 89 1011 1213141516 17 1819 202122 23242526 27 2829 3031 3233 3435 3637 3839 404142 434445 46474849 50 515253545556 5758 5960 6162 636465 6667 68 697071 72 73 7475 767778 7980 818283 848586 87 888990 91929394 95 96 9798 99100 (d) 100 node

Small-Clique (e)

SmallER

Figure 1: A set of small networks. Edges are colored by their length in the planar embedding. (a)

SmallBinary (b)

SmallPlanar (c)

SmallCycle (d)

Small-Clique (e)

SmallER

Figure 2: Greedy mindeg

TDs of toy networks. Bags are colored by the fraction of possible edgespresent in the bag, with red being denser and blue being less dense. (a)

SmallBinary (b)

SmallPlanar (c)

SmallCycle (d)

Small-Clique (e)

SmallER

Figure 3: metnnd (nested node dissection via METIS) TDs of toy networks. Bags are colored bythe fraction of possible edges present in the bag, with red being denser and blue being less dense. (a)

SmallBinary (b)

SmallPlanar (c)

SmallCycle (d)

Small-Clique (e)

SmallER

Figure 4: lexm

TDs of toy networks. Bags are colored by the fraction of possible edges presentin the bag, with red being denser and blue being less dense.More quantitatively, in Tables 2 and 3, we provide basic statistics for TD heuristics appliedto each of these networks. Table 2 shows a summary of our results for the the (maximum)13idth of TDs produced by various heuristics (for

SmallER , the width given is averaged overﬁve diﬀerent instantiations of the network). We ignore the issue of tie-breaker choices (e.g., in mindeg , choosing among non-unique minimum degree nodes). On the whole, the heuristics doa good job of ﬁnding optimal width TDs on

SmallBinary , SmallClique , and

SmallCycle (with the exception of metnnd ). The greedy heuristics have trouble ﬁnding the optimal widthTD on

SmallPlanar , while lexm and mcs both ﬁnd an optimal decomposition on the grid.On

SmallER , we observe that the greedy heuristics and metnnd outperform lexm and mcs ;this is in agreement with previously-reported results [22].Table 3 shows a summary of our results for the median width of TDs produced by variousheuristics (as deﬁned in Section 2.2). The median width is potentially more useful for revealingstructure in realistic network data since, e.g., it can be used to see whether a TD is dominatedby larger bags or by smaller bags. If a network is dominated by bags of small size (suchas

SmallBinary, SmallCycle, SmallPlanar ), depending on the internal structure of thebag, this can indicate several things. For example, the small bags could consist of tight clustersor cliques, indicating that the network has many tightly connected but small groups of nodes.Alternatively, if a small bag’s structure is mostly disconnected, this may indicate the bag is relatedto small cycles (an example is given below).For

SmallCycle and

SmallPlanar , the small bags are cyclical, while for

SmallBinary the small bags all consist of 2-cliques.

SmallClique and the

SmallER have large median widths(though this is trivial in the case of the clique). The 100-clique is both trivial and too large of aclique to be realistic, but

SmallER has interesting bags. The results in Table 2 show the smallmedian widths of

SmallPlanar, SmallCycle and

SmallBinary and the large median widthsof

SmallER and

SmallClique . Table 2 also demonstrates that, while there are diﬀerences inthe widths of the TDs produced by the heuristics, these diﬀerences are reasonably small. Network n W mindeg W minfill W nnd W mcs W lexm SmallBinary

128 1 1 3 1 1

SmallPlanar

100 13 13 14 10 10

SmallCycle

10 3 3 3 3 3

SmallClique

100 100 100 100 100 100

SmallER

100 86 85 86 91 89

Table 2: TD heuristic maximum widths. The widths of the

SmallClique and

SmallER arerelatively large (they grow linearly with the network size), the width of

SmallPlanar networkis of an intermediate size (they grow with the square root of the network size), and the widths of

SmallBinary and

SmallCycle are small (they stay constant with the size of the networks).The greedy heuristics ﬁnd smaller width decompositions on

SmallER , while lexm and mcs perform better on

SmallPlanar . Using medians rather than eccentricity can result in diﬀerent central bags. However in most of the networksthat we studied, the results were very similar. In particular, the biggest changes occurred in the FB networks wherethe median shifted towards the heavier end of the path-like TD. However, these bags were still a part of the thicktrunk of the network and thus the results were very similar. In other networks, the median bag was very close thecentral eccentric bag, and the main diﬀerence is that the median bag tended to have more whisker branches (abranch consisting of one or two bags of small width). This does not substantially change any of our analysis. We will see below that most real networks have small median width, with smallest bags dominated by cliques,intermediate bags dominated by cycles, and with large, connected, central bags which resemble bags of

SmallER . etwork n ˜ W mindeg ˜ W minfill ˜ W nnd ˜ W mcs ˜ W lexm SmallBinary

128 1 1 1 1 1

SmallPlanar

100 5 5 5 10 8

SmallCycle

10 3 3 3 3 3

SmallClique

100 100 100 100 100 100

SmallER

100 52 51 49 85 80

Table 3: TD heuristic median widths. This quantity is much smaller than the correspondingwidths in several of the networks (although it remains large with

SmallER ), indicating that thesenetworks are dominated by bags which are much smaller than the largest bag in the network. (a) A tree (left) and a opti-mal TD (right). (b) A clique (left) anda optimal TD (right). (c) An optimal TD of a cycle which is similar to thedecomposition found by mindeg . The center nodeis placed in every bag of the decomposition.(d) An optimal TD of the cycle which is similar tothe decomposition found by lexm . Note the cycleis ﬂattened and the bags are formed across the de-composition. (e) An optimal TD of the cycle which is similar tothe decomposition found by metnnd . The cycle is“pinched” in several places, forming central bags;the remaining pieces of the cycle can then be de-composed recursively (pinched in again) or using themethods in (b) and (c).

Figure 5: Example TDs. The tree and the clique have a standard optimal TD. The cycle hasmany possible minimum width TDs, though all place disconnected nodes in the bags.

An important aspect of TD heuristics is the diﬀerence between their behavior on (denser) clique-like graphs and (sparser) cycle-like graphs. In Figure 5, we illustrate this. First, for reference, inFigures 5a and 5b we give canonical minimum width TDs for a tree and a clique. To understandthe diﬀerence between cycles and cliques, recall that there are many ways of producing a TD ona cycle (three of these are illustrated in Figures 5c, 5d, and 5e). One simple way is to produce atree which is a path. This can be done by taking a node v and placing it and its two neighbors ina bag at one end of the path. Then, keeping v in every bag, progress around the cycle sequentiallyforming the next bag by including v and the two nodes of the next edge (see Figure 5c). Anothermethod produces a path by “ﬂattening” the cycle, and places each edge in a bag with the nodefrom the other side (see Figure 5d). The metnnd heuristic “pinches” the cycle at a few points,and the produces branches from each of those points (see Figure 5e).There are many diﬀerences between these TD heuristics, but an important point is that thenodes in the cycle must be placed in bags with nodes they are not neighbors of in the originalgraph. Diﬀerent TD heuristics are very diﬀerent in terms of how they make this decision, and itseﬀect can be seen in the TDs constructed by these heuristics.15nother important consideration is the interior structure of each bag that is produced by a TDheuristic. Recall that in SmallClique , the only valid TD (which does not contain unnecessarybags) is a single bag containing the entire network. Relatedly, if the network is a k-tree , formedby overlapping cliques (rather than overlapping edges, as in a normal or 2-tree), then the TD willhave bags which consist of the individual cliques. Thus, with cliques, it is the local structure (localin the original graph, in the sense that it is driven by neighbors of a given node in the originalgraph, in contrast with what is going on in, e.g.,

SmallCycle ) that drives the bag formation.With cycles, on the other hand, this local structure is partially “lost” in the bags of the TD.This is of interest since, as already mentioned, the interior structure of bags of diﬀerent widthsis important for understanding what is creating the properties of the TD.As an example, we observe that, for all of the heuristics, the larger bags on

SmallPlanar have many disconnected nodes and only a few edges. This is a signature of “cyclical” behavior;and, indeed, from the TD perspective, the grid “looks like” a set of small, regular, overlappingcycles. The structure of the TD is formed by the heuristic’s method of moving across the gridand closing cycles. This suggests that a simple metric to measure whether the interior of a bagis driven by cycles or is driven by small, tightly connected clusters: measure the fraction of edgespresent in the bag, i.e., the edge density of the bag. (We will do this below, and this is why wecolor many of the visualizations by the density of the bag.)

SmallPlanar (for which there exist good well-balanced partitions) and

SmallER (for whichthere do not exist good well-balanced partitions) also illustrate diﬀerences between the TD heuris-tics. For both graphs, the greedy heuristics and metnnd have signiﬁcantly smaller median widthsthan maximum widths. This is indicative of heterogeneity in the network: there are nodes whichare so entangled with other nodes that they must appear together in a large bag, but there arealso nodes which are connected to only a small number of other nodes and only need to appear ina few very small bags. This can partially be explained by the tendency of the greedy heuristicsto work from the “boundary” of the graph (e.g., boundary nodes have smaller degrees) and topick points to “eat into” the graph.This is illustrated in Figure 6 for

SmallPlanar . Using mindeg as an example, recall thatheuristic works by successively picking a minimum degree node in the network; thus, when appliedto

SmallPlanar , it will pick each of the corner vertices of the grid ﬁrst. This then forms smallbags at each corner and, depending on whether it is picking non-unique nodes at random or inan ordered fashion, it will then proceed to work in from the periphery of the network. Indeed,in Figure 2 and 3, we see that the TDs for these heuristics have four major arms with smallleaves containing nodes from the border of the grid. Figure 6 provides a visualization of wherethe nodes from these bags (one of the peripheral bags and one of the central bags in the TD) arein

SmallPlanar . (See, in particular Figures 6a and 6c for mindeg .)The lexm and mcs heuristics, in contrast, ﬁnd the minimum possible width for the grid, butthe TDs—as illustrated in Figure 4 for lexm —that are produced are long, path-like trunks. Thisis due to they way that lexm picks a starting node and then works across the graph in the styleof a breadth-ﬁrst search. With

SmallPlanar , it starts at one corner and then moves acrossthe network to form a minimal triangulation. Although the (maximum) width is minimal, themedian widths of these networks are relatively large, as most of the bags are roughly the samesize (see Figures 6b and 6d for the results of lexm ).With

SmallER (which is harder to visualize since it doesn’t embed well in two dimensions),the mindeg and metnnd algorithms also eat in from the “boundary” of the network, where here“boundary” means nodes with slightly smaller degrees or slightly better cuts (slightly smaller due16 a) mindeg bag fromupper left arm ofFigure 2b. (b) lexm bag fromlower right of Figure4b. (c) mindeg centralbag in Figure 2b. (d) lexm central bagin Figure 4b. Figure 6: Representative bags from an arm in mindeg and an arm in lexm , as well as the asso-ciated central bags. In mindeg , each arm progresses from a diﬀerent corner of

SmallPlanar .However, when these bag lines converge, the central bags end-up containing pieces of each line,as in Figure 6c. In lexm , the line proceeds diagonally across the grid from the lower left cornerto the upper right in a regular manner, as in Figure 6d. This results in smaller central bags andproduces a path decomposition. See the main text for more details.to random ﬂuctuations). As with the very diﬀerent

SmallPlanar , this produces several armsand then a few central bags. In

SmallER , the greedy heuristics produce a better TD in termsof width than the lexm and mcs heuristics, both search based heuristics. These similarities and(substantial) diﬀerences between TD heuristics in

SmallER (compared with

SmallPlanar )are apparent in Figures 2, 3, and 4.

Overall, the greedy heuristics, e.g., mindeg or metnnd , seem to produce a better representationof the large-scale structure of SmallPlanar and

SmallER than the lexm and mcs heuristicsin two ways. In

SmallER , the greedy heuristics ﬁnd decompositions with both smaller maximumas well as smaller median widths. (Since most real networks have a randomized aspect to theirgeneration, this indicates that greedy heuristics may be more useful.) On

SmallPlanar , themedian width is smaller and the greedy heuristics do a better job of “capturing” all four cornersof the grid. In other words, the resulting tree decomposition has four branches, each of whichis tied to a speciﬁc corner of the network, while lexm and mcs

TDs capture two of the cornerstructures. (Although the maximum width is smaller with lexm and mcs , the ability to capturewhat is an obvious visual feature of a simple network is of potential interest.) In the rest ofthe paper, we will be considering signiﬁcantly larger and more complicated networks than thesetoy examples. With these larger networks, the metnnd and amd heuristics, as implementedusing INDDGO [27], are the most scalable, compared with the basic greedy algorithms ( mindeg or minfill ). The amd heuristic is very related to the mindeg heuristic (recall that amd picksminimum nodes based on an easy-to-compute approximation of node degree), and it gives similarresults to mindeg . The the most consistent diﬀerence between the two heuristics seems to be thenumber of central/overlapping bags produced. Thus, we will often show results only for the amd heuristic as a matter of visual convenience. 17 Tree decompositions of synthetic networks

In this section, we will describe the results of using a variety of TD heuristics on a set of syntheticnetworks. We focus our attention on two simple classes of random graphs: the popular Erd˝os-R´enyi (ER) random graphs (in Section 5.1); and a power law (PL) extension of the basic ERmodel (in Section 5.2). (We emphasize, though, that similar qualitative results also hold formany other random graph models—in their extremely sparse regimes.) This will allow us to beginto understand how TDs behave in random graph models with a very simple random structure.Importantly, we will focus on extremely sparse graphs. For the ER model, this means values ofthe connection probability p that lead to the graph not even being fully-connected (in which casewe will consider the giant component), while for the PL model this means values of the degreeheterogeneity parameter γ that are typically used to describe many realistic networks and thatlead to analogously sparse graphs.ER graphs are often presented as “strawmen,” since they obviously do not provide a realisticmodel for many aspects of real-world networks (e.g., the heavy-tailed degree distributions andthe non-zero clustering coeﬃcient present in many real networks). Indeed, “vanilla ER” graphsthat are often considered (e.g., ER graphs with densities that are suﬃciently large that the graphis fully-connected) are not tree-like—either by the metric notion of δ -hyperbolicity or by thecut-based notion of TDs. Recent work has shown, however, that with respect to their large-scalestructure, extremely sparse ER networks do capture several subtle but ubiquitous properties ofinterest in realistic networks: ﬁrst, the small-scale versus large-scale isoperimetric structure ofthe NCP [1, 2]; second, a size-resolved version of δ -hyperbolicity that is consistent with large-scale metric tree-likeness [3]; and third, a non-trivial core-periphery structure with respect to k -core decompositions [3]. (In particular, in the sparsest regime of the ER networks that weconsider, ER(1.6) , a very shallow core-periphery structure appears—whereas none exists at thehigher densities.) Importantly, for all three of these properties, similar results were seen withother random graph models, such as PL random graphs in the regime of the degree heterogeneityparameter that is commonly-used. Prior work has also provided evidence that these extremelysparse random graphs have non-trivial tree-like structure (at least relative to much denser ERgraphs) when viewed with respect to the cut-like notion of tree-likeness [3].Here, we provide a much more detailed analysis of this phenomenon for TD heuristics appliedto ER and PL graphs. We will be particularly interested in similarities between extremely sparseER graphs and PL graphs with respect to the core-periphery structure (e.g., from k -core andrelated decompositions) of a network. Among other things, we show that this core-peripherystructure is captured with the amd TD. Of particular interest is the how the core-peripherystructure relates to central (low eccentricity) or perimeter (high eccentricity) bags in the TD.

Here, we give a summary of the empirical results of an analysis of TDs on ER random graphs,with an emphasis on the behavior as the connectivity parameter p is varied. In the very sparse toextremely sparse regime, ER networks have non-trivial global structural changes as p is varied [99,100]. In particular, for our subsequent results, there are three regimes of p that are of interest: if p < n , then the largest connected component is O (log n ) in size, and the small components arelikely trees; if n < p < log nn , then the graph has a giant component (i.e., a constant fraction of thesize of the network is connected), and the remaining small components of size O (log n ) are likelytrees; and if p > log nn , then almost surely the network is fully-connected, the degrees are very neartheir expected value, and there are no good-conductance clusters (of any size). We are interestedin these last two regimes, and we consider synthetic graphs ( ER(1.6) through

ER(32) —values18 etwork N amd E amd W ˜ W ˜ D ER(1.6)

ER(1.8)

ER(2)

ER(4)

ER(8)

ER(16)

ER(32) ER networks Network N amd E amd W ˜ W ˜ D PL(2.5)

PL(2.75)

PL(3.0) PL networks Table 4: Basic amd

TD statistics for ER and PL networks. N amd gives the number of bags inthe TD, E amd gives the maximum eccentricity (diameter) of the TD, W and ˜ W are the maximumand median width of the TD, and ˜ D is the median bag density.of p between 1 . / / n = 5000 nodes) that go from extremelysparse to somewhat denser. Table 1 provides basic statistics for these graphs. We start with Figure 7 and Table 4a, which show the basic features of the TDs of ER networks.Figure 7 presents a visualization of part of the output of a TD with the amd heuristic, colored bydensity of bag subgraph, for the sparsest ( ER(1.6) ) and densest (

ER(32) ) networks in our ERsuite. Results are similar to those of metnnd . Observe that there is a much greater heterogeneityin the density of bags for

ER(1.6) than for

ER(32) . For the former, there are many small bagswhich are cliques; while for the latter, there are fewer small bags, and the bags are much sparserin general. This suggests (and we have veriﬁed by inspection) that the sparser

ER(1.6) hasgreater structural heterogeneity than the denser

ER(32) .A more detailed understanding of this can be obtained from the summary statistics in Table 4a.Several observations are worth making. First, the number of bags in the TD tends to decreaseas the density p increases (with the exception of the sparsest regime, where the giant componentis smaller). This is because the network is mostly placed into one bag, and only a few bags areneeded to take care of the remaining nodes. Second, the TD itself, viewed as a graph, has smallerdiameter as the density p increases. Third, the maximum and median width increases with theaverage degree of the network. Indeed, the width increases quickly with the average degree,with the largest bag (at the “center” of the TD) containing 77% of the nodes in the networkin ER(32) . Finally, the median density of the bags decreases dramatically as the density ofthe original graph increases. This initially-counterintuitive phenomenon is easily-explained: forextremely sparse ER, the TDs have many small bags, which only need small numbers of edges tohave a reasonably high edge density. With the dense graphs, many nodes have to be placed ineach bag, and this requires quadratically more edges per bag to achieve a similar density.We would next like to look in more detail at the structure of the TDs generated on thesediﬀerent ER networks (e.g., what changes as we move from the central, large bags of the TD tothe smaller, peripheral bags of the TD) as well as the internal structure of each bag. Recall, ﬁrst,that, in a very sparse ER graph with expected degree greater than 2 log 2, but still suﬃcientlysparse, there are three diﬀerent parts of the random network (two parts which may be viewedas core-like, one part which may be viewed as periphery-like) [90]. The core-like part of thesegraphs is bi-connected, and it has an expander-like inner core (i.e., a set of nodes of “higher”degree), surrounded by an outer core which has long chains of nodes (forming sparse cycles).The third, peripheral, part of the network consists of tree “whiskers” that hang oﬀ the bicon-nected core. A similar structure has been observed empirically when looking at low-conductance19 a) ER(1.6) , the largest bag in this ﬁgure contains80 nodes. (b)

ER(32) , the largest bag in this ﬁgure contains3864 nodes.

Figure 7: Visualization of

ER(1.6) and

ER(32) amd tree decompositions, colored by the densityof the bag subgraph. For visualization purposes, the two networks are not drawn to the samescale. The bags in

ER(32) have widths that are approximately 50 times larger that that of

ER(1.6) . The blowups show the upper-left corner of the visualization in greater details. Theblowups show the color of some of the smaller bags that are in the peripheral part of each TD.In

ER(1.6) , many of the very the small bags are red (meaning they contain a clique, the vastmajority of which are simply a single edge). Slightly larger bags are light blue or yellow (indicatingan edge density of ca. 0 .

25 (light blue) to ca. 0 .

75 (yellow)). In

ER(32) , all of the bags of theTD are dark blue, indicating that these bags are all very sparse, regardless of whether they areperipheral or central to the TD. The statistics in Table 4a conﬁrm this.clusters/communities in a wide range of large social and information networks [1] and also whenlooking at the Gromov hyperbolicity and k -core properties of these real-world networks [3]. Thiscontrasts sharply with the denser ER graphs, which are much more regular in terms of theirdegree variability, core structure, etc. Our results (here on extremely sparse ER graphs and be-low on PL graphs and many real-world graphs) demonstrate that TD heuristics can reﬂect thiscore-periphery structure. Next, Figure 8 presents visualizations of three typical amd bags for

ER(1.6) and

ER(32) , re-spectively. In each case, the three bags are the most central (lowest eccentricity) bag in the TD(which we call the central bag), a typical bag that is a leaf in the TD (a periphery bag), and atypical bag that is in between these two in the TD (an intermediate bag). The color-coding isby k -core number, with high core nodes being red and low core nodes being blue. Note that the20entral bag for ER(1.6) is disconnected and consists of almost all singletons, while the centralbag for

ER(32) is well-connected; and that the intermediate and peripheral bags for

ER(1.6) are small, the latter consisting of only a single edge, while for

ER(32) both the intermediate andthe peripheral bag have non-trivial internal structure. (a)

ER(1.6) central bagsubgraph. (b)

ER(1.6) intermediate bagsubgraph. (c)

ER(1.6) peripheral bagsubgraph. (d)

ER(32) central bagsubgraph. (e)

ER(32) intermediate bagsubgraph. (f)

ER(32) peripheral bagsubgraph.

Figure 8: Bag subgraphs of a amd

TD of

ER(1.6) and

ER(32) graphs, colored by the k -corenumber of the node (red is high k , blue is low k ). The central bag is the largest bag in the TDand one of the bags of minimum eccentricity; the peripheral bag is a leaf in the TD graph, and itachieves the minimum width in the TD; the intermediate bag is in between these two extremes.The increased density of ER(32) over

ER(1.6) is the obvious cause of these diﬀerences, butit is worth considering what structures, produced by the increased density, aﬀect the formationof the amd

TD. Recall from the toy graphs that heuristic TDs of cycles produced bags which haddisconnected nodes. There were several diﬀerent ways of producing the decomposition, but anyTD of a small width on a cycle includes disconnected nodes in most bags. The more complex

SmallPlanar has many small overlapping cycles. In that case, the heuristics have to put manynonadjacent nodes into a bag. Essentially, cycles force distant nodes into the same bag, and manyoverlapping cycles will force many distant nodes into the same bag.This intuition suggests (and we have conﬁrmed by inspection) that a bag with many discon-nected nodes, as in the central bag of

ER(1.6) shown in Figure 8a, is due to a large number ofoverlapping cyclical structures. The intermediate bags of

ER(1.6) contain nodes from the long,overlapping cycles of the outer core (and as these cycles do not overlap as much in periphery,these bags have fewer nodes), while the peripheral bags each contain a single edge, capturingthe small trees on the periphery of the network (see also Figure 7). The coloring of the nodesindicates the core-periphery structure of the subgraph induced by the bags. In

ER(1.6) thereis only a 1-core (blue) and a 2-core (red), thus the red nodes in the central bags are all in the2-core, while the peripheral trees are in the 1-core, which agrees with [90].On the other hand, in

ER(32) , whose core-periphery structure spans from a 7-core (blue)to a 23-core (red) although almost all of the nodes (94%) are in the 23-core, the central bagcontains a relatively tightly-connected mass of 77% of the nodes in the network. This beginsto look more like

SmallER , which is a very dense ER network. The intermediate bags containsparser structures (with some of the disconnected nodes and edges that are indicative of cyclicalstructures); and, although the peripheral bags still contain the smallest structures, in

ER(32) they no longer contain only a single edge. This indicates that even the sparsest regions containcycles and other complicated structures (but very few triangles, which agrees with the smallclustering coeﬃcient of these networks). 21 .1.3 Large-scale organization

To provide a more quantitative evaluation of these ideas and to characterize better the large-scale organization of these synthetic networks, consider Figures 9, 10, and 11. These ﬁgures plotbag cardinality histograms, average bag density versus bag cardinality (this is width + 1), andaverage k -core versus bag eccentricity for two ER networks (as well as a suite of PL and real-worldnetworks). We will refer to other subﬁgures below, but for now consider only Figures 9a and 9e,Figures 10a and 10e, and Figures 11a and 11e for results on ER(1.6) and

ER(32) , respectively. F r a c t i on o f bag s Bag Cardinality

Fraction of BagsCumulative Fraction (a)

ER(1.6) . F r a c t i on o f bag s Bag Cardinality

Fraction of BagsCumulative Fraction (b)

PL(2.5) . F r a c t i on o f bag s Bag Cardinality

Fraction of BagsCumulative Fraction (c) as20000102 . F r a c t i on o f bag s Bag Cardinality

Fraction of BagsCumulative Fraction (d)

CA-GrQc . F r a c t i on o f bag s Bag Cardinality

Fraction of BagsCumulative Fraction (e)

ER(32) .

0 10 20 30 40 50 60 70 80 90 100 F r a c t i on o f bag s Bag Cardinality

Fraction of BagsCumulative Fraction (f)

PL(3.0) . F r a c t i on o f bag s Bag Cardinality

Fraction of BagsCumulative Fraction (g)

FB-Lehigh .

0 5 10 15 20 25 F r a c t i on o f bag s Bag Cardinality

Fraction of BagsCumulative Fraction (h)

PowerGrid . Figure 9: Bag cardinality histograms with cumulative fraction of bags for a representative setof networks. For all of the networks, there are many more small cardinality bags than largecardinality bags. This is consistent with a TD structure which has a few central bags whichquickly taper and branch oﬀ into many small peripheral bags. As we will see in Figures 10 and11, in networks with a strong core-periphery structure (e.g., not

PowerGrid ), these peripheralbags tend to have a low average k -core and high relative density. FB-Lehigh has the mostlarge bags, due to the tendency of the FB networks to form long, path-like trunks in its TDs. PowerGrid has the smallest tapering eﬀect; although the largest bags are still at the centerof the decomposition, there is only a small change in size from the largest bags to the smallest,presumably since this network has the weakest core-periphery structure.We saw in Figure 8a the central bag for

ER(1.6) , and we interpreted it in terms of the outputof amd

TD as due to overlapping cycles; the histograms in Figure 9a show that for

ER(1.6) (andthe networks in the rest of Figures 9) there are only a very few such central bags. On the otherhand, Figure 9 also shows that there are many very small bags in

ER(1.6) . Two features wenoticed about these two types of bags are that there is a change in edge density between the largeand small bags and that there is a change in the k -cores represented between the large and smallbags. Figure 10 and Figure 11, where the average edge density of a bag is plotted against thebag cardinality, show two ways of measuring this. These ﬁgures show that small peripheral bagsare dense (relative to their small size—in the extreme case, this could be a single edge), and theycontain low k -core nodes, as indicated, e.g., by the downward slope of the plots in Figure 11a.Many of the results for ER(32) are very diﬀerent than for

ER(1.6) . The histograms in Figure9a show that this network has a much larger proportion of high-width bags than

ER(1.6) . Thelargely homogeneous core-periphery structure of dense ER networks should also be clear since thenodes, regardless of bag size, are mostly in the deepest core ( k = 23). These trends can be seen22 A v e r age D en s i t y Bag Cardinality

Avg. density (a)

ER(1.6) A v e r age D en s i t y Bag Cardinality

Avg. density (b)

PL(2.5) A v e r age D en s i t y Bag Cardinality

Avg. density (c) as20000102 A v e r age D en s i t y Bag Cardinality

Avg. density (d)

CA-GrQc A v e r age D en s i t y Bag Cardinality

Avg. density (e)

ER(32) A v e r age D en s i t y Bag Cardinality

Avg. density (f)

PL(3.0) A v e r age D en s i t y Bag Cardinality

Avg. density (g)

FB-Lehigh A v e r age D en s i t y Bag Cardinality

Avg. density (h)

PowerGrid

Figure 10: Average bag density versus bag cardinality plots for a representative set of networks.In the PL networks the small bags are dense—in the extreme case consisting of a single edge—like the ER(1.6) network; but the largest bags are larger and mostly connected, similar to theintermediate and central bags of

ER(32) . The real-world networks all show denser bags even atlarge size scales as compared to the synthetic networks, and this is due to the increased clusteringpresent in these networks.by comparing the density of the smallest bags in

ER(1.6) and

ER(32) in Figures 10a and 10e.The ﬂat plot of the average k -core in Figure 11e, which holds steady close to the value of themaximum k -core, indicates the lack of a core-periphery structure in the network.Putting all of these results together, we can conclude that when it exists (e.g., in extremelysparse ER graphs), the core-periphery structure of ER networks is captured by the amd TD; andwhen the core-periphery structure does not exist (e.g., for ER graphs for other even moderatelysparse values of p ), the large width of the TD indicates that the most of the network is in thelargest bag, which is analogous to most of the nodes being in the core of the network. Here, we give a summary of results of an analysis of TDs on PL random graphs, with an emphasison the behavior as the degree heterogeneity parameter γ is varied. Recall that Table 1 providesbasic statistics for the PL graphs. PL graphs are a class of ER-like random graphs, except thatdegree heterogeneity is exogenously-speciﬁed. Previous work has shown that PL graphs haveimportant similarities with extremely sparse ER graphs, when one is interested in small-scaleversus large-scale tree-like structure [1, 3, 2]. In particular, the increased degree heterogeneityproduces a large-scale core-periphery structure in the PL networks, similar to the extremely sparseER networks, but these PL networks also have some of the characteristics of denser ER networks(e.g., the core is more strongly connected and the diameter of the network is smaller).We start with Table 4b, which show the basic features of the TDs of PL networks. PL(3.0) has the least amount of degree heterogeneity and has similar characteristics to

ER(1.6) , whilethe lower degree exponents (

PL(2.75), PL(2.5) ) have characteristics similar to both the denseand sparse ER networks. Most notably, the maximum width increases (as it would if the densityincreased), while the median width and median bag density stay the same (low and high, respec-23 A v e r age k - c o r e Bag Eccentricity

Avg. k-core (a)

ER(1.6) A v e r age k - c o r e Bag Eccentricity

Avg. k-core (b)

PL(2.5) A v e r age k - c o r e Bag Eccentricity

Avg. k-core (c) as20000102 A v e r age k - c o r e Bag Eccentricity

Avg. k-core (d)

CA-GrQc A v e r age k - c o r e Bag Eccentricity

Avg. k-core (e)

ER(32) A v e r age k - c o r e Bag Eccentricity

Avg. k-core (f)

PL(3.0)

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 A v e r age k - c o r e Bag Eccentricity

Avg. k-core (g)

FB-Lehigh A v e r age k - c o r e Bag Eccentricity

Avg. k-core (h)

PowerGrid

Figure 11: Average k -core versus bag eccentricity for a representative set of networks. Thecorrelation between the core-periphery structure and the central-perimeter bags can be seen in adownward slope in these plots. Networks with no prominent core-periphery structure ( ER(32) and

PowerGrid , for two very diﬀerent reasons) have a ﬂat plot here; while networks withmoderate core-periphery structure (the PL graphs and ER(1.6) ) have a downward sloping line,but relatively shallow (i.e., not deep) cores. as20000102 and

CA-GrQc both have prominent,deep core-periphery structures that reveal themselves in this plot. The dips that show up atsmall eccentricities in several of the synthetic networks and

CA-GrQc are due to the manysmall “whiskers” (in the sense of [1]) that hang oﬀ of the core bag.

FB-Lehigh also has a deepcore-periphery structure (in the sense of k -core decompositions); but because of the long path-likenature of the TD and since most of the nodes are in the deepest cores, the plot is ﬂat with largerdownward dips as the bag eccentricity increases.tively), as in the ER(1.6) . In the previous section, we saw that the low median width and highdensity was related to the presence of a core-periphery structure in the network. As we will see,this is also true of the PL networks, and the amd

TDs are again able to capture this structure.Among other things, we ﬁnd that for the PL networks, for a given average degree, the presenceof very high degree nodes that tend to link to each other means that the density of the high-width bags, i.e., the core of the network, is greater, making it more like the cores of the denser ER networks. On the other hand, the peripheries of these PL networks are still very sparse, andTD bags including them look more like the peripheries of the sparse ER networks. The peripheryresults are reﬂected in the results presented in Table 4b, where we see that the median width islow and the median density is high (many bags with only a single edge within). The visualizationsof the central, intermediate, and peripheral bags from TDs of PL networks in Figure 12 reﬂectthis. In particular, the central bag for PL(2.5) looks somewhat like the intermediate

ER(32) bags, while the central bag for

PL(3.0) is much less well-connected; and the peripheral bags forboth PL graphs look like the ER(1.6) bags. The TD reﬂects the core-periphery structure viathe central-peripheral bags as reﬂected in the downward slope of Figure 11b.As the power law exponent γ is increased, recall that the amount of degree heterogeneity inthe resulting network is reduced, i.e., the number of high degree nodes speciﬁed by the power lawdegree distribution is decreased. As a necessary consequence of maintaining this distribution as24 a) PL(2.5) central bagsubgraph. (b)

PL(2.5) intermediate bagsubgraph. (c)

PL(2.5) peripheral bagsubgraph. (d)

PL(3.0) central bagsubgraph. (e)

PL(3.0) intermediate bagsubgraph. (f)

PL(3.0) peripheral bagsubgraph.

Figure 12:

PL(2.5) and PL(3.0) bag subgraphs, colored by k -core number of the node.the nodes are connected, the high degree nodes are likely to be connected to other high degreenodes. This causes a core-periphery structure to emerge (see [3] for empirical measurementsbetween the relationship between γ and the k -core structure).For example, the core-periphery structure of PL(3.0) is shallower than that of

PL(2.5) ,as seen in Figure 11. Similarly, the width of the

PL(2.5) amd

TD is larger than that of the

PL(3.0) amd

TD. In all cases these widths are less than the corresponding

ER(2) network,whereas one might expect these networks to have larger widths because of the increased core-periphery structure. This occurs because there are several factors to consider as γ is decreased.The core does become denser as more edges are added to the core, causing these nodes to becomemore diﬃcult to separate; but most of those extra edges come from the outer regions of theexpander-like core, thus shrinking the size of the core and increasing the size of the periphery.In other words, when only a few medium degree nodes are added, then there are still cyclicalstructures in the core, as is observed in ER(1.6) , except smaller; but as higher degree nodesare added, the core becomes denser and begins to become larger, as this forces larger and largerpieces of the core to be placed in the same bag, as is observed in

ER(32) .The TD results on ER and PL graphs demonstrate that TDs (in particular, with the amd heuristic) can capture the core-periphery structure of two common random network models. Inboth cases, to the extent that there was a core-periphery structure (which itself depended onthe sparsity parameter p or the degree heterogeneity parameter γ ), the central and peripheralbags in the TD from amd were correlated with this structure. The peripheral bags were smallerand much sparser than the central bags and contained nodes from the shallow (low) k -cores ofthe network. With the exception of the tree-like periphery of the extremely sparse ER and the PL networks, the structures observed in TDs of the random network models were largely drivenby loosely connected core structures (e.g., overlapping loops in sparser regions and expander-likecores in the denser regions). This is consistent both with the results on the toy networks andprevious work involving the structure of ER networks [99, 90]. (The one exception to this is thedenser ER(32) , where the central bag contained 77% of the network; in this case, the networkdoes not exhibit the core-periphery structure of the other networks looked at in this section.) Wewill see how we obtain similar results when applying the amd heuristic to real-world networks.

In this section, we will describe the results of using a variety of TD heuristics on a set of real-worldnetworks. Our goal is to use the insights from the previous sections to evaluate the performance ofexisting TD heuristics on real social and information networks and to understand how those TDs25an be used to obtain an improved understanding of the properties of these realistic networks.Our main results in this section are three-fold. First, in Section 6.1, we summarize results of adetailed empirical evaluation of the amd

TD heuristic applied to our suite of realistic networks. The main focus is to illustrate how these TDs capture previously-identiﬁed core-periphery struc-ture, and also to illustrate how the internal structure of TD bags can be understood in termsof large-scale cycles and small-scale clustering in the original graph. Second, in Section 6.2, weevaluate the ability of amd to identify small-scale good-conductance communities such as thosepreviously-identiﬁed by the NCP with local spectral methods [1, 2]. We show connections betweenbags that are more peripheral in the TD and small good-conductance communities responsible for“dips” in the NCP. Third, in Section 6.3, we illustrate that TD heuristics can be used to identifycertain other types of large-scale non-conductance-based “ground truth” communities. In par-ticular, we will show connections between bags that are more central in the TD and large-scalecommunity-like (by a “ground truth” metric but not by conductance quality) clusters.

Here, we will describe the results of an empirical evaluation of the amd

TD heuristic appliedto our suite of real-world networks (those in Table 1). We will begin in Table 5 with a briefsurvey of all of our real networks, and we will then focus on four representative networks: as20000102 , CA-GrQc , FB-Lehigh , and

PowerGrid . The ﬁrst three all exhibit some form ofpreviously-recognized core-periphery structure [3], while

PowerGrid is known to lack a strongcore-periphery structure (basically since it is heavily tied to the underlying locally-Euclideangeometry of the Earth [3]). Note, though, that the Facebook networks are very core-heavy, inthe sense that they have many nodes in deep cores, essentially because of their signiﬁcantlyhigher average degree (see, e.g., [3] and Table 1). (Thus, informed by previous results on k -coredecompositions and related tree-like techniques [3, 75, 1, 87], we expect to see evidence of the core-periphery structure in the TDs associated with as20000102 , CA-GrQc , and, to a lesser extent,

FB-Lehigh , but a lack of substantial core-periphery structure in the TD of

PowerGrid .) In Table 5, we present the number of bags in the amd

TD ( N amd ), the maximum eccentricity(diameter) of the TD ( E amd ), the maximum and median width of the TD ( W and ˜ W , respectively),and median bag density ( ˜ D ). These measurements provide us with an idea of how large is the mostconnected part of the network (maximum width); how numerous are small bags (median width),which is indicative of areas of the network that have small separators, in our case small peripheralregions of the network; and whether the small separators are more clique-like or consist of mostlydisjoint nodes (median density), with disjoint nodes being indicative of cycles and clique-likestructures being indicative of more meaningful communities. A large maximum width combinedwith a low median width is evidence for a deep core and a shallow periphery, and high mediandensity is evidence for a periphery based on more community-like separators, rather than moredisparate separators. These observations assume that high density bags are mostly small-widthbags. This assumption is plausible, given that as the width w of a bag increases, the numberof edges required to maintain a constant density increases like w ; and in many cases we haveconﬁrmed this assumption indirectly or by direct observation. For example, see our discussion We should note that we ran these computations with many diﬀerent TD heuristics. In most of this section,however, we only show results from (the most scalable) amd heuristic. This is simply for brevity. There were somediﬀerences from heuristic to heuristic, but we feel this one is representative of the type of behavior found.

26f bag density and bag width below, as well as Figure 10 below for empirical evidence that highdensity bags are generally the smallest width bags.

Network N amd E amd W ˜ W ˜ D CA-GrQc

CA-AstroPh as20000102

Gnutella

Email-Enron

FB-Caltech

395 30 357 18 0.53

FB-Haverford

516 56 891 37.5 0.38

FB-Lehigh

FB-Rice

FB-Stanford

PowerGrid

Polblogs

899 49 294 6 0.57 road-TX . ×

170 197 3 0.5 web-Stanford . ×

500 1419 5 0.83

Table 5: Statistics for TDs of real networks. Notation is the same as in Table 4.Networks based on an underlying Euclidean geometry (e.g., road-TX , PowerGrid ) havelow maximum widths and low median widths, which indicates that they do not have a strongcore-periphery structure. While these networks have many small width bags, which is indicativeof tree-like sections of the network, the internal subgraphs have a low median density (e.g., ascompared to certain ER networks). More social networks, such as

Polblogs and the Facebooknetworks, all have higher average degrees and, consequently, higher widths, with lower medianwidths. This is one indicator of a core-periphery structure. As the median widths are higher inthese networks, as compared to the other real networks (although lower than the ER networks),the peripheral structure in these social graphs tends to be denser than in the other networks. Also,the median density, while low compared to the other real networks, is very high compared to themedian density of the densest ER networks in Table 4a. Thus, although the periphery is morediﬃcult to separate into small good-conductance community-like clusters than some of the otherreal networks, e.g., CA-GrQc or CA-AstroPh , it is still formed from more community-likepieces than similar ER (or PL) networks.Observe also that the two web networks,

Gnutella and web-Stanford , have high widths,like the Facebook networks, but lower median widths; and that they have higher median densities(especially when compared to

ER(4), ER(8) , and

ER(16) ). This indicates that these networkshave a sparser, more tree-like periphery than other social networks, which is also consistentwith previous results [1]. Importantly, and also consistent with previous results [1], is that thesparsest, most tree-like peripheries belong to the collaboration, email, and autonomous systemsnetworks (

CA-GrQc, CA-AstroPh, Email-Enron, as20000102 ). These networks all havelow median widths, high median densities, and high maximum widths, indicating that they exhibitthe cleanest core-periphery structure, also consistent with the upward-sloping NCPs [1, 2].

We will now look at several representative networks in greater detail. Let us start by discussingFigures 9, 10, and 11 from Section 5. Figure 9 clearly shows that most of the bags in the TDsare small-width bags. In fact, proportionally,

FB-Lehigh has the largest fraction of large bags,and yet 80% of the bags are below width 200 (in a graph with 5073 nodes, where the TD has a27aximum width bag of 2983). Since the bags and edges of TDs form separators in the network,this indicates that there are many relatively small separators. These are the largest in Facebooknetworks, where the separators tend to have around 100 nodes, while in most other networks manyof the separators have around 10 nodes. This is consistent with typical views of core-peripherystructure, with a few more highly-connected nodes in the core and many less well-connected nodesin the periphery. That is, in order to separate oﬀ most pieces of the periphery (where “piece”is deﬁned by the end of branches in the TD), only 10 or fewer nodes are needed for most of thesocial/information networks, while ca. 100 nodes are needed for the Facebook networks.In Figure 10, we see the average edge density for bags of a given cardinality plotted against thebag cardinality, showing that small-width bags have high densities. An important distinguishingfeature of the three representative real networks (that are not tied to an underlying Euclideangeometry) is that the curve has a heavier tail than in the synthetic networks. This indicates thatseparators, up to much larger size scales, are less disparate (e.g., are denser or clumpier) than inthe synthetic networks. In

PowerGrid , on the other hand, the underlying Euclidean geometryleads the density to falls oﬀ more quickly. It falls oﬀ similarly to the sparse ER network, exceptthat the tail of the curve is shorter. In this case, only the smallest bags have tight separators.In Figure 11, we consider the relationship between the core structure and low eccentricity(central) bags, and we compare that with the relationship between the periphery structure andthe high eccentricity (perimeter) bags in the TD. Figures 11c and 11d show that for as20000102 and

CA-GrQc there is a clear downward trend as the bag eccentricity is increased. This indicatesthat low eccentricity bags contain more high k -core nodes on average and that the high eccentricitybags contain more low k -core nodes on average. Figure 11g shows that FB-Lehigh , due to itsgreater density, has a mostly ﬂat proﬁle until the most extreme reaches of eccentricity are met, atwhich point some of the bags begin to contain nodes of a lower k -core. Thus, the core-peripherystructure is present in FB-Lehigh , but the core-periphery structure is moderated by a very largecore which produces long path-like sets of nodes that in turn lead to large core bags and hence amuch larger eccentricity. (This is typical of the results for most of the Facebook networks, whichis consistent with their ﬂat NCP [2].) Finally,

PowerGrid , which is not expected to have exhibita correlation between k -core structure and bag eccentricity, has a ﬂat proﬁle.To illustrate these ﬁndings, we present visualizations in Figures 13 and 14. Shown are acentral or very deep core bag, a perimeter or very peripheral bag, and an intermediate bag, foreach of our four networks. These ﬁgures show the community-like nature of typical bags for thethree information networks, i.e., as20000102 , CA-GrQc , and

FB-Lehigh , as well as the moredisparate separators of the

PowerGrid . The coloring of the visualizations in these ﬁgures is by k -core: the red nodes are in deep (high) k -cores while the blue nodes are in shallow (low) k -cores. (a) as20000102 central bag (b) as20000102 intermediate bag (c) as20000102 perimeter bag (d) CA-GrQc central bag (e)

CA-GrQc intermediate bag (f)

CA-GrQc perimeter bag

Figure 13: as20000102 and

CA-GrQc amd bag subgraphs, colored by k -core number, with redindicating deep/high k -cores and blue indicating shallow/low k -cores.One ﬁnal observation we would like to make is to address the “dips” in the average k -core28 a) FB-Lehigh central bag (b)

FB-Lehigh intermediate bag (c)

FB-Lehigh perimeter bag (d)

PowerGrid central bag (e)

PowerGrid intermediate bag (f)

PowerGrid perimeter bag

Figure 14:

FB-Lehigh and

PowerGrid amd bag subgraphs, colored by k -core number, withred indicating deep/high k -cores and blue indicating shallow/low k -cores.curves shown in Figure 11 (e.g., the dip in Figure 11d at a bag eccentricity of 21 or in Figure 11gthroughout). These dips are due to what we will call “twigs,” where a twig is a small (low widthand short) branch oﬀ of a much larger (high width and long) trunk-like structure of the TD. Forexample, in FB-Lehigh and in the other Facebook networks, the high average degree results notonly in high widths, but in larger collections of bags of high width. These are arranged in a longpath (a “trunk”) with many branches at either end. Along this main trunk, there are occasionaltwigs which contain peripheral nodes. Since the trunk is long, the average k -core at the pointwhere the twig is attached is slightly lower, resulting in the dip in the curve. In Figure 15, weprovide a visualization of the twigs responsible for three of these dips. These empirical observations suggest that many realistic social/information networks have a non-trivial core-periphery structure; and that in many cases this is caused by many small overlappingcluster-like or moderately clique-like structures. That is, there is local non-tree-like (combina-torial) structure that “ﬁts together” into a global core-periphery structure that is tree-like (ina metric and/or cut sense) when viewed from large size scales. This is in sharp contrast withmany models and intuitions. Most obviously, this is in contrast with the random networks (inparticular, the not extremely sparse ER networks and to a lesser extent the PL networks, butmany other more popular random generative models) which have a locally tree-like, but globallyloopy structure. Less obviously, this is also in sharp contrast with networks such as

PowerGrid , PlanarGrid , and road-TX that are strongly tied to an underlying Euclidean geometry. Saidanother way, many realistic social/information networks have a more tightly-connected core-likestructure than is present in typical random networks, and they have peripheral and intermediateregions that are “clumpier” than these random networks. While these claims are perhaps intu-itive, our empirical observations demonstrate that they can be meaningfully identiﬁed with TDsand interpreted as leading to large-scale cut-based tree-like structure.Interestingly, aside from the local clumpiness, the real-world social/information networks dohave a core-periphery structure that is reminiscent of that which is also seen in extremely sparseER graphs and PL graphs with greater degree heterogeneity. (This too is consistent with priorresults suggesting that extreme sparsity coupled with randomness/noise is responsible for the dipsin the NCP [1, 2].) It is also worth emphasizing that in most of the intermediate bags of the realsocial/information networks, there are still a small number of disconnected nodes. This indicatesthat there are still a small number of alternate paths, which are disparate from the clusters, to therest of the nodes in the network. The most prominent exceptions to these general observationsare networks that either do not have a strong core-periphery structure, e.g.,

PowerGrid that is29ied to a two-dimensional underlying Euclidean geometry, or networks that have a relatively lowclustering coeﬃcient, e.g.,

Gnutella09 . In both of these cases (but for diﬀerent reasons), theinternal subgraphs of the intermediate and peripheral bags have a larger number of disconnectednodes than the other realistic networks. (a)

CA-GrQc twigs on centralbag. (b)

FB-Lehigh small twig oncentral trunk. (c)

FB-Lehigh large branchingtwig on central trunk.

Figure 15: Twigs on

FB-Lehigh and

CA-GrQc . Bags are colored by the density, blue indicatinglow density and red indicating high density. A small twig and a larger, branching twig on the

FB-Lehigh trunk are shown. These twigs, combined with the long, path-like trunk, cause dipsin the k -core eccentricity plot. In CA-GrQc , the concentration of the twigs on one or two centralbags causes only a single, large dip as compared to the multiple dips in

FB-Lehigh . Syntheticnetworks in Figure 11, (

PL(2.5), PL(3.0), and

ER(1.6) ) also have twigs similar to

CA-GrQc . Here, we will consider how the peripheral part of the tree-like core-periphery structure identiﬁedby TDs relates to low-conductance clusters/communities that were previously-identiﬁed by theNCP method [1, 2]. To do so, observe that one way to determine whether a TD “captures”clustering/community structure is to see if those clusters/communities are well-localized in theTD. By “well-localized,” we mean here that the cluster/community is contained in a relativelysmall number of (contiguous) bags. We followed previous personalized page rank (PPR) localspectral procedures [101] to generate a set of candidate clusters [1, 2]. Then, given a set ofcandidate clusters, we looked at how many bags in the TD contain at least one node from thiscluster, i.e., we measured how well-localized the community is in the TD. As a crude threshold ofwhether a cluster/community is localized, we consider it to be localized if it is contained in fewerbags than there are nodes in the community. We apply this method using the amd heuristic. To understand this threshold, consider the following example: if a community of size n is a tree, e.g., whiskersin ER(1.6) , then it will be contained in n bags in the (ideal) TD; if the community is a “clique whisker,” i.e., aclique connected to the rest of the network by only one edge, it will be contained in just one or two bags; and if thecommunity contains deep core nodes which are connected to many nodes outside of the community, the communitywill be spread across many bags in the network. Other measures of TD locality showed similar results. from the community in the correspondingNCP plot , and the green dashed line represents the locality threshold. When the number of bagsfor a given community is localized by our deﬁnition, the red plot will be below the green threshold. C ondu c t an c e Community Size

Best community (a)

ER(1.6)

NCP plot C ondu c t an c e Community Size

Best community (b)

ER(32)

NCP plot N u m be r o f bag s Community Size

Best communityThreshold (c)

ER(1.6) baglocalization N u m be r o f bag s Community Size

Best communityThreshold (d)

ER(32) baglocalization

Figure 16:

ER(1.6 and

ER(32)

NCP plots and tree localization plots. The localization thresholdis plotted in green. C ondu c t an c e Community Size

Best community (a)

CA-GrQc

NCP plot C ondu c t an c e Community Size

Best community (b)

FB-Lehigh

NCP plot N u m be r o f bag s Community Size

Best communityThreshold (c)

CA-GrQc baglocalization N u m be r o f bag s Community Size

Best communityThreshold (d)

FB-Lehigh baglocalization

Figure 17:

CA-GrQc and

FB-Lehigh

NCP plots and tree localization plots. The localizationthreshold is plotted in green. C ondu c t an c e Community Size

Best community (a) as20000102

NCP plot C ondu c t an c e Community Size

Best community (b)

Gnutella09

NCPplot N u m be r o f bag s Community Size

Best communityThreshold (c) as20000102 baglocalization N u m be r o f bag s Community Size

Best communityThreshold (d)

Gnutella09 baglocalization

Figure 18: as20000102 and

Gnutella09

NCP plots and tree localization plots. The localizationthreshold is plotted in green. 31 C ondu c t an c e Community Size

Best community (a)

Email-Enron

NCPplot C ondu c t an c e Community Size

Best community (b)

Polblogs

NCP plot N u m be r o f bag s Community Size

Best communityThreshold (c)

Email-Enron baglocalization N u m be r o f bag s Community Size

Best communityThreshold (d)

Polblogs baglocalization

Figure 19:

Email-Enron and

Polblogs

NCP plots and tree localization plots. The localizationthreshold is plotted in green. C ondu c t an c e Community Size

Best community (a)

Planar

NCP plot C ondu c t an c e Community Size

Best community (b)

PowerGrid

NCPplot N u m be r o f bag s Community Size

Best communityThreshold (c)

Planar baglocalization N u m be r o f bag s Community Size

Best communityThreshold (d)

PowerGrid baglocalization

Figure 20:

Planar and

PowerGrid

NCP plots and tree localization plots. The localizationthreshold is plotted in green.As a reference, consider the extremely sparse and somewhat denser ER networks, which areshown in Figure 16. Since it is so sparse, ER(1.6) does have some very small good-conductanceclusters. As shown in the ﬁgure, however, the small “communities” are contained in roughly thesame number of bags as there are in the community. This is expected, as these communities arelargely peripheral tree-like whiskers in the network (Section 5). For larger communities, whichinclude core nodes, the localization is slightly above the line deﬁning our threshold. On theother hand, for the denser

ER(32) , there are no good-conductance clusters at any size, and baglocalization is above the line deﬁning the localization threshold, indicating that the localizationis poor at all size scales.For the small and intermediate-sized clusters in many of the real networks (including many ofthose from [1]), the smaller good-conductance clusters found using the PPR method are reasonablywell-localized within the TD, while the larger poorer-conductance clusters are not. Consider, e.g.,

CA-GrQc in Figure 17 as an example. On the other hand, both large and small clusters foundwith the PPR method applied to the denser graphs from the

Facebook100 set (i.e., those thatdo not have even small-cardinality good-conductance clusters [2]) are not well-localized in theTD. Consider, e.g.,

FB-Lehigh in Figure 17 as an example. Figure 18 shows as20000102 and

Gnutella09 , which also shows NCP plots that do not yield small good conductance clusters,and which shows that the outputs of the PPR method are not particularly well-localized in theTD. Figure 19 shows that

Email-Enron does have some of its small good-conductance clusterswell-localized, and it also shows that the output of the PPR algorithm applied to

Polblogs leadsto medium-to-large clusters with poor conductance values that are poorly-localized in the TD.Finally, although networks with an underlying Euclidean geometry are of less interest forsocial/information network applications, for completeness it is worth considering how these TD32ethods apply to them. Figure 20 presents results for

Planar and

PowerGrid . Both ofthese networks have downward-sloping NCP plots which are diﬀerent from the other social andinformation networks, reﬂecting the Euclidean geometry underlying these networks. In bothcases, fairly uninteresting results are obtained, suggesting that the localization metric we proposeis more interesting for realistic social graphs with non-trivial tree-like core-periphery structure.Although our results demonstrate that good-conductance clusters/communities in several re-alistic social graphs are well-localized in TDs found with existing heuristics, it is not obvious howto address the reverse question of ﬁnding good-conductance communities from a TD. One couldattempt to look at all or some large number of combinations of bags in the TD. Since one isusually interested in well-connected communities/clusters, the running intersection property ofTDs could be used to restrict attention to connected subsets of a TD. There are, however, twoobvious issues. First, there does not exist an obvious analogue of the “sweep cut” used in thespectral partitioning method for ﬁnding the best community from a TD. Second, as a relatedpractical matter, the presence of high degree (or deep core) nodes in the intermediate and centralbags of a TD cause bags to be poor conductance communities. These nodes have many connec-tions and increase the “surface area” of most cuts, even if there is only a small number of themin a cluster. We observed that, in the clusters we found using the PPR method, each clusteris typically well-represented by a set of small bags plus a couple of nodes in the larger bags. Ifwe then attempt to form clusters by combining bags, we get all of the nodes in the larger bags,including deep core nodes. Additional methods of ﬁltering nodes for the larger bags, such asordering by node degree or k -core combined with a sweep cut, may improve these results. Here, we will consider other ways in which the output of TDs can be useful in identifying clus-ters/communities of interest to the domain analyst. In particular, we describe two examples fromthe demographic data associated with the Facebook100 dataset [93].Consider, ﬁrst, Figure 21a, where we show the amd

TD of the

FB-Haverford network, andwhere each bag is colored-coded by the average graduation year of the constituent nodes. Thereis a large linear or trunk-like structure that dominates the large-scale structure of the TD. Weobserve that there is a strong overlap between the nodes that comprise successive bags in thattrunk, and we note that this trunk-like structure is typical of most of the

Facebook100 networks(but is not seen in most other social graphs we have considered). Also, each end of the long trunkcorrelates strongly with graduation year, and there is a gradual change in the average graduationyear of each bag as we move across the trunk. Thus, to the extent that one accepts graduationyear as some sort of easily-quantiﬁable “ground truth” community, the large bags in the TD ofthis network seem to be capturing a legitimate ground-truth structure in the network. This ﬁtswell with prior results that report that in most of the Facebook networks graduation year is bestpredictor of the existence of edges between two nodes [93].Consider, next, Figure 21b. It is known that for a small number of the

Facebook100 networks(e.g.,

FB-Caltech , FB-Rice , and

FB-UCSC ), residence hall rather than class year is the bestedge predictor [93]. Thus, we considered the amd

TD of (the students-only subset of)

FB-Caltech . In this case, a single simple trunk-like structure is not dominant, but there are severalrelatively large peripheral branches, and many of the peripheral branches are dominated by aparticular residence hall. In Figure 21b, the bags are colored by the fraction of students inresidence hall 170 (chosen arbitrarily). These examples are of particular interest since good-conductance clusters do not exist in

Facebook100 networks [2].By looking at bags where the concentration of a particular community node is higher than theincidence of that community throughout the TD, we can form a very simple classiﬁcation rule. In33 a) amd TD of

FB-Haverford , colored by graduationyear (red = freshman, blue = alumni). The long, path-like trunk of this (and most other) Facebook networks isdriven by the propensity of students to be friends withstudents of a similar graduation year. (b) amd

TD of (the students-only subset of)

FB-Caltech , colored by the fraction of students in residencehall 170 (blue = no nodes belong to residence, ..., red =all nodes belong to residence).

Figure 21: amd

TD of

FB-Haverford and

FB-Caltech . FB-Haverford is presented, ratherthan

FB-Lehigh (which has similar large-scale TD structure), because its smaller eccentricity (56rather than 150) makes it easier to visualize. For

FB-Caltech , this is a graphical representationof data presented in Table 7 for the amd

TD and residence hall 170.particular, given residence hall X , we collected all bags whose fraction of nodes which were listedas belonging to residence X was higher than the fraction of nodes belonging to that hall in thenetwork (the incidence in the network is given in column F in Tables 6 and 7). We then kept thelargest contiguous set of bags and used membership in this set as the classiﬁer. Although this isan overly simple classiﬁer, the goal in this section is simply to provide a baseline about how theresidence communities are located in the TD.We performed this procedure on the students-only restriction of FB-Caltech . Tables 6 and7 provide a summary of the classiﬁcation results using this method on

FB-Caltech network,with the (anonymized) listed residence hall for that student as the community. Table 6 shows thefraction of the “ground truth” community captured by the largest contiguous set of bags describedabove. This is analogous to the recall of classifying the community using this branch in the TD.Table 7 shows the fraction of the nodes in the union of all bags in this largest contiguous setwhich belong to the community. This is analogous to the precision of classifying the communityusing this branch. Since

FB-Caltech is very small, we can use a much larger variety of TD asclassiﬁers than is possible for larger networks, and we present results for all of these TD classiﬁers.Although the communities do seem to be well-captured by the TDs, there are also many othernodes in the same bags as these communities (see Table 7). Although the only bags selected werebags where the residence hall in question was over-represented, combining these bags actuallyresulted in a lower concentration of residents than were present in the network for some residencehalls (see Table 7 where the values are lower the F for a given residence hall). This occurs sincethe non-resident nodes in each of these bags are diﬀerent, while the resident nodes are largely thesame for each bag in the branch.In terms of heuristic performance, the mindeg, minfill, and amd seem to have similar34 all F mindeg minfill lexm mcs amd metnnd None .134 .270 .257 .270 .324 .284 .297165 .066 .472 .528 .556 .528 .472 .861

166 .090 .736 .736 .925 .811 .642 .792167 .134 .642 .566 .453 .585 .491 .566168 .116 .746 .762 .952 .889 .825 .143169 .136 .726 .712 .904 .877 .658 .740170 .090 .725 .725 .739 .783 .855 .362171 .136 .714 .673 .776 .857 .592 .429172 .098 .630 .630 .534 .699 .548 .863

Table 6: Fraction of each

FB-Caltech residence hall captured in the largest contiguous set of“frequent” bags. A frequent bag is a bag where the fraction of students who belong to the givenresidence hall is greater than the fraction of students who belong to that residence in the entirenetwork. Column F gives the fraction of students who identiﬁed as being in the associated hall(i.e., the threshold for being a frequent bag for that residence hall). This procedure was alsoperformed for the nodes which did not have a residence hall listed for comparison. Hall F mindeg minfill lexm mcs amd metnnd None .134 .065 .064 .068 .071 .071 .100165 .066 .055 .057 .052 .052 .059 .086

166 .090 .104 .110 .111 .111 .092 .129

167 .134 .102 .097 .092 .087 .088 .092168 .116 .137 .140 .150 .147 .157 .043169 .136 .150 .151 .147 .163 .138 .187

170 .090 .134 .139 .145 .140 .171 .103171 .136 .105 .099 .100 .104 .084 .074172 .098 .131 .129 .100 .137 .115 .199

Table 7: Fraction of the nodes contained in the largest contiguous set of frequent bags for a givenresidence hall which actually belong to the given residence hall. A frequent bag is a bag where thefraction of students who belong to the given residence hall is greater than the fraction of studentswho belong to that residence in the entire network. Column F gives the fraction of students whoidentiﬁed as being in the associated hall (i.e., the threshold for being a frequent bag for thatresidence hall). This procedure was also performed for the nodes which did not have a residencehall listed for comparison.performance given a residence, although there seems to be a larger gap between amd than theother heuristics. This is not surprising as these are all greedy heuristics which work by reducingﬁll (or minimum degree, which is a proxy for ﬁll) in each step. lexm and mcs also seem tobehave similarly, and they have the best performance in terms of recall (Table 6). metnnd hasa diﬀerent proﬁle from the other networks and seems to do the best in terms of precision (Table7). These results are comparable to what can be obtained with other simple classiﬁcation rules,and they suggest that TDs could be useful in these types of machine learning applications.Overall, these results demonstrate that for these realistic social/information networks, severaltypes of plausible “ground truth” communities are well-correlated with the large-scale structureidentiﬁed by existing TD heuristics. This striking since these heuristics make local greedy deci-sions about how to form the TDs, and it suggests that improved results could be obtained in thisapplication by considering TD heuristics designed for graphs with this type of structure.35 More details on tree decomposition methods

In this section, we consider the question of whether TDs and their treewidths can be related toother parameters for tree-like structure, speciﬁcally the Gromov δ hyperbolicity. It might appearthat there is no relation between TDs and δ (since, e.g., treewidth and δ take on opposite extremalvalues on cliques and cycles), but there are in fact structural characterizations for when they align.We will present here our new theoretical results on relating TDs and δ -hyperbolicity. Althoughthis result is a relatively-straightforward extension of previous work [102], and although most ofthe rest of the paper can be understood without this result, we include it here for completeness:ﬁrst, since motivating prior work in [3] demonstrates an empirical connection between the cut-based tree-like notion from TDs and the metric-based tree-like notion from δ -hyperbolicity; andsecond, since our results in Section 6 demonstrate the inadequacy of a na¨ıve optimization oftreewidth and the importance of large cycles for realistic social graphs. We start with the following deﬁnition, which provides another quality measure of a TD; this wasﬁrst introduced by Dourisboure and Gavoille [68]. See also [103].

Deﬁnition 4.

Let T = ( { X i } , T = ( I, F )) be a tree decomposition of a graph G . The length of T is deﬁned to be max i ∈ I,x,y ∈ X i d G ( x, y ) , where d G ( x, y ) is the shortest path distance in G .Analogously to treewidth, the treelength of G , denoted tl ( G ) , is the minimum length achieved byany tree decomposition of G . It is straight-forward to see that the treelength is at most the diameter of G . Like with treewidth,ﬁnding a tree decomposition achieving minimum length (and in fact the treelength itself) is NP-hard [69]. Given this, one might ask whether treelength and treewidth can be simultaneouslyapproximated. For general graphs, Dourisboure and Gavoille proved a negative result. Theorem 2. [68] Any algorithm computing a tree decomposition approximating the treewidth(or the treelength) of an n -vertex graph by a factor α or less does not give an α -approximation ofthe treelength (resp. the treewidth) unless α = Ω( n / ) . The speciﬁc examples used by [68] to prove their negative result are modiﬁcations of the 2-dimensional mesh (i.e., a lattice), which—due to long induced cycles—is not δ -hyperbolic for smallvalues of δ . This suggests that the situation might be very diﬀerent for “real-world” graphs—which have small diameter and which have non-trivial embedding properties into low-dimensionalhyperbolic spaces. (This is an open area of research more generally.) Chepoi et al. [104] showedthat if tl ( G ) ≤ λ , then G is λ -hyperbolic, and that a δ -hyperbolic graph G on n vertices satisﬁes tl ( G ) ≤

17 + 12 δ + 8 δ log n . Unfortunately, for many real networks of interest, this is not animprovement on the trivial bound of diameter as their diameter alone will be less than O (log n ).We conjecture that under minimal additional conditions, a δ -hyperbolic graph with diameter D has treelength at most a function of log D , a vast improvement on both known bounds.We turn to the question of using additional structural properties to characterize the interplaybetween δ , tw ( G ), and tl ( G ). The following theorem is our main result; this theorem follows fromthe work of M¨uller on atomic TDs [102], and its proof is in Section 7.2. As we mentioned in Section 2, this is not the main focus of our paper, but there has been recent theoreticaland empirical interest in this and related questions; see, e.g., [66, 67, 68, 69, 70, 71, 72, 73, 74]. heorem 3. [105] Say a subgraph H of G is geodesic if d H ( u, v ) = d G ( u, v ) for all u, v ∈ V ( H ) .Let ν ( G ) be the length of a longest geodesic cycle in G . Then δ ( G ) ≤ tl ( G ) ≤ ( tw ( G ) + 1) · ν ( G ) . Further, this result is tight—there is a graph class G of unbounded treewidth and containingarbitrarily long geodesic cycles such that δ ( G ) = Θ( tw ( G ) · ν ( G )) for every graph G ∈ G . In other words, if we can eliminate long distance-preserving cycles and obstructions to lowtreewidth (large grid minors), then G will embed well in low-dimensional hyperbolic space. Before we can give the proof of Theorem 3, we need a few additional deﬁnitions. First, given arooted tree T and a node s ∈ T , deﬁne T s to be the subtree of T with root s : T s := T [ { t ∈ T | s is an ancestor of t } ] . For a graph G = ( V, E ) with tree decomposition ( { X i } , T ) where T is rooted arbitrarily, for s ∈ T deﬁne G s := G [ (cid:83) t ∈ T s X t ] to be the graph induced by those bags that are equal to orbelow X s in the decomposition. We will write N ( S ) for the neighbors of a set S – more precisely, N ( S ) = { u ∈ V | ( u, s ) ∈ E for some s ∈ S } \ S . Finally, for notational convenience, for x ∈ V and e ∈ E , we will write G − x for the graph ( V \ { x } , E ) and G − e for the graph ( V, E \ { e } ).We now deﬁne a special type of tree decomposition (so-called atomic tree decompositions ), andgive a crucial property of all vertices that co-occur in one of its bags. Deﬁnition 5. [atomic tree decomposition, as in [106]] Let G be a graph on n vertices. Thefatness of a tree decomposition of G is the n -tuple ( a , . . . , a n ) , where a h denotes the number ofbags that have exactly n − h vertices. A tree decomposition of lexicographically minimal fatness iscalled an atomic tree decomposition. Proposition 1. [Lemma 3.9 in M¨uller [102]]

Let ( { X i } , T ) be an atomic tree decomposition of aconnected graph G = ( V, E ) . Then for any two distinct vertices x, y that occur together in somebag X t , either ( x, y ) ∈ E or there exists a neighbor s of t in T such that { x, y } ⊆ V s ∩ V t . We also need the following proposition, which follows from Lemmas 3.7 and 3.8 in M¨uller [102].

Proposition 2.

Let ( { X i } , T ) be an atomic tree decomposition of a connected graph G , e =( s, t ) ∈ E ( T ) be any edge and let T t be the connected component of T − e rooted at t , and set X = X s ∩ X t . Then there exists a connected component C t in G t \ X such that N ( C t ) = X and X t ⊆ C t ∪ X . Finally, we are ready to give a bound on treelength in terms of a graph’s treewidth and itslongest geodesic cycle. Our proof relies heavily on tools from [102].

Theorem 4. [105] For any graph G = ( V, E ) it holds that tl ( G ) ≤ ν ( G ) · ( tw ( G ) + 1) where ν ( G ) is the length of the longest geodesic cycle in G .Proof. We will prove a stronger statement, namely that any atomic tree decomposition of a two-connected graph has treelength at most ν ( G ) · ( tw ( G ) + 1). Let us ﬁrst show how this proves thelemma for graphs that are not two-connected.Assume G is not two-connected and x ∈ V is a cut vertex ( G − x has at least two connectedcomponents). Let H , . . . , H (cid:96) be the connected components of G − x . If we prove that the graphs37 [ H i ∪ { x } ], 1 ≤ i ≤ (cid:96) have tree decompositions T i with treelength bounded as in the statementof the theorem, then we can easily construct a tree decomposition for G with the same property:we simply introduce a single new bag V x = { x } and connect it to an arbitrary bag containing x in each of the individual tree decompositions T i (since these graphs all contain the vertex x sucha bag must exist). Note that the treelength of this decomposition is simply max ≤ i ≤ (cid:96) tl ( T i ) sincethe bag V x we added contains only the vertex x and thus cannot increase the treelength. Sincewe will show the statement for two-connected graphs in the following, we recursively decomposethe graph G over cut vertices until the remaining connected components are all two-connectedand then construct a tree decomposition of G as described above.We may now assume G is two-connected. Given an atomic tree decomposition ( { X i } , T ) of G ,we show that for every two vertices x, y that occur in a common bag X := X t , x and y areconnected by a path whose length depends only on | X | and ν ( G ). To this end, let C X be thecollection of geodesic cycles in G that have at least one vertex in X . We ﬁrst show that if G [ C X ]is connected and X ⊆ V ( C X ), then every pair of vertices in X is connected by a path of lengthat most | X | · ν ( G [ C X ]).Consider x, y ∈ X . Start a breadth-ﬁrst search (bfs) from x that stops as soon as it reaches y .Let L , L , · · · L p be the layers of the bfs-tree where L = { x } is the starting layer. We claim thatfor all L i with L i ∩ X (cid:54) = ∅ , there is a j such that i < j ≤ i + ν ( G [ C X ]) and L j ∩ X (cid:54) = ∅ . Considersuch an L i , and denote by X l ⊆ X those vertices of X that are contained in (cid:83) ik =1 L k . Denote by X r = X \ X l those vertices of X that have not been visited until step i . If there exists a geodesiccycle C in C X with vertices in both X l and X r we are done – the bfs will have seen all of C in atmost ν ( G [ C X ]) steps (and thus found C ∩ X r ). Otherwise, since C X is connected, there exist twogeodesic cycles C l , C r ∈ C X with C l ∩ X l (cid:54) = ∅ , C r ∩ X r (cid:54) = ∅ and C r ∩ C l (cid:54) = ∅ . Since the bfs will visitall vertices of C r ∪ C l in at most ( | C r | + | C l | ) / ≤ ν ( G [ C X ]) steps, the claim follows. Thereforethe number of layers p ≤ ν ( G [ C X ]) · | X t | and thus the distance between x and y is bounded by ν ( G [ C X ]) · | X t | as claimed.Therefore, if we show that for every bag X , the set C X of geodesic cycles touching X inducesa connected graph G [ V ( C X )], we are done: then every vertex pair x, y ∈ X is indeed connectedby a path of length at most ν ( G [ V ( C X )]) | X | , which (by the deﬁnition of treewidth and the fact C X is a family of geodesic cycles) is bounded by ν ( G )( tw ( G ) + 1).We ﬁrst prove that for any choice of X := X t and any pair of vertices x, y ∈ X , x and y lieon some cycle of G . By Proposition 1, the vertices x, y are either connected by an edge (in whichcase we are done: G is two-connected, so every edge lies on some cycle) or there exists some node s ∈ N T ( t ) such that { x, y } ⊆ V s ∩ V t . In the latter case, we invoke Proposition 2: for i ∈ { s, t } wecan ﬁnd connected components H i of G i \ X such that N ( H i ) = X and V i ⊆ H i ∪ X . Therefore,there exist two x - y -paths: one inside H s and another in H t , hence x and y lie on a cycle.Since the set of geodesic cycles forms a basis for the cycle space of a graph (see Theorem3.1 of [107]), it follows that for every t ∈ T , G [ V ( C X t )] is connected. The distance between anyvertices in X t is thus bounded by ν ( C X t ) · | X t | , implying that tl ( G ) is at most ν ( G ) · ( tw ( G ) + 1),as claimed.Finally, we put all the pieces together and show why these bounds are tight. Proof of Theorem 3

This follows directly from Theorem 4, Chepoi’s result that hyperbolicityis at most the treelength [104], and the observation that for any non-negative integers n and k ,the k -subdivision of the n × n planar grid has treelength n ( k + 1), treewidth n , a longest geodesiccycle of length 4( k + 1), and hyperbolicity ( n − k + 1) − Discussion and Conclusion

Clearly, there is a need to develop TD heuristics that are better-suited for the properties ofrealistic informatics graphs. This might involve making more sophisticated choices than greedilyminimizing degree or ﬁll, but it might also involve optimizing other parameters such as treelength(which has connections with δ -hyperbolicity) or minimizing the width of bags that are not central(associated with the deep core). In addition, it would be interesting to use TDs to help tocombine small local clusters found with other methods, e.g., local spectral methods, into largeroverlapping clusters, in order to understand better what might be termed the “local to global”properties of realistic informatics graphs. Since these graphs are not well-described by simplelow-dimensional structures or simple constant-degree expander-like structures, this coupling isparticularly counterintuitive, but it is very important for applications such as the diﬀusion ofinformation. Finally, given the connections between TDs and graphical models, it would beinteresting to understand better the implications of our results for improved graphical modelingand/or for improved inference on realistic network data. We expect that this will be a particularlychallenging but promising direction for future work on social (as well as non-social) graphs. Acknowledgments.

We would like to thank Felix Reidl for considerable help in simplifyingthe proof of Theorem 3. We would also like to thank Mason Porter for helpful discussionsand for providing several of the networks that we considered as well as Dima Krioukov andhis collaborators for providing us access to their code for generating networks based on theirhyperbolic model. In addition, we would like to acknowledge ﬁnancial support from the Air ForceOﬃce of Scientiﬁc Research, the Army Research Oﬃce, the Defense Advanced Research ProjectsAgency, the National Consortium for Data Science, and the National Science Foundation. Anyopinions, ﬁndings, and conclusions or recommendations expressed in this publication are those ofthe author(s) and do not necessarily reﬂect the views of any of the above funding agencies.

References [1] J. Leskovec, K.J. Lang, A. Dasgupta, and M.W. Mahoney. Community structure in large networks:Natural cluster sizes and the absence of large well-deﬁned clusters.

Internet Mathematics , 6(1):29–123, 2009. Also available at: arXiv:0810.1355.[2] L. G. S. Jeub, P. Balachandran, M. A. Porter, P. J. Mucha, and M. W. Mahoney. Think locally, actlocally: Detection of small, medium-sized, and large communities in large networks.

Physical ReviewE , 91:012821, 2015.[3] A. B. Adcock, B. D. Sullivan, and M. W. Mahoney. Tree-like structure in large social and informationnetworks. In

Proc. of the 2013 IEEE ICDM , pages 1–10, 2013.[4] V. Batagelj and M. Zaversnik. Generalized cores. Technical report. Preprint: arXiv:cs.DS/0202039(2002).[5] V. Batagelj and M. Zaversnik. An O ( m ) algorithm for cores decomposition of networks. Technicalreport. Preprint: arXiv:cs.DS/0310049 (2003).[6] V. Batagelj and M. Zaversnik. Fast algorithms for determining (generalized) core groups in socialnetworks. Advances in Data Analysis and Classiﬁcation , 5(2):129–145, 2011.[7] N. Robertson and P. D. Seymour. Graph minors. II. Algorithmic aspects of tree-width.

Journal ofAlgorithms , 7(3):309–322, 1986.[8] S. Arnborg and A. Proskurowski. Linear time algorithms for NP-hard problems restricted to partialk-trees.

Discrete Applied Mathematics , 23(1):11–24, 1989.

9] M. W. Bern, E. L. Lawler, and A. L. Wong. Linear-time computation of optimal subgraphs ofdecomposable graphs.

Journal of Algorithms , 8(2):216–235, 1987.[10] A. M. C. A. Koster, S. P. M. van Hoesel, and A. W. J. Kolen. Solving partial constraint satisfactionproblems with tree decomposition.

Networks , pages 170–180, 2002.[11] J. Lagergren. Eﬃcient parallel algorithms for graphs of bounded tree-width.

Journal of Algorithms ,20(1):20–44, 1996.[12] I. V. Hicks, A. M. C. A. Koster, and E. Koloto˘glu. Branch and tree decomposition techniques fordiscrete optimization.

TutORials in Operation Research: INFORMS–New Orleans , 2005.[13] J. Zhao, R. L. Malmberg, and L. Cai. Rapid ab initio

RNA folding including pseudoknots via graphtree decomposition. In

Proceedings of the 6th International Workshop on Algorithms in Bioinfor-matics , pages 262–273, 2006.[14] J. Zhao, D. Che, and L. Cai. Comparative pathway annotation with protein-DNA interaction andoperon information via graph tree decomposition. In

Paciﬁc Symposium on Biocomputing , pages496–507, 2007.[15] C. Liu, Y. Song, B. Yan, Y. Xu, and L. Cai. Fast de novo peptide sequencing and spectral alignmentvia tree decomposition. In

Paciﬁc Symposium on Biocomputing , pages 255–266, 2006.[16] S. L. Lauritzen and D. J. Spiegelhalter. Local computations with probabilities on graphical structuresand their application to expert systems (with discussion).

Journal of the Royal Statistical Societyseries B , 50:157–224, 1988.[17] D. Karger and N. Srebro. Learning Markov networks: maximum bounded tree-width graphs. In

Proceedings of the 12th ACM-SIAM Symposium on Discrete algorithms , pages 392–401, 2001.[18] H. Chen. Quantiﬁed constraint satisfaction and bounded treewidth. In

Proceedings of the 16thEuropean Conference on Artiﬁcial Intelligence , pages 161–165, 2004.[19] H. L. Bodlaender and R. H. M¨ohring. The pathwidth and treewidth of cographs.

SIAM Journal onDiscrete Mathematics , 6(2):181–188, 1993.[20] C. Chekuri and J. Chuzhoy. Polynomial bounds for the grid-minor theorem. In

Proceedings of the46th Annual ACM Symposium on Theory of Computing , pages 60–69, 2014.[21] P.D. Seymour and R. Thomas. Call routing and the ratcatcher.

Combinatorica , 14(2):217–241, 1994.[22] H. L. Bodlaender and A. M. C. A. Koster. Treewidth computations I. Upper bounds.

Inf. Comput. ,208(3):259–275, 2010.[23] H. L. Bodlaender. A linear-time algorithm for ﬁnding tree-decompositions of small treewidth.

SIAMJournal on Computing , 25(6):1305–1317, 1996.[24] E. Amir. Approximation algorithms for treewidth.

Algorithmica , 56(4):448–479, 2010.[25] H. R¨ohrig. Tree decomposition: a feasibility study. Master’s thesis, Universit¨at des Saarlandes,Saarbr¨ucken, Germany, 1998.[26] K. Shoikhet and D. Geiger. A practical algorithm for ﬁnding optimal triangulations. In

Proceedingsof AAAI/IAAI , pages 185–190, 1997.[27] C. Gro¨er, B. D. Sullivan, and D. Weerapurage. INDDGO: Integrated network decomposition &dynamic programming for graph optimization. Technical Report ORNL/TM-2012/176, Oak RidgeNational Laboratory, 2012.[28] B. D. Sullivan et al. Integrated Network Decompositions and Dynamic programming for GraphOptimization (INDDGO), 2012, 2013. http://github.com/bdsullivan/inddgo.[29] F. Gavril. The intersection graphs of subtrees in trees are exactly the chordal graphs.

Journal ofCombinatorial Theory, Series B , 16(1):47–56, 1974.

30] D. J. Rose and R. E. Tarjan. Algorithmic aspects of vertex elimination. In

Proceedings of the 7thAnnual ACM Symposium on Theory of Computing , pages 245–254, 1975.[31] A. Berry, J. R. S. Blair, and P. Heggernes. Maximum cardinality search for computing minimaltriangulations. In

Proceeding of the 28th International Workshop on Graph-Theoretic Concepts inComputer Science , pages 1–12, 2002.[32] A. Berry, J. R. S. Blair, P. Heggernes, and B. W. Peyton. Maximum cardinality search for computingminimal triangulations of graphs.

Algorithmica , 39(4):287–298, 2004.[33] D. Rose, R. Tarjan, and G. Lueker. Algorithmic aspects of vertex elimination on graphs.

SIAMJournal on Computing , 5:266–283, 1976.[34] R. E. Tarjan and M. Yannakakis. Simple linear-time algorithms to test chordality of graphs, testacyclicity of hypergraphs, and selectively reduce acyclic hypergraphs.

SIAM Journal on Computing ,13:566–579, 1984.[35] R. E. Tarjan and M. Yannakakis. Addendum: Simple linear-time algorithms to test chordality ofgraphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs.

SIAM Journal onComputing , 14(1):254–255, 1985.[36] A. Becker and D. Geiger. A suﬃciently fast algorithm for ﬁnding close to optimal clique trees.

Artiﬁcial Intelligence , 125(1–2):3–17, 2001.[37] H. L. Bodlaender, J. R. Gilbert, H. Hafsteinsson, and T. Kloks. Approximating treewidth, pathwidth,and minimum elimination tree height.

Journal of Algorithms , 18:238–255, 1995.[38] V. Bouchitt´e, D. Kratsch, H. M¨uller, and I. Todinca. On treewidth approximations.

Discrete Appl.Math. , 136(2-3):183–196, 2004.[39] B. A. Reed. Finding approximate separators and computing tree width quickly. In

Proceedings ofthe 24th Annual ACM Symposium on Theory of Computing , pages 221–228, 1992.[40] J. A. George. Nested dissection of a regular ﬁnite element mesh.

SIAM Journal of NumericalAnalysis , 10:345–363, 1973.[41] J. R. Gilbert and R. E. Tarjan. The analysis of a nested dissection algorithm.

Numerische Mathe-matik , 50(4):377–404, 1986.[42] G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregulargraphs.

SIAM Journal on Scientiﬁc Computing , 20:359–392, 1998.[43] H. M. Markowitz. The elimination form of the inverse and its application to linear programming.

Management Science , 3(3):255–269, 1957.[44] P. R. Amestoy, T. A. Davis, and I. S. Duﬀ. Algorithm 837: AMD, an approximate minimum degreeordering algorithm.

ACM Transactions on Mathematical Software (TOMS) , 30(3):381–388, 2004.[45] P. R. Amestoy, T. A. Davis, and I. S. Duﬀ. An approximate minimum degree ordering algorithm.

SIAM Journal on Matrix Analysis and Applications , 17(4):886–905, 1996.[46] D. Koller and N. Friedman.

Probabilistic Graphical Models: Principles and Techniques . MIT Press,2009.[47] H. L. Bodlaender. A tourist guide through treewidth.

Acta Cybernetica , 11:1–23, 1993.[48] H. L. Bodlaender. Discovering treewidth. In

Proceedings of the 31st international conference onTheory and Practice of Computer Science , pages 1–16, 2005.[49] H. L. Bodlaender. Treewidth: Characterizations, applications, and computations. In

Proceeding ofthe 32nd International Workshop on Graph-Theoretic Concepts in Computer Science , pages 1–14,2006.[50] H. L. Bodlaender and A. M. C. A. Koster. Combinatorial optimization on graphs of boundedtreewidth.

The Computer Journal , 51(3):255–269, 2007.

51] J. R. S. Blair and B. Peyton. An introduction to chordal graphs and clique trees. In A. George,J. R. Gilbert, and J. W. H. Liu, editors,

Graph Theory and Sparse Matrix Computation , The IMAVolumes in Mathematics and its Applications, Volume 56, pages 1–29. Springer-Verlag, 1993.[52] E. Amir. Eﬃcient approximation for triangulation of minimum treewidth. In

Proceedings of the 17thAnnual Conference on Uncertainty in Artiﬁcial Intelligence , pages 7–15, 2001.[53] A. M. C. A. Koster, H. L. Bodlaender, and S. P. M. van Hoesel. Treewidth: Computational experi-ments.

Electronic Notes in Discrete Mathematics , 8:54–57, 2001.[54] A. Berry, P. Heggernes, and G. Simonet. The minimum degree heuristic and the minimal triangula-tion process. In H. L. Bodlaender, editor,

Graph-Theoretic Concepts in Computer Science , LectureNotes in Computer Science, pages 58–70. Springer, 2003.[55] P. Heggernes. Minimal triangulations of graphs: A survey.

Discrete Mathematics , 306(3):297–317,2006.[56] C. Wang, T. Liu, P. Cui, and K. Xu. A note on treewidth in random graphs. In

Proceeding of the 5thInternational Conference on Combinatorial Optimization and Applications , pages 491–499, 2011.[57] Y. Gao. Treewidth of Erd˝os-R´enyi random graphs, random intersection graphs, and scale-free randomgraphs.

Discrete Applied Mathematics , 160(4–5):566–578, 2012.[58] A. B. Adcock, B. D. Sullivan, O. R. Hernandez, and M. W. Mahoney. Evaluating OpenMP taskingat scale for the computation of graph hyperbolicity. In

Proc. of the 9th IWOMP , pages 71–83, 2013.[59] A. B. Adcock.

Characterizing, identifying, and using tree-like structure in social and informationnetworks . PhD thesis, Stanford University, 2014.[60] M. Gromov. Hyperbolic groups. In S. M. Gersten, editor,

Essays in Group Theory , Math. Sci. Res.Inst. Publ., 8, pages 75–263. Springer, 1987.[61] J. M. Alonso, T. Brady, D. Cooper, V. Ferlini, M. Lustig, M. Mihalik, H. Shapiro, and H. Short.Notes on word hyperbolic groups. In E. Ghys, A. Haeﬂiger, and A. Verjovski, editors,

Group Theoryfrom a Geometrical Viewpoint, ICTP Trieste Italy , pages 3–63. World Scientiﬁc, 1991.[62] E. A. Jonckheere, P. Lohsoonthorn, and F. Bonahon. Scaled Gromov hyperbolic graphs.

Journal ofGraph Theory , 57(2):157–180, 2008.[63] E. A. Jonckheere, P. Lohsoonthorn, and F. Ariaei. Scaled Gromov four-point condition for networkgraph curvature computation.

Internet Mathematics , 7(3):137–177, 2011.[64] W. Chen, W. Fang, G. Hu, and M. W. Mahoney. On the hyperbolicity of small-world and tree-likerandom graphs.

Internet Mathematics , 9(4):434–491, 2013. Also available at: arXiv:1201.1717.[65] K. Verbeek and S. Suri. Metric embedding, hyperbolic space, and social networks. In

Proceedings ofthe 30th Annual Symposium on Computational Geometry , pages 501–510, 2014.[66] G. Brinkmann, J. H. Koolen, and V. Moulton. On the hyperbolicity of chordal graphs.

Annals ofCombinatorics , 5(1):61–69, 2001.[67] Y. Wu and C. Zhang. Hyperbolicity and chordality of a graph.

The Electronic Journal of Combina-torics , 18(1):P43, 2011.[68] Y. Dourisboure and C. Gavoille. Tree-decompositions with bags of small diameter.

Discrete Mathe-matics , 307(16):2008–2029, 2007.[69] D. Lokshtanov. On the complexity of computing treelength. In

Proceedings of the 32nd internationalconference on Mathematical Foundations of Computer Science , pages 276–287, 2007.[70] M. Grohe and D. Marx. On tree width, bramble size, and expansion.

Journal of CombinatorialTheory Series B , 99(1):218–228, 2009.[71] A. Kosowski, B. Li, N. Nisse, and K. Suchan. k -chordal graphs: From cops and robber to compactrouting via treewidth. In Proceedings of the 39th international colloquium conference on Automata,Languages, and Programming , pages 610–622, 2012.

72] F. F. Dragan. Tree-like structures in graphs: A metric point of view. In

Proceeding of the 39thInternational Workshop on Graph-Theoretic Concepts in Computer Science , pages 1–4, 2013.[73] M. Abu-Ata and F. F. Dragan. Metric tree-like structures in real-life networks: an empirical study.

Networks , 67(1):49–68, 2016.[74] M. M. Abu-Ata.

Tree-Like Structure in Graphs and Embeddability to Trees . PhD thesis, Kent StateUniversity, 2014.[75] Y. Shavitt and T. Tankel. Hyperbolic embedding of Internet graph for distance estimation andoverlay construction.

IEEE/ACM Transactions on Networking , 16(1):25–36, 2008.[76] M. P. Rombach, M. A. Porter, J. H. Fowler, and P. J. Mucha. Core-periphery structure in networks.

SIAM Journal on Applied Mathematics , 74(1):167–190, 2014.[77] S. B. Seidman. Network structure and minimum degree.

Social Networks , 5(3):269–287, 1983.[78] J. Ignacio Alvarez-Hamelin, L. Dall’Asta, A. Barrat, and A. Vespignani. Large scale networksﬁngerprinting and visualization using the k-core decomposition. In

Annual Advances in NeuralInformation Processing Systems 18: Proceedings of the 2005 Conference , pages 41–50, 2006.[79] J. Ignacio Alvarez-Hamelin, L. Dall’Asta, A. Barrat, and A. Vespignani. K-core decomposition ofinternet graphs: hierarchies, self-similarity and measurement biases.

Networks and HeterogeneousMedia , 3(2):371–393, 2008.[80] J. Healy, J. Janssen, E. Milios, and W. Aiello. Characterization of graphs using degree cores. In

WAW ’08: Proceedings of the 6th Workshop on Algorithms and Models for the Web-Graph , pages137–148, 2008.[81] V. Batagelj and A. Mrvar. Pajek—analysis and visualization of large networks. In

Proceedings ofGraph Drawing , pages 477–478, 2001.[82] J. Cheng, Y. Ke, S. Chu, and M. T. Ozsu. Eﬃcient core decomposition in massive networks. In

Proceedings of the 27th IEEE International Conference on Data Engineering , pages 51–62, 2011.[83] P. Colomer-de Simon, A. Serrano, M. G. Beiro, J. Ignacio Alvarez-Hamelin, and M. Boguna. De-ciphering the global organization of clustering in real complex networks.

Scientiﬁc Reports , 3:2517,2013.[84] M. Kitsak, L. K. Gallos, S. Havlin, F. Liljeros, L. Muchnik, H. E. Stanley, and H. A. Makse.Identiﬁcation of inﬂuential spreaders in complex networks.

Nature Physics , 6(11):888–893, 2010.[85] J. Ugander, L. Backstrom, C. Marlow, and J. Kleinberg. Structural diversity in social contagion.

Proceedings of the National Academy of Sciences , 109(16):5962–5966, 2012.[86] V. Ramasubramanian, D. Malkhi, F. Kuhn, M. Balakrishnan, A. Gupta, and A. Akella. On thetreeness of internet latency and bandwidth. In

Proceedings of the 2009 ACM SIGMETRICS Inter-national Conference on Measurement and modeling of computer systems , pages 61–72, 2009.[87] F. de Montgolﬁer, M. Soto, and L. Viennot. Treewidth and hyperbolicity of the internet. In

Proceed-ings of the 10th IEEE International Symposium on Network Computing and Applications (NCA) ,pages 25–32, 2011.[88] T. Maehara, T. Akiba, Y. Iwata, and K. Kawarabayashi. Computing personalized PageRank quicklyby exploiting graph structures.

Proceedings of the VLDB Endowment , 7:1023–1034, 2014.[89] B. Courcelle and M. Mosbah. Monadic second-order evaluations on tree-decomposable graphs.

The-oretical Computer Science , 109(12):49–82, 1993.[90] A. G. Percus, G. Istrate, B. Goncalves, R. Z. Sumi, and S. Boettcher. The peculiar phase structureof random graph bisection.

Journal of Mathematical Physics , 49(12):125219, 2008.[91] F.R.K. Chung and L. Lu.

Complex Graphs and Networks , volume 107 of

CBMS Regional ConferenceSeries in Mathematics . American Mathematical Society, 2006.[92] Supporting website. http://snap.stanford.edu/data/index.html .

93] A. L. Traud, P. J. Mucha, and M. A. Porter. Social structure of Facebook networks.

Physica A ,391:4165–4180, 2012.[94] L.A. Adamic and N. Glance. The political blogosphere and the 2004 U.S. election: divided they blog.In

LinkKDD ’05: Proceedings of the 3rd International Workshop on Link Discovery , pages 36–43,2005.[95] D.J. Watts and S.H. Strogatz. Collective dynamics of small-world networks.

Nature , 393:440–442,1998.[96] E. R. Gansner and S. C. North. An open graph visualization system and its applications to softwareengineering.

SoftwarePractice and Experience , 30(11):1203–1233, 2000.[97] T. A. Davis and Y. Hu. The University of Florida Sparse Matrix Collection.

ACM Transactions onMathematical Software (TOMS) , 38(1):1:1–1:25, 2011.[98] T. Malisiewicz. Open source code: Graphviz matlab magic. https://github.com/quantombone/graphviz_matlab_magic , May 2010.[99] P. Erd˝os and A. R´enyi. On the evolution of random graphs.

Publ. Math. Inst. Hungar. Acad. Sci. ,5:17–61, 1960.[100] B. Bollob´as.

Random Graphs . Academic Press, London, 1985.[101] R. Andersen, F.R.K. Chung, and K. Lang. Local graph partitioning using PageRank vectors. In

FOCS ’06: Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science ,pages 475–486, 2006.[102] R. Diestel andM. M¨uller. Connected tree-width. Technical report. Preprint: arXiv:arXiv:1211.7353(2012).[103] F. F. Dragan and I. Lomonosov. On compact and eﬃcient routing in certain graph classes.

DiscreteApplied Mathematics , 155(11):1458–1470, 2007.[104] V. Chepoi, F. Dragan, B. Estellon, M. Habib, and Y. Vax`es. Diameters, centers, and approximatingtrees of δ -hyperbolic geodesic spaces and graphs. In Proceedings of the 24th Annual Symposium onComputational Geometry , pages 59–68, 2008.[105] F. Reidl and B. Sullivan. Personal communication, 2014.[106] P. Bellenbaum and R. Diestel. Two short proofs concerning tree-decompositions.

Combinatorics,Probability, and Computing , 11:541–547, 2002.[107] A. Georgakopoulos and P. Sprussel. Geodesic topological cycles in locally ﬁnite graphs.

The Elec-tronic Journal of Combinatorics , 16(1):R144, 2009., 16(1):R144, 2009.