Spectral density of random graphs: convergence properties and application in model fitting
SSpectral density of random graphs: convergence properties andapplication in model fitting S UZANA DE S IQUEIRA S ANTOS ∗ Departamento de Ciˆencia da Computac¸ ˜ao, Instituto de Matem´atica e Estat´ıstica,Universidade de S˜ao Paulo, S˜ao Paulo, Brazil ∗ Corresponding author: [email protected]
NDR ´ E F UJITA
Departamento de Ciˆencia da Computac¸ ˜ao, Instituto de Matem´atica e Estat´ıstica,Universidade de S˜ao Paulo, S˜ao Paulo, Brazil
AND C ATHERINE M ATIAS
Sorbonne Universit´e, Universit´e de Paris, Centre National de la Recherche Scientifique,Laboratoire de Probabilit´es, Statistique et Mod´elisation, Paris, France [23 February 2021]
Random graph models are used to describe the complex structure of real-world networks in diverse fieldsof knowledge. Studying their behavior and fitting properties are still critical challenges, that in general,require model specific techniques. An important line of research is to develop generic methods able to fitand select the best model among a collection. Approaches based on spectral density (i.e., distribution ofthe graph adjacency matrix eigenvalues) are appealing for that purpose: they apply to different randomgraph models. Also, they can benefit from the theoretical background of random matrix theory. This workinvestigates the convergence properties of model fitting procedures based on the graph spectral densityand the corresponding cumulative distribution function. We also review results on the convergence ofthe spectral density for the most widely used random graph models. Moreover, we explore throughsimulations the limits of these graph spectral density convergence results, particularly in the case of theblock model, where only partial results have been established.
Keywords : random graphs, spectral density, model fitting, model selection, convergence.2000 Math Subject Classification: 34K30, 35K57, 35Q80, 92D25
1. Introduction
The study of real-world networks is fundamental in several areas of knowledge [14, 26, 30, 37]. Ana-lyzing their structure is sometimes challenging, as it may change across time, and instances from thesame group/phenotype. For example, a functional brain network may change across time, and individ-uals with the same phenotype. Therefore, we can think of a network as a random graph, which is arealization of a random process.Several models were proposed to describe random processes that generate graphs. For instance, inthe Erd˝os-R´enyi (ER) random graph [17], each pair of nodes connects independently at random withprobability p . We obtain a generalization when considering that nodes i and j connect independentlyat random with non-homogeneous probability p i j . Later, [21, 25, 34] proposed an intermediate setup a r X i v : . [ m a t h . S T ] F e b of 26 DE SIQUEIRA SANTOS, FUJITA AND MATIAS - block models - to generate graphs with non-homogeneous connection probabilities taking a finitenumber of values. In what follows, we distinguish between the (deterministic) block model (BM),where pairs of nodes connect independently of each other, and the stochastic block model (SBM),where these connections are only conditionally independent (given the nodes’ groups). Models can alsocharacterize different properties of networks. For example, networks with spatial associations betweennodes [geometric random graph - GRG, 31], a regular structure ( d -regular graph - DR), small-worldstructure [Watts-Strogatz - WS, 39], a power-law distribution of the vertex degrees [Barab´asi-Albert- BA, 5], or with occurrence probability depending on pre-defined summary statistics and covariates[exponential random graph model, ERGM, 10].Given an empirical network, a significant task is to fit a collection of models through parameter esti-mation and select the best model in the collection. There are well-established procedures for estimatingthe model’s parameters [1, 10, 34] for some models, such as the ER, SBM, and ERGM. However, thoseestimators present a lack of generality; they are specific for each model. Every time a new complexnetwork model is proposed, it becomes necessary to develop a specific parameter estimator. Besides,the next question is, among the proposed set of models, which one best fits the observed network? Theusual statistical answer relies on model selection criteria, which exist in some setups (e.g., ER, SBM)but not all and are again specific to the model.In this context, Takahashi et al. proposed fitting and selecting random graph models based on theKullback-Leibler divergence between graph spectral densities [35]. The rationale behind this idea isthat there is a close relationship between the graph structure and the graph spectrum [38]. The empiricalspectral density characterizes some random graph/matrix ensembles, as shown analytically by Wigneret al. (semicircle law) for ER and DR graphs [41] and empirically by Takahashi et al. [35] for GRG,BA, and WS graphs. The semicircle law states that a large class of random symmetric matrices’ spectraldensity converges weakly almost surely to a specific distribution called the semicircle distribution [2,22, 41]. We can apply this result to the ER and d -regular random graphs under certain conditions [36].However, despite the advances in the study of the convergence of graph spectral densities, we knowlittle about the theoretical properties of the procedures proposed by Takahashi et al.In this work, we prove that if we replace the Kullback-Leibler divergence between spectral densitiesby the (cid:96) -distance, the parameter estimated by the procedure proposed by Takahashi et al. [35] isconsistent under the general assumptions:1. the empirical spectral density converges weakly to a limiting distribution,2. the map between the parameter space and the limiting distribution is injective and continuous, and3. the parameter space is compact (this technical condition could be relaxed).Furthermore, we show that we can extend the results when considering the eigenvalues’ cumulativedistribution to fit the models.In Section 2, we present the notation and main definitions used in this paper. Section 3 summarizesthe literature on convergence results for the random graph’s spectral density under different models.We describe a model-fitting procedure (that slightly differs from [35]) in Section 4. We establish itsconvergence properties in Section 5; see Theorems 5.1 and 5.2. Then, in Section 6, we evaluate themethod performance by simulation experiments. Finally, we discuss the results in Section 7 and showour conclusions in Section 8. ANDOM GRAPH MODEL FIT VIA SPECTRAL DENSITY
2. Notation and definitions
Let G = ( V , E ) be an undirected graph in which V = { , , . . . , n } is a set of vertices, and E is a setof edges connecting the elements of V . The spectrum of G is the set of eigenvalues of its adjacencymatrix A = ( A i j ) (cid:54) i , j (cid:54) n . Since G is undirected, A is symmetric, and then its eigenvalues are real values λ G (cid:62) λ G (cid:62) . . . (cid:62) λ Gn .Let δ x be the Dirac measure at point x . Then, we define the empirical spectral distribution (ESD) as µ G = n n ∑ i = δ λ Gi / √ n . (2.1)The corresponding empirical eigenvalues’ cumulative distribution function (CDF) is F G ( x ) = n n ∑ i = (cid:110) λ Gi √ n (cid:54) x (cid:111) , (2.2)where { A } is the indicator function of set A . Let f : R (cid:55)→ R be a continuous and bounded function. Ifwe integrate µ G over f we obtain µ G ( f ) = (cid:90) R f ( λ ) µ G ( d λ ) = n n ∑ i = f ( λ Gi / √ n ) . (2.3)Let ( G n ) n (cid:62) be a sequence of graphs on the (sequence of) sets of vertices V (in other words, each G n isa graph over n nodes). We say that ( µ G n ) n (cid:62) converges weakly to µ (and denote µ G n ⇒ µ ) if for anyreal bounded and continuous function f : R (cid:55)→ R , we have µ ( f ) = lim n → ∞ µ G n ( f ) .In what follows, we will consider a (kernel) function φ and introduce for any x ∈ R the function φ x ( · ) = φ ( x − · ) . In this case, the quantity in Eq (2.3) becomes a convolution function: x (cid:55)→ µ G n ( φ x ) = (cid:90) R φ ( x − λ ) µ G n ( d λ ) = n n ∑ i = φ ( x − λ G n i / √ n ) . (2.4)A random graph is a random variable (r.v.) taking its values in the set of graphs. A random graphmodel is a collection of graphs that are either finite or countable, together with a probability distribution P on this collection. Let ( G n ) n (cid:62) be a sequence of random graphs on the set of vertices V = { , , . . . , n } .Then the eigenvalues of G n are random variables, and the ESD µ G n is a random probability measure.Now, there are many ways in which the random measure µ G n may converge weakly. We say that µ G n converges weakly in expectation to a probability measure µ if for any real bounded and continuousfunction f : R (cid:55)→ R , we have E µ G n ( f ) → n → ∞ E µ ( f ) . Similarly, we say that µ G n converges weakly in P -probability to a probability measure µ if for any realbounded and continuous function f : R (cid:55)→ R and any ε >
0, we have that, P (cid:16) | µ G n ( f ) − µ ( f ) | > ε (cid:17) → , as n → ∞ . Finally, if P (cid:16) lim n → ∞ µ G n ( f ) = µ ( f ) (cid:17) = DE SIQUEIRA SANTOS, FUJITA AND MATIAS for any bounded and continuous f : R (cid:55)→ R , we say that µ G n converges weakly P -almost surely to µ .Let f and g be functions of n . Then, as n → ∞ , we say that f = O ( g ) if | f | / | g | is bounded fromabove; f = o ( g ) if f / g →
0; and f = ω ( g ) if | f | / | g | → ∞ .Finally, we introduce the Stieltjes transform [4], which is an important tool for studying the limitingspectral distribution of random matrices. Given a probability distribution µ , its Stieltjes transform s µ isa function on the upper-half complex plane defined for z ∈ C + as s µ ( z ) = (cid:90) R d µ ( x ) x − z .
3. Random graph models’ spectral density
In this Section, we summarize results on the spectral density of random graphs. First, we describeWigner’s law (also called semicircle law) for random symmetric matrices. We show examples of randomgraph models for which the law is valid. Then we summarize analytical and empirical results on thespectral density under different models when Wigner’s law does not hold.3.1
Wigner’s law
Many results on the limiting ESD of random graphs rely on Wigner’s law for random symmetric matri-ces. Wigner [41] proved that for symmetric matrices whose entries are real-valued independent andidentically distributed (i.i.d.) random variables, with mean zero and unit variance, the ESD convergesweakly in expectation to the semicircle law µ sc defined as µ sc ( dx ) = π (cid:112) − x {| x | (cid:54) } dx . (3.1)Later, Grenander [22] proved that this convergence holds weakly in probability. Finally, Arnold [2]proved that (under the same assumptions) the ESD converges weakly almost surely to the semicircledistribution.3.2 Erd˝os-R´enyi random graph model (ER)
One of the simplest examples of random graph models in terms of construction is the model proposedby [17]. Given a probability p , and the number of vertices n , each pair of vertices is connected withprobability p , independently of the other pairs.Let ( G p , n ) n (cid:62) be a sequence of ER random graphs, and λ (cid:62) λ (cid:62) · · · (cid:62) λ n denote the eigenvalues ofthe adjacency matrix A (here, we do not stress that A = A n depends on n ). Based on Wigner’s semicirclelaw, it can be proved that if p = ω ( n ) , then the ESD of the scaled matrix A / (cid:112) np ( − p ) , given by˜ µ G p , n = n n ∑ i = δ λ i / √ np ( − p ) , (3.2)converges weakly almost surely (and also in probability and expectation) to the semicircle distribution[36].Note that if np = O ( ) , the convergence no longer holds.Another formulation says the ER graph G p , n ESD, denoted by µ G p , n and defined as in Equation (2.1),converges to ANDOM GRAPH MODEL FIT VIA SPECTRAL DENSITY µ p , sc ( dx ) = π p ( − p ) (cid:113) p ( − p ) − x {| x | (cid:54) (cid:112) p ( − p ) } dx . (3.3)3.3 (Deterministic) Block model (BM) The (deterministic) block model was introduced in the 70’s by [9, 40]; see also [19]. As already men-tioned, a natural generalization of the ER model is to consider that any pair of nodes i , j connect inde-pendently but with its probability p i j . From a statistical point of view, such a model has too manyparameters. We can interpret the block model as a particular case where we restrict these p i j to a finitenumber of values. More precisely, each node belongs to one of the M groups, determining its probabilityof connection. That is, pairs of nodes connect with probabilities p m , l , which depend on the respectivegroups m , l of these nodes.In what follows, we consider, more specifically, the model studied in [3]. Let G be a random graphwith n nodes V = { , , . . . , n } belonging to M groups { Ω , Ω , . . . , Ω M } with equal size K = n / M .Assume that the conditional probabilities of connections are as follows P ( A i j = | i ∈ m , j ∈ l ) = (cid:26) p if m (cid:54) = l , p m if m = l .Let A denote the adjacency matrix of G and ¯A denote its expectation matrix. Let i ∈ Ω m and j ∈ Ω l be two vertices of G , then the n × n matrix ¯A is such that ¯A i j = p m , if m = l , and ¯A i j = p , otherwise.Denote p (cid:63) = max (cid:54) m (cid:54) M p m , and γ ( n ) = / (cid:112) np (cid:63) ( − p (cid:63) ) . The centered and normalized adjacencymatrix of G is defined as ˜A = ( A − ¯A ) γ ( n ) . (3.4)An expression for the Stieltjes transform of the spectral density of ˜A was obtained by [3], as describedin Proposition 3.1.P ROPOSITION
OROLLARY FROM [3]). If lim n → + ∞ np m = + ∞ and p m / p (cid:54) c for some con-stant c > m = , , . . . , M , then, almost surely, the ESD associated with ˜A converges weaklyto a distribution function whose Stieltjes transform is s ( z ) = M ∑ m = c m ( z ) , where the functions ( c m ) (cid:54) m (cid:54) M are the unique solutions to the system of equations c m ( z ) = − / Mz + ζ m c m ( z ) + ζ ∑ l (cid:54) = m c l ( z ) , (cid:54) m (cid:54) M , with ζ l = lim n → + ∞ p l ( − p l ) p (cid:63) ( − p (cid:63) ) , (cid:54) l (cid:54) M , that satisfy the conditions ℑ ( c m ( z )) ℑ ( z ) > , for ℑ ( z ) > , (cid:54) m (cid:54) M , where ℑ ( z ) denotes the imaginary part of z . of 26 DE SIQUEIRA SANTOS, FUJITA AND MATIAS
The recent paper [42] proposes explicit moment formula for the limiting distribution. Notice thatthe results from [3] were obtained only for the centered matrix ˜A . For the Erd˝os-R´enyi model, Tranet al. [36] proved that the centered and non-centered adjacency matrices respective spectral densitiesare approximately identical. To investigate whether the BM graphs have a similar property, we didsimulation experiments comparing the ESD of ˜A and A / γ ( n ) . We did simulations to obtain the ESDunder different scenarios. We show the results in Section 6.1.3.4 Stochastic block model (SBM)
Similarly to the previous model, the stochastic block model (SBM) produces graphs with groups ofvertices connected with a particular edge density [20, 25, 34]. The main difference between SBM and thedeterministic block model lies in a random assignation of each node to one of the groups with some pre-specified group probability. This results in groups with random sizes, while the latter are fixed in BM.Moreover, the entries of an adjacency matrix in BM are independent random variables, while for SBM,these are conditionally independent random variables. Handling the non-independence of the entries ofthe adjacency matrix in SBM is more challenging. Note that many authors abusively call SBM whatis only a BM. More precisely, given a set of vertices V = { , . . . , n } , each vertex first picks a group atrandom among M possibilities independently of the others and with probabilities ( π , π , . . . , π M ) . Thenconditional on these latent (unobserved) groups, two vertices from groups m , l connect independently ofthe others with probability p ml . To our knowledge, no theoretical results about the convergence of theESD have been obtained for the SBM yet.3.5 Configuration random graph model
The configuration model is a particular instance of the inhomogeneous ER model and has at least twodifferent usages in the literature. Let d = ( d , . . . , d n ) be a degree sequence of a graph.The fixed-degree model FD ( d ) is the collection of all graphs with this degree sequence and uniformprobability on this collection. We obtain this model by simulation: start with a real graph from degreesequence d and use a rewiring algorithm to shuffle the edges.The random-degree model RD ( d ) considers all graphs over n nodes such that the entries A i j areindependent with (nonidentical) Bernoulli distribution B ( p i j ) where p i j = d i d j / C and C > (cid:54) p i j (cid:54) C = max i (cid:54) = j d i d j ). It is a randomgraph model where graphs have degrees only approximately given by d . Indeed, if D i denotes therandom degree of node i , then E ( D i ) = ∑ j (cid:54) = i E ( A i j ) = ∑ j (cid:54) = i p i j = d i C ∑ j (cid:54) = i d j = d i ( | E | − d i ) C . Whenever d i is not too large and C (cid:39) | E | we get E ( D i ) (cid:39) d i .Let A (cid:48) denote the normalized adjacency matrix: A (cid:48) = A √ n ∑ ni = d i . Preciado and Rahimian [32] proved that for the random-degree model proposed by Chung and Lu [11],under specific conditions regarding the expected degree sequence (sparsity, finite moments, and con-trolled growth of degrees), the ESD of the normalized adjacency matrix A (cid:48) converges weakly almostsurely to a deterministic density function. Although the analytical expression of this function is unknown,the authors show its moments explicitly. ANDOM GRAPH MODEL FIT VIA SPECTRAL DENSITY d-regular random graph (DR) A d -regular random graph is a random graph over n nodes with all degrees equal to d . In this case, theparameter d is then directly determined by the observed graph. Although there is no need to estimate d ,studying this random graph’s spectral density can help obtain a measure of goodness of fit. This can beuseful for graphs that “are almost” regular, in which degrees vary due to measurement errors.If d (cid:62) n , and ( G d , n ) n (cid:62) is a sequence of d -regular random graphs, then the ESDdefined as ˜ µ G d , n = n n ∑ i = δ λ i (3.5)converges weakly almost surely to µ d ( dx ) = d (cid:112) ( d − ) − x π ( d − x ) {| x | (cid:54) √ d − } dx . (3.6)Note that ˜ µ G d , n is the distribution of the eigenvalues of A , without scaling by √ n . The result dates backto the work [28]. It is also valid when considering that d grows slowly with n [16, 36]. The distribution µ d is known as a Kesten-McKay distribution. The result may also be rephrased in the following way[6]: the ESD of the matrix A / √ d − µ d ( dx ) = (cid:18) + d − − x d (cid:19) − √ − x π {| x | (cid:54) } dx . (3.7)Dumitriu and Pal [16] showed that the ESD of A / √ d − µ sc for d tendingto infinity slowly, namely d = n ε n with ε n = o ( ) .Let J be a n × n matrix containing only ones, and set M = (cid:20) dn (cid:18) − n (cid:19)(cid:21) − / (cid:18) A − dn J (cid:19) . (3.8)Tran et al. [Theorem 1.5 in 36] showed that the ESD of M / √ n converges weakly to the semicircledistribution µ sc when d tends to infinity with n .3.7 Geometric random graph model
To construct a geometric random graph (GRG), we first randomly draw n vertices on a unit cube (ortorus) of dimension D . Then, we connect pairs of vertices whose distance is at most r n (with r n → nr Dn → c ∈ ( , ∞ ) , called the thermodynamic regime. They also proposed an approximation of the limitwhen the constant c becomes large. Later, Hamidouche et al. [24] proved that in the connectivity regimewhen (cid:16) n log n (cid:17) r Dn → c ∈ ( , ∞ ) as n → ∞ , the ESD of a random geometric graph converges to the ESD ofa deterministic geometric graph in which the vertices are drawn over a grid. Finally, Bordenave [8] alsocharacterizes the spectral measure of the adjacency matrix, when normalized by n , in the dense regime,i.e when r Dn scales as a constant (namely r Dn = O ( ) and r Dn = ω ( ) ). of 26 DE SIQUEIRA SANTOS, FUJITA AND MATIAS
Preferential attachment models
The preferential attachment random graph models generate graphs by adding a node in the networktogether with a set of edges. The probability of connecting the new vertex is proportional to the degreeof the network’s existing vertices. One of the most known preferential attachment model is the Barab´asi-Albert (BA) [5]. In this model, the probability of connecting a vertex to other existing nodes is propor-tional to their degrees to power the scaling exponent p s .There is no theoretical result about the convergence of the empirical spectral density for the Barab´asi-Albert model. However, [18] has described the shape of the distribution through simulations.In the linear preferential attachment tree model, the probability of connecting a new vertex is pro-portional to the out-degree plus 1 + a , where a is a process parameter [7]. Under certain conditions, theESD converges to a deterministic probability function as n → + ∞ [7].3.9 Watts-Strogatz random graph model
Watts-Strogatz (WS) random graphs [39] present small-world properties, such as short average pathlengths (the majority of vertices can be reached from all other vertices by a small number of steps) and ahigher clustering coefficient (the number of triangles in the graph) than ER random graphs. The processthat generates WS graphs starts with a regular lattice of size n , with each vertex connected to the K /2nearest vertices on each ring’s side. For each vertex i , and each edge connected to i , with probability p r , it replaces the edge by a new one connecting i to a vertex chosen at random. For the best of ourknowledge, all results regarding the limiting ESD of WS graphs are empirical [18].
4. Fitting a model
Consider a parameterized random graph model { P θ ; θ ∈ Θ } where P θ is a probability distribution ona collection of graphs, described through the parameter θ ∈ Θ , Θ ⊂ R d . Let G be a random graphfrom some distribution P (that might belong or not to the parametric model { P θ ; θ ∈ Θ } ). We want toestimate a parameter θ such that P θ is close to P . In many random graphs models, the ESD of graphsgenerated under the distribution P θ converges weakly in some sense (for P θ ) to some distribution thatwe denote µ θ whenever this limit exists.ESDs are discrete distributions. In general, the limits µ θ are absolutely continuous measures (thatis, they have densities with respect to the Lebesgue measure on the spectrum set Λ ). It is thus naturalto consider a kernel estimator of the spectrum’s density as defined in Eq (2.4). We denote by φ x , σ ( · ) = σ − φ ( x −· σ ) a kernel function (a function such that (cid:82) φ ( x ) dx =
1) with parameters x ∈ R and σ >
0. Thekernel estimator of the spectral density with bandwidth σ is defined as µ G ( φ x , σ ) = n σ n ∑ i = φ (cid:18) x − λ G i / √ n σ (cid:19) . Suppose the bandwidth σ = σ n goes to 0 as n tends to infinity. In that case, this estimator shouldapproximate the limiting spectral density µ θ of the graph. Note that we choose to focus on the ESDderived from the normalized adjacency matrix, with normalization √ n , while the literature review fromSection 3 contains results for varying normalization. Section 6 will investigate this point further.Now, given a divergence measure D on the set of probability measures over the spectrum of thegraphs, we estimate θ by solving the minimization problem ANDOM GRAPH MODEL FIT VIA SPECTRAL DENSITY θ = argmin θ ∈ Θ D ( µ G ( φ · , σ ) , µ θ ( φ · , σ )) . (4.1)We either use an analytical expression for µ θ (when available) or rely on Monte Carlo estimates. Inthe latter case, we sample M graphs G , G , · · · , G M from the distribution P θ . Then, for each graph G m ,we numerically obtain µ G m ( φ x , σ ) , and we estimate µ θ ( φ x , σ ) byˆ µ θ ( φ x , σ ) = M M ∑ m = µ G m ( φ x , σ ) = nM σ M ∑ m = n ∑ i = φ (cid:32) x − λ G m i / √ n σ (cid:33) . The parameter estimator problem (4.1) is based on the assumption that the random graph G wasgenerated by a known family of random graphs { P θ ; θ ∈ Θ } (model). That is, we assume that P ∈{ P θ ; θ ∈ Θ } . However, usually, the model { P θ ; θ ∈ Θ } is unknown. Then given a list of random graphmodels, we choose the one with the best fit according to some objective criterion. If we consider randomgraphs models with the same complexity (number of parameters), we choose the model with the smallestdivergence ( D ) between the spectral densities, as described in Algorithm 4.1. Otherwise, we penalizethe divergence D by a function of the number of parameters.A previous work by [35] uses AIC as a criterion to choose a model. Their approach for problem (4.1)is based on the Kullback-Leibler divergence KL ( µ G ( φ x ) , µ θ ( φ x )) , where φ is the Gaussian kernel. Inthis case, we obtain D ( µ G ( φ x , σ ) , µ θ ( φ x , σ )) = n σ n ∑ i = (cid:90) φ ( x − λ G i / √ n ) log (cid:18) ∑ ni = φ ( x − λ G i / √ n ) n σ µ θ ( φ x , σ ) (cid:19) dx . Algorithm 4.1
Procedure for fitting and selecting a random graph model.
Input:
Graph G , a list of random graph models { P i θ ; θ ∈ Θ i ⊂ R d } , a finite subset ˜ Θ i ⊂ Θ i , for i = , , . . . , N . Output:
Return model and parameter with best fit.Min ← + ∞ Argmin ← ( , ) Compute the kernel density estimator µ G ( φ · , σ ) . for each parameterized random graph model { P i θ ; θ ∈ Θ i } , i = , , · · · , N , dofor each θ j ∈ ˜ Θ i doif the limiting ESD from P i θ j , denoted by µ i θ j , is known analytically then D i , j ← D ( µ G ( φ · , σ ) , µ i θ j ( φ · , σ )) . else Sample M graphs G , G , · · · , G M from P i θ j . for each graph G m do Compute the kernel density estimator µ G m ( φ · , σ ) . end for ˆ µ i θ j ( φ · , σ ) ← M ∑ Mm = µ G m ( φ · , σ ) . D i , j ← D ( µ ( φ · , σ ) , ˆ µ i θ j ( φ · , σ )) end ifif D i , j < Min then
Argmin ← ( i , j ) Min ← D i , j DE SIQUEIRA SANTOS, FUJITA AND MATIAS end ifend forend forreturn
Argmin.In Section 5, we will concentrate on D being the (cid:96) -distance. In this case, we have D ( µ G ( φ · , σ ) , µ θ ( φ · , σ )) = (cid:107) µ G ( φ · , σ ) − µ θ ( φ · , σ ) (cid:107) = (cid:90) | µ G ( φ x , σ ) − µ θ ( φ x , σ ) | dx . Suppose that the ESD from the parameterized random graph model is known analytically and set D asthe (cid:96) -distance. In that case, we have the convergence properties described in the next section.
5. Convergence properties
In what follows, we consider as divergence the (cid:96) -distance and denote it || · || . We also denote M ( Λ ) the set of probability measures over the spectrum Λ .T HEOREM { P θ ; θ ∈ Θ } denote a parameterized random graph model and assume that for any θ ∈ Θ , there exists µ θ the limiting ESD with respect to weak, almost sure convergence. We assumethat µ θ ∈ M ( Λ ) , where Λ is a bounded set. Consider ( G n ) n (cid:62) a sequence of random graphs fromdistribution P θ (cid:63) . Let φ be a kernel and σ = σ n a bandwidth that converges to 0. If the map θ ∈ Θ (cid:55)→ µ θ ( d λ ) ∈ M ( Λ ) is injective, continuous and Θ is compact, then the minimizerˆ θ n = argmin θ ∈ Θ || µ G n ( φ · , σ ) − µ θ ( φ · , σ ) || converges in probability to θ (cid:63) as n → ∞ . Proof.
The proof is based on [Theorem 3.2.8 from 13] about minimum contrast estimators. Thefirst step is establishing the convergence of the contrast function D n ( θ , θ (cid:63) ) : = (cid:107) µ G n ( φ · , σ ) − µ θ ( φ · , σ ) || .Assuming that µ θ exists for any θ ∈ Θ , we get that for any x ∈ R , µ G n ( φ x , σ ) = n n ∑ i = φ x , σ ( λ G n i / √ n ) −→ n → ∞ µ θ (cid:63) ( φ x , σ ) , P θ (cid:63) -almost surely. Then, from dominated convergence Theorem, we have D n ( θ , θ (cid:63) ) −→ n → ∞ (cid:107) µ θ (cid:63) ( φ · , σ ) − µ θ ( φ · , σ ) || : = D ( θ , θ (cid:63) ) , P θ (cid:63) -almost surely. Now, the second step is to establish that the limiting function θ (cid:55)→ D ( θ , θ (cid:63) ) has astrict minimum at θ (cid:63) . Notice that the limit D ( θ , θ (cid:63) ) is minimum if and only if µ θ (cid:63) ( φ · , σ ) = µ θ ( φ · , σ ) almost surely (wrt the Lebesgue measure), namely ∀ x ∈ R , (cid:90) φ [( x − λ ) / σ ] µ θ (cid:63) ( d λ ) = (cid:90) φ [( x − λ ) / σ ] µ θ ( d λ ) . Classical properties of kernels imply that letting σ →
0, we get µ θ ( d λ ) = µ θ (cid:63) ( d λ ) . ANDOM GRAPH MODEL FIT VIA SPECTRAL DENSITY
11 of 26Since θ (cid:55)→ µ θ is injective, this implies θ = θ (cid:63) . Thus, the limit D ( θ , θ (cid:63) ) is minimum if and only if θ = θ (cid:63) .Finally, the map θ (cid:55)→ µ θ ( d λ ) is continuous on the compact set Θ and thus uniformly continuous.This implies that ∀ ε > , ∃ η > (cid:107) θ − θ (cid:48) (cid:107) (cid:54) η then | µ θ ( d λ ) − µ θ (cid:48) ( d λ ) | (cid:54) ε . Thus we get that | D n ( θ , θ (cid:63) ) − D n ( θ (cid:48) , θ (cid:63) ) | (cid:54) (cid:107) µ θ ( φ · , σ ) − µ θ (cid:48) ( φ · , σ ) (cid:107) (cid:54) σ − (cid:90) (cid:90) Λ φ [( x − λ ) / σ ] × | µ θ ( d λ ) − µ θ (cid:48) ( d λ ) | dx (cid:54) (cid:90) (cid:90) φ ( y ) × | µ θ ( x − σ y ) − µ θ (cid:48) ( x − σ y ) | dydx (cid:54) ε | Λ | , where | Λ | denotes the length of the interval Λ . For clarity, we assumed in the last line that µ θ isa measure with a density, but this has no consequence as soon as (cid:107) θ − θ (cid:48) (cid:107) (cid:54) η . This establishes auniform convergence (wrt θ ) of the contrast function D n as n tends to infinity. Relying on [Theorem3.2.8 from 13], we obtain the desired result. (cid:3) We can establish a similar result whether we consider cumulative distribution functions (CDF) ofeigenvalues instead of spectral densities.T
HEOREM { P θ ; θ ∈ Θ } denote a parameterized random graph model and assume that for any θ ∈ Θ , there exists µ θ the limiting ESD with respect to weak almost sure convergence. We assumethat µ θ ∈ M ( Λ ) , where Λ is a bounded set. Consider ( G n ) n (cid:62) a sequence of random graphs fromdistribution P θ (cid:63) . If the map θ ∈ Θ (cid:55)→ µ θ ( d λ ) ∈ M ( Λ ) is injective, continuous and Θ is compact, thenthe minimizer ˆ θ n = argmin θ || F G n − F θ || converges in probability to θ (cid:63) as n → ∞ . Proof.
The proof follows the same lines as the previous one, by first noting that for any x , the function λ (cid:55)→ { λ (cid:54) x } is continuous almost everywhere. Thus F G n ( x ) −→ n → ∞ F θ ( x ) , P θ (cid:63) -almost surely. For the second step, notice that F θ entirely characterizes the distribution µ θ . Finally,we similarly prove that | D n ( θ , θ (cid:63) ) − D n ( θ (cid:48) , θ (cid:63) ) | (cid:54) (cid:107) F θ − F θ (cid:48) (cid:107) (cid:54) (cid:90) Λ (cid:90) Λ { λ (cid:54) x }| µ θ ( d λ ) − µ θ (cid:48) ( d λ ) | dx (cid:54) ε | Λ | as soon as (cid:107) θ − θ (cid:48) (cid:107) (cid:54) η . (cid:3) DE SIQUEIRA SANTOS, FUJITA AND MATIAS
6. Simulations
In what follows, we describe experiments to study the block model spectral density, illustrate Theo-rems 5.1 and 5.2, and evaluate the model selection procedures’ performance. We used the Gaussiankernel to estimate the ESD and the Silverman’s criterion [33] to choose the bandwidth σ in all experi-ments. To generate random graphs, we used package igraph for R [12].First, in Section 6.1, we analyze the ESD of BM random graphs. Indeed, the results in [3] arelimited to a specific BM and we empirically explore the validity of the convergence of the ESD inmore general BM. Then, in Section 6.2, we analyze the parameter estimator’s empirical behavior toconfirm results given by Theorems 5.1 and 5.2 in scenarios in which a limiting spectral density existsand has a known analytical form. In Section 6.3, we further explore scenarios in which the limitingspectral density may not exist. The distribution of the random graph model’s eigenvalues is estimatedempirically, as described in Algorithm 4.1. Finally, in Section 6.4, we show experiments for the modelselection procedure.6.1 Block Model spectral density
We generated BM graphs randomly and compared the ESD of the adjacency matrix A / γ ( n ) with thecentered matrix ˜A given in (3.4) and the theoretical density derived by [3] that are recalled in Proposi-tion 3.1. We considered five different scenarios, described as follows.1. Scenario 1.
Graph generated by the same model described by [3] with a small number of blocks( M = K = p = .
2, and the probabilities within the groups ( p , p , p ) to ( . , . , . ) .2. Scenario 2.
Graph generated by the same model described by [3] with a larger number of groups( M = K = p = . ( p , p , p , p , p , p , p , p , p , p ) to ( . , . , . , . , . , . , . , . , . , . ) .3. Scenario 3.
Generalization of the model described by [3] for different block sizes. We considered M = p = .
2, and the probabilities within the groups ( p , p , p ) to ( . , . , . ) .4. Scenario 4.
Generalization of the model described by [3] for p larger than the probabilitieswithin the groups. We considered M = K = p = .
9, and the probabilities within the groups ( p , p , p ) to ( . , . , . ) .5. Scenario 5.
Generalization of the model described by [3] with different probabilities of con-necting vertices from different groups. We considered M = K = p ) to p = . p = .
2, and p = .
05. The probabilities within the groups ( p , p , p ) are ( . , . , . ) .We compared the ESD of both the centered matrix ˜A and the non-centered matrix A / γ ( n ) to [3]’stheoretical density in Figure 1. Note that the normalizing term γ ( n ) depends on the unknown parametervalue p (cid:63) . The density is computed numerically by inverting the Stieltjes transform. ANDOM GRAPH MODEL FIT VIA SPECTRAL DENSITY
13 of 26 −3 −2 −1 0 1 2 3 . . . . Eigenvalues D en s i t y (a) Scenario 1 −3 −2 −1 0 1 2 3 . . . . . Eigenvalues D en s i t y (b) Scenario 2 −3 −2 −1 0 1 2 3 . . . . Eigenvalues D en s i t y (c) Scenario 3 −2 −1 0 1 2 . . . . Eigenvalues D en s i t y (d) Scenario 4 −4 −2 0 2 4 . . . . Eigenvalues D en s i t y (e) Scenario 5 F IG . 1: Comparison between the empirical spectral distribution (ESD) of the block model for matrices ˜A (solid black line), A / γ ( n ) (red dashed line), and the theoretical distribution derived by [3] (greendotted line). We considered different scenarios: (a) a small number of blocks, (b) a larger number ofblocks, (c) blocks of different sizes, (d) probability between groups larger than the probabilities withinthe groups, and (e) different probabilities of connecting vertices from different groups.For all the scenarios considered here, the ESDs of A / γ ( n ) and ˜A were very close, suggesting thattheir ESD might converge to the same distribution. For scenarios 1, 2, 4, and 5, the empirical densitiesof ˜A and A / γ ( n ) are close to the theoretical distribution derived by [3], suggesting that Avrachenkov etal. ’s results could be further investigated for models with different probabilities to connect vertices fromdifferent groups, and large p . For scenario 3 the empirical densities are farther from the theoretical den-sity. This might be due to numerical errors during ESD estimation or might indicate that Avrachenkov et al. ’s results do not apply to graphs with blocks of different sizes.6.2 Illustration of our convergence theorems for ER and DR models
To illustrate Theorems 5.1 and 5.2, we considered scenarios in which we know the ESD analytic form,such as the ER and d -regular random graph models. Note however that in the latter case, the convergenceis known for the adjacency matrix with no scaling, or scaled with √ d while our theorems rely on a4 of 26 DE SIQUEIRA SANTOS, FUJITA AND MATIAS √ n normalization. All experiments described in this Section rely on the optimize function from the R package stats to solve (4.1), with precision ε = − . It uses a combination of golden section searchand successive parabolic interpolation.For the ER model, we expect the ESD of A / √ n to converge to the semicircle law when p = ω ( n − ) .In that case, the assumptions of Theorems 5.1 and 5.2 hold when Θ = [ , . ] or Θ = [ . , ] , and thenthe (cid:96) -distance minimizer for problem (4.1) converges in probability to the true parameter. We generated1 000 graphs of sizes n = , , , , ,
10 000 with parameters p = . , . , . , . , .
9. For p < . p = .
5, and p > .
5, we considered the intervals Θ = [ , . ] , Θ = [ , ] , and Θ = [ . , ] ,respectively. In Figure 2, we show the average estimated parameter with a 95% confidence intervalfor the ER model by using the spectral density (blue solid line) and the cumulative distribution (greendashed line). The red dotted line indicates the true parameter. . . . . . . n A d j u s t ed pa r a m e t e r
20 50 100 500 1000 10000 (a) p = . . . . . n A d j u s t ed pa r a m e t e r
20 50 100 500 1000 10000 (b) p = . . . . . n A d j u s t ed pa r a m e t e r
20 50 100 500 1000 10000 (c) p = . . . . . . n A d j u s t ed pa r a m e t e r
20 50 100 500 1000 10000 (d) p = . . . . . . n A d j u s t ed pa r a m e t e r
20 50 100 500 1000 10000 (e) p = . F IG . 2: Average estimated parameter for the ER model. We generated 1 000 Erd˝os-R´enyi random graphsof sizes n = , , , , ,
10 000, and probability of connecting two vertices p = . p = . p = . p = . p = . x -axis and the y -axis correspond,respectively, to the graph size ( n ) and estimated parameter. The points and error bars indicate the averageestimated parameters and the corresponding 95% confidence intervals. The blue solid corresponds tothe results based on the (cid:96) -distance between the ESD and the limiting theoretical spectral density of theER model (semicircle law). The green dashed lines corresponds to the results based on the eigenvalues’cumulative distribution. The red dotted lines indicate the true value. ANDOM GRAPH MODEL FIT VIA SPECTRAL DENSITY
15 of 26Similarly, we performed simulation experiments with d -regular random graphs in scenarios in whichthe ESD of A converges weakly almost surely to the Kesten-McKay law. In what follows, we slightlymodified the procedure from Algorithm 4.1 to rely on the spectrum of unnormalized adjacency matri-ces A (rather than A / √ n ). We generated 1 000 graphs of sizes n = , , , , ,
10 000 withparameters d = , , , √ n , namely fixed or slowly growing values d . Finally, we set Θ = [ , n ] . Fig-ure 3 shows the average estimated parameter with a 95% confidence interval for the d -regular model byusing the spectral density (blue solid line) and the cumulative distribution (green dashed line). . . . . . n A d j u s t ed pa r a m e t e r
20 50 100 500 1000 10000 (a) d = . . . . . . . n A d j u s t ed pa r a m e t e r
20 50 100 500 1000 10000 (b) d = n A d j u s t ed pa r a m e t e r
20 50 100 500 1000 10000 (c) d = l l l l l l n A d j u s t ed pa r a m e t e r
20 50 100 500 1000 10000 l l l l l l (d) d = √ n F IG . 3: Average estimated parameter for the DR model. We generated 1 000 d -regular random graphsof sizes n = , , , , ,
10 000, and degree d = d = d =
10 (c), and d = √ n (d). In each plot, the x -axis and the y -axis correspond, respectively, to the graph size ( n ) and estimatedparameter. The points and error bars indicate the average estimated parameters and the corresponding95% confidence intervals. The blue solid lines correspond to the results based on the (cid:96) -distance betweenthe ESD and the limiting theoretical spectral density of the DR model (Kesten-McKay law). The greendashed lines correspond to the results based on the eigenvalues’ cumulative distribution. The red dottedlines indicate the true value.As expected by Theorems 5.1 and 5.2, we observe in Figures 2 and 3 that the estimated parameterapproximates the true parameter as the number of vertices increases.6 of 26 DE SIQUEIRA SANTOS, FUJITA AND MATIAS
Convergence of the procedure in other scenarios
We also considered scenarios in which the assumptions of Theorems 5.1 and 5.2 may not hold, and alimiting spectral density does not exist, or it is unknown, such as the GRG, WS, and BA random graphs.In that case, to evaluate the model fitting procedure’s performance based on the (cid:96) -distance, we estimatethe model’s spectral density empirically, as described in Algorithm 4.1.We generated 1 000 graphs of sizes n = , , , r = . , . , . , . , . p r = . , . , . , . , . p s = . , . , . , . , . [ , ] , and inspected values using a step size of0.01. For computing the eigenvalues, we used the √ n normalization of the adjacency matrix.Figure 4 shows the average estimated parameter and a 95% confidence interval for the GRG model.Results obtained for the WS and BA models are shown in Figures 5 and 6, respectively. The blue solidlines correspond to the results based on the (cid:96) -distance between the ESD of the observed graph andthe average ESD of the model, estimated as described by Algorithm 4.1. Similarly, the green dashedcorrespond to the results based on the eigenvalues’ empirical cumulative distribution. The red dottedlines indicate the true parameter. ANDOM GRAPH MODEL FIT VIA SPECTRAL DENSITY
17 of 26 . . . . . n A d j u s t ed pa r a m e t e r
50 100 500 1000 (a) r = . . . . . n A d j u s t ed pa r a m e t e r
50 100 500 1000 (b) r = . . . . . n A d j u s t ed pa r a m e t e r
50 100 500 1000 (c) r = . . . . . n A d j u s t ed pa r a m e t e r
50 100 500 1000 (d) r = . . . . . n A d j u s t ed pa r a m e t e r
50 100 500 1000 (e) r = . F IG . 4: Average estimated parameter for the GRG model. We generated 1 000 GRG random graphs ofsizes n = , , , r = . r = . r = . r = . r = . x -axis and the y -axis correspond, respectively, to the graph size ( n ) and estimatedparameter. The points and error bars indicate the average estimated parameters and the correspond-ing 95% confidence intervals. The blue solid lines correspond to the results based on the (cid:96) -distancebetween the ESD of the observed graph and the average ESD of the GRG model. The green dashedlines correspond to the results based on the eigenvalues’ cumulative distribution. The red dotted linesindicate the true value.8 of 26 DE SIQUEIRA SANTOS, FUJITA AND MATIAS . . . . . n A d j u s t ed pa r a m e t e r
50 100 500 1000 (a) p r = . . . . n A d j u s t ed pa r a m e t e r
50 100 500 1000 (b) p r = . . . . . n A d j u s t ed pa r a m e t e r
50 100 500 1000 (c) p r = . . . . . . . . n A d j u s t ed pa r a m e t e r
50 100 500 1000 (d) p r = . . . . . n A d j u s t ed pa r a m e t e r
50 100 500 1000 (e) p r = . F IG . 5: Average estimated parameter for the WS model. We generated 1 000 WS random graphs ofsizes n = , , , p r = . p r = . p r = . p r = . p r = . x -axis and the y -axis correspond, respectively, tothe graph size ( n ) and estimated parameter. The points and error bars indicate the average estimatedparameters and the corresponding 95% confidence intervals. The blue solid correspond to the resultsbased on the (cid:96) -distance between the ESD of the observed graph and the average ESD of the WS model.The green dashed lines correspond to the results based on the eigenvalues’ cumulative distribution. Thered dotted lines indicate the true value. ANDOM GRAPH MODEL FIT VIA SPECTRAL DENSITY
19 of 26 . . . . . . n A d j u s t ed pa r a m e t e r
50 100 500 1000 (a) p s = . . . . . . . . n A d j u s t ed pa r a m e t e r
50 100 500 1000 (b) p s = . . . . . . n A d j u s t ed pa r a m e t e r
50 100 500 1000 (c) p s = . . . . . . . . n A d j u s t ed pa r a m e t e r
50 100 500 1000 (d) p s = . . . . . . n A d j u s t ed pa r a m e t e r
50 100 500 1000 (e) p s = . F IG . 6: Average estimated parameter for the BA model. We generated 1 000 BA random graphs ofsizes n = , , , p s = . p s = . p s = . p s = . p s = . x -axis and the y -axis correspond, respectively, to the graph size ( n )and estimated parameter. The points and error bars indicate the average estimated parameters and thecorresponding 95% confidence intervals. The blue solid lines correspond to the results based on the (cid:96) -distance between the ESD of the observed graph and the average ESD of the BA model. The greendashed lines correspond to the results based on the eigenvalues’ cumulative distribution. The red dottedlines indicate the true value.For all models, in most scenarios, we observe that the estimated parameter approximates the truevalue as the graph size increases. However, in some cases, results for n =
500 and n = p s = . , . Model selection
To evaluate the performance of the model selection approach based on the (cid:96) -distance between theESD and its theoretical limit, we performed simulation experiments with 1 000 graphs of sizes n = , ,
500 generated by the following models: Erd˝os-R´enyi (ER) ( p = β ), Geometric (GRG) ( r = β ), d -regular (DR) ( d = β ), Watts-Strogatz (WS) ( p r = β ), and Barab´asi-Albert (BA) ( p s = + β ),where β = . , . , .
9. Tables 1, 2, 3 show the confusion matrices obtained by the methods based on0 of 26
DE SIQUEIRA SANTOS, FUJITA AND MATIAS
Table 1:
Model selection for graphs generated with β = .
1. The number of graphs correctly clas-sified by the model selection approach based on the empirical spectral density (left) and the empiricalcumulative distribution (right). For each model, we generated 1 000 graphs of sizes n = , , p = β for the Erd˝os-R´enyi(ER), the radius r = β for the geometric (GRG), the degree d = β for the d -regular (DR), the proba-bility of reconnecting edges p r = β for the Watts-Strogatz (WS), and the scale exponent p s = + β forthe Barab´asi-Albert (BA). n True Selected modelmodel ER DR GRG WS BAER 881 |
929 7 | | |
66 0 | | | | | |
050 GRG 4 |
20 0 | |
980 0 | | | | | | | |
16 0 | | | | |
998 7 | | | | | | | | | | | | | | | | | | | | | | | | |
998 6 | | | | | | | | | | | | | | | | | | | | | | | | β = . , .
5, and 0 .
9, respectively.We observed in Tables 1, 2, and 3 that the number of hits increases as the graphs become larger. Theonly exception is when β = .
7. Discussion
By reviewing results on the ESD of random graphs’ convergence properties, we observed that theWigner semicircle law could be applied to the ER and DR models when the adjacency matrices are ade-quately normalized, as shown in Table 4. For the BM random graph, we can express the limiting ESD ofthe centered and normalized matrix through the Stieltjes transform under certain conditions (blocks withthe same size, same probabilities of connecting vertices from different groups, and inter-groups proba-bility smaller than the intra-groups probabilities). Besides, our simulation experiments suggested thatconvergence of the ESD of non-centered adjacency matrices and centered BM adjacency matrices aresimilar. The ESD may also converge to the limiting ESD obtained through the Stieltjes transform whenthe probabilities to connect vertices from different groups are different, and the inter-groups probability p is larger than the intra-groups probabilities p m .The ESDs of the geometric random graph and the linear preferential attachment models convergeweakly almost surely to a limiting density function as n → ∞ . However, an analytical expression for this ANDOM GRAPH MODEL FIT VIA SPECTRAL DENSITY
21 of 26Table 2:
Model selection for graphs generated with β = . . The number of graphs correctly clas-sified by the model selection approach based on the empirical spectral density (left) and the empiricalcumulative distribution (right). For each model, we generated 1 000 graphs of sizes n = , , p = β for the Erd˝os-R´enyi(ER), the radius r = β for the geometric (GRG), the degree d = β for the d -regular (DR), the proba-bility of reconnecting edges p r = β for the Watts-Strogatz (WS), and the scale exponent p s = + β forthe Barab´asi-Albert (BA). n True Selected modelmodel ER DR GRG WS BAER 948 |
911 52 |
89 0 | | | | | | | |
050 GRG 2 | | | | | | | | | | | | | | | |
951 51 |
49 0 | | | | | | | | | | | | | | | | | | | | | | | |
922 92 |
78 0 | | | | | | | | | | | | | | | | | | | | | | | DE SIQUEIRA SANTOS, FUJITA AND MATIAS
Table 3:
Model selection for graphs generated with β = . . The number of graphs correctly clas-sified by the model selection approach based on the empirical spectral density (left) and the empiricalcumulative distribution (right). For each model, we generated 1 000 graphs of sizes n = , , p = β for the Erd˝os-R´enyi(ER), the radius r = β for the geometric (GRG), the degree d = β for the d -regular (DR), the proba-bility of reconnecting edges p r = β for the Watts-Strogatz (WS), and the scale exponent p s = + β forthe Barab´asi-Albert (BA). n True Selected modelmodel ER DR GRG WS BAER 1000 | | | | | | |
996 0 | | |
050 GRG 71 | | | | | |
22 0 | | |
978 0 | | | | | | | | | | | | | | | | | | | | | |
59 0 | | |
941 0 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A / √ n µ p , sc ( x ) = π p ( − p ) (cid:112) [ p ( − p ) − x ] + A / (cid:112) np ( − p ) µ sc ( x ) = π (cid:112) [ − x ] + DR A µ d ( x ) = d π ( d − x ) (cid:112) [ ( d − ) − x ] + A / √ d − µ d ( x ) = (cid:16) + d − − x d (cid:17) − π (cid:112) [ − x ] + BM ( A − E ( A )) / (cid:112) np (cid:63) ( − p (cid:63) ) expression through Stieltjes transformTable 4: Convergences of ESD in different models. ANDOM GRAPH MODEL FIT VIA SPECTRAL DENSITY
23 of 26limiting function is unknown.There is still no theoretical result on the ESD convergence for other models, such as the WS andBA. However, simulation experiments suggest that the ESD of those models approximate to a fixeddistribution as the number of vertices increases [18].Therefore, there are scenarios in which we may assume that the ESD converges weakly to a limitingdistribution, as required by Theorems 5.1 and 5.2. If all assumptions hold, then the minimizer of the (cid:96) -distance between the observed graph’s eigenvalue distribution and the limiting ESD of the modelconverges in probability to the true parameter as n → ∞ .When assumptions of Theorems 5.1 and 5.2 are satisfied, our simulations in Section 6.2 show that theestimated parameters approximate the true values as the graph size increases. When we can not obtain alimiting spectral distribution, we estimate it empirically by randomly sampling graphs from the randomgraph model (as described in Algorithm 4.1). In this case, the estimated parameter also approximates thetrue parameter as the number of vertices increases, as shown by simulation experiments in Section 6.3.The model-fitting procedure’s performance depends on the choice of the ESD or the CDF. Theconvergence of the estimated parameter based on the ESD, in general, is faster for the ER model. Themodel-fitting procedures’ performance based on the ESD and CDF was similar for the WS model. Forthe GRG and BA random graph models, we observed a better performance by using the CDF. The reasonis that sometimes ESD estimation/bandwidth choice is difficult. Thus, in that case, using the cumulativedistribution of the eigenvalues instead of the spectral density is more convenient.One of the main limitations of the procedure in Algorithm 4.1 is its high computational cost. In thefollowing complexity analysis, we neglect the number of models N that is supposed to be fixed (andsmall in general).The spectral decomposition has a complexity O ( n ) . Thus, when the analytical form of the limitingspectral distribution is known, the complexity of Algorithm 4.1 is O ( n ) , while when we use a MonteCarlo approximation for the limiting ESD, it increases to O ( n | ˜ Θ | M ) . Future research for reducingcomputational cost of Algorithm 4.1 could explore either a faster approximation of the spectrum of eachgraph and thus its ESD, or an approximation of this ESD. Indeed, a faster approximation of the ESDdoes not require the direct computation of all eigenvalues [27, 29], and has complexity of the order of O ( | E | C ) , where C is associated to the number of random vectors and number of moments used by theapproximation algorithm. For large graphs, we have C << | E | . Then, if this approximation is usedin our model fitting procedure, the final computational cost could be reduced to O ( | E | ) (when analyticform of limiting ESD is known) or O ( | E || ˜ Θ | M ) (when MC approximation is needed). Notice, however,that more experiments are necessary to validate these approximations.Furthermore, the procedure in Algorithm 4.1 might not be the most recommended approach forclassical models such as the ER and BM. The reason is that there are already accurate and efficientparameter estimators [1, 34] for them. However, Algorithm 4.1 gains in generality. Every time we fixa parametric model, we can build a procedure to estimate the parameter and a divergence between themodel and the observed graph. This generality is particularly useful when considering a list of randomgraph models, and we want to choose the one that best approximates the observed graph.Indeed, the experiments in Section 6.4 suggest that choosing the model that minimizes the (cid:96) -distance between spectral densities may present a hit rate of 100% when the graph size is large enough.However, it is essential to notice that certain combinations of parameters may generate similar graphsfrom different models, such as the ER and DR graphs.We focused on studying the ESD of the graph adjacency matrix. However, assumptions of Theo-rems 5.1 and 5.2 may also hold for other graph matrices, such as the Laplacian and normalized Lapla-cian, as discussed by [15, 23]. Therefore, results are general in terms of different random graph models4 of 26 DE SIQUEIRA SANTOS, FUJITA AND MATIAS and different graph matrices whose ESDs satisfy the assumptions of Theorems 5.1 and 5.2.
8. Conclusions
We reviewed the convergence properties of random graph’s ESDs. Our review and simulation studysuggest that different random graph models’ ESD converge to a limiting density function. Based onthe ESD’s convergence properties, we prove that the minimizer of the (cid:96) -distance of the ESD and thelimiting density function is a consistent estimator of the random graph model parameter. This resultalso holds for the CDF of the eigenvalues. The main advantage of this model-fitting approach is itsgenerality.Furthermore, our simulation experiments suggest that the procedures proposed by Takahashi et al.[35] may be modified to rely on (cid:96) -distance between eigenvalue distribution functions to select a randomgraph model.In our computational experiments we used code from the statGraph package available at CRAN( https://cran.r-project.org/web/packages/statGraph/ ). A zip file containing theR codes of all the experiments are available at . Acknowledgment
This work was supported by S˜ao Paulo Research Foundation [grant numbers 2015/21162-4, 2017/12074-0, 2018/21934-5, 2019/22845-9, 2020/08343-8], National Council for Scientific and TechnologicalDevelopment [grant number 303855/2019-3], Coordination for the Improvement of Higher EducationPersonnel (Finance code 001), Alexander von Humboldt Foundation, and the Academy of Medical Sci-ences - Newton Fund. R
EFERENCES Ambroise, C. & Matias, C. (2012) New consistent and asymptotically normal parameter estimates for random-graph mixture models.
Journal of the Royal Statistical Society: Series B (Statistical Methodology) , (1),3–35. Arnold, L. (1967) On the asymptotic distribution of the eigenvalues of random matrices.
Journal of Mathe-matical Analysis and Applications , (2), 262–268. Avrachenkov, K., Cottatellucci, L. & Kadavankandy, A. (2015) Spectral properties of random matrices forstochastic block model. In
The 2015 4th International Workshop on Physics-Inspired Paradigms in WirelessCommunications and Networks , pages 537–544. Bai, Z. D. (1999) Methodologies in spectral analysis of large dimensional random matrices, a review.
StatisticaSinica , (3), 611–662. Barab´asi, A.-L. & Albert, R. (1999) Emergence of Scaling in Random Networks.
Science , (5439), 509–512. Bauerschmidt, R., Knowles, A. & Yau, H.-T. (2017) Local Semicircle Law for Random Regular Graphs.
Comm. Pure Appl. Math. , (10), 1898–1960. Bhamidi, S., Evans, S. N. & Sen, A. (2012) Spectra of Large Random Trees.
Journal of Theoretical Probabil-ity , (3), 613–654. Bordenave, C. (2008) Eigenvalues of Euclidean random matrices.
Random Structures & Algorithms , (4),515–532. Breiger, R. L., Boorman, S. A. & Arabie, P. (1975) An algorithm for clustering relational data with appli-cations to social network analysis and comparison with multidimensional scaling.
Journal of MathematicalPsychology , (3), 328 – 383. ANDOM GRAPH MODEL FIT VIA SPECTRAL DENSITY
25 of 26
Chatterjee, S. & Diaconis, P. (2013) Estimating and understanding exponential random graph models.
Ann.Statist. , (5), 2428–2461. Chung, F. & Lu, L. (2002) Connected Components in Random Graphs with Given Expected DegreeSequences.
Annals of Combinatorics , (2), 125–145. Cs´ardi, G. & Nepusz, T. (2006)
The igraph software package for complex network research . R package version1.2.6.
Dacunha-Castelle, D. & Duflo, M. (1986)
Probability and statistics. Vol. II . Springer-Verlag, New York.
Davidson, E. & Levin, M. (2005) Gene regulatory networks.
PNAS , (14), 4935–4935. Ding, X. & Jiang, T. (2010) Spectral distributions of adjacency and Laplacian matrices of random graphs.
TheAnnals of Applied Probability , (6), 2086–2117. Dumitriu, I. & Pal, S. (2012) Sparse regular random graphs: Spectral density and eigenvectors.
The Annals ofProbability , (5), 2197–2235. Erd˝os, P. & R´enyi, A. (1959) On random graphs.
Publ. Math. Debrecen , , 290–297. Farkas, I. J., Der´enyi, I., Barab´asi, A.-L. & Vicsek, T. (2001) Spectra of “real-world” graphs: Beyond thesemicircle law.
Physical Review E , (2). Faust, K. & Wasserman, S. (1992) Blockmodels: Interpretation and evaluation.
Social Networks , (1), 5 –61. Special Issue on Blockmodels. Fienberg, S. E. & Wasserman, S. S. (1981) Categorical Data Analysis of Single Sociometric Relations.
Soci-ological Methodology , , 156–192. Frank, O. & Harary, F. (1982) Cluster inference by using transitivity indices in empirical graphs.
J. Amer.Statist. Assoc. , (380), 835–840. Grenander, U. (2008)
Probabilities on algebraic structures. Reprint of the 1963 original ed.
Mineola, NY:Dover Publications, reprint of the 1963 original ed. edition.
Gu, J., Jost, J., Liu, S. & Stadler, P. F. (2016) Spectral classes of regular, random, and empirical graphs.
LinearAlgebra and its Applications , , 30–49. Hamidouche, M., Cottatellucci, L. & Avrachenkov, K. (2019) Spectral Analysis of the Adjacency Matrixof Random Geometric Graphs. In , page 208–214. IEEE Press.
Holland, P. W., Laskey, K. B. & Leinhardt, S. (1983) Stochastic blockmodels: First steps.
Social networks , (2), 109–137. H¨uckel, E. (1931) Quantentheoretische Beitr¨age zum Benzolproblem.
Zeitschrift f¨ur Physik , (3), 204–286. Lin, L., Saad, Y. & Yang, C. (2016) Approximating Spectral Densities of Large Matrices.
SIAM Review , (1),34–65. McKay, B. D. (1981) The expected eigenvalue distribution of a large regular graph.
Linear Algebra and itsApplications , , 203–216. Newman, M. E. J., Zhang, X. & Nadakuditi, R. R. (2019) Spectra of random networks with arbitrary degrees.
Phys. Rev. E , (4), 042309. Pellegrini, M., Haynor, D. & Johnson, J. M. (2004) Protein interaction networks.
Expert Rev Proteomics , (2),239–249. Penrose, M. (2003)
Random geometric graphs , volume 5 of
Oxford Studies in Probability . Oxford: OxfordUniversity Press.
Preciado, V. M. & Rahimian, M. A. (2017) Moment-Based Spectral Analysis of Random Graphs with GivenExpected Degrees.
IEEE Transactions on Network Science and Engineering , (4), 215–228. Silverman, B. W. (1986)
Density Estimation for Statistics and Data Analysis . Chapman and Hall, Boca Raton.
Snijders, T. A. B. & Nowicki, K. (1997) Estimation and Prediction for Stochastic Blockmodels for Graphswith Latent Block Structure.
J. of Classification , (1), 75–100. Takahashi, D. Y., Sato, J. R., Ferreira, C. E. & Fujita, A. (2012) Discriminating Different Classes of BiologicalNetworks by Analyzing the Graphs Spectra Distribution.
PLoS ONE , (12), e49949. Tran, L. V., Vu, V. H. & Wang, K. (2013) Sparse Random Graphs: Eigenvalues and Eigenvectors.
RandomStruct. Algorithms , (1), 110–134. DE SIQUEIRA SANTOS, FUJITA AND MATIAS van den Heuvel, M. P. & Hulshoff Pol, H. E. (2010) Exploring the brain network: A review on resting-statefMRI functional connectivity.
European Neuropsychopharmacology , (8), 519–534. Von Collatz, L. & Sinogowitz, U. (1957) Spektren endlicher grafen.
Abh. Math. Semin. Univ. Hambg. , (1),63–77. Watts, D. J. & Strogatz, S. H. (1998) Collective dynamics of ‘small-world’ networks.
Nature , (6684),440–442. White, H. C., Boorman, S. A. & Breiger, R. L. (1976) Social Structure from Multiple Networks. I. Blockmod-els of Roles and Positions.
American Journal of Sociology , (4), 730–780. Wigner, E. P. (1958) On the Distribution of the Roots of Certain Symmetric Matrices.
Annals of Mathematics , (2), 325–327. Zhu, Y. (2020) A graphon approach to limiting spectral distributions of Wigner-type matrices.
Random Struc-tures & Algorithms ,56