Community Detection: Exact Recovery in Weighted Graphs
CCommunity Detection: Exact Recovery in WeightedGraphs
Mohammad Esmaeili and Aria Nosratinia
Department of Electrical and Computer Engineering, The University of Texas at DallasEmail: {Esmaeili, Aria}@utdallas.edu
Abstract —In community detection, the exact recovery of com-munities (clusters) has been mainly investigated under the generalstochastic block model with edges drawn from Bernoulli distri-butions. This paper considers the exact recovery of communitiesin a complete graph in which the graph edges are drawn fromeither a set of Gaussian distributions with community-dependentmeans and variances, or a set of exponential distributions withcommunity-dependent means. For each case, we introduce a newsemi-metric that describes sufficient and necessary conditionsof exact recovery. The necessary and sufficient conditions areasymptotically tight. The analysis is also extended to incomplete,fully connected weighted graphs.
I. I
NTRODUCTION
A main thrust of community detection literature has beenon the stochastic block model with the graph edges drawnfrom Bernoulli distributions [1]–[7], under various recoverymetrics [8]–[17], and algorithms [18]–[22]. Exact recoverythreshold of general stochastic block model was derived in [1]by approximating Binomial distributions by Poisson distribu-tions and utilizing the Chernoff-Hellinger divergence.While binary edges represent several practical applicationsand are analytically more tractable, there are many real-world graphs in which edge weights are better modelledby continuous values. For example, brain networks are in-trinsically weighted, reflecting a continuous distribution ofconnectivity strengths between different brain regions [23].Applications in communications, e.g., data forwarding inDelay Tolerant Networks (DTN) and worm containment inOnline Social Networks (OSN) [24] also are well representedwith continuous-valued weighted graphs. The edges of socialmedia networks can be of different types, such as simple,weighted, directed and multi-way (i.e. connecting more thantwo entities) depending on the network creation process [25].In biology, community detection is applied on weighted genenetworks for revealing cancers and anomalous tissues [26]. Forthese applications, the stochastic block model with continuousprobability density functions such as Gaussian distributions isthe more appropriate choice.For community detection from continuous-valued weightedgraphs, only a few information-theoretic results are known,mostly under Gaussian distributions. In [27], weak recoveryand exact recovery of a hidden community is investigatedwhile the edges are drawn from two different Gaussiandistributions. This is a symmetric version of the submatrixlocalization (also known as noisy biclustering) problem [18],[28], [29]. In submatrix localization problem, the task is to detect a small block (blocks) with atypical mean within alarge Gaussian matrix. Binary symmetric communities withGaussian distributions are investigated in [30]. The problemof detecting a sparse principal component based on a samplefrom a multivariate Gaussian distribution in high dimensionsis considered in [31].Community detection in a more general setting similarto [1] and under well-known continuous probability densityfunctions is an interesting and challenging problem from bothalgorithmic and information-theoretic perspectives. This paperinvestigates this problem and obtains information limits forexact recovery of communities.The contributions of this paper are as follows. First, weanalyze the exact recovery of node labels in a completegraph in which the edge weights are drawn from either aset of Gaussian or a set of Exponential distributions whoseparameters are determined by the latent labels. Under thismodel, sufficient and necessary conditions for exact recoveryare derived. Second, we extend the results to fully-connectedbut incomplete weighted graphs, by showing that under someconditions the inter and intra community probability distri-butions can be approximated by Gaussian distributions. Thecontributions of this paper and techniques that are used hereare widely applicable for other high-dimensional inferenceproblems such as sparse PCA, Gaussian mixture clustering,tensor PCA, and other community detection problems withcontinuous distributions.II. S
YSTEM M ODEL & M
AIN R ESULTS
Notation: P indicates the probability operator and P aprobability distribution which is identified by the choice ofits variables whenever there is no confusion. A matrix A hascolumns A i and elements A ij . R is the set of real numbers, R + is the set of non-negative real numbers, and R ++ is theset of positive real numbers.We start by considering a complete graph with n nodes.The graph nodes are divided into K communities, where K is finite. Let Q be an K × K matrix with entries Q ij . Anode from community i is connected to a node in community j by a weighted edge drawn from distribution Q ij . In thispaper, Q ij belongs to either a set of Gaussian or a set ofExponential distributions. Let p (cid:44) [ p , p , · · · , p K ] , where p i denote the size of community i . It is assumed that the size ofeach community is proportional to n , i.e., p i = (cid:98) ρ i n (cid:99) , where ρ i ∈ (0 , and (cid:80) Ki =1 ρ i = 1 . a r X i v : . [ c s . S I] F e b hen Q ij belongs to the set of Gaussian distributions, Q ij = N (¯ µ ij , ¯ σ ij ) . For this case, we define matrices µ and Σ with entries µ ij = p i ¯ µ ij and Σ ij = p i ¯ σ ij , respectively.When Q ij belongs to the set of Exponential distributions, Q ij = Exp (˜ λ ij ) . For this case, we define matrix λ with entries λ ij = ˜ λ ij .Under the model with Gaussian distributions, assume thateach edge is removed by a Bernoulli random variable. Then anedge from a node in community i to a node in community j is removed with probability − θ ij . To have a fully connectedgraph, we consider a regime in which θ ij = c ij log nn , where c ij is a constant. For this case, we define matrix θ with entries θ ij . In this paper, this model is called incomplete but fullyconnected weighted graph with Gaussian distributions.Now, we summarize the main results of this paper. Forconvenience define the following semi-metrics: D g ( ˜ µ , ˆ µ , ˜ Σ , ˆ Σ ) (cid:44) max t ∈ [0 , K (cid:88) k =1 (cid:40) ˜ µ k ˆΣ k t + ˆ µ k ˜Σ k (1 − t )2 ˜Σ k ˆΣ k − (cid:104) ˜ µ k ˆΣ k t + ˆ µ k ˜Σ k (1 − t ) (cid:105) k ˆΣ k (cid:104) ˆΣ k t + ˜Σ k (1 − t ) (cid:105) −
12 log (cid:32) ˜Σ − tk ˆΣ tk ˜Σ k (1 − t ) + ˆΣ k t (cid:33) (cid:41) , D e (˜ λ , ˆ λ , p ) (cid:44) max t ∈ [0 , K (cid:88) k =1 p k log (cid:32) (1 − t )˜ λ k + t ˆ λ k ˜ λ − tk ˆ λ tk (cid:33) . Theorem 1.
With Gaussian distributions, • when D g ( µ i , µ j , Σ i , Σ j ) = ω (log n ) exact recovery ofnode labels is possible if and only if min i,j,i (cid:54) = j D g ( µ i , µ j , Σ i , Σ j ) > . • when D g ( µ i , µ j , Σ i , Σ j ) = O (log n ) exact recovery ofnode labels is possible if and only if min i,j,i (cid:54) = j lim n →∞ D g ( µ i , µ j , Σ i , Σ j )log n > . Theorem 2.
With Exponential distributions, • when D e ( λ i , λ j , p ) = ω (log n ) , exact recovery of nodelabels is possible if and only if min i,j,i (cid:54) = j D e ( λ i , λ j , p ) > . • when D e ( λ i , λ j , p )) = O (log n ) , exact recovery of nodelabels is possible if and only if min i,j,i (cid:54) = j lim n →∞ D e ( λ i , λ j , p )log n > . Remark 1.
For both the Gaussian and the Exponential cases,when the related semi-metric is ω (log n ) , the exact recoverycondition is equivalent to Q i (cid:54) = Q j ∀ i (cid:54) = j. Corollary 1.
For a fully connected weighted but incomplete graph whose edge weights are Gaussian distributed, exactrecovery of node labels is possible if and only if min i,j,i (cid:54) = j lim n →∞ D g ( µ i , µ j , Σ i , Σ j )log n > , where µ ij = p i ¯ µ ij θ ij ∀ i, j, Σ ij = p i θ ij [¯ σ ij + (1 − θ ij )¯ µ ij ] ∀ i, j. III. P
ROOFS
At each node, our problem is equivalent to testing a hypoth-esis H indicating which community the node belongs to, outof the set of K communities. In our setting, this is a Bayesianproblem with prior P ( H = i ) = ρ i . For each node, let W bea random vector with entries W i representing the summationof edge weights connecting a node of interest to nodes incommunity i .Assume that all node labels are revealed except for one,whose community membership H is to be derived based onan observation of W . The maximum a posteriori estimator(MAP) is argmax i ρ i P ( w | H = i ) . A simple comparison can eliminate a candidate, i.e, if ρ i P ( w | H = i ) < ρ k P ( w | H = k ) , (1)then H (cid:54) = i . Therefore, a set of pairwise comparisons of thehypotheses reveals the MAP. Assume that the true hypothesisis H = i . Denote by B ( i, k ) the region of W for which (1) issatisfied, i.e., H = i has a worse metric compared with H = k .Also denote by B ( i ) the region for W where the overall MAPestimator is in error. Then the probability of error is P e = (cid:88) i ρ i P ( W ∈ B ( i ) | H = i ) . Since B ( i ) ⊂ (cid:83) Kk =1 B ( i, k ) , P e ≤ (cid:88) i (cid:88) k,k (cid:54) = i ρ i P ( W ∈ B ( i, k ) | H = i ) . (2)Define I ( w , i, k ) (cid:44) min { ρ i P ( w | H = i ) , ρ k P ( w | H = k ) } , and note that I ( w , i, k ) = (cid:40) ρ i P ( w | H = i ) when W ∈ B ( i, k ) ρ k P ( w | H = k ) when W ∈ B c ( i, k ) . (3)Substituting (3) into (2), P e ≤ (cid:90) (cid:88) i (cid:88) k>i I ( w , i, k ) d w . (4)The error is bounded from below by P e ≥ K − (cid:90) (cid:88) i (cid:88) k>i I ( w , i, k ) d w , (5)ecause (cid:88) kk (cid:54) = i P ( W ∈ B ( i, k ) | H = i ) ≤ ( K − P ( W ∈ B ( i ) | H = i ) . Therefore, the error probability is bounded by controlling (cid:90) I ( w , i, k ) d w . (6) A. Proof of Theorem 1
For a node in community i , the edge sums W j are dis-tributed according to N ( p j ¯ µ ji , p j ¯ σ ji ) , and are independent ofeach other. We collect these edge sums into the vector W ,which obeys a multivariate Gaussian distribution with meandenoted µ i and covariance matrix diag( Σ i ) . Then f ( w ; µ i , Σ i ) (cid:44) P ( w | H = i )= K (cid:89) k =1 √ π Σ ki exp (cid:18) − ( w k − µ ki ) ki (cid:19) , where µ ki = p k ¯ µ ki and Σ ki = p k ¯ σ ki . Lemma 1.
Let ˜ µ , ˆ µ ∈ R K , ˜ Σ , ˆ Σ ∈ R K ++ , and ˜ ρ, ˆ ρ ∈ R ++ . Ifeither ˜ µ (cid:54) = ˆ µ or ˜ Σ (cid:54) = ˆ Σ , then (cid:90) R K min { ˜ ρ f ( w ; ˜ µ , ˜ Σ ) , ˆ ρ f ( w ; ˆ µ , ˆ Σ ) } d w ≤ e − D g (˜ µ , ˆ µ , ˜ Σ , ˆ Σ )+ c , (cid:90) R K min { ˜ ρf ( w ; ˜ µ , ˜ Σ ) , ˆ ρf ( w ; ˆ µ , ˆ Σ ) } d w ≥ e − D g (˜ µ , ˆ µ , ˜ Σ , ˆ Σ )+ c , where c and c are some constants.Proof. Define g ( t ) (cid:44) (cid:32) f ( w ; ˜ µ , ˜ Σ ) f ( w ; ˆ µ , ˆ Σ ) (cid:33) − t ,g ( t ) (cid:44) (cid:32) f ( w ; ˆ µ , ˆ Σ ) f ( w ; ˜ µ , ˜ Σ ) (cid:33) t ,g ( t ) (cid:44) f ( w ; ˜ µ , ˜ Σ ) t f ( w ; ˆ µ , ˆ Σ ) − t , in which the dependence of g ( t ) , g ( t ) , and g ( t ) on w issuppressed for notational convenience. Note that g ( t ) can berestated as g ( t ) = e − (cid:80) Kk =1 D k ( t ) K (cid:89) k =1 (cid:112) πσ k ( t ) exp (cid:18) − ( w k − µ k ( t )) σ k ( t ) (cid:19) , where µ k ( t ) (cid:44) ˜ µ k ˆΣ k t + ˆ µ k ˜Σ k (1 − t )ˆΣ k t + ˜Σ k (1 − t ) ,σ k ( t ) (cid:44) ˜Σ k ˆΣ k ˆΣ k t + ˜Σ k (1 − t ) ,D k ( t ) (cid:44) ˜ µ k ˆΣ k t + ˆ µ k ˜Σ k (1 − t )2 ˜Σ k ˆΣ k − (cid:104) ˜ µ k ˆΣ k t + ˆ µ k ˜Σ k (1 − t ) (cid:105) k ˆΣ k (cid:104) ˆΣ k t + ˜Σ k (1 − t ) (cid:105) −
12 log (cid:32) ˜Σ − tk ˆΣ tk ˜Σ k (1 − t ) + ˆΣ k t (cid:33) . Lemma 2.
For any t ∈ [0 , , min { g ( t ) , g ( t ) } ≤ .Proof. Both g ( t ) and g ( t ) are monotonic and g ( t ) g ( t ) is a pos-itive constant (does not depend on t ), thus min { g ( t ) , g ( t ) } is also monotonic in t . Since g (1) = g (0) = 1 , for all t wehave: min { g ( t ) , g ( t ) } ≤ . It can be shown that for any t ∈ [0 , , (cid:90) R K min { ˜ ρf ( w ; ˜ µ , ˜ Σ ) , ˆ ρf ( w ; ˆ µ , ˆ Σ ) } d w ≤ max { ˜ ρ, ˆ ρ } (cid:90) R K min { f ( w ; ˜ µ , ˜ Σ ) , f ( w ; ˆ µ , ˆ Σ ) } d w = max { ˜ ρ, ˆ ρ } (cid:90) R K g ( t ) min { g ( t ) , g ( t ) } d w ≤ max { ˜ ρ, ˆ ρ } e − (cid:80) Kk =1 D k ( t ) , where the last inequality holds due to Lemma 2 and (cid:90) R K K (cid:89) k =1 (cid:112) πσ k ( t ) exp (cid:18) − ( w k − µ k ( t )) σ k ( t ) (cid:19) d w = 1 . When t is chosen to minimize e − (cid:80) Kk =1 D k ( t ) , (cid:90) R K min { ˜ ρf ( w ; ˜ µ , ˜ Σ ) , ˆ ρf ( w ; ˆ µ , ˆ Σ ) } d w ≤ max { ˜ ρ, ˆ ρ } e − D g (˜ µ , ˆ µ , ˜ Σ , ˆ Σ ) . To prove the second half, note that min { g ( t ∗ ) , g ( t ∗ ) } = g ( τ ∗ ) , (7)where τ ∗ (cid:44) t ∗ if min { g ( t ∗ ) , g ( t ∗ ) } = g ( t ∗ ) ; Otherwise τ ∗ (cid:44) t ∗ − . Hence, at t ∗ , (cid:90) R K min { ˜ ρf ( w ; ˜ µ , ˜ Σ ) , ˆ ρf ( w ; ˆ µ , ˆ Σ ) } d w ≥ min { ˜ ρ, ˆ ρ } (cid:90) R K min { f ( w ; ˜ µ , ˜ Σ ) , f ( w ; ˆ µ , ˆ Σ ) } d w = min { ˜ ρ, ˆ ρ } (cid:90) R K g ( t ∗ ) min { g ( t ∗ ) , g ( t ∗ ) } d w = min { ˜ ρ, ˆ ρ } e − D g (˜ µ , ˆ µ , ˜ Σ , ˆ Σ ) × (cid:90) R K K (cid:89) k =1 (cid:112) πσ k ( t ∗ ) e − ( wk − µk ( t ∗ ))22 σ k ( t ∗ ) g ( w k , τ ∗ ) d w , here g ( w k , τ ∗ ) = (cid:32) ˜Σ k ˆΣ k (cid:33) τ ∗ e − (cid:20) ( wk − ˆ µk )22ˆΣ k − ( wk − ˜ µk )22˜Σ k (cid:21) τ ∗ . Since g ( w k , τ ∗ ) is a non-negative and integrable function of w k , applying a generalized variant of the mean value Theorem,there exists w ∗ k such that (cid:90) R (cid:112) πσ k ( t ∗ ) e − ( wk − µk ( t ∗ ))22 σ k ( t ∗ ) g ( w k , τ ∗ ) d w k = g ( w ∗ k , τ ∗ ) . It can be shown that at τ ∗ , g ( w ∗ k , τ ∗ ) is a positive constant.Therefore, (cid:90) R K min { ˜ ρf ( w ; ˜ µ , ˜ Σ ) , ˆ ρf ( w ; ˆ µ , ˆ Σ ) } d w ≥ min { ˜ ρ, ˆ ρ } e − D g (˜ µ , ˆ µ , ˜ Σ , ˆ Σ )+ c , where c is a constant.Using Lemma 1 and the bounds (4) and (5), P e ≤ e − D g ( µ i , µ j , Σ i , Σ j )+ c ,P e ≥ e − D g ( µ i , µ j , Σ i , Σ j )+ c . When D g ( µ i , µ j , Σ i , Σ j ) = ω (log n ) , as n goes to infinity,exact recovery is possible if and only if min i,j,i (cid:54) = j D g ( µ i , µ j , Σ i , Σ j ) > . If µ i is close to µ j and Σ i is close to Σ j , then D g ( µ i , µ j , Σ i , Σ j ) = O (log n ) . In this regime, exact recoveryis possible if and only if min i,j,i (cid:54) = j lim n →∞ D g ( µ i , µ j , Σ i , Σ j )log n > . B. Proof of Theorem 2
If the node of interest belongs to community i , W j isdistributed according to Gamma( p j , λ ji ) . The vector W hasindependent Gamma entries with different means p j /λ ji .Under H = i , random variable W is drawn from a multivariateGamma distribution with shape parameter p and rate parameter λ i ∈ R K ++ . Then f ( w ; p , λ i ) (cid:44) P ( w | H = i ) = K (cid:89) k =1 λ kip k Γ( p k ) w p k − k e − λ ki w k . Lemma 3.
Let ˜ λ , ˆ λ ∈ R K ++ , p ∈ R K ++ , and ˜ ρ, ˆ ρ ∈ R ++ . If ˜ λ (cid:54) = ˆ λ , (cid:90) R K + min { ˜ ρf ( w ; p , ˜ λ ) , ˆ ρf ( w ; p , ˆ λ ) } d w ≤ e − D e (˜ λ , ˆ λ , p )+ c , (cid:90) R K + min { ˜ ρf ( w ; p , ˜ λ ) , ˆ ρf ( w ; p , ˆ λ ) } d w ≥ e − D e (˜ λ , ˆ λ , p )+ c , where c and c are some constants. Proof. Define g ( t ) (cid:44) (cid:32) f ( w ; p , ˜ λ ) f ( w ; p , ˆ λ ) (cid:33) − t ,g ( t ) (cid:44) (cid:32) f ( w ; p , ˆ λ ) f ( w ; p , ˜ λ ) (cid:33) t ,g ( t ) (cid:44) f ( w ; p , ˜ λ ) t f ( w ; p , ˆ λ ) − t , in which the dependence of g ( t ) , g ( t ) , and g ( t ) on w issuppressed for notational convenience. Notice that Lemma 2holds also in this case. For any t ∈ [0 , , (cid:90) R K + min { ˜ ρf ( w ; p , ˜ λ ) , ˆ ρf ( w ; p , ˆ λ ) } d w ≤ max { ˜ ρ, ˆ ρ } (cid:90) R K + min { f ( w ; p , ˜ λ ) , f ( w ; p , ˆ λ ) } d w = max { ˜ ρ, ˆ ρ } (cid:90) R K + g ( t ) min { g ( t ) , g ( t ) } d w ≤ max { ˜ ρ, ˆ ρ } e − (cid:80) Kk =1 p k log (cid:18) (1 − t )˜ λk + t ˆ λk ˜ λ − tk ˆ λtk (cid:19) , where the last inequality holds due to Lemma 2 and (cid:90) R K + K (cid:89) k =1 ( λ k ( t )) p k Γ( p k ) w p k − k e − w k λ k ( t ) d w = 1 , where λ k ( t ) (cid:44) (1 − t )˜ λ k + t ˆ λ k . When t is chosen to maximize (cid:80) Kk =1 p k log (cid:16) (1 − t )˜ λ k + t ˆ λ k ˜ λ − tk ˆ λ tk (cid:17) , (cid:90) R K + min { ˜ ρf ( w ; p , ˜ λ ) , ˆ ρf ( w ; p , ˆ λ ) } d w ≤ max { ˜ ρ, ˆ ρ } e − D e (˜ λ , ˆ λ , p ) . Notice that (7) holds also in this case. Hence, at t ∗ , (cid:90) R K + min { ˜ ρf ( w ; p , ˜ λ ) , ˆ ρf ( w ; p , ˆ λ ) } d w ≥ min { ˜ ρ, ˆ ρ } (cid:90) R K + min { f ( w ; p , ˜ λ ) , f ( w ; p , ˆ λ ) } d w = min { ˜ ρ, ˆ ρ } (cid:90) R K + g ( t ∗ ) min { g ( t ∗ ) , g ( t ∗ ) } d w = min { ˜ ρ, ˆ ρ } e − D e (˜ λ , ˆ λ , p ) × (cid:90) R K + K (cid:89) k =1 ( λ k ( t ∗ )) p k Γ( p k ) ( w k ) p k − e − w k λ k ( t ∗ ) g ( w k , τ ∗ ) d w , where g ( w k , τ ∗ ) = (cid:32) ˜ λ k ˆ λ k (cid:33) p k τ ∗ e − w k τ ∗ (˜ λ k − ˆ λ k ) . Since g ( w k , τ ∗ ) is a non-negative and integrable function of w k , applying a generalized variant of mean value Theorem,there exists w ∗ k such that (cid:90) R + (cid:112) πσ k ( t ∗ ) e − ( wk − µk ( t ∗ ))22 σ k ( t ∗ ) g ( w k , τ ∗ ) d w k = g ( w ∗ k , τ ∗ ) . t can be shown that at τ ∗ , g ( w ∗ k , τ ∗ ) is a positive constant.Therefore, (cid:90) R K + min { ˜ ρf ( w ; p , ˜ λ ) , ˆ ρf ( w ; p , ˆ λ ) } d w ≥ min { ˜ ρ, ˆ ρ } e − D e (˜ λ , ˆ λ , p )+ c , where c is a constant.Using Lemma 3 and the bounds (4) and (5), for someconstants c and c , P e ≤ e − D e ( λ i , λ j , p )+ c ,P e ≥ e − D e ( λ i , λ j , p )+ c . When D e ( λ i , λ j , p ) = ω (log n ) , as n goes to infinity, exactrecovery is possible if and only if min i,j,i (cid:54) = j D e ( λ i , λ j , p ) > . If λ i is close to λ j , then D e ( λ i , λ j , p ) = O (log n ) . Inthis regime, lim n →∞ D g ( µ i , µ j , Σ i , Σ j )log n is a constant and exactrecovery is possible if and only if min i,j,i (cid:54) = j lim n →∞ D e ( λ i , λ j , p )log n > . IV. I
NCOMPLETE BUT F ULLY C ONNECTED W EIGHTED G RAPHS
Let X ∼ Bern( θ ) and Y ∼ N ( µ, σ ) . Then Z (cid:44) XY is arandom variable with probability density function f Z ( z ) = θf Y ( z ) + (1 − θ ) δ ( z ) , where f Y ( y ) is the probability density function of Y and δ ( z ) is Dirac delta function. Then the probability density functionof (cid:80) ni =1 Z i is n (cid:88) i =0 (cid:18) ni (cid:19) θ i (1 − θ ) n − i { f Y ( z ) } (cid:126) i (cid:126) δ ( z ) , (8)where (cid:126) denotes the convolution operator. In (8), for each i , { f Y ( z ) } (cid:126) i is a Gaussian probability density function withmean iµ and variance of iσ . If θ is in order of log nn and µσ ≤ , then the probability density function (8) iswell-enough approximated by a Gaussian distribution withmean nµθ and variance of nθ [ σ + (1 − θ ) µ ] . Figure 1compares the probability density function (8) and its Gaussianapproximation under the conditions mentioned above.Using this approximation and following Theorem 1, when Q ij = N (¯ µ ij , ¯ σ ij ) , exact recovery of node labels is possibleif and only if min i,j,i (cid:54) = j lim n →∞ D g ( µ i , µ j , Σ i , Σ j )log n > , where µ ij = p i ¯ µ ij θ ij , Σ ij = p i θ ij [¯ σ ij + (1 − θ ij )¯ µ ij ] . (a) µ = 0 , σ = 1 (b) µ = 4 , σ = 1 (c) µ = 6 , σ = 1 Fig. 1: True distribution (8) and its approximation for θ = log nn , n = 10000 , and different values of µσ . EFERENCES[1] E. Abbe and C. Sandon, “Community detection in general stochasticblock models: Fundamental limits and efficient algorithms for recovery,”in . IEEE, 2015, pp. 670–688.[2] P. W. Holland, K. B. Laskey, and S. Leinhardt, “Stochastic blockmodels:First steps,”
Social networks , vol. 5, no. 2, pp. 109–137, 1983.[3] B. Hajek, Y. Wu, and J. Xu, “Exact recovery threshold in the binarycensored block model,” in . IEEE, 2015, pp. 99–103.[4] A. Saade, M. Lelarge, F. Krzakala, and L. Zdeborová, “Spectral detectionin the censored block model,” in . IEEE, 2015, pp. 1184–1188.[5] M. Esmaeili, H. Saad, and A. Nosratinia, “Community detection withside information via semidefinite programming,” in . IEEE, 2019, pp.420–424.[6] P. Fronczak, A. Fronczak, and M. Bujok, “Exponential random graphmodels for networks with community structure,”
Physical Review E ,vol. 88, no. 3, p. 032810, 2013.[7] M. Esmaeili and A. Nosratinia, “Community detection with secondarylatent variables,” in , 2020, pp. 1355–1360.[8] A. Decelle, F. Krzakala, C. Moore, and L. Zdeborová, “Inference andphase transitions in the detection of modules in sparse networks,”
Physical Review Letters , vol. 107, no. 6, p. 065701, 2011.[9] E. Mossel, J. Neeman, and A. Sly, “Reconstruction and estimation inthe planted partition model,”
Probability Theory and Related Fields , vol.162, no. 3-4, pp. 431–461, 2015.[10] L. Massoulié, “Community detection thresholds and the weak ramanujanproperty,” in
Proceedings of the forty-sixth annual ACM symposium onTheory of computing , 2014, pp. 694–703.[11] E. Mossel, J. Neeman, and A. Sly, “A proof of the block model thresholdconjecture,”
Combinatorica , vol. 38, no. 3, pp. 665–708, 2018.[12] H. Saad, A. Abotabl, and A. Nosratinia, “Exit analysis for beliefpropagation in degree-correlated stochastic block models,” in . IEEE, 2016,pp. 775–779.[13] E. Mossel and J. Xu, “Density evolution in the degree-correlatedstochastic block model,” in
Conference on Learning Theory , 2016, pp.1319–1356.[14] S.-Y. Yun and A. Proutiere, “Community detection via random andadaptive sampling,” in
Conference on learning theory , 2014, pp. 138–175.[15] E. Abbe, “Community detection and stochastic block models: recentdevelopments,”
The Journal of Machine Learning Research , vol. 18,no. 1, pp. 6446–6531, 2017.[16] E. Abbe, A. S. Bandeira, and G. Hall, “Exact recovery in the stochasticblock model,”
IEEE Transactions on Information Theory , vol. 62, no. 1,pp. 471–487, 2015.[17] E. Mossel, J. Neeman, and A. Sly, “Consistency thresholds for theplanted bisection model,” in
Proceedings of the forty-seventh annualACM symposium on Theory of computing , 2015, pp. 69–75.[18] Y. Chen and J. Xu, “Statistical-computational tradeoffs in plantedproblems and submatrix localization with a growing number of clustersand submatrices,”
The Journal of Machine Learning Research , vol. 17,no. 1, pp. 882–938, 2016.[19] E. Mossel, J. Neeman, and A. Sly, “Belief propagation, robust recon-struction and optimal recovery of block models,” in
Conference onLearning Theory , 2014, pp. 356–370.[20] M. Esmaeili, H. Saad, and A. Nosratinia, “Exact recovery by semidef-inite programming in the binary stochastic block model with partiallyrevealed side information,” in
ICASSP 2019 - 2019 IEEE InternationalConference on Acoustics, Speech and Signal Processing (ICASSP) , 2019,pp. 3477–3481.[21] A. A. Amini, E. Levina et al. , “On semidefinite relaxations for the blockmodel,”
The Annals of Statistics , vol. 46, no. 1, pp. 149–179, 2018.[22] B. Hajek, Y. Wu, and J. Xu, “Achieving exact cluster recovery thresh-old via semidefinite programming,”
IEEE Transactions on InformationTheory , vol. 62, no. 5, pp. 2788–2797, 2016.[23] C. Nicolini, C. Bordier, and A. Bifone, “Community detection inweighted brain connectivity networks beyond the resolution limit,”
Neuroimage , vol. 146, pp. 28–39, 2017. [24] Z. Lu, X. Sun, Y. Wen, G. Cao, and T. La Porta, “Algorithms andapplications for community detection in weighted networks,”
IEEETransactions on Parallel and Distributed Systems , vol. 26, no. 11, pp.2916–2926, 2014.[25] S. Papadopoulos, Y. Kompatsiaris, A. Vakali, and P. Spyridonos, “Com-munity detection in social media,”
Data Mining and Knowledge Discov-ery , vol. 24, no. 3, pp. 515–554, 2012.[26] L. Cantini, E. Medico, S. Fortunato, and M. Caselle, “Detection of genecommunities in multi-networks reveals cancer drivers,”
Scientific reports ,vol. 5, p. 17386, 2015.[27] B. Hajek, Y. Wu, and J. Xu, “Information limits for recovering a hiddencommunity,”
IEEE Transactions on Information Theory , vol. 63, no. 8,pp. 4729–4745, 2017.[28] Butucea, Cristina, Ingster, Yuri I., and Suslina, Irina A., “Sharpvariable selection of a sparse submatrix in a high-dimensional noisymatrix,”
ESAIM: PS , vol. 19, pp. 115–134, 2015. [Online]. Available:https://doi.org/10.1051/ps/2014017[29] M. Kolar, S. Balakrishnan, A. Rinaldo, and A. Singh, “Minimax local-ization of structural information in large noisy matrices,” in
Advancesin Neural Information Processing Systems , 2011, pp. 909–917.[30] Y. Wu and J. Xu, “Statistical problems with planted structures:Information-theoretical and computational limits,” arXiv preprintarXiv:1806.00118 , 2018.[31] Q. Berthet, P. Rigollet et al. , “Optimal detection of sparse principalcomponents in high dimension,”