[PDF] A Practical Two-Sample Test for Weighted Random Graphs

Abstract

Network (graph) data analysis is a popular research topic in statistics and machine learning. In application, one is frequently confronted with graph two-sample hypothesis testing where the goal is to test the difference between two graph populations. Several statistical tests have been devised for this purpose in the context of binary graphs. However, many of the practical networks are weighted and existing procedures can't be directly applied to weighted graphs. In this paper, we study the weighted graph two-sample hypothesis testing problem and propose a practical test statistic. We prove that the proposed test statistic converges in distribution to the standard normal distribution under the null hypothesis and analyze its power theoretically. The simulation study shows that the proposed test has satisfactory performance and it substantially outperforms the existing counterpart in the binary graph case. A real data application is provided to illustrate the method.

Full PDF

aa r X i v : . [ s t a t . M E ] J a n A Practical Two-Sample Test for Weighted RandomGraphs

Mingao Yuan a, ∗ , Qian Wen a a Department of Statistics, North Dakota State University, Fargo, ND,USA, 58102.

Abstract

Network (graph) data analysis is a popular research topic in statistics andmachine learning. In application, one is frequently confronted with graphtwo-sample hypothesis testing where the goal is to test the diﬀerence be-tween two graph populations. Several statistical tests have been devised forthis purpose in the context of binary graphs. However, many of the practicalnetworks are weighted and existing procedures can’t be directly applied toweighted graphs. In this paper, we study the weighted graph two-sample hy-pothesis testing problem and propose a practical test statistic. We prove thatthe proposed test statistic converges in distribution to the standard normaldistribution under the null hypothesis and analyze its power theoretically.The simulation study shows that the proposed test has satisfactory perfor-mance and it substantially outperforms the existing counterpart in the binarygraph case. A real data application is provided to illustrate the method.

Keywords: two-sample hypothesis test, random graph, weighted graph

1. Introduction

A graph or network G = ( V, E ) is a mathematical model that consistsof a set V of nodes (vertices) and a set E of edges. In the last decades, ithas been widely used to represent a variety of systems in various regimes[20, 10, 12, 19, 9]. For instance, in social networks, a node denotes an indi-vidual and an edge represents the interaction between two individuals [12]; ∗ Corresponding author

Email addresses: [email protected] (Mingao Yuan), [email protected] (Qian Wen)

Preprint submitted to Journal Name February 1, 2021 n brain graphs, a node may be a neural unit and the functional link be-tween two units forms an edge [14]; in co-authorship networks, the authorsof a collection of articles are the nodes and an edge is deﬁned to be the co-authorship of two authors [20]. Due to the widespread applications, networkdata analysis has drawn a lot of attentions in both statistical and machinelearning communities [1, 2, 5, 7, 13, 18, 24]. Most of the existing literaturefocus on mining a single network, such as community detection [1, 2, 7, 5],global testing of the community structures [7, 18, 13, 24] and so on. In prac-tice, a number of graphs from multiple populations may be available. Forexample, in the 1000 Functional Connectomes Project, 1093 fMRI (weighted)networks were collected from subjects located in 24 communities [8]; to studythe relation between Alzheimer’s disease and a functional disconnection ofdistant brain areas, dozens of functional connectivity (weighted) networksfrom patients and control subjects were constructed [21]. In this case, a nat-ural and fundamental question is to test the diﬀerence between two graphpopulations, known as graph two-sample hypothesis testing.There are a few literature dealing with the graph two-sample hypothesistesting problem [8, 22, 15, 16]. Speciﬁcally, [8] ﬁrstly investigated this prob-lem and proposed a χ -type test. In [22], the authors developed a kernel-based test statistic for random dot product graph models. Under a moregeneral setting, [15] studied the graph two sample test from a minimax test-ing perspective and proposed testing procedures based on graph distancesuch as Frobenius norm or operator norm. The threshold of the test statis-tics in [15] could be calculated by concentration inequalities, which usuallymakes the test very conservative [16]. To overcome this issue, [16] derivedthe asymptotic distribution of the test statistic and proposed practical testmethods that outperform existing methods.In practice, most of the graphs are weighted [8, 21, 23, 4, 3]. The testingprocedures in [16, 15, 22, 11] are designed under the context of binary (un-weighted) graphs and the tests can’t be directly applied to weighted graphs(See Section 3 for an example). Consequently, before using these tests, onehas to artiﬁcially convert weighted graphs into binary graphs, which can re-sult in a loss of information [23, 4, 3]. Motivated by the T fro test in [16],we propose a powerful test statistic for weighted graph two-sample hypoth-esis testing. Under the null hypothesis, the proposed test statistic convergesin distribution to the standard normal distribution and the power of thetest is theoretically characterized. Simulation study shows that the test canachieve high power and it substantially outperforms its counterpart in the2inary graph case. Besides, we apply the proposed test to a real data.The rest of the paper is organized as follows. In Section 2, we formallystate the weighted graph two-sample hypothesis testing problem and presentthe theoretical results. In Section 3, we present the simulation study resultsand real data application. The proof of main result is deferred to Section 4.

2. Weighted Graph Two-Sample Hypothesis Test

For convenience, let X ∼ F represent random variable X follows dis-tribution F and let Bern ( r ) denote the Bernoulli distribution with successprobability r .Let V = { , , . . . , n } be a vertex (node) set and G = ( V, E ) denote anundirected graph on V with edge set E . The adjacency matrix of graph G is a symmetric matrix A ∈ { , } n × n such that A ij = 1 if ( i, j ) ∈ E and 0otherwise. The graph G is binary or unweighted, since A ij only records theexistence of an edge. If A ij ∼ Bern ( p ij ) , ≤ p ij ≤

1, then the graph G iscalled an inhomogeneous random graph(inhomogeneous Erd¨os-R´enyi graph).Let µ = ( µ ij ) ≤ i

3n the binary graph case ( Q ij ( µ ij ) = Bern ( µ ij ) , ≤ i < j ≤ n ), severaltesting procedures for (1) are available in the literature. For m → ∞ andsmall n , a χ -type test was proposed in [8]. For m = 1 and n → ∞ , underthe random dot product model, a nonparametric test statistic was developedin [22], and a test based on eigenvalues of adjacency matrix under the inho-mogeneous random graph could be found in [16]. A more practical case issmall m ( m ≥

2) and n → ∞ . In this case, a test called T fro was proposed in[15] and its asymptotic behavior was studied in [16]. Recently, [11] proposeda test statistic based on the largest eigenvalue of a Wigner matrix.In this work, we study (1) for a broad class of distributions Q and focuson the regime m ≥ n → ∞ . The sample size m could be either ﬁxedor tend to inﬁnity along with n .To deﬁne the test statistic, the two samples G k , H k , (1 ≤ k ≤ m ) arerandomly partitioned into two parts, denoted as G k , H k , (1 ≤ k ≤ m/

2) and G k , H k , ( m/ < k ≤ m ) with a little notation abuse. Let s n = P ≤ i m ( A G k ,ij − A H k ,ij ) . We propose the following test statistic for (1): T n = P ≤ i

Suppose n = o (cid:0) X ≤ i Z (1 − α ) where Z (1 − α ) is the 100(1 − α )% quantile of the standard normal distribution.4ondition (3) could be simpliﬁed in the binary case. Suppose Q ij ( µ ij ) = Bern ( µ ij ) and µ ,ij = µ ,ij = µ ij ≤ − δ for some δ ∈ (0 ,

1) under H . Then η ij ≤ σ ij ≤ µ ij . In this case, condition (3) reduces to n = o ( k µ k F ). Here k µ k F denotes the Frobenius norm of matrix µ . To see this, let C be a genericconstant, then0 . δ k µ k F = δ X ≤ i

Suppose n = o (cid:0) m P ≤ i

0) for t = 1 ,

2. Then V ij = µ ,ij (1 − µ ,ij ) + µ ,ij (1 − µ ,ij ) + ( µ ,ij − µ ,ij ) = ( µ ,ij + µ ,ij )(1 + o (1)) .

5n this case, n = o ( m P ≤ i

The quantity λ n in Theorem 2.2 completely characterizes thepower of our test. For binary graphs, the sparsity may increase or decreasethe power, dependent on the model settup. To see this, we consider two sce-narios below.(a) Suppose µ ,ij = τ a n for a constant τ > and µ ,ij = a n with a n = o (1) , ≤ i < j ≤ n . By (8), it follows that λ n = mna n ( τ − τ + 1) [1 + o (1)] . For ﬁxed sample size m and the number of nodes n , the power of our teststatistic declines as the networks get sparser (smaller a n ).(b) Suppose µ ,ij = a n + b n and µ ,ij = a n − b n with a n = o (1) and b n = o (1) , ≤ i < j ≤ n . Then by equation (8), one has λ n = mn b n a n [1 + o (1)] . The ratio b n a n controls the power, if the sample size m and the number ofnodes n are held constant. Model 1: a n = n . n , b n = √ a n , then b n a n = 1 and λ n = mn [1 + o (1)] . Model 2: a n = n . n , b n = q a n log n , then b n a n = n and λ n = mn n [1 + o (1)] . Clearly, Model 1 is sparser than Model 2 but our testachieves higher power under Model 1 than Model 2 based on Theorem 2.2. Remark 2.

Recall that the T fro test in [16] is deﬁned as T fro = P ≤ i m ( A G k ,ij + A H k ,ij ) . The diﬀerence between T n and T fro lies in the diﬀerence between s n and t n .Note that s n in T n is proved to be a consistent estimator of the variance of P ≤ i

3. Simulation and Real Data

In this section, we evaluate the ﬁnite sample performance of the proposedtest T n and compare it with the test T fro in [16] by simulation. Besides, weapply our test method to a real data.7 .1. Simulation Throughout this simulation, we set the nominal type one error α to be0.05. The empirical size and power are obtained by repeating the experiment1000 times. We take n = 10 , , , , ,

300 and m = 2 , , G , . . . , G m ∼ G = ( V, Q, µ ) with Q ij ( µ ,ij ) = Beta ( a, b )for 1 ≤ i < j ≤ n/ n/ < i < j ≤ n and Q ij ( µ ,ij ) = Beta ( c, d ) for1 ≤ i ≤ n/ < j ≤ n . Denote the graph model as G ( Beta ( a, b ) , Beta ( c, d )).For a ﬁxed constant ǫ ( ǫ ≥ H , . . . , H m ∼G = ( V, Q, µ ), with Q ij ( µ ,ij ) = Beta ( a + ǫ, b + ǫ ) for 1 ≤ i < j ≤ n/ n/ < i < j ≤ n and Q ij ( µ ,ij ) = Beta ( c + ǫ, d + ǫ ) for 1 ≤ i ≤ n/ < j ≤ n .Denote the graph model as G ( Beta ( a + ǫ, b + ǫ ) , Beta ( c + ǫ, d + ǫ )).Note that the constant ǫ ( ǫ ≥

0) characterizes the diﬀerence between µ ,ij and µ ,ij with ﬁxed a, b, c, d , since for Beta ( a + ǫ, b + ǫ ), the mean is equal to µ ( ǫ ) = a + ǫ − a + b + 2 ǫ − . Clearly µ ( ǫ ) is an increasing function of ǫ ( ǫ ≥

0) and larger ǫ implies largerdiﬀerence in the means and consequently the power of the test T n is supposedto increase.We take a = 2 , b = 3 , c = 1 , d = 3 and a = 9 , b = 3 , c = 3 , d = 2 to yieldright-skewed and left-skewed beta distributions respectively. The simulationresults are summarized in Table 1 and Table 2, where the sizes (powers) arereported in column(s) with ǫ = 0 ( ǫ > T fro are all zeros, which indicates this test (designed for binary graphs) doesn’tapply to weighted graph (see Remark 2 for explanation). On the contrary,all the sizes of the proposed test T n are close to 0.05, which implies the nulldistribution is valid even for small networks (small n ) and small sample sizes(small m ). Besides, the power can approach one, this shows the consistencyof the proposed test T n . The parameter ǫ , n , m have signiﬁcant inﬂuence onthe powers. As any one of them increases with the rest held constant, thepower of T n gets higher.In the second simulation, we generate binary graphs to compare theperformance of T n and T fro . Speciﬁcally, we generate G , . . . , G m ∼ G =( V, Q, µ ) with Q ij ( µ ,ij ) = Bern ( a ) for 1 ≤ i < j ≤ n/ n/ < i < j ≤ n and Q ij ( µ ,ij ) = Bern ( b ) for 1 ≤ i ≤ n/ < j ≤ n . Denote the graphmodel as G ( Bern ( a ) , Bern ( b )). For a constant ǫ ( ǫ ≥ , . . . , H m are generated from G = ( V, Q, µ ), with Q ij ( µ ,ij ) = Bern ( a + ǫ )for 1 ≤ i < j ≤ n/ n/ < i < j ≤ n and Q ij ( µ ,ij ) = Bern ( b + ǫ ) for 1 ≤ i ≤ n/ < j ≤ n . Denote the graph model as G ( Bern ( a + ǫ ) , Bern ( b + ǫ )).We take a = 0 . , b = 0 . a = 0 . , b = 0 .

05 and a = 0 . , b = 0 . ǫ = 0 ( ǫ > a = 0 . , b = 0 .

01 and a = 0 . , b = 0 .

05, the networks are too sparse so that the denominators of T n and T fro may be zeros for smaller n . Consequently, T n and T fro may notbe available and we denote them as NA in Table 3 and Table 4.The sizes of T n ﬂuctuate around 0.05 and the pattern of powers resemblethat in Table 1 and Table 2. Since the networks are binary, the test T fro is applicable. For denser networks, the test seems to be pretty conservativesince almost all the sizes are less than 0.04 in Table 4 and almost all the sizesare zeros in Table 5 (see Remark 2 for explanation). This fact undermines itspower signiﬁcantly. On the contrary, the proposed test T n has satisfactorypower and outperforms T fro substantially. For sparser networks in Table 3,the sizes of T fro are closer to 0.05 and has powers close to that of T n . Thissimulation shows the advantage of the proposed test T n over T fro under thesetting of binary graphs. In this section, we consider applying the proposed method to a real lifedata that can be downloaded from a public database (http://fcon_1000.projects.nitrc.org/indi/retro/cobre.html) .This data set contains Raw anatomical and functional scans from 146 sub-jects (72 patients with schizophrenia and 74 healthy controls). After a seriesof processes done by [6], only 124 subjects (70 patients with schizophreniaand 54 healthy controls) were kept. In their study, 263 brain regions of inter-ests were chosen as nodes and connectivity between nodes were measured bythe edge weights that represent the Fisher-transformed correlation betweenthe fMRI time series of the nodes after passing to ranks [17].As the healthy group (54 networks) and patient group (70 networks) havediﬀerent sample sizes, our test statistic is not directly applicable. We adoptthe following two methods to solve this issue. The ﬁrst way is to randomlysample 16 networks from the 54 networks in the health group and unite themwith the 54 networks of health group to yield 70 samples. Then the healthygroup and patient group have equal sample sizes and we can calculate the teststatistics. This process is repeated 100 times and the ﬁve-number summary of9he test statistics is presented in Table 6. The second method is to randomlysample 54 networks from the schizophrenia patient group and then calculatethe test statistics based on the sampled 54 networks and the 54 networks inthe healthy group. The random sampling procedure is repeated 100 timesand the ﬁve-number summary of test statistics is presented in Table 7. Allthe calculated test statistics T n and T fro are much larger than 1.96, whichleads to the same conclusion that the patient population signiﬁcantly diﬀersfrom the healthy population at signiﬁcance level α = 0 .

05. Moreover, theproposed test statistic T n is almost twice of T fro , implying that our test ismore powerful to detect the population diﬀerence.The computation of the proposed test statistic requires randomly splittingtwo samples into two groups. In order to evaluate the eﬀect of the randomsplitting on the proposed test, we randomly sample 54 networks from thepatient group, denoted as G k , (1 ≤ k ≤ H k , (1 ≤ k ≤

54) be the54 networks in the healthy group. Consider G k , H k , (1 ≤ k ≤

54) as the twosamples. We randomly partition the two samples into two groups, denotedas ˜ G k , ˜ H k , (1 ≤ k ≤

27) and ˜ G k , ˜ H k , (27 < k ≤

54) and then compute thetest statistics T n and T fro . This procedure is repeated 100 times and the ﬁve-number summary of the 100 calculated test statistics are recorded in Table8. The same conclusion could be drawn based on the 100 statistics, implyingthat random splitting doesn’t signiﬁcantly aﬀect the proposed method.Additionally, to compare the performance of T n and T fro in binary graphsetting, we artiﬁcially transform the weighted graphs to binary graphs bythresholding as follows. For a given threshold τ , if the absolute value of anedge weight is greater (smaller) than τ , then the edge is transformed to 1 (0).Smaller (larger) τ yields denser (sparser) networks. We take the thresholdvalues τ ∈ { . , . , . , . , . , . , . } . For each τ , we calculate the teststatistics as in Table 6 and the results are summarized in Table 9. Thethreshold τ dramatically aﬀects the conclusion. For 0 . ≤ τ ≤ .

7, both T n and T fro reject the null hypothesis H that the two network populationsare the same, with T n more powerful than T fro in most cases. However,for τ = 0 .

01 (denser networks), T n rejects the null hypothesis H , while T fro fails to reject H . This analysis outlines the importance to developtesting procedures for weighted networks, as artiﬁcial transforming weightednetworks to unweighted networks may lead to contradictory conclusions.10 . Proof of Main Results Proof of Theorem 2.1:

We employ the Lindeberg Central Limit Theorem toprove theorem 2.1.Firstly, note that under H , we have σ n = E [ s n ] = X ≤ i m ( A G k ,ij − A H k ,ij ) i = X ≤ i m E ( A G k ,ij − A H k ,ij ) = X ≤ i E T ij I [ | T ij | > ǫσ n ] ≤ q E T ij P [ | T ij | > ǫσ n ] ≤ s E T ij E T ij ǫ σ n = E T ij ǫ σ n . Notice that E T ij = X k ,k ,k ,k ≤ m E ( A G k ,ij − A H k ,ij )( A G k ,ij − A H k ,ij )( A G k ,ij − A H k ,ij )( A G k ,ij − A H k ,ij ) × X k ,k ,k ,k > m E ( A G k ,ij − A H k ,ij )( A G k ,ij − A H k ,ij )( A G k ,ij − A H k ,ij )( A G k ,ij − A H k ,ij ) . Since for distinct k , k , k , k ∈ { , , . . . , m } , E [( A G k ,ij − A H k ,ij ) ( A G k ,ij − A H k ,ij )] = 0 , E [( A G k ,ij − A H k ,ij ) ( A G k ,ij − A H k ,ij )( A G k ,ij − A H k ,ij )] = 0 , E [( A G k ,ij − A H k ,ij )( A G k ,ij − A H k ,ij )( A G k ,ij − A H k ,ij )( A G k ,ij − A H k ,ij )] = 0 .

11s a result, it follows X k ,k ,k ,k ≤ m E ( A G k ,ij − A H k ,ij )( A G k ,ij − A H k ,ij )( A G k ,ij − A H k ,ij )( A G k ,ij − A H k ,ij )= X k = k = k = k ≤ m E ( A G k ,ij − A H k ,ij ) ( A G k ,ij − A H k ,ij ) + X k = k = k = k ≤ m E ( A G k ,ij − A H k ,ij ) ( A G k ,ij − A H k ,ij ) + X k = k = k = k ≤ m E ( A G k ,ij − A H k ,ij ) ( A G k ,ij − A H k ,ij ) + X k = k = k = k ≤ m E ( A G k ,ij − A H k ,ij ) = X k = k = k = k ≤ m E ( A G k ,ij − A H k ,ij ) + 3 X k = k ≤ m E ( A G k ,ij − A H k ,ij ) ( A G k ,ij − A H k ,ij ) Then we have E T ij = h X k ≤ m E ( A G k ,ij − A H k ,ij ) + 3 X k = k ≤ m E ( A G k ,ij − A H k ,ij ) ( A G k ,ij − A H k ,ij ) i × h X k> m E ( A G k ,ij − A H k ,ij ) + 3 X k = k > m E ( A G k ,ij − A H k ,ij ) ( A G k ,ij − A H k ,ij ) i = (cid:16) m σ ij + 3 mη ij (cid:17) = m σ ij + 6 m σ ij η ij + 9 m η ij , where η ij = E ( A G k ,ij − A H k ,ij ) . Hence, by condition (3), it follows that1 σ n X ≤ i ǫσ n ] ≤ ǫ σ n X ≤ i

1) by proving that s n = (1 + o p (1)) σ n . Note that for i < j and k < l , E (cid:2) ( T ij − m σ ij )( T kl − m σ kl ) (cid:3) = 0 , if { i, j } 6 = { k, l } . E (cid:2) s n − σ n (cid:3) = E (cid:2) X ≤ i

1) by Slutsky’s theorem.

Proof of Theorem 2.2:

Under H , we haveΛ ij = E [ T ij ] = E h X k ≤ m ( A G k ,ij − A H k ,ij ) i E h X k> m ( A G k ,ij − A H k ,ij ) i = X k ≤ m E ( A G k ,ij − A H k ,ij ) X k> m E ( A G k ,ij − A H k ,ij )= m µ ,ij − µ ,ij ) , and V ij = E ( A G k ,ij − A H k ,ij ) = E ( A G k ,ij − µ ,ij + µ ,ij − µ ,ij + µ ,ij − A H k ,ij ) = E ( A G k ,ij − µ ,ij ) + E ( µ ,ij − µ ,ij ) + E ( µ ,ij − A H k ,ij ) = σ ,ij + σ ,ij + ( µ ,ij − µ ,ij ) . Then σ ,n = E s n = m X ≤ i

13s a result, under H , the test statistic is decomposed as T n = P ≤ i

The authors are grateful to the Editor, the Associate Editor and Refereesfor helpful comments that signiﬁcantly improved this manuscript.

References [1] Abbe, E.(2018). Community Detection and Stochastic Block Models: Re-cent Developments.

Journal of Machine Learning Research , 18: 1-86.[2] Agarwal, S., Branson, K. and Belongie, S. (2006). Higher order learn-ing with graphs.

Proceedings of the International Conference on MachineLearning , 17-24.[3]

Aicher, C. (2014). The Weighted Stochastic Block Model.

AppliedMathematics Graduate Theses and Dissertations, 50. [4] Aicher, C., Jacob, A. and Clauset, A.(2015). Learning Latent Block Struc-ture in Weighted Networks.

Journal of Complex Networks , 3, 221-248.[5] Amini, A., Chen, A. and Bickel, P. (2013). Pseudo-likelihood methodsfor community detection in large sparse networks.

Annals of Statistics ,41(4), 2097-2122.[6] Arroyo Reli´on, Jes´us D., Kessler, Daniel and Levina, Elizaveta and Tay-lor, Stephan F.(2019) Network classiﬁcation with applications to brainconnectomics.

The Annals of Applied Statistics.

Journal of Royal Statistical Society,Series B , 78, 253-273. 148] Cedric E. Ginestet, Jun Li, Prakash Balanchandran, Steven Rosenberg,and Eric D. Kolaczyk.(2017). Hypothesis testing for network data in func-tional neuroimaging.

The Annals of Applied Statistics , 11(2):725–750.[9] Chen J. and Yuan, B.(2006). Detecting functional modules in the yeastprotein-protein interaction network.

Bioinformatics , 22(18):2283–2290.[10] Costa LF, Oliveira ON Jr, Travieso G, Rodrigues FA, Villas Boas PR,Antiqueira L, Viana MP, CorreaRocha LE (2011). Analyzing and mod-eling real-world phenomena with complex networks: a survey of applica-tions.

Adv Phys https://arxiv.org/pdf/1911.03783.pdf [12] Fortunato S (2010). Community detection in graphs.

PhysRep https://arxiv.org/pdf/1710.00862.pdf [14] Garcia, J., Ashourvan, A., Muldoon, S., Vettel, J. and Bassett, D.(2018).Applications of community detection techniques to brain graphs: Algo-rithmic considerations and implications for neural function

Proc IEEEInst Electr Electron Eng , 106:846-867.[15] Ghoshdastidar, et. al(2019). Two-sample hypothesis testing for inhomo-geneous random graphs. arXiv:1707.00833 .[16] Ghoshdastidar, D. and Luxburg, V. U.(2018). Practical Methods forGraph Two-Sample Testing,

NIPS 2018 , 3019–3028, Montr´eal, Canada,2018.[17] Jesus Daniel Arroyo Relion (2019). graphclass: Network classiﬁcation.

R package version 1.1. [18] Lei, J.(2016). A goodness-of-ﬁt test for stochastic block models.

TheAnnals of Statistics , 44(1):401–424.[19] Ma’ayan, A.(2011). Introduction to Network Analysis in Systems Biol-ogy.

Sci Signal . 4(190): tr5. 1520] Newman, M.E.J.(2004). Coauthorship networks and patterns of scien-tiﬁc collaboration.

PNAS , 101: 5200-505.[21] Stam, C. J. , Jones, B. F., Nolte, G., Breakspear, M. and Schel-tens, P.(2007). Small-world networks and functional connectivity inAlzheimer’s disease.

Cerebral Cortex , 17(1):92–99.[22] Tang, M., Athreya, A., Sussman, D. L., Lyzinski, V., and Priebe, C.E. (2017). A nonparametric two-sample hypothesis testing problem forrandom graphs.

Bernoulli , 23(3):1599–1630.[23] Thomas, A. C. and Blitzstein, J. K. (2011). Valued ties tell fewer lies:Why not to dichotomize network edges with thresholds. arXiv:1101.0788 [24] Yuan, M. and Nan, Y.(2020). Test dense subgraphs in sparse uni-form hypergraph.

Communications in Statistics - Theory and Methods ,DOI:10.1080/03610926.2020.1723637.16 able 1: Simulated size and power with graphs generated from G ( Beta (2 , , Beta (1 , G ( Beta (2 + ǫ, ǫ ) , Beta (1 + ǫ, ǫ )) n ( m = 2) Method ǫ = 0( size ) ǫ = 0 . power ) ǫ = 0 . power ) ǫ = 0 . power )10 0.000 0.000 0.000 0.00030 0.000 0.000 0.000 0.00050 0.000 0.000 0.000 0.000100 T fro T n n ( m = 4) Method ǫ = 0( size ) ǫ = 0 . power ) ǫ = 0 . power ) ǫ = 0 . power )10 0.000 0.000 0.000 0.00030 0.000 0.000 0.000 0.00050 0.000 0.000 0.000 0.000100 T fro T n n ( m = 14) Method ǫ = 0( size ) ǫ = 0 . power ) ǫ = 0 . power ) ǫ = 0 . power )10 0.000 0.000 0.000 0.00030 0.000 0.000 0.000 0.00050 0.000 0.000 0.000 0.000100 T fro T n able 2: Simulated size and power with graphs generated from G ( Beta (9 , , Beta (3 , G ( Beta (9 + ǫ, ǫ ) , Beta (3 + ǫ, ǫ )). n ( m = 2) Method ǫ = 0( size ) ǫ = 0 . power ) ǫ = 0 . power ) ǫ = 0 . power )10 0.000 0.000 0.000 0.00030 0.000 0.000 0.000 0.00050 0.000 0.000 0.000 0.000100 T fro T n n ( m = 4) Method ǫ = 0( size ) ǫ = 0 . power ) ǫ = 0 . power ) ǫ = 0 . power )10 0.000 0.000 0.000 0.00030 0.000 0.000 0.000 0.00050 0.000 0.000 0.000 0.000100 T fro T n n ( m = 14) Method ǫ = 0( size ) ǫ = 0 . power ) ǫ = 0 . power ) ǫ = 0 . power )10 0.000 0.000 0.000 0.00030 0.000 0.000 0.000 0.00050 0.000 0.000 0.000 0.000100 T fro T n able 3: Simulated size and power with graphs generated from G ( Bern (0 . , Bern (0 . G ( Bern (0 .

05 + ǫ ) , Bern (0 .

01 + ǫ )). n ( m = 2) Method ǫ = 0( size ) ǫ = 0 . power ) ǫ = 0 . power ) ǫ = 0 . power )10 NA NA NA NA30 NA NA NA NA50 NA 0.038 0.094 0.236100 T fro T n n ( m = 4) Method ǫ = 0( size ) ǫ = 0 . power ) ǫ = 0 . power ) ǫ = 0 . power )10 NA NA NA NA30 0.032 0.058 0.132 0.34250 0.043 0.078 0.314 0.750100 T fro T n n ( m = 14) Method ǫ = 0( size ) ǫ = 0 . power ) ǫ = 0 . power ) ǫ = 0 . power )10 NA 0.069 0.189 0.39230 0.038 0.263 0.872 1.00050 0.049 0.613 1.000 1.000100 T fro T n able 4: Simulated size and power with graphs generated from G ( Bern (0 . , Bern (0 . G ( Bern (0 . ǫ ) , Bern (0 .

05 + ǫ )). n ( m = 2) Method ǫ = 0( size ) ǫ = 0 . power ) ǫ = 0 . power ) ǫ = 0 . power )10 NA NA NA NA30 0.025 0.031 0.036 0.05450 0.031 0.033 0.039 0.101100 0.030 0.038 0.123 0.311200 T fro T n n ( m = 4) Method ǫ = 0( size ) ǫ = 0 . power ) ǫ = 0 . power ) ǫ = 0 . power )10 NA NA NA NA30 0.034 0.040 0.056 0.14850 0.030 0.037 0.120 0.328100 0.040 0.074 0.402 0.895200 T fro T n n ( m = 14) Method ǫ = 0( size ) ǫ = 0 . power ) ǫ = 0 . power ) ǫ = 0 . power )10 0.039 0.044 0.075 0.15630 0.031 0.091 0.432 0.88650 0.035 0.199 0.874 1.000100 T fro T n able 5: Simulated size and power with graphs generated from G ( Bern (0 . , Bern (0 . G ( Bern (0 . ǫ ) , Bern (0 . ǫ )). n ( m = 2) Method ǫ = 0( size ) ǫ = 0 . power ) ǫ = 0 . power ) ǫ = 0 . power )10 0.000 0.000 0.000 0.00030 0.000 0.000 0.000 0.00250 0.000 0.000 0.002 0.001100 T fro T n n ( m = 4) Method ǫ = 0( size ) ǫ = 0 . power ) ǫ = 0 . power ) ǫ = 0 . power )10 0.001 0.001 0.001 0.00230 0.001 0.001 0.002 0.00250 0.001 0.002 0.003 0.030100 T fro T n n ( m = 14) Method ǫ = 0( size ) ǫ = 0 . power ) ǫ = 0 . power ) ǫ = 0 . power )10 0.000 0.001 0.002 0.01430 0.000 0.013 0.174 0.55950 0.000 0.083 0.812 0.997100 T fro T n able 6: Repeat sampling 16 networks from 54 networks in the healthy group. Method Min. 1st Qu. Median 3rd Qu. Max. T n T fro Table 7: Repeat sampling 54 networks from 70 networks in patient group.

Method Min. 1st Qu. Median 3rd Qu. Max. T n T fro Table 8: Random splitting of two samples G k , H k , ≤ k ≤ Method Min. 1st Qu. Median 3rd Qu. Max. T n T fro Table 9: Transforming weighted graphs to unweighted graphs with diﬀerent threshold τ . Method Min. 1st Qu. Median 3rd Qu. Max. T n ( τ = 0 .

01) 9.70 16.40 19.61 22.10 31.88 T fro ( τ = 0 .

01) 0.42 0.70 0.83 0.93 1.31 T n ( τ = 0 .

03) 11.91 17.32 21.20 24.90 33.23 T fro ( τ = 0 .

03) 1.54 2.17 2.62 3.05 3.95 T n ( τ = 0 .

1) 11.86 19.66 23.01 25.78 38.03 T fro ( τ = 0 .

1) 4.88 7.81 9.01 9.93 14.10 T n ( τ = 0 .

3) 14.06 26.37 29.64 32.42 41.33 T fro ( τ = 0 .

3) 11.81 20.90 23.20 25.34 32.16 T n ( τ = 0 .

5) 11.86 16.57 18.19 19.21 24.43 T fro ( τ = 0 .

5) 10.14 13.81 15.10 16.14 20.14 T n ( τ = 0 .

7) 5.60 6.93 7.32 7.93 9.37 T fro ( τ = 0 .

7) 6.55 8.02 8.53 9.08 10.60 T n ( τ = 0 .

9) 0.47 1.79 2.19 2.52 3.36 T fro ( τ = 0 ..