Coactivated Clique Based Multisource Overlapping Brain Subnetwork Extraction
CCoactivated Clique Based Multisource OverlappingBrain Subnetwork Extraction
Chendi Wang, Rafeef Abugharbieh
Biomedical Signal and Image Computing Lab, UBC, [email protected], [email protected]
Abstract
Subnetwork extraction using community detection methods is commonlyused to study the brain’s modular structure. Recent studies indicated that cer-tain brain regions are known to interact with multiple subnetworks. However,most existing methods are mainly for non-overlapping subnetwork extraction.In this paper, we present an approach for overlapping brain subnetwork extrac-tion using cliques, which we defined as co-activated node groups performingmultiple tasks. We proposed a multisource subnetwork extraction approachbased on the co-activated clique , which (1) uses task co-activation and taskconnectivity strength information for clique identification, (2) automaticallydetects cliques of different sizes having more neuroscientific justifications, and(3) shares the subnetwork membership, derived from a fusion of rest and taskdata, among the nodes within a clique for overlapping subnetwork extraction.On real data, compared to the commonly used overlapping community detec-tion techniques, we showed that our approach improved subnetwork extractionin terms of group-level and subject-wise reproducibility. We also showed thatour multisource approach identified subnetwork overlaps within brain regionsthat matched well with hubs defined using functional and anatomical infor-mation, which enables us to study the interactions between the subnetworksand how hubs play their role in information flow across different subnetworks.We further demonstrated that the assignments of interacting/individual nodesusing our approach correspond with the posterior probability derived indepen-dently from our multimodal random walker based approach.Keywords: Clique, Overlapping Brain Subnetwork Extraction, MultisourceFusion, Functional Connectivity, Hypergraph a r X i v : . [ q - b i o . N C ] J a n Introduction
The mainstream of brain subnetwork extraction and standard definition of modu-larity focus on nonoverlapping definition. However, studies have shown evidencesof the existence of overlapping brain subnetworks, hence the methods for nonover-lapping subnetwork extraction are limited by neglecting inclusive relationships [1].There are emerging approaches for discovering overlapping modular network struc-ture, which implies that single nodes may belong in more than one specific module.We here summarize some representative approaches used in brain subnetwork ex-traction application, and detailed information can be found in a review paper ongeneral overlapping community detection [2] .The Clique Percolation Method (CPM) is one of the earliest methods for overlap-ping community detection [3]. It is based on the assumption that communities tendto be comprised of overlapping sets of cliques, i.e., fully connected subgraphs. Itidentifies overlapping communities by searching connected cliques. First, all cliquesof a fixed size k must be detected, and a clique adjacency matrix is constructed bytaking each clique as a vertex in a new graph. Two cliques are considered connectedif they share k -1 nodes. Communities are detected corresponding to the connectedcomponents of the clique adjacency matrix. Since a vertex can be in multiple cliquessimultaneously, mapping the communities from the clique level back to the node levelmay result in nodes being assigned to multiple communities [2, 4]. The limitationof CPM is that it operates on binarized graph edges, thus cannot handle weightedgraphs [5].A new definition of modularity has been proposed to discover the overlappingsubnetwork based on unbiased cluster coefficients using resting state connectivity[1]. However, methods based on the modularity function Q suffer from degeneratepartitions and resolution limit [4]. Another line of studies is to transform a networkinto its corresponding line graph, where the nodes represent the connections inthe original network. Thus, the nonoverlapping community detection (modularitymaximization used in [6] and agglomerative hierarchical clustering used in [7]) onthe line graph will result in overlapping subnetworks in the original network. Thereexist inherent limitations in the nonoverlapping community detection used for theline graph (resolution limit for modularity maximization and local sub-optimum forhierarchical clustering).Fuzzy community detection algorithms quantify the strength of association be-tween all pairs of communities and nodes [2]. Fuzzy k-menas clustering [8] and fuzzyaffinity propagation [9] have been applied to detect overlapping brain subnetworkextraction. However, one has to use an ad hoc threshold for extracting interactingnodes or independent nodes from the membership vector.2ocal expansion and optimization algorithms grow a natural community [10] ora partial community based on local benefit functions [2]. One example is ConnectedIterative Scan (CIS), which has been explored for brain subnetwork extraction [11].Taking each node as a partial subnetwork, CIS expands the subnetwork by deter-mining if any other nodes belong to this existing subnetwork using a local functionto form a densely connected group of nodes. Its limitation is the sensitivity toa density factor that controls subnetwork size [12]. Another good example is theOrder Statistics Local Optimization Method (OSLOM) [13], which uses statisticalsignificance of a subnetwork when tested against a global randomly generated nullmodel during community expansion. OSLOM has been shown to outperform manystate-of-the-art community detection techniques.In a previous work from our lab, the Replicator Dynamics (RD) concept fromtheoretical biology for modeling the evolution of interacting and self-replicating enti-ties was used to identify subnetworks. Further, the RD formulation was extended toenable overlaps between subnetworks by incorporating a graph augmentation strat-egy [14]. This approach, Stable Overlapping Replicator Dynamics (SORD) [12],has demonstrated its superiority over many commonly used overlapping subnetworkextraction methods, including OSLOM.Most of the algorithms aforementioned are based on one single source, suchas resting state functional connectivity. Coupled Stable Overlapping ReplicatorDynamics (CSORD), the multimodal version of SORD, is one of the few overlappingmethods which considers multi-source information. CSORD is based on survivalprobabilities of different genders in evolution and graph augmentation [14]. However,its theoretical background for overlapping assumption based on graph augmentationhas relatively indirect neuroscientific justifications. We here explore the directionof integrating multisource information for the overlapping subnetwork extraction byusing the straightforward clique concept. Recent study has indicated that repeatedly activated nodes in different tasks couldbe canonical network components in the pre-existing repertoires of intrinsic subnet-works [15], we argue that the clique concept closely resembles groups of nodes whichare the canonical network components . Based on the basic observation that typicalcommunities consist of several cliques that tend to share many of their nodes [3],clique-based approach would be a straightforward way to find overlapping brain sub-networks. However, the existing clique-based subnetwork extraction approach CPM3igure 1: The schematic illustration of multisource clique based overlapping sub-network extraction approach.(kclique) [3] has three major limitations that it can only handle binary graphs, butnot weighted graphs; the size k of cliques is fixed, which needs to be adjusted fordifferent types of networks; and it only uses uni-source information. In order totackle the aforementioned limitations, we here propose a multisource subnetworkextraction approach based on co-activated clique , which (1) uses task co-activationand task connectivity strength information for clique identification, (2) automati-cally detects cliques with different sizes having more neuroscientific justifications,and (3) shares the subnetwork membership, derived from multisource hypergraphbased approach we recently proposed [16], among nodes within a clique for overlap-ping subnetwork extraction. The schematic illustration of our approach is shown inFigure 1.We first detect co-activated groups of brain nodes across different tasks based onan activation fingerprint idea, and then identify densely connected cliques based ontask-induced weighted connectivity. Core cliques are further detected using cliqueproperties we defined. The nodes within a clique should belong to the same sub-networks due to the close relationship between nodes in a fully connected clique, wethus share the subnetwork membership of nodes within a clique to facilitate over-lapping subnetwork assignment. The initial subnetwork membership for each nodeis derived from non-overlapping subnetwork extraction technique, which is based onthe fusion of resting state connectivity and task information embedded with highorder relations using hypergraph (see details and notations in [16]). The differenceof our approach from the traditional uni-source kclique method is the utilization ofboth the task co-activation information and the connectivity weights (only the bina-rized connectivity is used in kclique method). The co-activated cliques derived usingour approach have flexible clique sizes, which has more neuroscientific justifications4han the fixed k . Besides, we explore if our proposed clique node subnetwork mem-bership sharing idea can generate more straightforward and biologically meaningfulresults than the existing multisource method CSORD. We define cliques as co-activated groups of brain nodes that are densely connectedin our approach. We first identify the co-activated groups of brain nodes (coarsecliques) using an activation fingerprint idea. Then we refine coarse cliques intocliques, within which nodes are densely connected to each other based on task-induced connectivity information. We denote the clique set as CS , and the coarseclique set as CCS . Given T different tasks, one can construct a hypergraph withan N × T incidence matrix H , where N is the number of brain regions and T is thenumber of tasks (hyperedge e ). h ( v, e ) = 1 when the brain region node v is activatedin the task corresponding to hyperedge e . The task-induced connectivity matrix C task is generated by removing all inter-block rest periods from all regions’ timecourses and computing pairwise Pearson’s correlations of time courses which wereconcatenated through block/event durations across all the tasks. The underlyingassumption for our clique identification is that nodes in the same clique should beco-activated across tasks at times from t = 1 . . . T , where t indicates the numberof tasks, in which the nodes are co-activated. There are two steps involved in ourclique identification, which (1) pre-selects sets of coarse cliques in all T layers, (2)and refines the coarse cliques into cliques.The approach starts with a pre-selection of coarse cliques CCS , which mightinclude loosely connected nodes that are co-activated. Take each row from theincidence matrix H as a activation fingerprint vector f corresponding to the taskactivation pattern of a node. For example, if one node is activated in the 1st, 3rdand 6th out of the seven tasks, the corrsponding f = [1010010]. We next operate bit-wise and between the fingerprints from a node pair { i, j } , which gives us anoutput fingerprint vector of co-activation patterns SF ij : SF ij = f i ∧ f j , (1)where f i and f j are the activation fingerprint vectors of node pair i and j , and SF isthe matrix containing the co-activation fingerprint vectors between the nodes in eachnode pair. We then define a matrix NT which counts the number of co-activatedtasks between two nodes: NT ij = T (cid:88) t =1 SF ij ( t ) , (2)where SF ij is the co-activation fingerprint vector of length T . Next, we define the5ode set P S = t which contains nodes that are co-activated together for t times as: P S = t = (cid:91) ∀ i,j s.t. NT ij = t { i, j } , (3)and define the node set P S >t which contains nodes that are co-activated togetherfor greater than t times as: P S >t = (cid:91) ∀ i,j s.t. NT ij >t { i, j } . (4)Based on the definition above, we follow the four steps as below to identify thecoarse cliques. Step 1
We extract M t pre-selected sets of co-activated coarse cliques from thenodes in P S = t . We identify { CCS = t , CCS = t , . . . , CCS = tM t } by ensuring all the nodepairs within a certain set share the same co-activation fingerprint vector in SF : CCS = tm = { p m , p m , . . . , p mN m | ∃ p mi , p mj ∈ P S = t , s.t. SF p mi p mj = SF p m p m } . (5)where m = 1 , . . . , M t . The minimal rank of CCS = tm is 2, being only one node pairwithin a coarse clique. The nodes identified in a coarse clique are fully connected toeach other defined by sharing the same co-activation pattern. Step 2
Similarly, we extract M t extended sets of co-activation coarse cliques, { CCS >t , CCS >t , . . . , CCS >tM t } , from the nodes in P S >t , based on the co-activationpatterns between nodes in CCS = tm and P S >t : CCS >tm = (cid:91) ∀ i ∈ CCS = tm , ∃ j ∈ P S >t , s.t. SF ij ∧ SF pm pm = SF pm pm { j } . (6)We do not consider the coarse clique set selection for the nodes which only exist inthe node set P S >t for the t th layer, since those will be selected in the pre-selectedsets in t + 1 layer. Step 3
We then generate M t coarse clique sets by merging the pre-selected andextended sets together as: CCS t = { CCS t , CCS t , . . . , CCS tM t } CCS tm = CCS = tm ∪ CCS >tm , m = 1 , . . . , M t (7) Step 4
Extract the coarse clique set
CCS across layers in the order from T to1: CCS = (cid:91) t = T,..., CCS t . (8)The second part of clique identification is to refine the coarse cliques into cliques.When we extract CCS = { CCS , CCS , . . . , CCS M } , there still exist loosely con-nected nodes in the coarse cliques, mostly from lower layers when t is small, especially6hen t = 1. Hence, we subsequently extract cliques based on the strength informa-tion from task-induced connectivity matrix C task and hypergraph properties. Weformulate a coarse clique set, CCS k = { p k , p k , . . . , p kM k } where there are M k nodeswithin, as an M k × M k simple graph with the weights between nodes being thetask-induced connectivity pairwise edge strength. We next apply a local threshold-ing [17] on the M k × M k connectivity matrix C task - k to find out the most closelyconnected nodes to each node, and binarize the thresholded matrix to generate anadjacency matrix A task - k . We then transform A task - k into its hypergraph H task - k using A = HWH T − D v , where the locations with 1 in each hyperedge correspondto the nodes that comprise a fully connected subgraph, i.e., cliques CS c . We extract N c cliques: CS = { CS , CS , . . . , CS N c } . (9) We present three properties that can be derived to study the cliques for furthernetwork analysis.(1) Co-activation times
N T c of a clique CS c , i.e., the number of ones in theclique co-activation fingerprint: CSF c = (cid:94) ∀ i ∈ CS c f i , (10)then the co-activation times: N COA c = T (cid:88) t =1 CSF c ( t ) . (11)(2) Activation times in a clique: N A c = 1 | CS c | (cid:88) ∀ p ∈ CS c (cid:88) t =1 ,...,T f p ( t ) . (12)(3) Clique overlap ratio - the times of a clique overlaps with other cliques dividedby the size of a clique, i.e., the number of nodes within a clique. We first define theset of cliques which node i belongs to as a label set: LC i = { c i , c i , . . . , c iN i } , c ik ∈ , . . . , N c , (13)where LC i is an empty set when node i does not belong to any cliques. We thendefine the clique overlap ratio as: RCO c = 1 | CS c | | (cid:91) ∀ p ∈ CS c LC p | . (14)7 .3 Core Clique Identification Based on the clique properties, we further identify core cliques out of clique sets forthe future overlapping subnetwork extraction. We argue that core cliques shouldhave relatively high co-activation times, high activation times, and high clique over-lap ratio. We then devise a core clique selection criterion based on the combinationof the clique properties. We normalize all the property values into the range of [0, 1]by dividing individual values by the maximum across all the cliques. The criterionis set as below: ρ = median ∀ i ∈ CS { N COA i } max ∀ i ∈ CS { N COA i } + median ∀ i ∈ CS { N A i } max ∀ i ∈ CS { N A i } + median ∀ i ∈ CS { ROC i } max ∀ i ∈ CS { ROC i } . (15)For any clique c which satisfies the criterion: N COA c max ∀ i ∈ CS { N COA i } + N A c max ∀ i ∈ CS { N COA i } + ROC c max ∀ i ∈ CS { ROC i } > ρ, (16)it is selected into the core clique set. Based on the identified core cliques, we further deploy a subnetwork membershipsharing technique to identify overlapping subnetworks. The underlying rationaleis that the nodes residing within the same clique behave very similarly to performsome basic functions in tasks, thus, they should be within the same subnetworks.In a brain graph with N nodes, let C rest be an N × N resting state connectivitymatrix, and we have already labeled the non-overlapping subnetwork membershipfor each node using C rest . We have also defined the clique membership of a node i as LC i in Equation 13. We then share the subnetwork membership of the nodeswithin a clique to facilitate overlapping subnetwork assignment.First, M s subnetworks are extracted using non-overlapping community detectionapproach applied on C rest . We define the subnetwork membership of a node i as: label ( i ) = s, i ∈ , . . . , N, s ∈ , . . . , M s . (17)Next, we deploy a sharing scheme of the subnetwork membership label from label ( i )of a node i , with the label set of the remaining nodes in the clique where node i belongs to: LS ( i ) = (cid:91) ∀ c ∈ LC i (cid:91) ∀ p ∈ CS c label ( p ) , (18)and label ( i ) = label ( i ) ∪ LS ( i ) . (19)We have also explored replacing the resting state connectivity matrix C rest withthe multisource connectivity matrix C t-r defined in [16]. We argue that we should8urther integrate the activation information from task data with high order relationinformation presented by hypergraph and the rest data when identifying the non-overlapping subnetwork membership. We first compare our multisource clique based approach against the uni-sourcekclique method [3], which is the closest straightforward way to identify overlappingsubnetworks. Next we compare against SORD, which has been proven to outper-form the state-of-the-art techniques such as OSLOM, and CSORD (the multisourceversion of SORD) [18] to see if our proposed approach have more direct biologicalintuition for the overlapping subnetwork extraction. We also examine the nodeswithin subnetwork overlaps derived by our approach by assessing the probability ofa node belonging to subnetworks using our recently proposed multimodal RandomWalker (RW) approach [19], to verify that our overlapping subnetwork assignmentscorrespond with the posterior probability.
We used the resting state Functional Magnetic Resonance Imaging (fMRI) andtask fMRI scans of 77 unrelated healthy subjects from the Human ConnectomeProject (HCP) dataset [20]. Two sessions of resting state fMRI with 30 minutes foreach session, and 7 sessions of task fMRI data were available for multisource inte-gration. The seven tasks are working memory (total time: 10:02), gambling (6:24),motor (7:08), language (7:54), social cognition (6:54), relational processing (5:52)and emotion processing (4:32). Preprocessing already applied to the HCP fMRI dataincludes gradient distortion correction, motion correction, spatial normalization toMontreal Neurological Institute (MNI) space with nonlinear registration based ona single spline interpolation, and intensity normalization [21]. Additionally, we re-gressed out motion artifacts, mean white matter and cerebrospinal fluid confounds,and principal components of high variance voxels using compCor [22]. Next, weapplied a bandpass filter with cutoff frequencies of 0.01 and 0.1 Hz for resting statefMRI data. For task fMRI data, we performed similar temporal processing, except ahigh-pass filter at 1/128 Hz was used. The data were further demeaned and normal-ized by the standard deviation. We then used the Harvard-Oxford (HO) atlas [23],which has 112 region of interest (ROI)s, to define the brain region nodes. We chosethe well-established HO atlas because it sampled from every major brain system,and consists of the highest number of subjects with both manual and automaticlabelling technique compared to other commonly used anatomical atlases. Voxel9ime courses within ROIs were averaged to generate region time courses. The re-gion time courses were demeaned, normalized by the standard deviation. Grouplevel time courses were generated by concatenating the time courses across sub-jects. The Pearson’s correlation values between the region time courses were takenas estimates of Functional Connectivity (FC) matrices. Negative elements in allconnectivity matrices were set to zero due to the currently unclear interpretation ofnegative connectivity [24]. For task activation, we applied the activation detectionon the seven tasks available following the steps described in [16].We further applied local thresholding [17] on ¯ C task by setting graph density to be0.1 to generate the hypergraph when we identified cliques from the coarse clique set.We selected a relatively strict threshold to only select those most closely connectednodes to form cliques. 0.1 has been chosen based on the cross-validation on inter-subject reproducibility within the range between 0.03 to 0.2 at the interval of 0.01.The non-overlapping subnetworks were derived from the resting state connectivitymatrix C rest or multisource connectivity matrix C t-r (generated using strategies from[16]) using Normalized cuts (Ncuts), when the number of subnetworks was set to 7,same as the abaialble number of tasks. We compared the overlapping subnetwork extraction using our proposed MultisourceClique-based Subnetwork Extraction (MCSE) approach with C t-r , or C rest againstthe uni-source kclique approach [3], SORD [12], which has been demonstratedto outperform state-of-the-art overlapping community detection methods includ-ing OSLOM when applied to brain subnetwork extraction, and CSORD [18], themultisource extension of SORD. Two uni-source approaches extract overlappingsubnetworks using resting state data. The parameters for kclique were set usingthe cross-validation on the clique size k based on inter-subject reproducibility fromthe suggested range [3 , . . . ,
6] [3] and reasonable graph densities from 0.03 to 0.2 atthe interval of 0.01. SORD and CSORD applied 100 bootstraps by sampling withreplacement as suggested in [12]. We also evaluated the probability of a node beingassigned to a subnetwork using our recently proposed RW based approach [19] toexamine the proposed clique-based overlapping subnetwork identification. All sta-tistical comparisons are based on the Wilcoxon signed rank test with significancedeclared at an α of 0.05 with Bonferroni correction.10igure 2: Group-level Subnetwork Extraction reproducibility based on data fromtwo different sessions. MCSE outperforms all other contrasted approaches. We quantitatively evaluated the contrasted approaches based on test-retest reliabil-ity and inter-subject reproducibility, since ground truth subnetworks are unknownfor the real data of human brain.
We first assessed the test-retest reliability based on group level subnetworks ex-tracted separately from two sessions of rest and task data (each of the seven tasksincludes two sessions of fMRI data) using Dice Similarity Coefficient (DSC). Thesubnetworks extracted from the first session’s data are taken as the “ground truth”,against which the subnetworks from the second session are compared. We foundthat our proposed MCSE outperforms all other contrasted approaches, by achievinga DSC between subnetworks extracted from two sessions of data at 0.8917 with C t-r and 0.8865 with C rest , against kclique at 0.7514, SORD at 0.8378, and CSORD at0.8514, see Figure 2. We assessed the inter-subject reproducibility by comparing the subnetwork extrac-tion results using subject-wise data against the group level data, Figure 3. Theaverage DSC between subject-wise and group level subnetworks across 77 subjectsbased on five approaches are MCSE with C t-r at 0.7024 ± C rest at 0.6281 ± ± ± ± C t-r and C rest are found to achievestatistically higher inter-subject reproducibility than constrasted approaches basedon the Wilcoxon signed rank test at p < − and p < . CSE t-r MCSE rest kclique SORD CSORD00.20.40.60.81
Figure 3: Subject-wise level inter-subject reproducibility of subnetwork extraction.Our proposed MCSE approach outperforms existing state-of-the-art overlappingcommunity detection methods.MCSE with C t-r outperforms C rest at p < We further examined the biological meaning of the overlapping subnetworks foundusing all five methods, i.e., our proposed MCSE with C t-r , MCSE with C rest , kclique,SORD and CSORD, Figure 4. We first measured the overlapping ratio by divid-ing the number of nodes residing in the subnetwork overlaps by the total numberof brain regions detected in subnetworks. The ratio of five methods are 0.3482,0.3482, 0.4444, 0.4328 and 0.2885. Our proposed approach can generate the similarratio of interacting nodes which reside within subnetwork overlaps to the existingoverlapping methods. We note that CSORD generated relatively smaller number ofinteracting nodes, the possible reason is that the strict stability selection resultedin exclusion of some meaningful nodes, which were taken as false detected nodesarising from noise [18].By examining the locations of those interacting nodes, we found that our pro-posed MCSE with C t-r approach identified subnetwork overlaps within pre- and post-central gyri, medial superior frontal cortex, inferior frontal gyrus, superior parietallobule, precuneous, lateral occipital cortex, occipital pole and frontal orbital cortex;which match well with functional hubs previously identified by graph-theoreticalanalysis based on the degree of the voxels [25]. Besides, brain regions of insula,putamen, thalamus, supramarginal gyrus have been found within subnetwork over-laps, which match well with the connector hubs identified using the centrality mea-sures [26]. The results of using MCSE with C rest is very similar to C t-r , only that12 a) Task activation (b) MCSE with C t-r (c) MCSE with C rest (d) kclique(e) SORD (f) CSORD Figure 4: Visualization of Task activation and overlapping subnetworks extractedfrom our proposed approach and contrasted three other methods. The brain isvisualized in the axial view. Our proposed MCSE approach outperforms existingstate-of-the-art overlapping community detection methods by detecting well-knownhubs which reside within subnetwork overlaps.13recuneous cortex was missed, and the temporal pole was misclassified into the sub-network overlaps. This result confirms the benefit of integrating the informationfrom both task and rest data. Both MCSE methods also identified lingual gyrusand fusiform cortex around as interacting nodes. Lingual gyrus was identified asa hub based on cortical thickness correlation [27] and the fusiform cortex withinoccipitotemporal cortex has been found to be intermediary “hub” linking visual andhigher linguistic representations [28].As for the traditional kclique approach, biologically meaningful subnetwork over-laps were found within inferior frontal gyrus, superior and middle temporal gyri [25],supramarginal gyrus, insula [26], inferior temporal gyrus [29], and occipitotemporalcortex [28]. kclique failed to identify all the other aforementioned (connector) hubswhich were found using our proposed methods. Instead, regions normally were notconsidered to reside in subnetwork overlaps were found, such as temporal fusiformcortex, central opercular cortex, and parietal operculum cortex. On the other hand,this kclique approach detected angular gyrus (functioning as a semantic hub) withinsubnetwork overlaps.SORD was able to find subnetwork overlaps within inferior, superior and middletemporal gyri, superior parietal lobule, lateral occipital cortex, occipital pole andlingual gyrus that match well with functional hubs, but failed to find other hubregions identified by MCSE. Instead, SORD detected many regions as interactingnodes, which normally are not considered as hubs, such as intracalcarine cortexand cuneal cortex in the visual system, and regions in language related system, in-cluding central opercular cortex, parietal operculum cortex, planum polare, planumtemporale, heschls gyrus, and supracalcarine cortex.With relatively lower number of overlapping ratio, CSORD identified biologicalmeaningful subnetwork overlaps within regions such as pre- and postcentral gyri,middle temporal gyrus, angular gyrus and lateral occipital cortex, while failed tofind any other hubs. Similar to SORD, it included some regions in language re-lated system to the subnetwork overlaps, such as central opercular cortex, parietaloperculum cortex, and planum temporale. We did not discover the single subnet-work constituting the visual corticostriatal loop, striatothalamo-cortical loop, andcerebello-thalamo-cortical loop, which was found in [18]. The reason could be thisconnection was reflected in Anatomical Connectivity (AC), instead of task functionalconnectivity.Collectively, our proposed MCSE approach is able to identify subnetwork over-laps which constitute more biologically meaningful brain regions, such as hubs, com-pared against contrasted methods. 14 .3 Comparison Between the Subnetwork Overlaps and theRW Posterior Probability
We also examined the overlapping subnetworks derived from our approach MCSEwith C t-r by assessing the probability of a node belonging to subnetworks using ourown recently proposed multimodal RW approach [19], to verify that our overlappingsubnetwork assignments correspond with the posterior probability. The underly-ing rationale is that for an interacting node, which resides within the subnetworkoverlaps, its probability of belonging to a subnetwork will be distributed across thesubnetworks it resides in. On the other hand, an individual node, which does notreside within subnetwork overlaps, would have higher chances to possess a dominantprobability of being assigned to a particular subnetwork. Hence the difference ofprobabilities of a node being assigned to the first two subnetworks with the first twohighest probabilities indicates the possibility of a node residing within subnetworkoverlaps. Interacting nodes tend to have a smaller value of difference of first twohighest probabilities.We here define the degree of overlapping confidence as the subtraction from oneof the difference between the first two highest probabilities of a node being assignedto subnetworks. The nodes identified within the subnetwork overlaps (interactingnodes) and outside of the overlaps (individual nodes) are considered as two popula-tions. For each population, the average overlapping confidence is defined as belowin Equation 20: overConf = 1 | S | (cid:88) i ∈ S (1 − ( p maxi − p smaxi )) , (20)where S is a set of nodes, either nodes residing within or outside the subnetworkoverlaps, and p max is the maximal probability of a node belonging to subnetworks,and p smax is the second maximal probability. Thus, the interacting node populationis expected to have higher overConf compared to individual nodes.We first derived the probabilities of each node being assigned into all possiblesubnetworks using our recently proposed multimodal RW approach [19], where twosources of connectivity matrices are C rest and ¯ C task , matching with how C t-r wasgenerated in our approach. The number of seeds within each subnetwork n k was setto [2 , . . . , C t-r .We found that the overlapping confidence of the interacting nodes with an aver-age overConf of 0.6884 are statistically higher than the individual nodes with anaverage overConf of 0.6338 based on the Wilcoxon signed rank test at p=0.006, seeFigure 5. This finding confirms that the overlapping subnetwork assignments basedon our proposed MCSE match with the probability derived independently from our15igure 5: Overlapping confidence of interacting nodes in blue versus indivual nodesin red derived by MCSE with C t-r . The probability of a node being assigned intosubnetworks was derived by the RW based approach [19].RW based approach. The traditional definition of clique is the fully connected subgraphs identified bythe connections between brain regions mostly on resting state connectivity. In ourapproach, we present a novel way to identify cliques based on the similarity of activa-tion patterns between nodes. We argue that the clique concept closely resemble the canonical network components that are recruited selectively and repeatedly in dif-ferent task-induced activities [15]. Different from the traditional kclique method [3],our clique-based approach is able to utilize both the task activation information andthe task-induced connectivity strength rather than only the binarized connectivityinformation used in kclique method. Besides, the cliques derived using our approachhave flexible clique size, which was determined automatically, having more neuro-scientific justifications than the fixed clique size in [3]. Moreover, we estimate theproperties from cliques to indicate the importance of cliques, which gives us a bettercontrol over falsely including some fake cliques due to noise. We did find cliqueswithin brain areas that well match with hubs, which indicates that our approachcan identify subnetwork overlaps with more biological meaning than the traditionalkclique method. 16 .2 Multisource Information Integration Improves the Over-lapping Subnetwork Extraction
Compared to the widely used overlapping community detection methods, our ap-proach integrates information from multiple sources. We used both task information(including task activation and connectivity strength) for clique identification andresting state connectivity information for subnetwork membership sharing. The re-sults from reproducibility and biological meaning indicate that our multisource ap-proach, especially MCSE with C t-r , outperforms uni-source methods such as kcliqueand SORD, which has been proven to give better overlapping brain subnetworkextraction results compared to state of the art techniques such as OSLOM. Wenote that our multisource approach further outperformed the multisource version ofSORD, CSORD. The reason could be that clique based idea and the sharing of thenode subnetwork membership is more straightforward and have more direct biolog-ical intuition than relying on survival probabilities of different genders in evolution,which is used in CSORD. We have identified subnetwork overlaps within brain regions that well match withhubs defined using functional, structural and anatomical information. The resultsenable us to study the interaction and integration between subnetworks and howinteracting nodes (or important hubs) play their roles in the information flow acrossdifferent subnetworks. We further demonstrated that the assignments of interact-ing/individual nodes using our proposed MCSE correspond with the posterior prob-ability derived independently from our previously proposed RW based approach [19].The finding of more distinguishable overlapping confidence between two populationsof nodes when the number of seeds was set within a range of [6, 8] confirms the meritof using multiple seeds within a reasonable range (not including connector hubs) inthe RW based approach.
We have also discovered that the uni-source traditional kclique approach has highcomputational complexity when the graph density increases, where there exist largenumber of fully connected subgraphs. The computational complexity of both SORDand CSORD increases when the bootstrap sampling increases [5]. However, thecomputation time of our proposed MCSE is quite reasonable and not sensitive tothe graph densities. 17n terms of the coverage of the brain area from the subnetwork extraction results,SORD and CSORD neglected some brain regions which are not selected as significantnodes by the stability selection. While these two approaches offered this extrafeature, they sometimes falsely missed important nodes and failed to cover the wholebrain for analysis.
In this work, we presented an approach to identify cliques based on task informa-tion and extract overlapping subnetworks using both task and rest data. However,the ideal multimodal framework would be able to integrate AC into the fusion fordetecting overlapping subnetworks. The challenge is to discover the relationship be-tween AC and task activation, which enables the clique identification to incorporateanatomical information. Our future work will focus on integrating AC into task-activation based clique identification, or into the multimodal subnetwork member-ship assignment, e.g., using the multimodal RW approach or the multislice approach[30].
We proposed an approach for multisource overlapping brain subnetwork extractionusing canonical network components, i.e., cliques, which we defined based on taskco-activation. Based on the clique concept, we investigated overlapping subnetworksbased on a label sharing scheme which incorporates the rest data information andtask data embedded with higher order relations. We have demonstrated that inte-grating multimodal/multisource information and using high order relations result inbetter subnetwork extraction in terms of the overlaps to well-established brain sys-tems, test-retest repeatability, inter-subject reproducibility and biological meaning. AC Anatomical Connectivity
CIS
Connected Iterative Scan
CPM
Clique Percolation Method
CSORD
Coupled Stable Overlapping Replicator Dynamics
DSC
Dice Similarity Coefficient FC Functional Connectivity 18
MRI
Functional Magnetic Resonance Imaging
HCP
Human Connectome Project HO Harvard-Oxford
MCSE
Multisource Clique-based Subnetwork Extraction
MNI
Montreal Neurological Institute
Ncuts
Normalized cuts
OSLOM
Order Statistics Local Optimization Method RD Replicator Dynamics
ROI region of interest RW Random Walker
SORD
Stable Overlapping Replicator Dynamics
References [1] Ferrarini, L., Veer, I.M., Baerends, E., van Tol, M.J., Renken, R.J., van derWee, N.J., Veltman, D., Aleman, A., Zitman, F.G., Penninx, B.W., et al.:Hierarchical functional modularity in the resting-state human brain. Humanbrain mapping (7) (2009) 2220–2231[2] Xie, J., Kelley, S., Szymanski, B.K.: Overlapping community detection innetworks: The state-of-the-art and comparative study. Acm computing surveys(csur) (4) (2013) 43[3] Palla, G., Der´enyi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping com-munity structure of complex networks in nature and society. arXiv preprintphysics/0506133 (2005)[4] Sporns, O., Betzel, R.F.: Modular brain networks. Annual review of psychology (2016) 613–640[5] Yoldemir, B.: Multimodal fusion for assessing functional segregation and inte-gration in the human brain. PhD thesis, University of British Columbia (2016)[6] Evans, T., Lambiotte, R.: Line graphs, link partitions, and overlapping com-munities. Physical Review E (1) (2009) 016105197] Ahn, Y.Y., Bagrow, J.P., Lehmann, S.: Link communities reveal multiscalecomplexity in networks. arXiv preprint arXiv:0903.3178 (2009)[8] Zhang, S., Wang, R.S., Zhang, X.S.: Identification of overlapping commu-nity structure in complex networks using fuzzy c-means clustering. Physica A:Statistical Mechanics and its Applications (1) (2007) 483–490[9] Ding, F., Luo, Z., Shi, J., Fang, X.: Overlapping community detection bykernel-based fuzzy affinity propagation. In: Intelligent Systems and Applica-tions (ISA), 2010 2nd International Workshop on, IEEE (2010) 1–4[10] Lancichinetti, A., Fortunato, S., Kert´esz, J.: Detecting the overlapping andhierarchical community structure in complex networks. New Journal of Physics (3) (2009) 033015[11] Yan, X., Kelley, S., Goldberg, M., Biswal, B.B.: Detecting overlapped func-tional clusters in resting state fmri with connected iterative scan: a graphtheory based clustering algorithm. Journal of neuroscience methods (1)(2011) 108–118[12] Yoldemir, B., Ng, B., Abugharbieh, R.: Stable overlapping replicator dynamicsfor brain community detection. IEEE transactions on medical imaging (2)(2016) 529–538[13] Lancichinetti, A., Radicchi, F., Ramasco, J.J., Fortunato, S.: Finding statisti-cally significant communities in networks. PloS one (4) (2011) e18961[14] Torsello, A., Bulo, S.R., Pelillo, M.: Beyond partitions: Allowing overlappinggroups in pairwise clustering. In: Pattern Recognition, 2008. ICPR 2008. 19thInternational Conference on, IEEE (2008) 1–4[15] Park, B., Kim, D.S., Park, H.J.: Graph independent component analysis revealsrepertoires of intrinsic network components in the human brain. PloS one (1)(2014) e82873[16] Wang, C., Abugharbieh, R.: Hypergraph based subnetwork extraction usingfusion of task and rest functional connectivity. arXiv preprint arXiv:1801.05017(2018)[17] Wang, C., Ng, B., Abugharbieh, R.: Modularity reinforcement for improvingbrain subnetwork extraction. In: International Conference on Medical ImageComputing and Computer-Assisted Intervention, Springer (2016) 132–1392018] Yoldemir, B., Ng, B., Abugharbieh, R.: Coupled stable overlapping replicatordynamics for multimodal brain subnetwork identification. In: InternationalConference on Information Processing in Medical Imaging, Springer (2015) 770–781[19] Wang, C., Ng, B., Abugharbieh, R.: Multimodal brain subnetwork extractionusing provincial hub guided random walks. In: International Conference onInformation Processing in Medical Imaging, Springer, Cham (2017) 287–298[20] Van Essen, D.C., Smith, S.M., Barch, D.M., Behrens, T.E., Yacoub, E., Ugur-bil, K., Consortium, W.M.H., et al.: The wu-minn human connectome project:an overview. Neuroimage (2013) 62–79[21] Glasser, M.F., Sotiropoulos, S.N., Wilson, J.A., Coalson, T.S., Fischl, B., An-dersson, J.L., Xu, J., Jbabdi, S., Webster, M., Polimeni, J.R., et al.: Theminimal preprocessing pipelines for the human connectome project. Neuroim-age (2013) 105–124[22] Behzadi, Y., Restom, K., Liau, J., Liu, T.T.: A component based noise correc-tion method (compcor) for bold and perfusion based fmri. Neuroimage (1)(2007) 90–101[23] Desikan, R.S., S´egonne, F., Fischl, B., Quinn, B.T., Dickerson, B.C., Blacker,D., Buckner, R.L., Dale, A.M., Maguire, R.P., Hyman, B.T., et al.: An auto-mated labeling system for subdividing the human cerebral cortex on mri scansinto gyral based regions of interest. Neuroimage (3) (2006) 968–980[24] Skudlarski, P., Jagannathan, K., Calhoun, V.D., Hampson, M., Skudlarska,B.A., Pearlson, G.: Measuring brain connectivity: diffusion tensor imagingvalidates resting state temporal correlations. Neuroimage (3) (2008) 554–561[25] Buckner, R.L., Sepulcre, J., Talukdar, T., Krienen, F.M., Liu, H., Hedden, T.,Andrews-Hanna, J.R., Sperling, R.A., Johnson, K.A.: Cortical hubs revealed byintrinsic functional connectivity: mapping, assessment of stability, and relationto alzheimer’s disease. Journal of Neuroscience (6) (2009) 1860–1873[26] GeethaRamani, R., Sivaselvi, K.: Human brain hubs (provincial and connec-tor) identification using centrality measures. In: Recent Trends in InformationTechnology (ICRTIT), 2014 International Conference on, IEEE (2014) 1–6[27] He, Y., Chen, Z., Evans, A.: Structural insights into aberrant topologicalpatterns of large-scale cortical networks in alzheimer’s disease. Journal of Neu-roscience (18) (2008) 4756–4766 2128] Mano, Q.R., Humphries, C., Desai, R.H., Seidenberg, M.S., Osmon, D.C., Sten-gel, B.C., Binder, J.R.: The role of left occipitotemporal cortex in reading:reconciling stimulus, task, and lexicality effects. Cerebral Cortex (4) (2012)988–1001[29] Yun, J.Y., Kim, J.C., Ku, J., Shin, J.E., Kim, J.J., Choi, S.H.: The left middletemporal gyrus in the middle of an impaired social-affective communicationnetwork in social anxiety disorder. Journal of Affective Disorders (2017)53–59[30] Mucha, P.J., Richardson, T., Macon, K., Porter, M.A., Onnela, J.P.: Commu-nity structure in time-dependent, multiscale, and multiplex networks. science328