LOCUS: A Novel Decomposition Method for Brain Network Connectivity Matrices using Low-rank Structure with Uniform Sparsity
LLOCUS: A Novel Decomposition Method for Brain NetworkConnectivity Matrices using Low-rank Structure withUniform Sparsity
Yikai Wang [email protected]
Department of Biostatistics&BioinformaticsEmory UniversityAtlanta, GA 30322, USA
Ying Guo [email protected]
Department of Biostatistics&BioinformaticsEmory UniversityAtlanta, GA 30322, USA
Abstract
Network-oriented research has been increasingly popular in many scientific areas. In neuro-science research, imaging-based network connectivity measures have become the key for un-derstanding brain organizations, potentially serving as individual neural fingerprints. Thereare major challenges in analyzing connectivity matrices including the high dimensionalityof brain networks, unknown latent sources underlying the observed connectivity, and thelarge number of brain connections leading to spurious findings. In this paper, we proposea novel blind source separation method with low-rank structure and uniform sparsity (LO-CUS) as a fully data-driven decomposition method for network measures. Compared withthe existing method that vectorizes connectivity matrices ignoring brain network topology,LOCUS achieves more efficient and accurate source separation for connectivity matricesusing low-rank structure. We propose a novel angle-based uniform sparsity regularizationthat demonstrates better performance than the existing sparsity controls for low-rank ten-sor methods. We propose a highly efficient iterative Node-Rotation algorithm that exploitsthe block multi-convexity of the objective function to solve the non-convex optimizationproblem for learning LOCUS. We illustrate the advantage of LOCUS through extensivesimulation studies. Application of LOCUS to Philadelphia Neurodevelopmental Cohortneuroimaging study reveals biologically insightful connectivity traits which are not foundusing the existing method.
Keywords: blind source separation, low-rank, matrix factorization, network connectivity,neuroimaging
1. Introduction
In this paper, we present a blind source separation (BSS) method for decomposing imaging-based brain network measures to identify underlying source signals characterizing connec-tivity traits. Our work is motivated by brain network research in neuroimaging studies. Inrecent years, network-oriented analyses have become an important research field in neuro-science for understanding brain organization and its involvement in neurodevelopment andmental disorders (Bullmore and Sporns, 2009; Deco et al., 2011; Satterthwaite et al., 2014b;Kemmer et al., 2015; Wang et al., 2016; Wang and Guo, 2019). In neuroimaging studies, c (cid:13) https://creativecommons.org/licenses/by/4.0/ . a r X i v : . [ s t a t . M L ] A ug ang and Guo network measures are derived from various neuroimaging modalities to reflect functional orstructural connections between a set of nodes or brain regions. The network measures aretypically encoded as symmetric matrices where the entries represent brain connectivity be-tween pairs of nodes or regions in the brain. For example, derived from functional magneticresonance imaging (fMRI) or electroencephalogram (EEG), functional connectivity (FC)measures the dependence between temporal dynamics in neural processing of spatially dis-joint brain regions (Biswal et al., 1995; Lang et al., 2012; Kemmer et al., 2018; Kundu et al.,2019). The commonly used FC measures include the Pearson correlation matrix or partialcorrelation matrix (Smith et al., 2011; Church et al., 2008; Seeley et al., 2009; Wang et al.,2016). These connectivity measures contain important information about the structure ofbrain organizations and could potentially serve as an individual’s neural fingerprint thathelp inform disease diagnosis or treatment planning (Finn et al., 2015; Amico and Go˜ni,2018b).There are several major challenges in network analysis in neuroimaging studies. First,to fully capture the whole brain organization, connectivity matrices are usually high di-mensional (Chung, 2018). For example, the voxel-level brain network based on fMRI datacontains tens of thousands of nodes and nearly half a billion edges. More commonly, atlas-based brain networks are constructed based on a brain atlas or node system such as theautomated anatomical labeling (AAL) atlas (Tzourio-Mazoyer et al., 2002) and the morerecent Power’s node system (Power et al., 2011) and Glassier’s atlas (Glasser et al., 2013).Although the number of nodes are reduced dramatically in these atlas-based networks,there are still hundreds of nodes and hundreds of thousands of edges (Chung, 2018; Soloet al., 2018; Wu et al., 2013; Wang et al., 2016). The high dimensionality makes it chal-lenging to analyze the brain network for scientific discoveries. Secondly, brain connectomeis a complex organization encompassing many underlying neural circuits. The observedconnectivity matrix, measuring the overall connectivity patterns across the brain, repre-sents aggregated information from various underlying neural circuits (Figure 1). Currently,there is a lack of methods for reliably decomposing the observed connectivity matrices torecover those underlying neurocircuitry subsystems. Neuroscience literature has suggestedthat demographic- or disease-related alterations in brain network usually happen in certainsmall scale neural circuits instead of in the whole brain connectome. Without consistentdissection of neural circuits, it is often not clear which neural circuits are primary drivers ofsubpopulation differences and clinical symptoms. Thirdly, given the large number of edgespresent in the brain networks, there is a high possibility of spurious findings in terms ofoutcome-related brain connections across the network in neuroscience research. In this paper, we propose a novel lo w-rank decomposition of brain c onnectivity with u niform s parsity (LOCUS) method. LOCUS is a fully data-driven blind source separation methodfor decomposing brain network measures. Specifically, LOCUS decomposes subjects’ con-nectivity matrix , Y , into a linear combination of latent source signals, { S (cid:96) } q(cid:96) =1 , weightedby mixing coefficients { a (cid:96) } q(cid:96) =1 , i.e. Y = (cid:80) q(cid:96) =1 a (cid:96) S (cid:96) + error. Here, each of the source signals S (cid:96) represents an underlying neurocircuitry trait and the mixing coefficients a (cid:96) representsubject-specific loadings on the trait. We propose to model the source signals using a low- OCUS: Low-rank connectivity decomposition with uniform sparsity
Figure 1: Illustration of decomposing the brain functional network into various underlyingneurocircuitry traits.Figure 2:
Example connectivity traits extracted from resting-state fMRI data. rank structure X (cid:96) D (cid:96) X (cid:48) (cid:96) where D (cid:96) is a diagonal matrix. This is well motivated by theobservation that brain connectivity traits often have block-diagonal or banded structure(Figure 2) that can be efficiently captured with a low-rank factorization (Zhou et al., 2013).The low-rank structure leads to a significant reduction in the number of parameters, henceimproving accuracy and reliability in the recovery of underlying connectivity traits. More-over, we propose a novel regularization method to achieve the sparsity in the extracted latentsource signals with the low-rank structure for reducing spurious findings. Our regularization ang and Guo method directly targets element-wise sparsity of the the connectivity traits S (cid:96) modeled viathe low-rank structure X (cid:96) D (cid:96) X (cid:48) (cid:96) . This novel sparsity regularization is conceptually intuitiveand significantly reduces false positive connections when extracting the connectivity traits.The learning of LOCUS is formulated as a non-convex optimization problem. We show thatthe optimization function has a block multi-convex structure (Gorski et al., 2007). Based onthis property, we develop an efficient node-alternating algorithm with closed-form solutionsat each iteration for estimating the parameters in LOCUS. Currently, the most popular blind source separation methods for decomposing neuroimag-ing data is independent component analysis (ICA). In neuroscience research, ICA is widelyused for dimension reduction, denoising and extraction of latent neural components. ICAhas achieved great success in many neuroimaging applications (Beckmann and Smith, 2004;Guo, 2011; Contreras et al., 2017; Mejia et al., 2019; Wang and Guo, 2019; Lukemire et al.,2020). However, existing ICA applications have mainly focused on decomposing observedneural activity signals such as the blood oxygen level-dependent (BOLD) series from fMRIor the electrodes signal series from EEG. Brain connectivity data has quite different proper-ties as compared to activity data. For example, a connectivity matrix is typically symmetricand its diagonal elements (i.e. self-connections) are often not of interest. Therefore, therelevant information in a connectivity matrix is captured by its lower or upper triangle ma-trix (Amico et al., 2017). Traditional matrix decomposition approaches such as ICA cannotbe directly applied to brain connectivity matrices. Recently, Amico et al. (2017) proposeda connectivity independent component analysis framework (connICA) for brain connec-tivity data. connICA first vectorizes the connectivity matrices and then utilizes existingICA algorithms for decomposition. Ignoring the dependence structure across edges in thebrain network, connICA method treats each edge independently and has a large numberof parameters, which reduces accuracy and reliability in extracting connectivity compo-nents. connICA also does not include sparsity regularization, leading to noisy estimatesand spurious findings in connectivity analysis.Our proposed LOCUS is one of the first statistically principled BSS methods for brainconnectivity measures and has several innovative contributions over existing decompositionmethods: 1) LOCUS achieves more efficient and accurate source separation for connectivitymatrices by assuming a low-rank factorization structure for the extracted latent connectivitycomponents/traits. The low-rank structure results in a dramatic decrease in the number ofparameters and hence leads to more accurate recovery of underlying connectivity traits. 2)Compared to the vectorization approach such as connICA, the low-rank structure providesanother major advantage by taking into account the dependence structure across edgesthat involve the same brain region. This allows LOCUS to borrows information acrossedges, leading to more robust estimation. 3) LOCUS utilizes a novel sparsity regularizationmethod to reduce false positive connections when extracting the connectivity traits, thusalleviating the burden of searching for an optimal threshold in the existing methods. OCUS: Low-rank connectivity decomposition with uniform sparsity
In recent years, there have been methods development in using tensor or low-rank factoriza-tion for studying network data in neuroimaging analysis (Durante et al., 2017; Zhou et al.,2013; Sun and Li, 2017; Wang et al., 2017). Our proposed LOCUS approach has severalkey distinctions and contributions over these methods. The existing methods mainly focuson using the low-rank structure to reduce the size of the high-dimensional parameters intensor regressions or to model binary brain networks. Specifically, existing tensor regressionmethods in brain network analysis (Zhou et al., 2013; Sun and Li, 2017) either treat brainconnectivity matrices as a high dimensional predictor or model it as a tensor outcome. Thelow-rank structure is assumed for the regression parameters to efficiently learn the high-dimensional parameters in regression setting. In some other work, Bayesian methods aredeveloped for binary brain networks obtained by thresholding the observed brain connectiv-ity measures (Durante et al., 2017; Wang et al., 2017). These papers mainly adopt low-rankfactorization in modeling the parameters characterizing the probability mass function ofthe binary network data.Compared with the existing low-rank factorization methods, the proposed LOCUS hasa fundamentally different goal that aims to conduct fully data-driven decomposition ofmulti-subject connectivity data to extract population-level latent connectivity traits corre-sponding to various neural circuits. As another distinction, LOCUS generates a very usefuloutput: mixing coefficients, which are not obtainable from other low-rank network methods.The coefficients represent subject-specific trait loadings on the connectivity traits, quantify-ing the prominence or presence of the traits in individual subjects. By associating subjects’trait loadings with clinical measures, investigators can identify neural circuits associatedwith clinical groups or clinical symptoms, which offers appealing clinical interpretability.
Currently, there are two major approaches of sparsity penalization for low-rank factorization(Raskutti and Yuan, 2015; Wang et al., 2018): the vector-wise sparsity constraint that aimsto achieve element-wise sparsity of the column vectors X (cid:96) (Zhou et al., 2013; Sun andLi, 2017; Li et al., 2018), and the low-rank matrix constraint that aims to recover a low-rank matrix via regularization on D (cid:96) such as minimizing the nuclear norm of S (cid:96) (Chenet al., 2013; Rabusseau and Kadri, 2016; Fan et al., 2017; Yuan and Zhang, 2016). Bothof the existing sparsity controls have limitations for achieving our goal which is to recoverparsimonious connectivity traits S (cid:96) . The low rank constraint that aims to achieve lowrankness in S (cid:96) does not necessarily ensure the sparsity in S (cid:96) (Sun and Li, 2017). Thevector-wise sparsity control might not generate desired sparsity pattern on reconstructedsources since they depend on the dot product between the matrices.Our purpose is to decompose the brain connectivity matrices to extract the latent sourcescorresponding to various brain neurocircuitry traits which are of main interest. Therefore,we provide a more intuitive way of sparsity regularization by directly penalizing on la-tent sources S (cid:96) . Through extensive simulation studies, we show this new sparsity methoddemonstrates more accurate and robust performance in recovering underlying traits thanthe commonly used sparsity controls. Furthermore, compared with some existing sparsity ang and Guo controls which require numerical methods to solve, our new sparsity penalization allows forexplicit analytic solutions to the optimization function in the estimation, which increasescomputational efficiency.In summary, we proposed a novel blind source separation method for decomposing brainconnectivity matrix. Our contributions include following. First, to the best of our knowl-edge, LOCUS is the first formal signal separation approach with a low-rank structure fordecomposing brain connectivity matrices. The low-rank structure leads to considerable re-duction in parameters and improved accuracy, allowing LOCUS to achieve more efficientand reliable source separation for connectivity metrics. Secondly, we propose a novel spar-sity regularization method that aims to control the element-wise or uniform sparsity onthe reconstructed matrix from the low-rank factorization. Compared with the commonlyused sparsity methods that focus on controlling sparsity on the vector or diagonal matrixcomponents in the low-rank factorization, our method aims to control the sparsity in theoverall matrix reconstructed from the low-rank structure, which directly targets the output,i.e. connectivity trait, that we are interested in. This new sparsity control has shown highlypromising results in the simulation studies and provide a new type of sparsity regulariza-tion that can be generally applied in low-rank structure involved models such as tensorregressions or covariance modeling. Thirdly, we prove the block multi-convexity in LO-CUS objective function and develop an efficient node-rotation algorithm to break down theoriginal non-convex optimization problem into a series of sub-convex problems with closed-form solutions. The proposed model and the estimation algorithm demonstrate superiorperformance in recovering the underlying source signals though our extensive simulationstudies.The overall structure of the paper is organized as follows. Section 2 introduces themethodology including model specification, the Bayesian perspective of LOCUS, algorithmfor estimation. Section 3 is for the simulation study. Section 4 is the real data application.Section 5 is for discussion and conclusion.
2. Methodology
We define vector norm (cid:107) x (cid:107) h = ( (cid:80) i | x i | h ) h with an integer h ≥
1, and we denote the Frobe-nius norm of a matrix by (cid:107) X (cid:107) F = ( (cid:80) ij X ij ) / . Suppose we observe brain connectivitydata from N subjects. Let Y ∗ i denote a V × V symmetric brain connectivity matrix forthe i th ( i = 1 , . . . , N ) subject with Y ∗ i ( u, v ) ∈ R representing the strength of connectionbetween node u and v in the brain. Y ∗ i is obtained by performing proper transformationson brain connectivity measures. Since the diagonal of Y ∗ i which represents self-relationshipin the network is typically not of interest, we define a vector Y i based on the upper triangu-lar elements of Y ∗ i , i.e. Y i = L ( Y ∗ i ) where L ( Y ∗ i ) = [ Y ∗ i (1 , , Y ∗ i (1 , , ..., Y ∗ i ( V − , V )] (cid:48) .Here, L : R V × V → R p with p = V ( V − / In this section, we introduce the LOCUS framework for decomposing multi-subject connec-tivity matrices while assuming a low-rank structure for the underlying connectivity traits. OCUS: Low-rank connectivity decomposition with uniform sparsity
We also propose a novel sparsity regularization method that aims to achieve element-wisesparsity on the reconstructed connectivity trait based on the low-rank structure.
We propose the following LOCUS model to decompose the multi-subject connectivity ma-trices to extract latent connectivity sources. Motivated by the observed patterns of brainconnectivity traits which often have block-diagonal or banded structure (Figure 2), we modelconnectivity sources with a low-rank structure which can considerably reduce the numberof parameters while capturing the network characteristics. Specifically, Y i = q (cid:88) (cid:96) =1 a i(cid:96) S (cid:96) + e i , (1)where S (cid:96) ∈ R p ( (cid:96) = 1 , . . . , q ) is the source signal of the (cid:96) th connectivity source or traitand we assume independence across the q traits, { a i(cid:96) } are the mixing coefficients or traitloadings which mixes the traits to generate the observed connectivity, e i ∈ R p is an errorindependent of the source signals. The number of latent sources, i.e. q , can be determinedusing existing methods such as the Laplace approximation method (Minka, 2000; Shi andGuo, 2016; Wang and Guo, 2019). We can also rewrite the LOCUS model in (1) acrosssubjects as, Y = AS + E , (2)where Y = [ Y , ..., Y N ] (cid:48) ∈ R N × p is the multi-subject connectivity data, S = [ S , ..., S q ] (cid:48) ∈R q × p is the connectivity traits matrix, A = { a i(cid:96) } ∈ R N × q is the mixing/loading matrix,and E = [ e , ..., e N ] (cid:48) ∈ R N × p .We model the connectivity source signals via a low-rank structure, i.e. S (cid:96) = L ( X (cid:96) D (cid:96) X (cid:48) (cid:96) ) , (3)where X (cid:96) = [ X (1) (cid:96) , . . . , X ( R (cid:96) ) (cid:96) ] ∈ R V × R (cid:96) with R (cid:96) < V and each column X ( r ) (cid:96) ( r = 1 , . . . , R (cid:96) )is a V × (cid:107) X ( r ) (cid:96) (cid:107) = 1 for identifiability purpose. D (cid:96) is adiagonal matrix with diagonal elements d (cid:96) = ( d (1) (cid:96) , .., d ( R (cid:96) ) (cid:96) ). The low-rank structure impliesthe V nodes reside in a reduced subspace of R (cid:96) dimensions, i.e. S (cid:96) = L ( (cid:80) R (cid:96) r =1 d ( r ) (cid:96) X ( r ) (cid:96) X ( r ) (cid:48) (cid:96) )where the r th column X ( r ) (cid:96) ∈ R V × represents the coordinates of the V nodes in the r thdimension and d ( r ) (cid:96) reflects the contribution of the r th dimension in generating S (cid:96) . Each rowof X (cid:96) , i.e. X (cid:96) ( v ) (cid:48) with X (cid:96) ( v ) ∈ R R (cid:96) × , represents the coordinates of the v th node in the R (cid:96) dimensional latent subspace. As shown in Figure 2, the network property and topologicalstructures may vary considerably across different connectivity traits. Therefore, we let thesubspace rank R (cid:96) to be specific to each latent source to accommodate such differences.Figure 3 illustrates the framework of LOCUS method.The decomposition part of the LOCUS model in (1) takes a similar form as the proba-bilistic ICA (Hyv¨arinen et al., 2001; Beckmann and Smith, 2004) which is a classical blindsource separation method . However, there are key distinctions between LOCUS and ICA.The existing connectivity ICA methods such as connICA (Amico et al., 2017) simply vec-torizes a connectivity matrix and assumes the p elements in S (cid:96) are independent random ang and Guo Figure 3: Illustration of LOCUS framework for decomposing subjects’ brain functional con-nectivity (FC) matrices into latent source signal matrices representing various un-derlying neurocircuitry traits. Each latent source signal matrix is further modeledvia a symmetric low-rank factorization.samples from the same latent variable, ignoring the network topology structure across thebrain. In comparison, LOCUS models the source signal using a low-rank structure whichis well-motivated by the characteristics of the connectivity metrics. The low-rank structureoffers several major advantages. First, it leads to a significant reduction of the numberof parameters from a quadratic function of the number of nodes V to a linear function of V , hence improving reliability in estimation. Furthermore, the low-rank structure offersappealing interpretations for neurocircuitry structures. Each of the R (cid:96) subspace dimensioncan potentially reveal an underlying neurophysiological event that contributes to the con-nectivity trait where X ( r ) (cid:96) captures the activity of V nodes in the r th neural event, allowingus to identify key nodes involved in a connectivity trait. The connection between two nodesin the (cid:96) th trait depends on the similarity in the activity of the nodes across the R (cid:96) neu-ral events, i.e. S (cid:96) ( u, v ) = (cid:80) R (cid:96) r =1 d ( r ) (cid:96) X ( r ) (cid:96)u X ( r ) (cid:96)v , which coincides with the definition of brainconnectivity in neuroscience (Friston et al., 1993; Friston, 2011). The contribution coeffi-cients d ( r ) (cid:96) represents the significance of the r th neural event in defining the connectivitytrait. Therefore, LOCUS reveals underlying neural events and brain nodes contributing tothe connectivity trait. The existing ICA methods do not provide such interpretations andinsights. In LOCUS, we model each latent source S (cid:96) with a low-rank factorization. To obtain parsi-monious results for the connectivity traits, we propose a new sparsity method that aims toachieve element-wise sparsity on the reconstructed connectivity traits based on the low-rank OCUS: Low-rank connectivity decomposition with uniform sparsity structure. Currently, the commonly adopted sparsity regularization methods in low-rankfactorization and tensor decomposition methods fall into two major categories: the vector-wise sparsity constraint or low-rank matrix constraint (Raskutti and Yuan, 2015; Wanget al., 2018). The vector-wise sparsity methods aim to achieve element-wise sparsity of thecolumn vectors { X ( r ) (cid:96) } by minimizing a penalization terms based on the L1 or L2 normsof { X ( r ) (cid:96) } (Zhou et al., 2013; Li et al., 2018; Fan and Li, 2001; Zhang et al., 2010). Thelow-rank constraint methods aim to recover a low-rank matrix via regularization on D (cid:96) such as minimizing the nuclear norm of D (cid:96) (Chen et al., 2013; Rabusseau and Kadri, 2016;Fan et al., 2017; Yuan and Zhang, 2016). In LOCUS, our goal is to recover parsimoniousconnectivity traits S (cid:96) . The existing vector-wise sparsity control and low rank constraintmethods have certain limitations in achieving this goal. The vector-wise sparsity control,which aims to achieve sparsity in the column vectors { X ( r ) (cid:96) } , does not necessarily lead tothe desired sparsity in the connectivity traits which depends on the dot product betweenthe vectors. The nuclear norm method that aims to achieve low rankness in S (cid:96) does notnecessarily lead to sparse estimates for the connectivity traits.To achieve sparsity in the extracted connectivity traits to reduce false positives andimprove the robustness of the results, we propose a novel sparsity regularization that aimsto achieve element-wise sparsity on the reconstructed connectivity trait matrix based onthe low rank structure. i.e. S (cid:96) = L ( X (cid:96) D (cid:96) X (cid:48) (cid:96) ). The objective function for estimating theLOCUS model with the new sparsity control is defined as follows,min N (cid:88) i =1 (cid:107) Y i − q (cid:88) (cid:96) =1 a i(cid:96) L ( X (cid:96) D (cid:96) X (cid:48) (cid:96) ) (cid:107) + φ q (cid:88) (cid:96) =1 (cid:88) u Prior to LOCUS decomposition, we take several preprocessing steps that are commonlyadopted in blind source separation, which includes centering, dimension reduction andwhitening. The preprocessing is generally performed to facilitate the subsequent decompo-sition by reducing the computational load and avoid overfitting (Hyv¨arinen et al., 2001).Following the preprocessing procedure from previous work (Beckmann and Smith, 2004; Shiand Guo, 2016; Wang and Guo, 2019), we first demean the group connectivity data Y andthen perform a dimension reduction and whitening procedure on the demeaned data. Thatis, (cid:101) Y = HY , where H = ( Λ q − (cid:101) σ q I ) − / U (cid:48) q . U q and Λ q contains the first q eigenvectorsand eigenvalues based on singular value decomposition of Y . The residual variance, (cid:101) σ q ,represents the variability in Y that is not explained by the extracted q latent sources and OCUS: Low-rank connectivity decomposition with uniform sparsity is estimated by the average of the smallest N − q eigenvalues in Λ q . The preprocessed data (cid:101) Y is of dimension q × p where each column corresponds to one of the p connections in thebrain.With the preprocessing, the model in (2) can be re-expressed on the reduced and spheredspace as following: (cid:101) Y = (cid:101) AS + (cid:101) E , (7)where (cid:101) A = HA and (cid:101) E = HE . Due to the whitening in the preprocessing, the mixingmatrix on the reduced space (cid:101) A = { (cid:101) a i(cid:96) } ∈ R q × q is orthogonal (Hyv¨arinen and Oja, 2000;Beckmann and Smith, 2004). Note that the dimension reduction in the preprocessing is per-formed on the row space of Y which corresponds to the subject domain and does not affectthe column space of Y which corresponds to the connectivity domain. The optimizationfunction for LOCUS with the preprocessed data is thenmin (cid:101) A , { X (cid:96) , D (cid:96) } q (cid:88) i =1 (cid:107) (cid:101) Y i − q (cid:88) (cid:96) =1 (cid:101) a i(cid:96) L ( X (cid:96) D (cid:96) X (cid:48) (cid:96) ) (cid:107) + φ q (cid:88) (cid:96) =1 (cid:88) u With an orthogonal mixing matrix (cid:101) A , the optimization in (8) is equivalent to min (cid:101) A , { X (cid:96) , D (cid:96) } q (cid:88) (cid:96) =1 (cid:107) (cid:101) Y (cid:48) (cid:101) A (cid:96) − L ( X (cid:96) D (cid:96) X (cid:48) (cid:96) ) (cid:107) + φ q (cid:88) (cid:96) =1 (cid:88) u We propose an efficient algorithm to solve the optimization problem in (9). Denote Θ =[ (cid:101) A , { X (cid:96) , D (cid:96) } ] as the parameters to learn. We propose the following iterative estimationalgorithm which includes 3 major updating steps. A summary of the algorithm is presentedin Algorithm 1. Step 1 : Updating X (cid:96) . We propose a novel node-rotation algorithm that updates X (cid:96) at one of the node v while conditioning on the rest of the nodes and then rotating across ang and Guo the nodes. This algorithm exploits the conditional convexity in X (cid:96) ( v ) given the otherparameters. Specifically, at the t th iteration, we update ˆ X ( t ) (cid:96) ( v ), v = 1 , .., V , conditioningon (cid:101) A , D (cid:96) and X (cid:96) ( − v ) estimated from the t − X (cid:96) ( − v ) is X (cid:96) with the v throw removed. The updated ˆ X ( t ) (cid:96) ( v ) is obtained via the following,min X (cid:96) ( v ) (cid:13)(cid:13)(cid:13) (cid:101) Y (cid:48){ v } ˆ (cid:101) A (cid:96) − ˆ X (cid:96) ( − v ) ˆ D (cid:96) X (cid:96) ( v ) (cid:13)(cid:13)(cid:13) + φ V (cid:88) u =1 u (cid:54) = v | ˆ X (cid:96) ( u ) (cid:48) ˆ D (cid:96) X (cid:96) ( v ) | , (10)where (cid:101) Y { v } is a q × ( V − 1) sub-matrix of (cid:101) Y which includes the subset of columns in (cid:101) Y that correspond to connections involving node v . A detailed derivation from (9) to (10) isprovided in the Appendix. It is straightforward to show that the optimization in (10) isconvex.We propose the following procedure for solving (10). First, we define B (cid:96) { v } ∈ R V − torepresent ˆ X (cid:96) ( − v ) ˆ D (cid:96) X (cid:96) ( v ). We obtain estimate ˆ B (cid:96) { v } via the optimization below which isa rewrite of (10), min B (cid:96) { v } ∈R V − (cid:13)(cid:13)(cid:13) (cid:101) Y (cid:48){ v } (cid:101) A (cid:96) − B (cid:96) { v } (cid:13)(cid:13)(cid:13) + φ (cid:13)(cid:13)(cid:13) B (cid:96) { v } (cid:13)(cid:13)(cid:13) , (11)Following Fan and Li (2001), we derive the following analytic solution for (11),ˆ B (cid:96) { v } = diag (cid:16) sgn (cid:0) (cid:101) Y (cid:48){ v } (cid:101) A (cid:96) (cid:1)(cid:17) δ (cid:0) | (cid:101) Y (cid:48){ v } (cid:101) A (cid:96) | − φ V − (cid:1) , where sgn represents sign function for each element and δ denotes a rectifier function ( δ ( x ) = x if x > B (cid:96) { v } to the low-rank space spanned by ˆ X (cid:96) ( − v ) ˆ D (cid:96) to obtainthe estimate for X (cid:96) ( v ), i.e.,ˆ X ( t ) (cid:96) ( v ) = ˆ D − (cid:96) ( ˆ X (cid:96) ( − v ) (cid:48) ˆ X (cid:96) ( − v )) − ˆ X (cid:96) ( − v ) (cid:48) ˆ B (cid:96) { v } . (12)The optimization in (11) is to obtain an intermediate estimate ˆ B (cid:96) { v } which is an estimateof the (cid:96) th latent source signal which satisfies the desired element-wise sparsity but does nothave the low-rank structure. Then, in (12), we project the intermediate estimate onto thelow-rank space to obtain an updated latent subspace coordinate estimate for the v th nodein the (cid:96) th source signal.After updating X (cid:96) ( v ), we rotate to the next node and repeat the procedure describedabove across nodes v = 1 , . . . , V to obtain updated estimate for X (cid:96) . An advantage ofthe proposed Node-Rotation algorithm is that it has analytic solutions and does not needgradient-based numerical approximation, which makes it highly efficient and reliable. Step 2 : Updating D (cid:96) . The second step is to update the diagonal matrix D (cid:96) for (cid:96) =1 , ..., q , given the estimate of X (cid:96) from the t th iteration and the estimate of (cid:101) A from the t − D (cid:96) , i.e. d (cid:96) = diag( D (cid:96) ) via thefollowing, min d (cid:96) ∈R R(cid:96) (cid:13)(cid:13)(cid:13) (cid:101) Y (cid:48) ˆ (cid:101) A (cid:96) − ˆ Z (cid:96) d (cid:96) (cid:13)(cid:13)(cid:13) + φ (cid:107) ˆ Z (cid:96) d (cid:96) (cid:107) , (13) OCUS: Low-rank connectivity decomposition with uniform sparsity where ˆ Z (cid:96) ∈ R p × R (cid:96) with the r th column of ˆ Z (cid:96) being L ( ˆ X ( r ) (cid:96) ˆ X ( r ) (cid:48) (cid:96) ) ( r = 1 , . . . , R (cid:96) ). A similarprocedure for solving (10) can be adopted for solving (13). Step 3 : Updating (cid:101) A . This step is to update mixing matrix (cid:101) A given the estimates of X (cid:96) and D (cid:96) from the t th iteration. Specifically,ˆ (cid:101) A = (cid:101) Y ˆ S ( t ) (cid:48) ( ˆ S ( t ) ˆ S ( t ) (cid:48) ) − , (14)where ˆ S ( t ) = [ ˆ S ( t )1 , . . . , ˆ S ( t ) q ] (cid:48) with ˆ S (cid:96) = L ( ˆ X ( t ) (cid:96) ˆ D ( t ) (cid:96) ˆ X ( t ) (cid:48) (cid:96) ). The solution from (14) is thenorthogonalized to obtain the updated mixing matrix. Algorithm 1 An Iterative Node-Rotation Algorithm for Learning LOCUS Initial : Initialize ˆ (cid:101) A (0) , { ˆ X (0) (cid:96) , ˆ D (0) (cid:96) } based on estimates from existing methods such asconnICA. repeat For (cid:96) = 1 ...q ,For v = 1 , . . . , V , Step 1. Update X (cid:96) ( v ) :ˆ B ( t ) (cid:96) { v } = argmin B ∈R V − (cid:13)(cid:13)(cid:13) (cid:101) Y (cid:48){ v } (cid:101) A (cid:96) − B (cid:13)(cid:13)(cid:13) + φ (cid:13)(cid:13)(cid:13) B (cid:13)(cid:13)(cid:13) , ˆ X ( t ) (cid:96) ( v ) = ˆ D − (cid:96) ( ˆ X (cid:96) ( − v ) (cid:48) ˆ X (cid:96) ( − v )) − ˆ X (cid:96) ( − v ) (cid:48) ˆ B ( t ) (cid:96) { v } . End for v Step 2. Update D (cid:96) :ˆ d ( t ) (cid:96) = diag( ˆ D ( t ) (cid:96) ) = argmin d (cid:96) ∈R R(cid:96) (cid:13)(cid:13)(cid:13) (cid:101) Y (cid:48) ˆ (cid:101) A (cid:96) − ˆ Z (cid:96) d (cid:96) (cid:13)(cid:13)(cid:13) + φ (cid:107) ˆ Z (cid:96) d (cid:96) (cid:107) , End for (cid:96) Step 3. Update (cid:101) A : ˆ (cid:101) A ( t ) = (cid:101) Y ˆ S ( t ) (cid:48) ( ˆ S ( t ) (cid:48) ˆ S ( t ) ) − where ˆ S ( t ) = [ ˆ S ( t )1 , . . . , ˆ S ( t ) q ] (cid:48) with ˆ S ( t ) (cid:96) = L ( ˆ X ( t ) (cid:96) ˆ D ( t ) (cid:96) ˆ X ( t ) (cid:48) (cid:96) ). Perform an orthogonal transformation on ˆ (cid:101) A ( t ) until (cid:107) ˆ (cid:101) A ( t ) − ˆ (cid:101) A ( t − (cid:107) F (cid:107) ˆ (cid:101) A ( t − (cid:107) F < (cid:15) and (cid:107) ˆ S ( t ) − ˆ S ( t − (cid:107) F (cid:107) ˆ S ( t − (cid:107) F < (cid:15) The LOCUS learning in (8) is non-convex optimization problem. To solve it, we proposethe above estimation algorithm based on the block multi-convex structure of the optimiza-tion function. A function is block multi-convex if it is convex with respect to each of theindividual arguments while holding all others fixed. The formal definition of block multi-convexity (Gorski et al., 2007) is provided below. ang and Guo Definition 1 (Block Multi-Convexity) Define a partition of set x = { x , ..., x p } ∈ R p as a collection of disjoint non-empty subsets { x , .., x h } of x with 1): (cid:83) hi =1 x i = x ; 2): x i ⊆ x and x i (cid:54) = ∅ ; 3) x i ∪ x j = ∅ . A function f ( x , ..., x p ) : R p (cid:55)→ R is a block multi-convex function if there exists a partition { x , .., x h } on { x , ..., x p } satisfying that f isconvex with respect to each of the individual x i , i = 1 , ..., p , while holding all others fixed. Proposition 1 shows the LOCUS optimization function has the block multi-convexitystructure. Proposition 1 Let f ( (cid:101) A , { X (cid:96) , D (cid:96) } ) be the objective function in (8). The function f is blockmulti-convex with respect to the partition of P = { X (1) , .., X ( V ) , . . . , X q ( V ) , d , .., d q , (cid:101) A } . The proof of Proposition 1 is presented in the Appendix. Our proposed Node-Rotationiterative estimation algorithm exploits the block multi-convexity of the objective functionto solve the non-convex optimization problem for learning LOCUS. In LOCUS, the rank parameters { R (cid:96) } q(cid:96) =1 control the dimension of the reduced subspaceof the low-rank structure for the latent connectivity sources. Given the difference in thetopology and structure across the neurocircuitry traits, it is oversimplified to specify acommon rank parameter for all connectivity sources. Therefore, we propose an adaptiveselection approach to choose { R (cid:96) } for the q latent sources. Specifically, when updatingˆ S (cid:96) ( (cid:96) = 1 , . . . , q ), we denote ˆ S ∗ (cid:96) to be the latent sources estimated without the low-rankstructure assumption, which can be obtained from the intermediate source estimates in(11) which have the desired element-wise sparsity but do not have the low-rank structure.The rank R (cid:96) is chosen to achieve a desired level of closeness between the unstructured ˆ S ∗ and the latent source with the low-rank structure, i.e. ˆ S (cid:96) = L ( ˆ X (cid:96) ˆ D (cid:96) ˆ X (cid:48) (cid:96) ). Specifically, R (cid:96) is selected to be the smallest integer value such that, (cid:107) ˆ S (cid:96) − ˆ S ∗ (cid:96) (cid:107) / (cid:107) ˆ S ∗ (cid:96) (cid:107) ≤ − ρ, (15)where ρ ∈ (0 , 1) is a proportion parameter controlling the desired level of closeness betweenthe unstructured and low-rank structured latent sources. Once the proportion parameter ρ is specified, the proposed approach adaptively selects the rank for each of the latent sources,flexibly accommodating connectivity traits with different topology characteristics.With the proposed approach, the tuning parameters for learning the LOCUS modelinclude φ and ρ . We propose to select those parameters via a BIC-type criterion thatbalances between model fitting and model sparsity,BIC = − N (cid:88) i =1 log (cid:16) g ( Y i ; q (cid:88) (cid:96) =1 ˆ a i(cid:96) ˆ S (cid:96) , ˆ σ I p ) (cid:17) + log( N ) q (cid:88) (cid:96) =1 (cid:107) ˆ S (cid:96) (cid:107) (16)where g denotes the pdf of a multivariate Gaussian distribution, ˆ σ = Np (cid:80) i (cid:107) Y i − (cid:80) q(cid:96) =1 ˆ a i(cid:96) ˆ S (cid:96) (cid:107) , (cid:107) · (cid:107) denotes the L norm . This criterion balances between model fitting and model spar-sity, and similar version has been commonly employed in tuning parameter selection onsparse tensor decomposition (Allen, 2012; Kim et al., 2013; Sun and Li, 2017). OCUS: Low-rank connectivity decomposition with uniform sparsity Figure 4: Generated underlying true source signals of 2 scenarios in the simulation study 3. Simulation Study In this section, we investigate the performance of our model based on simulation studies.We compare the performance of LOCUS with two other source separation methods: con-nICA (Amico et al., 2017) which is a recently developed connectivity ICA method, and thedictionary learning (DL) method (Mairal et al., 2009) which is a popular sparse decompo-sition method. Moreover, to evaluate the performance of the proposed angle-based sparsityregularization, we compare LOCUS with the two commonly used penalization approaches,i.e. vector-wise L1 penalty and the nuclear norm penalty. Specifically, the vector-wise L1regularization aims to achieve sparsity in the vectors in the low-rank structure and the pe-nalizing term is based on (cid:80) q(cid:96) =1 (cid:107) X (cid:96) (cid:107) . The nuclear norm aims to achieve low-rankness inthe source signals and the penalizing term is based on (cid:80) q(cid:96) =1 (cid:107) D (cid:96) (cid:107) ∗ where (cid:107) · (cid:107) ∗ denotes thenuclear norm. As with LOCUS, the BIC criteria was used to select the tuning parametersfor these existing sparsity regularization. We specified V = 50, q = 3 and two sample sizes, N = 50 , ang and Guo Table 1: Simulation results for comparing LOCUS with other methods with 100 simula-tion runs for Scenario I. Values presented are mean and standard deviation ofcorrelations between true and estimated: latent sources and loading matrices.a realistic scenario for brain imaging applications. In scenario II, we considered three latentsource signals with diagonal triangle shape, off-diagonal circle shape and long-range hollowsquare shape shown in in Figure 4 (B). These shapes were shown to be several particularlychallenging patterns to capture with the low-rank structure (Zhou et al., 2013). Therefore,scenario II allows us to evaluate the robustness of the performance of the proposed LOCUSmodel under challenging scenarios that deviate from the low-rank assumption. The mixingcoefficients were also sampled from estimates from real imaging data. Furthermore, weadded zero mean Gaussian noises to the mixture of signals where the variance was specifiedbased on signal-to-noise ratio observed from real data. Specifically, we considered threevariance settings with σ = 1 , and 6 , corresponding to low, medium and high variancelevels, respectively. In summary, we have 2 × × OCUS: Low-rank connectivity decomposition with uniform sparsity Table 2: Simulation results for comparing LOCUS with other methods with 100 simula-tion runs for Scenario II. Values presented are mean and standard deviation ofcorrelations between true and estimated: latent sources and loading matrices. ang and Guo Following previous work (Beckmann and Smith, 2005; Guo, 2011; Wang and Guo, 2019),we evaluate the performance of each method based on the correlations between the truthand the model-based estimates on the source signals and mixing coefficients. We furtherexamine the standard deviation of the correlations across 100 simulation runs to evaluatethe robustness of the methods. Results are summarized in Tables 1 and 2. Furthermore,to illustrate the accuracy of the methods in recovering the patterns in the latent sources,we randomly selected 4 simulation runs to display the estimated source signals for the highvariance setting (Figures 5 and 6).In recent years, there has been increasing emphasis on reproducibility in neuroscienceresearch. To this end, we evaluate the reproducibility of the latent sources extracted by eachof the methods across the simulation runs using a reliability index proposed by Kemmeret al. (2018). The index provides a scaled and chance-corrected measure to assess the re-producibility of latent sources extracted with blind source separation methods. Specifically,the reliability index of the (cid:96) th latent source estimates is defined as: RI (cid:96) = B (cid:80) Bb =1 { h ( S (cid:96) , ˆ S ( b ) (cid:96) ) } − Bq (cid:80) Bb =1 (cid:80) qj =1 h ( S (cid:96) , ˆ S ( b ) j )1 − Bq (cid:80) Bb =1 (cid:80) qj =1 h ( S (cid:96) , ˆ S ( b ) j ) , (17)where B = 100 is the number of replications which are simulation runs in this case, S (cid:96) isthe true latent source , ˆ S ( b ) (cid:96) is the latent source estimated from b th simulation run, h ( · )is a measure of similarity between the true and estimated sources, which can be specifiedas Pearson correlation (Wang et al., 2016) or Jaccard Index (Real and Vargas, 1996). Thereliability index reflects the similarity between the true and estimated source signals, re-moving by-chance similarity between the true source and any source estimates out of the q extracted latent sources. It is further scaled by its maximum possible value so that ittypically ranges from 0 to 1, where RI (cid:96) = 0 indicates the (cid:96) th latent source is not repro-ducible across simulation runs after we correct for by-chance correlations and RI (cid:96) close to1 indicates that the component is highly reproducible across replication data. Figure 7displays the average reproducibility of the latent sources estimated via different methodsunder various simulation settings. Based on the results in Tables 1 and 2, the proposed LOCUS method provides more ac-curate estimates for the latent sources and mixing coefficients or subject-specific loadingscompared with the existing connICA and dictionary learning method. The correlation stan-dard deviation of LOCUS is generally lower than that of connICA and dictionary learning,indicating the results from LOCUS have better stability. Figures 5 and 6 show that LOCUSproduces more accurate results in recovering the underlying connectivity traits in both sce-nario I and II. In comparison, connICA generates considerably noisier estimates and showscross-talking between some of its extracted traits. Dictionary learning has less accurateestimates than LOCUS in the activated region of the latent sources and also more falsepositive findings in the de-activated regions. Compared with LOCUS, the estimated latent OCUS: Low-rank connectivity decomposition with uniform sparsity Figure 5: Estimated latent signals of 4 randomly selected simulation runs in Scenario I with highlevel variance across all methods. 19 ang and Guo Figure 6: Estimated latent signals of 4 randomly selected simulation runs in Scenario II with highlevel variance across all methods. 20 OCUS: Low-rank connectivity decomposition with uniform sparsity Figure 7: Reproducibility results on latent sources for both scenarios. The first row rep-resents the averaged adjusted Pearson correlation between true and estimatedlatent sources, while the second row represents for adjusted jaccard index.sources of dictionary learning are noisy even with L1 penalty. This is because it does notmodel the sources with the low-rank structure as LOCUS does, which leads to considerablylarger number of parameters to estimate and noisy results. Furthermore, latent sourcesextracted by the proposed LOCUS method consistently demonstrate higher reproducibil-ity than those estimated by connICA and dictionary learning across all simulation settings(Figure 7 shows). For example, in Scenario I, with sample size N = 100 and high variancelevel, the correlation-based reliability index remains as high as 0.94 for LOCUS’s estimates,while the reliability is only 0.52 and 0.71 for connICA and dictionary learning. Results in Tables 1 and 2 show the proposed novel sparse regularization of LOCUS leads tomore accurate estimates for the latent sources and subject-specific loadings compared withthe existing vector-wise L1 regularization and the nuclear norm sparsity control. Figures 5and 6 show that the proposed sparsity regularization produces more accurate and preciseresults in recovering the underlying connectivity traits in both scenario I and II. In com-parison, the vector-wise sparsity leads to structured errors in estimating the latent sourcesand fails to detect the true patterns for some of the connectivity traits. The unsatisfactoryperformance is because the vector-wise sparsity aims to achieve element-wise sparsity inthe vectors { X (cid:96) } . In doing so, it often results in structured inaccuracy in estimating thelatent sources which are inner products of the vectors. The nuclear norm sparsity methodaccurately recovers the activated regions in Scenario I where the latent sources have low- ang and Guo rank structure. However, its estimates are more noisy as compared with those of LOCUSand have more false positive signals in the non-activated regions. This is because nuclearnorm sparsity aims to achieve low-rankness in the estimated latent sources and does notnecessarily lead to element-wise sparsity in the sources. For scenario II where the patternsare challenging to capture for the low-rankness structure, the performance of nuclear normsparsity is less satisfactory, showing cross-talking between the estimated sources. Finally,latent sources estimated based on proposed LOCUS sparsity regularization consistentlydemonstrate better reproducibility than those estimated with the vector-wise L1 penaltyand nuclear norm penalty across all simulation settings (Figure 7). 4. Application to rs-fMRI data from the PhiladelphiaNeurodevelopmental Cohort (PNC) We applied the proposed method to resting state fMRI (rs-fMRI) data collected in thePhiladelphia Neurodevelopmental Cohort (PNC) study. The PNC is a collaborative project from the Brain Behavior Laboratory at the Uni-versity of Pennsylvania and the Children’s Hospital of Philadelphia (CHOP), funded byNIMH through the American Recovery and Reinvestment Act of 2009 (Satterthwaite et al.,2014a,b). The PNC study includes a population-based sample of individuals aged 8–21 yearsselected among those who received medical care at the Children’s Hospital of Philadelphianetwork in the greater Philadelphia area; the sample is stratified by sex, age and ethnicity.A subset of participants from the PNC were recruited for a multimodality neuroimagingstudy which included resting-state fMRI (rs-fMRI).Prior to analysis, we performed quality control on the rs-fMRI including displacementanalysis to remove images with excessive motion (Satterthwaite et al., 2014b; Wang et al.,2016). Among the subjects who had rs-fMRI scans, 515 participants’ data met the qualitycontrol criterion and were used in our following analysis. Among these subjects, 290 (56%)were female and the mean age was 14.51 years (SD = 3.32).The rs-fMRI data were processed using standard preprocessing procedure. Specifically,skull stripping was performed on the T1 images to remove extra-cranial material, then thefirst four volumes of the functional time series were removed to adjust for initial stabilization,leaving 120 volumes for subsequent preprocessing. The anatomical image was registeredto the 8th volume of the functional image and subsequently spatially normalized to theMNI standard brain space. These normalization parameters from MNI space were usedfor the functional images, which were smoothed with a 6 mm FWHM Gaussian kernel.Motion corrections were applied on the functional images. A validated confound regressionprocedure (Satterthwaite et al., 2015) was performed on each subject’s time series datato remove confounding factors including motions, global effects, white matter (WM) andcerebrospinal fluid (CSF) nuisance signals. Furthermore, motion-related spike regressorswere included to bound the observed displacement. Lastly, the functional time series datawere band-pass filtered to retain frequencies between 0.01 and 0.1 Hz which is the relevantfrequency range for rs-fMRI. OCUS: Low-rank connectivity decomposition with uniform sparsity In our paper, we adopt Power’s 264-node brain parcellation system (Power et al., 2011)for connectivity analysis. Each node is a 10 mm diameter sphere in the standard MNI spacerepresenting a putative functional area, and the collection of nodes provides good coverageof the whole brain. The nodes are assigned into 10 functional modules that correspondto the major resting state networks (Smith et al., 2009). The functional modules includemedial visual network (“Med Vis,”), occipital pole visual network (“OP Vis,”), lateral visualnetwork (“Lat Vis,”), default mode network (“DMN,”), cerebellum (“CB,”), sensorimotornetwork (“SM,”), auditory network (“Aud,”), executive control network (“EC,”), and rightand left frontoparietal networks (“FPR” and “FPL,”). 232 of the 264 nodes that areassociated with the resting state networks were used for our connectivity analysis. Weextract the fMRI time series from each node and obtain 232 × 232 connectivity matrix foreach subject by evaluating the pair-wise correlations between the node-specific fMRI series.Fisher’s Z transformation is applied to the correlations to obtain the connectivity data forLOCUS decomposition.Figure 8: Reproducibility analysis for 18 matched latent sources from LOCUS and con-nICA. It is shown that for the matched latent sources LOCUS tends to have sig-nificant higher reproducibility compareed to connICA approach (p-value: 0.005for Pearson correlation, 0.010 for Jaccard Index). We apply the proposed LOCUS method to decompose the preprocessed functional connec-tivity data from PNC study. We also implement connICA which represents the currentlyleading method in neuroimaging for decomposing brain connectivity data. We extract 30 la-tent sources in both LOCUS and connICA analysis. For comparison purpose, we match theestimated latent sources from LOCUS and connICA based on their correlations and identify18 matching sources with correlations of 0.6 and above. We evaluate the reproducibility ofthe extracted latent sources through re-sampling the observed data. Specifically, we ran- ang and Guo domly bootstrap the data set for 200 times and apply LOCUS and connICA to extractlatent sources from each of these bootstrap data samples. We evaluate the reproducibilityof the results across the bootstrap samples using the reliability index in (17).Figure 8 compares the reproducibility of the 18 matched latent sources extracted byLOCUS and connICA using reliability indices based on both Pearson correlation and Jac-card Index. The connectivity traits from LOCUS show significantly higher reproducibilitycompared with those from connICA (p-value: 0 . 005 for Pearson correlation, 0 . 010 for Jac-card Index). For most of the 18 traits, the LOCUS’s estimates demonstrate about 15% to50% increase in reliability as compared with those from connICA.Figure 9 displays 6 highly reproducible connectivity traits with correlation-based relia-bility index of 0.7 or higher. Compared with the noisy estimates from connICA, the latentsources extracted by LOCUS clearly reveal the neural circuits contributing to each of theconnectivity trait. In Figure 10, we map the LOCUS estimated top 1% edges from each ofthe 6 connectivity traits onto the brain using BrainNet Viewer (Xia et al., 2013). Trait 1mainly consists of connections involving nodes from the FPL and Aud functional modules.Trait 2 mainly consists of connections within the Med Vis module and connections betweenMed Vis and some brain regions including the other visual modules, i.e. Op Vis and LatVis., and medial frontal cortex regions in the EC module. Trait 3 is a cerebellum-relatedconnectivity trait including connections within CB and also between CB and the otherbrain regions. Trait 4 consists of the connections between Op Vis and brain regions. Trait5 mainly consists of edges connecting Med Vis with sensory and motor modules includingSM,CB, Aud and also with anterior cingulate regions in EC. Trait 6 mainly consists ofconnections among the cognitive functional modules including EC, DMN, FPL and FPR.In addition to improving the accuracy and reliability in extracting the connectivity traits,LOCUS also generates subject-specific trait loadings that show stronger association withsubjects’ demographic characteristics. For example, LOCUS derived subjects’ loadings forlatent source 3 , i.e. the cerebellum-related connectivity trait, are found to be significantlyassociated with subjects’ age ( p = 0 . p = 0 . p = 0 . OCUS: Low-rank connectivity decomposition with uniform sparsity Figure 9: Heatmap of six matched latent sources between LOCUS and connICA with high repro-ducibility, where these six latent sources estimated from LOCUS have a Pearson-basedreproducibility higher than 0.7. 25 ang and Guo Figure 10: Visualizing the top 1% brain connectivities of the 6 matched latent signals based onLOCUS using BrainNetViewer (blue are negative edges, red are positive edges). existing neuroscience literature (Ingalhalikar et al., 2014). LS8 mainly contains connectionsinvolving the EC network including connections within EC and the connections betweenEC and SM, Op Vis, Aud. The subjects’ loadings of LS8 are significantly associated withtheir age ( p < . 5. Discussion In this paper, we propose a novel signal separation framework designed for decomposingimaging-based brain connectivity matrices to reveal underlying connectivity traits. Theproposed LOCUS method has several key innovations. Motivated by the observed charac-teristics in connectivity data, LOCUS uses a low-rank structure to significant reduce thenumber of parameters and improve accuracy and precision, leading to more efficient and re-liable source separation for connectivity metrics. Moreover, we propose a novel angle-basedsparsity regularization for the low-rank decomposition. This regularization is methodolog-ically appealing by directly targeting the connectivity traits in its sparsity control, henceshowing better performance than existing sparsity regularization methods. Furthermore,unlike many existing sparsity regularization which require numerical methods to solve, ournew sparsity penalization lead to explicit analytic solutions to the optimization function in OCUS: Low-rank connectivity decomposition with uniform sparsity Figure 11: Two estimated latent sources based on LOCUS which are not identified by connICA.These 2 latent sources have relatively high reproducibility and are significantly associ-ated with subjects’ clinical outcomes, i.e. gender and age, Figure 12: Visualizing the top 1% brain connectivities of the 2 estimated latent signals from LOCUSwhich are not identified by connICA (blue are negative edges, red are positive edges).27 ang and Guo the estimation, which increases computational efficiency. We show that the optimizationfunction of LOCUS has the block multi-convex structure and propose a novel node-rotationalgorithm for learning the LOCUS model. We conduct extensive simulation studies withdata generated from various types of underlying source signals. The proposed LOCUSmethod demonstrates superior performance than the existing methods in both the simula-tion studies and the real data application. The sparsity penalization term in the currentpaper is based on the L1 norm. The proposed method and algorithm can be readily extendedto alternative types of norms such as L2, MCP and SCAD. Furthermore, the proposed LO-CUS is applicable to various types of connectivity measures such as structural connectivityfrom DTI or functional connectivity measured by mutual information. Finally, the proposedangled-based sparsity regularization can be generally applied to tensor-decomposition meth-ods that involve the low-rank structure, providing a useful new alternative to the existingsparsity penalization methods. We plan to release an R package on The ComprehensiveR Archive Network (CRAN) and Neuroconductor to facilitate the implementation of theLOCUS method. OCUS: Low-rank connectivity decomposition with uniform sparsity Appendix 1. Proof of Lemma 1: Denote L ( X (cid:96) D (cid:96) X (cid:48) (cid:96) ) as S (cid:96) , and given the orthogonality on (cid:101) A , we have q (cid:88) i =1 (cid:13)(cid:13)(cid:13) (cid:101) Y i − q (cid:88) (cid:96) =1 (cid:101) a i(cid:96) L ( X (cid:96) D (cid:96) X (cid:48) (cid:96) ) (cid:13)(cid:13)(cid:13) = q (cid:88) i =1 (cid:13)(cid:13)(cid:13) (cid:101) Y i − q (cid:88) (cid:96) =1 (cid:101) a i(cid:96) S (cid:96) (cid:13)(cid:13)(cid:13) = (cid:107) (cid:101) Y − (cid:101) AS (cid:107) F = (cid:107) (cid:101) A ( (cid:101) A (cid:48) (cid:101) Y − S ) (cid:107) F = (cid:13)(cid:13)(cid:13) [ (cid:101) Y (cid:48) (cid:101) A , ..., (cid:101) Y (cid:48) (cid:101) A q ] (cid:48) − [ S , ..., S q ] (cid:48) (cid:13)(cid:13)(cid:13) F = q (cid:88) (cid:96) =1 (cid:107) (cid:101) Y (cid:48) (cid:101) A (cid:96) − S (cid:96) (cid:107) = q (cid:88) (cid:96) =1 (cid:13)(cid:13)(cid:13) (cid:101) Y (cid:48) (cid:101) A (cid:96) − L ( X (cid:96) D (cid:96) X (cid:48) (cid:96) ) (cid:13)(cid:13)(cid:13) , where (cid:101) A (cid:96) is the (cid:96) th column of (cid:101) A . This finished the proof of Lemma 1. 2. Derivation of the update for X (cid:96) ( v ) in the Node-Rotation algorithm: To see the linkage between (9) and (10) , we first transfer (9) into the following edge-wiseform: (cid:13)(cid:13)(cid:13) (cid:101) Y (cid:48) (cid:101) A (cid:96) − L ( X (cid:96) D (cid:96) X (cid:48) (cid:96) ) (cid:13)(cid:13)(cid:13) + φ (cid:88) u 3. Proof of Proposition 1. Block Multi-Convexity: It is straightforward to show the following proof of block multi-convexity applies to boththe LOCUS model (2) on the original data as well as the model (7) on the preprocesseddata. We will use the notations from the original data model in the following.Let f ( · ) be the objective function in (8). We define the parameter partition of P = { X (1) , .., X ( V ) , .., X q ( V ) , d , .., d q , A } , where X (cid:96) ( v )( v = 1 , . . . , V ) is the v th element of X (cid:96) ( (cid:96) = 1 , . . . , q ) and d (cid:96) = diag( D (cid:96) ). We show that the function f is convex with respectto each individual argument in P while holding the others fixed. ang and Guo First, we show the convexity of f w.r.t. X (cid:96) ( v ) given the other terms. We rewrite thefunction f as follows, f = (cid:88) i (cid:88) u Research reported in this publication was supported by the National Institute of Men-tal Health of the National Institutes of Health under Award Number R01MH105561 andR01MH118771. The content is solely the responsibility of the authors and does not neces-sarily represent the official views of the National Institutes of Health. Philadelphia Neu-rodevelopmental Cohort: Support for the collection of the data sets was provided by grantRC2MH089983 awarded to Raquel Gur and RC2MH089924 awarded to Hakon Hakorson.All subjects were recruited through the Center for Applied Genomics at The Children’sHospital in Philadelphia. References Genevera Allen. Sparse higher-order principal components analysis. In Artificial Intelligenceand Statistics , pages 27–36, 2012.Enrico Amico and Joaqu´ın Go˜ni. Mapping hybrid functional-structural connectivity traitsin the human connectome. Network Neuroscience , 2(3):306–322, 2018a.Enrico Amico and Joaqu´ın Go˜ni. The quest for identifiability in human functional connec-tomes. Scientific reports , 8(1):8254, 2018b. OCUS: Low-rank connectivity decomposition with uniform sparsity Enrico Amico, Daniele Marinazzo, Carol Di Perri, Lizette Heine, Jitka Annen, CharlotteMartial, Mario Dzemidzic, Murielle Kirsch, Vincent Bonhomme, Steven Laureys, et al.Mapping the functional connectome traits of levels of consciousness. NeuroImage , 148:201–211, 2017.Christian F Beckmann and Stephen M Smith. Probabilistic independent component analysisfor functional magnetic resonance imaging. Medical Imaging, IEEE Transactions on , 23(2):137–152, 2004.Christian F Beckmann and Stephen M Smith. Tensorial extensions of independent compo-nent analysis for multisubject fmri analysis. Neuroimage , 25(1):294–311, 2005.Bharat Biswal, F Zerrin Yetkin, Victor M Haughton, and James S Hyde. Functional con-nectivity in the motor cortex of resting human brain using echo-planar mri. Magneticresonance in medicine , 34(4):537–541, 1995.Ed Bullmore and Olaf Sporns. Complex brain networks: graph theoretical analysis ofstructural and functional systems. Nature reviews neuroscience , 10(3):186, 2009.Kun Chen, Hongbo Dong, and Kung-Sik Chan. Reduced rank regression via adaptivenuclear norm penalization. Biometrika , 100(4):901–920, 2013.Moo K Chung. Statistical challenges of big brain network data. Statistics & probabilityletters , 136:78–82, 2018.Jessica A Church, Damien A Fair, Nico UF Dosenbach, Alexander L Cohen, Francis MMiezin, Steven E Petersen, and Bradley L Schlaggar. Control networks in paediatrictourette syndrome show immature and anomalous patterns of functional connectivity. Brain , 132(1):225–238, 2008.Joey A Contreras, Joaqu´ın Go˜ni, Shannon L Risacher, Enrico Amico, Karmen Yoder, MarioDzemidzic, John D West, Brenna C McDonald, Martin R Farlow, Olaf Sporns, et al. Cog-nitive complaints in older adults at risk for alzheimer’s disease are associated with alteredresting-state networks. Alzheimer’s & Dementia: Diagnosis, Assessment & Disease Mon-itoring , 6:40–49, 2017.Gustavo Deco, Viktor K Jirsa, and Anthony R McIntosh. Emerging concepts for the dy-namical organization of resting-state activity in the brain. Nature Reviews Neuroscience ,12(1):43, 2011.Daniele Durante, David B Dunson, and Joshua T Vogelstein. Nonparametric bayes modelingof populations of networks. Journal of the American Statistical Association , 112(520):1516–1530, 2017.Jianqing Fan and Runze Li. Variable selection via nonconcave penalized likelihood andits oracle properties. Journal of the American statistical Association , 96(456):1348–1360,2001.Jianqing Fan, Wenyan Gong, and Ziwei Zhu. Generalized high-dimensional trace regressionvia nuclear norm regularization. arXiv preprint arXiv:1710.08083 , 2017. ang and Guo Emily S Finn, Xilin Shen, Dustin Scheinost, Monica D Rosenberg, Jessica Huang, Mar-vin M Chun, Xenophon Papademetris, and R Todd Constable. Functional connectomefingerprinting: identifying individuals using patterns of brain connectivity. Nature neu-roscience , 18(11):1664, 2015.Karl J Friston. Functional and effective connectivity: a review. Brain connectivity , 1(1):13–36, 2011.KJ Friston, CD Frith, PF Liddle, and RSJ Frackowiak. Functional connectivity: theprincipal-component analysis of large (pet) data sets. Journal of Cerebral Blood Flow& Metabolism , 13(1):5–14, 1993.Matthew F Glasser, Stamatios N Sotiropoulos, J Anthony Wilson, Timothy S Coal-son, Bruce Fischl, Jesper L Andersson, Junqian Xu, Saad Jbabdi, Matthew Webster,Jonathan R Polimeni, et al. The minimal preprocessing pipelines for the human connec-tome project. Neuroimage , 80:105–124, 2013.Gyuhyeong Goh, Dipak K Dey, and Kun Chen. Bayesian sparse reduced rank multivariateregression. Journal of multivariate analysis , 157:14–28, 2017.Jochen Gorski, Frank Pfeuffer, and Kathrin Klamroth. Biconvex sets and optimizationwith biconvex functions: a survey and extensions. Mathematical methods of operationsresearch , 66(3):373–407, 2007.Ying Guo. A general probabilistic model for group independent component analysis and itsestimation methods. Biometrics , 67(4):1532–1542, 2011.Aapo Hyv¨arinen and Erkki Oja. Independent component analysis: algorithms and applica-tions. Neural networks , 13(4):411–430, 2000.Aapo Hyv¨arinen, Juha Karhunen, and Erkki Oja. Independent component analysis , vol-ume 46. John Wiley & Sons, 2001.Madhura Ingalhalikar, Alex Smith, Drew Parker, Theodore D Satterthwaite, Mark A Elliott,Kosha Ruparel, Hakon Hakonarson, Raquel E Gur, Ruben C Gur, and Ragini Verma. Sexdifferences in the structural connectome of the human brain. Proceedings of the NationalAcademy of Sciences , 111(2):823–828, 2014.Phebe B Kemmer, Ying Guo, Yikai Wang, and Giuseppe Pagnoni. Network-based charac-terization of brain functional connectivity in zen practitioners. Frontiers in psychology ,6, 2015.Phebe Brenne Kemmer, Yikai Wang, F DuBois Bowman, Helen Mayberg, and Ying Guo.Evaluating the strength of structural connectivity underlying brain functional networks. Brain Connectivity , 8(10):579–594, 2018.Hyon-Jung Kim, Esa Ollila, and Visa Koivunen. Sparse regularization of tensor decomposi-tions. In ,pages 3836–3840. IEEE, 2013. OCUS: Low-rank connectivity decomposition with uniform sparsity Prantik Kundu, Brenda E Benson, Dana Rosen, Sophia Frangou, Ellen Leibenluft, Wen-Ming Luh, Peter A Bandettini, Daniel S Pine, and Monique Ernst. The integration offunctional brain activity from adolescence to adulthood. Journal of Neuroscience , 38(14):3559–3570, 2018.Suprateek Kundu, Joshua Lukemire, Yikai Wang, and Ying Guo. A novel joint brainnetwork analysis using longitudinal alzheimer’s disease data. Scientific reports , 9(1):1–18, 2019.Elmar Wolfgang Lang, Ana Maria Tom´e, Ingo R Keck, JM G´orriz-S´aez, and Carlos Garc´ıaPuntonet. Brain connectivity analysis: a short survey. Computational intelligence andneuroscience , 2012:8, 2012.Xiaoshan Li, Da Xu, Hua Zhou, and Lexin Li. Tucker tensor regression and neuroimaginganalysis. Statistics in Biosciences , 10(3):520–545, 2018.Joshua Lukemire, Yikai Wang, Amit Verma, and Ying Guo. Hint: A hierarchical inde-pendent component analysis toolbox for investigating brain functional networks usingneuroimaging data. Journal of Neuroscience Methods , page 108726, 2020.Julien Mairal, Francis Bach, Jean Ponce, and Guillermo Sapiro. Online dictionary learningfor sparse coding. In Proceedings of the 26th annual international conference on machinelearning , pages 689–696, 2009.Amanda F Mejia, Mary Beth Nebel, Yikai Wang, Brian S Caffo, and Ying Guo. Tem-plate independent component analysis: Targeted and reliable estimation of subject-levelbrain networks using big data population priors. Journal of the American StatisticalAssociation , pages 1–27, 2019.Thomas P Minka. Automatic choice of dimensionality for pca. In NIPS , volume 13, pages598–604, 2000.Trevor Park and George Casella. The bayesian lasso. Journal of the American StatisticalAssociation , 103(482):681–686, 2008.Jonathan D Power, Alexander L Cohen, Steven M Nelson, Gagan S Wig, Kelly Anne Barnes,Jessica A Church, Alecia C Vogel, Timothy O Laumann, Fran M Miezin, Bradley LSchlaggar, et al. Functional network organization of the human brain. Neuron , 72(4):665–678, 2011.Guillaume Rabusseau and Hachem Kadri. Low-rank regression with tensor responses. In Advances in Neural Information Processing Systems , pages 1867–1875, 2016.Garvesh Raskutti and Ming Yuan. Convex regularization for high-dimensional tensor re-gression. arXiv preprint arXiv:1512.01215 , 639, 2015.Raimundo Real and Juan M Vargas. The probabilistic basis of jaccard’s index of similarity. Systematic biology , 45(3):380–385, 1996. ang and Guo Theodore D Satterthwaite, Mark A Elliott, Kosha Ruparel, James Loughead, Karthik Prab-hakaran, Monica E Calkins, Ryan Hopson, Chad Jackson, Jack Keefe, Marisa Riley, et al.Neuroimaging of the philadelphia neurodevelopmental cohort. Neuroimage , 86:544–553,2014a.Theodore D Satterthwaite, Daniel H Wolf, David R Roalf, Kosha Ruparel, Guray Erus,Simon Vandekar, Efstathios D Gennatas, Mark A Elliott, Alex Smith, Hakon Hakonarson,et al. Linked sex differences in cognition and functional connectivity in youth. Cerebralcortex , 25(9):2383–2394, 2014b.Theodore D Satterthwaite, Daniel H Wolf, David R Roalf, Kosha Ruparel, Guray Erus,Simon Vandekar, Efstathios D Gennatas, Mark A Elliott, Alex Smith, Hakon Hakonarson,et al. Linked sex differences in cognition and functional connectivity in youth. Cerebralcortex , 25(9):2383–2394, 2015.William W Seeley, Richard K Crawford, Juan Zhou, Bruce L Miller, and Michael D Greicius.Neurodegenerative diseases target large-scale human brain networks. Neuron , 62(1):42–52, 2009.Ran Shi and Ying Guo. Investigating differences in brain functional networks using hi-erarchical covariate-adjusted independent component analysis. The annals of appliedstatistics , 10(4):1930, 2016.Stephen M Smith, Peter T Fox, Karla L Miller, David C Glahn, P Mickle Fox, Clare EMackay, Nicola Filippini, Kate E Watkins, Roberto Toro, Angela R Laird, et al. Corre-spondence of the brain’s functional architecture during activation and rest. Proceedingsof the National Academy of Sciences , 106(31):13040–13045, 2009.Stephen M Smith, Karla L Miller, Gholamreza Salimi-Khorshidi, Matthew Webster, Chris-tian F Beckmann, Thomas E Nichols, Joseph D Ramsey, and Mark W Woolrich. Networkmodelling methods for fmri. Neuroimage , 54(2):875–891, 2011.Victor Solo, Jean-Baptiste Poline, Martin A Lindquist, Sean L Simpson, F DuBois Bowman,Moo K Chung, and Ben Cassidy. Connectivity in fmri: blind spots and breakthroughs. IEEE transactions on medical imaging , 37(7):1537–1550, 2018.Will Wei Sun and Lexin Li. Store: sparse tensor response regression and neuroimaginganalysis. The Journal of Machine Learning Research , 18(1):4908–4944, 2017.Nathalie Tzourio-Mazoyer, Brigitte Landeau, Dimitri Papathanassiou, Fabrice Crivello,Olivier Etard, Nicolas Delcroix, Bernard Mazoyer, and Marc Joliot. Automated anatom-ical labeling of activations in spm using a macroscopic anatomical parcellation of the mnimri single-subject brain. Neuroimage , 15(1):273–289, 2002.Lu Wang, Daniele Durante, Rex E Jung, and David B Dunson. Bayesian network–responseregression. Bioinformatics , 33(12):1859–1866, 2017.Yao Wang, Deyu Meng, and Ming Yuan. Sparse recovery: from vectors to tensors. NationalScience Review , 5(5):756–767, 2018. OCUS: Low-rank connectivity decomposition with uniform sparsity Yikai Wang and Ying Guo. A hierarchical independent component analysis model forlongitudinal neuroimaging studies. NeuroImage , 189:380–400, 2019.Yikai Wang, Jian Kang, Phebe B Kemmer, and Ying Guo. An efficient and reliable sta-tistical method for estimating functional connectivity in large scale brain networks usingpartial correlation. Frontiers in neuroscience , 10, 2016.Guo-Rong Wu, Sebastiano Stramaglia, Huafu Chen, Wei Liao, and Daniele Marinazzo.Mapping the voxel-wise effective connectome in resting state fmri. PloS one , 8(9):e73670,2013.Mingrui Xia, Jinhui Wang, and Yong He. Brainnet viewer: a network visualization tool forhuman brain connectomics. PloS one , 8(7):e68910, 2013.Ming Yuan and Cun-Hui Zhang. On tensor completion via nuclear norm minimization. Foundations of Computational Mathematics , 16(4):1031–1068, 2016.Cun-Hui Zhang et al. Nearly unbiased variable selection under minimax concave penalty. The Annals of statistics , 38(2):894–942, 2010.Hua Zhou, Lexin Li, and Hongtu Zhu. Tensor regression with applications in neuroimagingdata analysis. Journal of the American Statistical Association , 108(502):540–552, 2013., 108(502):540–552, 2013.