[PDF] Variability in higher order structure of noise added to weighted networks

Abstract

From spiking activity in neuronal networks to force chains in granular materials, the behavior of many real-world systems depends on a network of both strong and weak interactions. These interactions give rise to complex and higher-order system behaviors, and are encoded using data as the network's edges. However, distinguishing between true weak edges and low-weight edges caused by noise remains a challenge. We address this problem by examining the higher-order structure of noisy, weak edges added to model networks. We find that the structure of low-weight, noisy edges varies according to the topology of the model network to which it is added. By investigating this variation more closely, we see that at least three qualitative classes of noise structure emerge. Furthermore, we observe that the structure of noisy edges contains enough model-specific information to classify the model networks with moderate accuracy. Finally, we offer network generation rules that can drive different types of structure in added noisy edges. Our results demonstrate that noise does not present as a monolithic nuisance, but rather as a nuanced, topology-dependent, and even useful entity in characterizing higher-order network interactions. Hence, we provide an alternate approach to noise management by embracing its role in such interactions.

Full PDF

VVariability in higher order structure of noise added to weighted networks

Ann S. Blevins, Jason Z. Kim, Danielle S. BassettJanuary 12, 2021

From spiking activity in neuronal networks to force chains in granular materials, the behavior of many real-worldsystems depends on a network of both strong and weak interactions. These interactions give rise to complex andhigher-order system behaviors, and are encoded using data as the network’s edges. However, distinguishing betweentrue weak edges and low-weight edges caused by noise remains a challenge. We address this problem by examiningthe higher-order structure of noisy, weak edges added to model networks. We ﬁnd that the structure of low-weight,noisy edges varies according to the topology of the model network to which it is added. By investigating this variationmore closely, we see that at least three qualitative classes of noise structure emerge. Furthermore, we observe thatthe structure of noisy edges contains enough model-speciﬁc information to classify the model networks with moderateaccuracy. Finally, we oﬀer network generation rules that can drive diﬀerent types of structure in added noisy edges.Our results demonstrate that noise does not present as a monolithic nuisance, but rather as a nuanced, topology-dependent, and even useful entity in characterizing higher-order network interactions. Hence, we provide an alternateapproach to noise management by embracing its role in such interactions.1 a r X i v : . [ q - b i o . Q M ] J a n Introduction

In weighted network analyses, it is incredibly diﬃcult todistinguish between weak edges that correspond to sig-niﬁcant system features and weak edges that are simplyfalse positives or noise. Previous research in neuroscience[49, 24, 7], social networks [25, 20], and molecular biol-ogy [35] has demonstrated the importance of weak, realedges, to the function of a network. However, weightednetworks collected from data are also often plagued bynoise either in the form of imprecise edge weights or spu-rious connections. Unfortunately, such noise often aﬀectsthe signal-to-noise ratio of weak edges more than strongedges, thus complicating our understanding of weak edgenetwork structure.How does one deal with unwanted noise in estimatesof network structure in real systems? Current meth-ods often attempt to remove noisy edges via thresholding[8, 14, 50, 3, 55] by taking into account edge weight, den-sity, group similarity, or network measures [8]. However,thresholding presents a challenge as one may remove toomany real edges or keep too many spurious edges [62]. Al-ternatively, many theoretical studies focus on structuralproperties of noisy edges independent of and isolated fromany data; most commonly, eﬀorts in this space study ran-dom graphs. One can predict graph properties of randomnetworks [58, 46, 19, 11], which can be useful in distin-guishing real networks from random graphs. However,noisy edges in real networks do not exist in isolation, butare intertwined with real edges. Consequently, neitherthresholding nor studying noise in isolation is withoutﬂaws when distinguishing the structure of real weak edgesfrom that of noisy weak edges. Instead of isolating noise,can we understand its role in higher-order network struc-ture to mitigate, and potentially even take advantage of,noisy edges?Here we address the above questions by welcomingnoise into our experiments and descriptions of networkstructure (Fig. 1). Speciﬁcally, we ask what (if any-thing) can noise added to a real network tell us aboutthe underlying real network structure? If the added noisehas the same structure regardless of the topology of thereal network, then the answer is nothing. The advan-tage of this scenario is that we could potentially identifyan appropriate threshold for our data fairly easily andin a data-driven manner. Instead, if the structure of theadded noise varies based on the topology of the real net-work, then the noise may carry information about thereal weighted network. Indeed, given that a binary graphis deﬁned by a list of either its edges or its non-edges,it is possible that the noise ﬁlling the empty edges of areal weighted network holds information related to thetopology of the real edges.To explore this possibility, we ask whether the higherorder structure of noise varies across model network topolo-gies in the controlled context of adding noise to modelnetworks. We tested twelve model networks that spanned diﬀerent node strength distributions, reliance on distances,and more. Using the network models tested, we identifyat least three common proﬁles of added noise structureas assessed by persistent homology. Notably, we ﬁnd thatthese patterns correspond to the topologies of the modelnetworks to which the noise was added. We ﬁnd thatenough information exists within the structure of addednoise alone to reasonably distinguish between networkmodels, and that this information stems both from fea-tures persisting from the model network and those formedby noisy edges. Furthermore we provide generative rulesthat explain the diﬀerent patterns of added noise struc-ture. Finally we remark on the ability of noise to createmisleading structure within weak network edges and dis-cuss the implications for data analysis.

To better understand the eﬀect of noise added to net-works, we precisely measure the changes in network struc-ture that arise from noisy, weak edges, and compare themto an expected structure. We begin with a weighted,completely connected model network generated from aspeciﬁc set of rules, which has a predictable structure(Fig. 1d, navy). Next, we create a space for weak, noisyedges by thresholding this model network at a chosenedge density ρ T to keep only those strongest ρ T frac-tion of edges (rose). We then create the added noisenetwork (gold) by assigning a random weight to any edgenot included in the ρ T -thresholded weighted network. Alledges in the added noise network have edge weight lessthan any edge in the ρ T -thresholded network. Finally,we combine the ρ T -thresholded network and the addednoise network to yield a combined weighted network (roseand gold), which now has a precise cutoﬀ below which alledge weights are randomly chosen. Said another way, ifwe begin with an empty network and add edges to thisgraph in order of decreasing edge weights from the com-bined weighted network, the ﬁrst | E | ρ T edges will comefrom modeled edges while the latter 1 − | E | ρ T edges willcome from noisy edges only. Then if we measure networkstructure along this expanded version of the combinationnetwork (Fig. 1d, right), we will be able to distinguish be-tween structure attributed to the model edges from thatcreated by the added noise. Real data is often characterized by three features: weightedrelationships between nodes, higher order interactions,and topological constraints such as wiring distance. Giventhese features, we use persistent homology to study weightednetwork structure. Persistent homology [10, 66, 43] recordsthe longevity of topological cavities that form and collapse2igure 1:

Our experiments query the structure of noise added to model networks. (a)

A schematic ofreal data from a structural brain network. Such data are often assumed to consist of strong and likely real edges (b) ,along with weak edges (c) that could arise simply due to noise in the system or in the measurements. Edge densitydenoted by ρ . (d) In our experiment, we begin with a model network (navy) that has been thresholded to a speciﬁcedge density ρ T (rose), and ﬁll the thresholded edges with weaker, noisy edges (gold) to form a combined network(rose-gold). (e) Then we will measure the structure of the noisy edges (gold).throughout the graph ﬁltration G ⊆ G · · · ⊆ G | E | .This ﬁltration is the formalization of the expanded viewof a weighted network discussed in Fig. 1, in which G i is the binary graph containing the i strongest edges inthe weighted network (see Fig. 2a, Methods, and [51, 23,45, 29]). This persistent homology approach has beenpreviously used to identify diﬀerences in cognition acrossindividuals based on resting state functional connectivity[4], to ﬁnd percolation properties of porous materials [48],to diﬀerentiate neuron morphologies [33], and to under-stand many other real-world systems [13, 28, 44]. We notethat in our setup, only the rank order of edges induced bythe original edge weights are preserved, so that the spe-ciﬁc edge weights or their generating distribution doesnot aﬀect the outcome. For this work, we compute thepersistent homology in several dimensions: dimension 1(gaps surrounded by edges), 2 (voids surrounded by ﬁlledtriangles), 3 (voids surrounded by collections of tetrahe-dra), and 4 (a higher dimensional analog). A ﬁltrationof graphs can be translated into a ﬁltration of higher-order complexes called simplicial complexes on which wecan compute persistent homology, by assigning a cliqueof k + 1 nodes to a k -simplex (Fig. 2a, see Methods formore details).The persistent homology of a weighted network is acollection of half-open intervals [ b l , d l ) called the barcodethat denote the birth b l and death d l of the l th persis- tent cavity in dimension k . We visualize this output aseither the barcodes itself or a Betti curve summarization(Fig. 2a, bottom). In the barcode visualization, each barcorresponds to a persistent cavity and extends from thebar’s birth to its death. In the Betti curve plot, β k ( ρ )counts the number of persistent cavities alive at edge den-sity ρ . Unlike many other graph metrics, persistent ho-mology incorporates the strong and weak interactions ina holistic manner by considering how the entire ﬁltrationﬁts together, which both generates a unique perspectiveon network structure similarity [51] and allows us to moreprecisely understand the interplay between edges corre-sponding to real data and edges added randomly.As motivating examples, we present the persistent ho-mology of two random systems [30, 31]. For both sys-tems, we summarize their persistent homology using Betticurves as shown in Fig. 2. In the ﬁrst system, we cre-ate a noisy network by assigning to every edge a weightsampled uniformly at random from (0 , Through the lens of persistent homology, we imagine three possibilities for the structure ofnoise added to model networks. (a)

From a weighted network, we construct a graph ﬁltration (top) by addingedges to an empty graph one at a time in order of decreasing edge weight. From this ﬁltration we create a sequence ofclique complexes (middle) on which we compute the persistent homology (bottom). We show the barcode (horizontallines) and Betti curve β ( ρ ) for the resulting persistent homology. (b, c) An example matrix (left) and Betti curves(right) of the IID noise (b) and random geometric (c) network models. Solid lines indicate the average over 500replicates and shaded ares correspond to ± (d) Stylized possibilities for the structure of addednoise as perceived by the Betti curves: a reversion to IID noise Betti curves (left), a condensed collection of all fourBetti curves following an IID noise pattern (middle), or a Betti curve pattern completely unlike that of IID noise(right). 4f the IID noise model, and that the peaks decrease withincreasing dimension.Combining persistent homology with the experimentalapproach set up in Fig. 1, we imagine three possibilitiesfor the impact of added noise on the Betti curves of modelnetworks, illustrated with stylized Betti curves in Fig. 2d.First, after ρ T the Betti curves could quickly revert to theexpected IID noise pattern at that edge density (Fig. 2d,left). Then we would see that the noise section of theBetti curves looks similar to a copy-and-pasted version ofthat same section of the IID noise Betti curves. Second,we could imagine that if the weighted network model hasone densely connected community, there will be so muchempty space that randomly adding edges will create asmaller version of the IID noise network (Fig. 2d, mid-dle). We might expect this scenario to produce the entiresequence (non-zero β k for k = 1 , . . . ,

4) of increasing Betticurves all condensed after ρ T . Third, perhaps neither ofthe above is correct, and in fact the added noise section ofthe ﬁltration may show no resemblance to the IID noiseBetti curves (Fig. 2b, right). Which of these three possi-bilities actually occurs? We test which of the three scenarios described above oc-curs for 12 graph models (including IID noise, see Meth-ods) and 17 values of ρ T ∈ [0 . , . ρ T = 0 . ρ T = 0 .

5. Weinclude results for all models in Fig. A4.Repeating the experiment across all 17 values of ρ T (Fig. 3b), we observe three qualitative classes of addednoise structure (Fig. 3c) within the non-IID noise mod-els. We will refer to the three qualitative classes as the random reversion , coned , and random condensed classes.First, the Betti curves of the added noise networks onthe assortative, core periphery, and to a lesser extentthe disassortative models mirror those of the IID noisemodel but scaled and sometimes slightly shifted; thesemodels comprise the random reversion class, named for the return of the Betti curves to those of the random IIDnoise model. Second, noise added to the conﬁgurationand the dot product models generate Betti curve peaksthat decrease with increasing dimension and dramaticallyshift rightwards as ρ T increases. We will refer to thissecond set of models as the coned class following Ref.[51]. Third, the distance-based models (random geomet-ric, cosine geometric, ring lattice, squared Euclidean, andRMSD) constitute the random condensed class. Here, theadded noise produces an increasingly compressed collec-tion of IID noise-like Betti curves in all four dimensionsas ρ T increases. In summary, we found that the structureof noise added to model weighted networks varies acrossnetwork models and values of ρ T . We demonstrated that the structure of noise varies basedon the model to which it is added. But does the struc-ture of added noise vary enough that it can accuratelyclassify the network models? In the binary case, there isas much information about a binary graph from knowingall the edges that exist as there is from knowing whichedges do not exist. Our experiment could be interpretedas a weighted extension of this idea, in which we knowweights of model network edges, and then all the openspace (edges with weight 0 after thresholding at ρ T ) areﬁlled by random weights. The randomly-weighted edgesand the model-weighted edges together form a complete(weighted) graph. Therefore, despite the fact that edgeweights are chosen at random for the added noise, weexpect that at ρ T near 0.5 the added noise persistent ho-mology will be enough to classify the model networks.To test these ideas, we classify the networks usinga Gaussian mixture model. We use three features de-rived from the persistent homology in dimension k , for k = 1 , . . . , ρ T ; see confusion matrices for ρ T = 0 . ρ T = 0 . ≈ . ≈ . Betti curves of added noise vary based on the underlying network model. (a)

Examples ofnetwork models that support varying structure of added noise. Solid lines indicate mean Betti curve and shadedregion corresponds to ± (b) Repeating across values of ρ T , we createa summarized visualization of the Betti curves. (c) Mean Betti curves for each of the 11 tested network models (IIDnoise not included). Betti curve opacity indicates ρ T . Betti curve peaks are marked with a dot to emphasize thescaling and shifting of Betti curves as ρ T varies. 6igure 4: Classiﬁcation based on the added noise persistent homology can distinguish between networkmodels.

Confusion matrices showing the results of the classiﬁcations based on (a) added noise barcodes, (b) thebarcode from model weights only, (c) the entire barcode, and (d) barcodes generated from the isolated added noisenetworks. Matrix entries show the number of corresponding true models that were classiﬁed as the correspondingpredicted model, out of 250 (see Methods).siﬁed using model weights than the added noise portion ofthe barcode. Finally, when we classify using the entiretyof the barcodes (Fig. 4c), we ﬁnd that the prediction accu-racy is at least as good or improved over both the addednoise and model weights section for eight of the twelvemodels. Indeed, using all barcodes results in the highestaccuracy for 0 . ≤ ρ T ≤ . ρ T = 0 . only the noisy edges, with all model edges setto 0. As we have previously discussed, the added noiseedges on any weighted network is itself a weighted graph.The above classiﬁcation experiment used the added noisestructure that was dependent – that is – existed atopa weighted network, such that the added noise existed within the context of the model network. That approachleaves open the question of whether this added noise graphas a standalone weighted network (independent of anymodel edges already added) contains the same amountof information as the added noise network (noise in thecontext of the network model)? To address this question,we extracted the added noise network from each graphmodel for all values of ρ T and computed their persistenthomology. We show the Betti curves across values of ρ T for these isolated noise networks in Fig. A5. We ﬁndthat classifying the topological features from the addednoise independent of model edges (noise (isolated)) does a ﬁne job (accuracy ≈ . ρ T ) contains animpressive amount of distinguishing power.Finally, as a byproduct of studying the isolated addednoise networks (Fig. A5) we learn that the disassortativeand assortative models are, in a way, weighted comple-ments of each other. Indeed the space left by the thresh-olded disassortative network at particular values of ρ T ex-ists within four large groups of nodes, so that the weightedrandom complement is indeed structured as four commu-nities. We see evidence of this type of community struc-ture in the Betti curves, particularly in the two peaksobserved β and β [51] of the isolated noise networks(Fig. A5). Taken together, we learn that the structure ofnoise added to model networks contains information notseen in the structure of the model network itself. In theparticular case of model networks that produce little per-sistent homology, the structure of their empty edges asseen by the added noise persistent homology can improveclassiﬁcation.7 .4.1 Determining the source of information con-tained within added noise topology Given the ability of the added noise persistent homol-ogy features to reasonably classify the underlying net-work models, we next aim to clarify more precisely thereason for this result. We can label persistent cavities(bars in the barcode) that exist during the added noisesection as one of two types: the ﬁrst a so-called noise-exclusive persistent cavity whose birth and death timeare both within the added noise portion of the ﬁltration( b, d > ρ T ), and a crossover cavity that was born withinthe model section ( b < ρ T ) but dies within the addednoise section ( d > ρ T , see Methods). We expect that, forthose graph models that have them, the crossover barsshould hold the majority of the classiﬁcation informationsince they are formed with model weight edges. Afterclassiﬁcation on these slices of the barcodes, we ﬁnd thatindeed the crossover bars can be used to classify the ﬁvemodels that consistently produce crossover bars nearlyperfectly (Fig. A15). Surprisingly, the classiﬁcation usingthe noise-exclusive portion (Fig. A16) is nearly identicalto that of using the added noise portion of the barcode(Fig. 4a) for all models. Together our results suggest thatusing only those persistent features generated within theadded noise section of the barcode are suﬃcient for mod-erate accuracy in classifying this set of model networks.Finally, we ask how much information is contained insimply the binary network at ρ T by comparing the per-sistent homology from the added noise to that from arandomized model weights experiment. In this experi-ment, G is empty, the binary graph at ρ T is the samemodel network at ρ T as before, but the ordering of modeledges added has been randomized (Fig. A6). We ﬁnd thatthe classiﬁcation run on the persistent homology featuresof these randomized model weights (Fig. A6, A10) per-forms similarly to that of the added noise features when ρ T = 0 .

5. In addition to the classiﬁcation results usingadded noise or isolated noise, these ﬁndings strengthenthe intuition that the binary network at ρ T constrainsthe possible persistent homology outcomes generated byrandomly adding edges.In summary, our classiﬁcation experiments suggestthat the distinguishing information of the added noisestructure comes from both crossover and noise exclusivebars. The randomized model network experiment couldbe viewed as a step-wise random graph process in whichexactly one interior graph G i , 0 < i < | E | is ﬁxed. There-fore results from the randomized model networks exper-iments additionally suggest that even having one graphpredetermined between G and G | E | can greatly alter thetopology of the ﬁltration between G , G i and G i , G | E | . Next, we ask how one might obtain each of the three mainnoise proﬁles observed in Fig. 3a,c. The ﬁrst random re-version proﬁle, in which the added noise section ( ρ > ρ T ) is similar to an IID noise Betti curve copied at that edgedensity (Fig. 3a), can be replicated using matrix blocksthat are themselves created through a random process [2].The assortative model shows this trend for the largestrange of ρ T , since it has four blocks of highly weightededges and the rest are weak but randomly weighted edges(Fig. A1). The core-periphery model shows the same pat-tern but for a smaller range of ρ T , because it has sevenblocks of highly weighted edges and thus its natural breakbetween the high and low weighted edges occurs at alarger edge density than for the assortative model. Thedisassortative model only has four low-weight blocks, andwe observe that at very few values of ρ T it supports addednoise that also produces this similar Betti curve pattern.For the random condensed and coned proﬁles, we ﬁrstinvestigated how the propensity to ﬁll triangles wouldcontribute to the added noise proﬁle. Both the distance-based and coned network models have a strong tendencyto form triangles, either by the inﬂuence of the triangleinequality in the former or the large clique size in the lat-ter [51]. We create a random model network weighted sothat at each step in the ﬁltration, the next edge addedcompletes an open triangle (three nodes connected by twoedges) with probability p and connects a randomly se-lected pair of non-adjacent nodes with probability 1 − p (or if no open triangles exist). If the new edge will com-plete a triangle, the open triangle is chosen with proba-bility weighted by the product of the open triangle edgeweights (see Methods). This rule states that it is morelikely for the new edge to form a triangle with two edgesthat were added early in the ﬁltration than late in theﬁltration. We call this model the weighted probabilistictriangle model (Fig. 5a). In Fig. 5b we show the result-ing Betti curves across values of ρ T for p = 0 . p allows us to interpolatebetween the IID noise network ( p = 0) and a model thatsupports an added noise proﬁle of the coned type (de-creasing Betti peaks, 0 . < p ≤

1, see Fig. A7). Takento the most extreme, we create a weighted clique graph inwhich at each step in the ﬁltration, the newest edge con-tributes to building one growing clique. This weightedclique model additionally shows an added noise proﬁlesimilar to the weighted probabilistic triangle model with p > .

85 (Fig. 5c). Indeed, as discussed in the supple-ment, upper and lower bounds on the Betti numbers forthis weighted clique network can be derived in a similarfashion to those from the IID noise graph (see Section10.2.4).What additional process or constraint underlies thedistance-based network models that is not captured bythe above weighted probabilistic triangle model? Althoughdistance-based networks do indeed ﬁll triangles quickly,the alternative to completing a triangle in an embeddednetwork is far from adding a random edge anywhere inthe network, as is the case in the above weighted prob-abilistic triangle model. Instead, in embedded networkswe often see multiple pockets of clustered nodes arise and8igure 5:

Cliques and triangles support diﬀerent patterns of added noise Betti curves. (a)

The weightedprobabilistic triangle model proceeds by adding edges either at random or to complete an open triangle. Thelikelihood of completing a speciﬁc open triangle is based on the weight of its edges. (b)

Betti curves at varyingvalues of ρ T for the weighted probability triangle with p = 0 . p = 1 (middle). Average (solid lines) and ± standard deviation (ﬁlled area) for Betti curves of the weighted probabilistic triangle model with p = 1, ρ T = 0 . (c) Betti curves at varying values of ρ T for the clique graph. Inset shows the adjacency matrix. (d) Therandom m -clique model creates a weighted network by selecting m nodes at random, increasing all edge weightsbetween these nodes by 1, and repeating the process. (e) . Betti curves across all values of ρ T (left) and for ρ T = 0 . m -clique model with m = 25. Legends for all plots are the same as for those in panel (b) .9ventually connect. We aimed to capture this process at abasic, non-embedded level, by creating a model that con-structs a weighted network by adding pockets of denselyconnected nodes to the network; we call the model the m -clique model (Fig. 5d). In this model, we choose arandom set of m nodes, increase all edge weights betweenthese m nodes by 1, and repeat this process until a desirednetwork density is reached (see Methods). We record thepersistent homology of these models and their added noisefor varying values of ρ T and m in Fig. A8, and we showthe m = 25 case in Fig. 5e. We ﬁnd that for parametervalues near m = 25, the added noise Betti peaks show anincreasing pattern similar to that seen with the distance-based models. We note that the random m -clique modeldoes not fully capture the extent to which the Betti curvepeaks shift rightwards with increasing ρ T in the distance-based models, suggesting that adding random m -cliquesalone is not suﬃcient to completely recreate the observedphenomenon.In sum, through the above generative graph modelswe have determined processes by which we can drive thestructure of added noise towards any of the three observedproﬁles. We close with a more realistic example of added noise onnetworks, and highlight situations in which added noisemay erroneously suggest non-existent network features.Above we considered the situation in which there exists asharp, binary distinction between the model network andadded noise sections of the ﬁltration. The resulting Betticurves often show an obvious point at which the trendschange drastically. Though nicer to study numerically,this situation is unlikely in real data. It is more likelythat as we move along the ﬁltration, adding stronger thenweaker edges, the proportion of noisy edges increases untilat some point the last real edge has been added and allthe later (weaker) edges are noise.We examine this overlapping noise scenario by cre-ating network models as before, but instead of switch-ing from only model edges to only noise edges at ρ T (Fig. 6a, left), we now set an increasing noise interval[ ρ a , ρ b ] (Fig. 6a, right). For edges added at densities ρ <ρ a , model network edges are added in the usual ordering.If ρ a ≤ ρ ≤ ρ b , then with probability p ρ = ρ b − ρ a ( ρ − ρ a )we choose the next edge in the ﬁltration at random andwith probability 1 − p ρ we choose the next edge based onthe ordered model edges. For ρ > ρ b , all further edgesare chosen at random.We show all Betti curves generated by this process inFig. A9 and highlight a few interesting results in Fig. 6b,c. First, because we have an expectation for edge densityintervals with non-zero persistent homology, persistentcavities that are born exceptionally late can be consideredsigniﬁcant [52]. Following this concept, both the quan- titative and qualitative evaluations of, for example theuniform conﬁguration persistent homology output shownin Fig. 6b, left, could suggest interesting or signiﬁcantfeatures exist after ρ = 0 .

2. Similar inferences might bedrawn from the Betti curves of the geometric conﬁgu-ration model in Fig. 6b, right. However, we would bewrong to assign signiﬁcance to the late born features inthese plots. All of these late-born persistent cavities aregenerated by added noise.Second, consider the Betti curve plots in Fig. 6c. Giventhe double peak of β and β [51], one could reasonablyconclude that all four have a considerable modular struc-ture. However, again we would be wrong. Three of thefour are distance-based models while only the assortativemodel (run without noise) actually contains distinct mod-ules (Fig. 6c, left). The second peaks seen in the Betticurves generated by the distance-based models are solelyproduced by added noise, suggesting that qualitative in-terpretations of Betti curves arising from noisy data maybe misleading.Together, our results reveal the importance of a care-ful eye when interpreting the structure of weak edges incomplex systems. Particularly, noise in the weak edgescan falsely appear as signiﬁcant structural features. In this work we investigated the structure of randomedges added to pre-existing model network edges. We de-termined that the existing model structure dictates thetopology of its added noise, and consequently that thestructure of the added noise alone carries distinguishinginformation about the network model. We then identi-ﬁed generative processes for creating the three main pat-terns of added noise and ﬁnally highlighted consequencesof variable noise structure for the analysis of weightedgraphs.

Implications for data analysis

The increasing use ofweighted networks in applied sciences from molecular bi-ology [21] to transportation [59] suggests that weak edgesand topological variability [60] will continue to be studiedin the future. A major consequence of our results is thatgiven the variable structure of added noise atop networks,extreme care must be taken when analyzing weighted net-works with possible noise contamination. Though herewe used persistent homology to query network structure,we expect that added noise structure as viewed by manygraph metrics will vary based on real network topology,as seen previously in the binary case [62, 54]. Conse-quently, we suggest that one considers the structure ofnoise dependent on their system as perceived by theirstructural measure of choice before assigning value to fea-tures formed by weak edges. On the other hand, our ex-periments suggest an avenue for improving the detectionof a threshold (or range of densities) that separate data10igure 6:

Noise overlapping real networks can complicate interpretations of network structure. (a)

Previous experiments used a sharp threshold ρ T to separate model from noise edges (left), but a more realisticscenario is to have an increasing likelihood of noisy edges over some interval [ ρ a , ρ b ] (right). Betti curves from theuniform conﬁguration model with noise interval [0 . , .

5] (left) and the geometric conﬁguration model with noiseinterval [0 . , .

3] (right). (c)

Betti curves for the (left to right) assortative model with no noise added, squaredEuclidean model with noise interval [0 . , . . , . . , . (b) for legend.edges from noise edges [56, 63, 65]. If one knows the ex-pected structure of added noise on their system, then onecould use this information to determine at what thresholdin a new dataset the noise begins. Implications for theoretical work

Though motivatedby problems in data analysis, our work also lays the foun-dation for additional interesting theoretical questions. First,one could interpret the experiments performed in this pa-per as querying the change of network structure caused bycombining or shifting between two network models [61].Here we combined one network model with random IIDnoise, but one could easily repeat these experiments withany pair of graph models. For example, how does thestructure of a ring lattice combined with a modular net-work compare to that of a ring lattice combined with arandom geometric model? Such questions could be help-ful for understanding systems such as the brain that nat-urally switch between states [5, 12].Second, an alternative way to interpret the patternsof added noise structure is that the model edges force or restrict the noise into a particular shape based on thetopology of the empty space left by the arrangement ofmodel edges. Following this line of thought, the randomreversion class of models could be seen as having a defer-ential structure, in that the added noise was quickly ableto revert to its natural architecture, and the other twomodel classes could be interpreted as having a forceful structure, which dominates the ability of the added noiseto revert. Indeed, we observed that even with only 15% ofedges added, the geometric conﬁguration model has inﬂu-enced the structure of the 85% of randomly added edges,so that its added noise does not follow the IID noise pat-tern. We leave the question of how exactly the topology ofthe model edges dictate (or do not dictate) the topology ofthe added noise for future work. Additionally, one mightask if real-world sparse networks [6, 40] have a dominantstructure that protects their architecture against randomﬂuctuations.Third, given a sparse network, one can use randomlyadded edges and the crossover bar concept to help deter-mine geometrical properties of the network’s topologicalcavities [42]. Speciﬁcally, since we add edges at random,generating a distribution of death times for each topolog-ical cavity would suggest a cavity geometry that is moreor less susceptible to randomly dying. One would ex-pect cavities with large minimal generators to be unlikelyto die via random edge addition, whereas cavities withmultiple small minimal generators (for example a narrowtube) would likely die very soon from randomly addingedges since there are more opportunities to tessellate sucha cavity. Noisy networks and cognition

Finally, our exper-iments suggest interesting directions for future work incognitive science and neuroscience. Studying the added11oise structure is equivalent to studying the empty spaceleft by a network, making the above analyses particularlyinteresting for systems in which the sparsity of edges is afeature. For example, in one’s brain network the structureof both the present edges and the empty space changesover development [15]. Speciﬁcally, network edges arepruned as a person ages and learns [53], which in turn al-ters the structure of the non-edges in their network [39].Is the structure of the empty space in brain networks moreor less important than the structure of those edges thatexist? Finally, in cognitive science one could interpretlearning as beginning with a noisy knowledge network inwhich some edges incorrectly connect disconnected con-cepts. Learning would then proceed by a rewiring thatnetwork to a ﬁnal, correct knowledge network [34, 37].Studying this phenomenon is eﬀectively the reverse of theexperiments presented here, in which one begins with anoisy network and ends with an expected model network.

Conclusion

In conclusion, our work shows that thepersistent homology of noise added to networks variesbased on the real network topology. Additionally we ﬁndgenerative network rules that produce networks support-ing diﬀering structures of noise. Finally our results oﬀera reason to examine how the structure of added noiseto a real network may inﬂuence the structural measurein question, in order to make real features present inweak edges clearly distinguishable from features createdby noise alone.

Computations were performed in julia, with the exceptionof the classiﬁcation experiments which were performed inMATLAB. We use the Eirene software [27] for all persis-tent homology computations.

We chose weighted network models that show a varietyof real-world properties, including the structural featureof modularity, and the physical feature of weights thatdecreases with distance. For all models the ﬁnal graphcontained N = 70 nodes, and 500 replicates were cre-ated. We chose these values for nodes and replicates tobalance topological richness, reliable estimates of vari-ance, and computation time. If a model yielded non-unique graph weights, random noise was added such thatall edge weights would be unique and that the ranking ofedges with unique edge weights would remain the same.See Section 10.1.5 of the Supplementary Information andFig. A1, A2, A3 for more details. We separate the descrip-tions of the models into two sections: distance-agnosticgraph models and distance-based graph models. Distance-agnostic graph models.

The following mod-els are created without any notion of a formal distancebetween nodes. • Assortative . The assortative model was constructedfollowing an implementation of the weighted stochas-tic block model (WSBM) [2] in which four high-weight blocks were positioned along the diagonal. • Core periphery . This model was also constructedwith the WSBM [2] approach but high-weight blockswere positioned along the top and left edge of theadjacency matrix to form a core and periphery. Onefourth of nodes formed the core, and the rest formedthe periphery. • Disassortative . The inverse of the assortative model,in which nodes within a community connect stronglyto nodes outside of their community. • Discrete uniform conﬁguration . A weightedconﬁguration model with the node strengths drawnfrom the discrete uniform distribution. • Dot product . Here we chose N points at randomin R dim (here dim = 3). We then weighted edgesbetween two nodes as the dot product of the asso-ciated vectors. • Geometric conﬁguration . A weighted conﬁgura-tion model with node strengths drawn from a geo-metric distribution. • IID noise . All edge weights were chosen at randomfrom the uniform distribution on (0 , Distance-based graph models.

The following mod-els are created by ﬁrst choosing N points in R dim , thencalculating a distance d ( (cid:126)u, (cid:126)v ) between every pair of pointsusing the deﬁnitions below, and ﬁnally taking the recipro-cal of that distance as the edge weight between two nodes.By this process, two nodes that are close together, as de-termined by the distance metric, will parent an edge witha large weight, whereas nodes that are far apart will par-ent edges with small weights. For simplicity, we chose dim = 3 for all models. • Cosine geometric . Given nodes v, u with asso-ciated vectors (cid:126)v, (cid:126)u ∈ R dim , respectively, the cosinedistance is d ( (cid:126)v, (cid:126)u ) = 1 − (cid:126)v · (cid:126)u || (cid:126)v |||| (cid:126)u || . • Random geometric (Euclidean distance) . Givennodes v, u with associated vectors (cid:126)v, (cid:126)u ∈ R dim , re-spectively, d ( (cid:126)v, (cid:126)u ) = (cid:118)(cid:117)(cid:117)(cid:116) dim (cid:88) i =1 ( (cid:126)v i − (cid:126)u i ) . Ring lattice . We labeled N vertices as 1 , . . . , N ,and connected nodes in one large ring. We assignedthe edge weight between nodes i and j as the inversehop distance along this ring, and assumed that hop-ping is allowed only between neighboring nodes [57]. • Root mean squared deviation (RMSD) . Givennodes v, u with associated vectors (cid:126)v, (cid:126)u ∈ R dim , re-spectively, d ( (cid:126)v, (cid:126)u ) = (cid:118)(cid:117)(cid:117)(cid:116) dim dim (cid:88) i =1 ( (cid:126)v i − (cid:126)u i ) . • Squared Euclidean distance . Given nodes v, u with associated vectors (cid:126)v, (cid:126)u ∈ R dim , respectively, d ( (cid:126)v, (cid:126)u ) = dim (cid:88) i =1 ( (cid:126)v i − (cid:126)u i ) . We created three models with the intention of ﬁndingsimple rules that would give rise to a particular addednoise persistent homology pattern. First, the weightedprobabilistic triangle graph takes one parameter p thatcontrols triangle formation. The goal of this model is toform a weighted network such that when we expand tothe ﬁltration, at each step in the ﬁltration the new edgehas probability p of forming a triangle. Beginning with N nodes and 0 edges, with probability p we either add anedge that will create a new triangle, or we add an edge atrandom. If we are to add a new triangle, we check to see ifthere are any open triangles in the graph – that is wheretwo edges connect three nodes – and if there are no opentriangles, we add an edge at random. If there are opentriangles, we pick one open triangle to ﬁll with probabilityproportional to the product of the two edge weights of theopen triangle edges. The new edge is assigned a weightlower than any previously added edge.The second and third models are complementary tothe ﬁrst. In the second model, the random m -cliquenetworks take one parameter m that controls the sizeof cliques to be added. Beginning with an empty net-work we randomly choose m nodes and add a value of 1to each edge weight connecting the m nodes. This pro-cess repeats until no more than 12% of edges were empty.In the third model, we create a weighted clique modelin which throughout the ﬁltration, each new edge con-tributes to forming one growing clique. For example, oncethree edges have been added the graph is a 3-clique, theﬁrst 6 edges will form a 4-clique, the ﬁrst 10 edges addedwill form a 5-clique, and so on. Persistent homology [18, 43] measures the birth and deathof persistent cavities that arise and evolve throughout a sequence of simplicial complexes in which simplices maybe added at each step (a ﬁltered simplicial complex). Herewe form this sequence from a weighted network by creat-ing a graph ﬁltration G ⊆ G ⊆ · · · ⊆ G | E | , (1)where G i is the binary graph containing edges with the i highest weights in the weighted network, and then takingthe clique complex of each G i [23, 45, 29, 30]. We computethe persistent homology using the Eirene [27] package injulia. See the Supplementary Methods in the Appendixfor more details. Following Ref. [1], we use the three barcode summarieslisted below. Intuitively, each returns a description of theamount of persistent homology in each dimension, butthey vary by their weighting of each bar [ b l , d l ) of thebarcode. • Betti bar, ¯ β k . Let M be the total number of per-sistent cavities in dimension k . Then¯ β k = M (cid:88) l =1 ( d l − b l ) . The ¯ β k value sums the lifetimes of all bars in di-mension k . • Mu bar, ¯ µ k . Let M be the total number of persis-tent cavities in dimension k . Then¯ µ k = M (cid:88) l =1 b l ( d l − b l ) . The ¯ µ k value scales each bar’s lifetime by the birthtime and then sums these weighted lifetimes. • Nu bar, ¯ ν k . Let M be the total number of persis-tent cavities in dimension k and L the number ofedges in the complete graph. Then¯ ν k = M (cid:88) l =1 ( L − d l )( d l − b l ) . The ¯ ν k scales each bar’s lifetime based on the deathtime of that bar, and then sums the scaled lifetimes. We seek a simple and generative method to classify ournetworks to ensure ﬂexibility in incorporating new net-work models or data that may be generated from a dif-ferent underlying distribution. As such, we model thedistribution of features for each network model using amultivariate Gaussian model, and collect these modelsinto a Gaussian mixture model [47]. For prediction, we13se features from a held-out test set of network features,and assign a predicted label based on the class that gen-erates the highest posterior probability. These predictedlabels are then used to generate the confusion matrices ofthe main text.

For each network instantiation i belonging to networkmodel j , we collect a 12-dimensional vector x ij of fea-tures ¯ β k , ¯ µ k , and ¯ ν k for k = 1 , , ,

4. Then, we computethe mean µ j and covariance Σ j of the features for the N train = 250 networks belonging to model j , to yield theGaussian probability distribution of network model j as G j ( x | µ j , Σ j ) = 1 (cid:112) π | Σ j | e − ( x − µ j ) (cid:62) Σ − j ( x − µ j ) . Because some features are all 0 at some thresholds, webias the covariance matrix by adding 0 . I to ensure Σ j is positive deﬁnite. Next, we collect these models togetherinto the following equally weighted Gaussian mixture: p ( x ) = J (cid:88) j =1 J G j ( x | µ j , Σ j ) , where J = 12 is the total number of network models weused. Finally, for a test set of N test = 250 for each modelclass j (for a total of JN test = 3000 network instanti-ations), we computed the posterior probability for eachof the mixture components, and assigned a label to eachnetwork corresponding to the class of its maximum poste-rior probability. While the confusion matrices representone selection of training and testing sets, we measure thedistribution of performance by taking the total classiﬁca-tion accuracy for each feature set across 100 random anddisjoint sets of training and testing sets (see Fig. A10). Below we detail the inputs to each of the eight classiﬁ-cation experiments performed in this work. For each, wetake a speciﬁc subset of barcodes and use them to com-pute ¯ β k , ¯ µ k , and ¯ ν k . Recall that the barcode in dimension k is a set of pairs ( b l , d l ) corresponding to the birth anddeath time of persistent cavity i , respectively. • Added noise (in context).

We computed per-sistent homology on the entire graph sequence from ρ = 0 to ρ = 1, where at ρ = ρ T edge additions be-came random. To understand the information con-tained in the added noise portion of the barcode,we replaced any b l , d l < ρ T with ρ T | E | +1 | E | (the den-sity at which the ﬁrst noisy edge is added). Anycavity with b l = d l will contribute 0 to each of thethree summary statistics. Any bar with b l < ρ T and d l > ρ T will eﬀectively be shortened so thatthe persistent cavity is born at ρ T | E | +1 | E | . • Model weights before ρ T . From the barcodescomputed using the full ﬁltration, we replaced any b l , d l > ρ T with ρ T . Any bar with b l < ρ T and d l > ρ T will eﬀectively be shortened so that it diesat ρ T . • All barcodes.

Here we kept the barcodes as theywere originally calculated. • Added noise (isolated).

Persistent homology wascalculated on the added noise graph alone so weused these barcodes for the classiﬁcation. • Crossover bars.

We ﬁltered the added noise bar-code to only include bars with original birth b l < ρ T and original death d l > ρ T . Because we begin withthe added noise portion of the barcode, the smallest b l possible is ρ T | E | +1 | E | . • Noise exclusive bars.

We ﬁltered the barcode toonly include bars with b l > ρ T and d l > ρ T . • Randomized model edge weights.

First, werandomized the ordering of the model edges by ran-domizing their weights. We then computed the per-sistent homology on this randomized model, and ﬁ-nally we replaced every b l , d l > ρ T with ρ T , so thatany bar with b l < ρ T and d l > ρ T was eﬀectivelyshortened so that it would die at ρ T . All data can be generated using the open code hosted at https://github.com/asizemore/Noise_and_TDA . All code to generate data and perform analyses can befound at https://github.com/asizemore/Noise_and_TDA . Interactive visualizations are hosted at https://asizemore.github.io/noise_and_tda_supplement/ . The authors especially thank Dr. Erin Teich, Dr. LindenParkes, Darrick Lee, Dr. Jakob Hansen, Zoe Cooperband,and Dr. Lia Papadopolous for their helpful comments andinsightful feedback. This work was funded by the ArmyResearch Oﬃce through contract number W911NF-16-1-0474. DSB and ASB also acknowledge additional supportfrom the John D. and Catherine T. MacArthur Foun-dation, the Alfred P. Sloan Foundation, the ISI Foun-dation, the Paul Allen Foundation, the Army ResearchLaboratory (W911NF-10-2-0022), the Army Research Of-ﬁce (Bassett-W911NF-14-1-0679, DCIST- W911NF-17-2-0181), and the National Science Foundation (NSF PHY-1554488, BCS-1631550, and IIS-1926757). The content is14olely the responsibility of the authors and does not nec-essarily represent the oﬃcial views of any of the fundingagencies.

Recent work in several ﬁelds of science has identiﬁed abias in citation practices such that papers from womenand other minority scholars are under-cited relative to thenumber of such papers in the ﬁeld [38, 36, 9, 16, 17]. Herewe sought to proactively consider choosing references thatreﬂect the diversity of the ﬁeld in thought, form of contri-bution, gender, race, ethnicity, and other factors. First,we obtained the predicted gender of the ﬁrst and lastauthor of each reference by using databases that storethe probability of a ﬁrst name being carried by a woman[17, 64]. By this measure (and excluding self-citations tothe ﬁrst and last authors of our current paper), our refer-ences contain 14.79% woman(ﬁrst)/woman(last), 6.37%man/woman, 23.63% woman/man, and 55.22% man/-man. This method is limited in that a) names, pro-nouns, and social media proﬁles used to construct thedatabases may not, in every case, be indicative of genderidentity and b) it cannot account for intersex, non-binary,or transgender people. 15 eferences [1] Aaron Adcock, Erik Carlsson, and Gunnar Carlsson. The ring of algebraic functions on persistence bar codes.

Preprint http://comptop. stanford. edu/u/preprints/multitwo , 2012.[2] Christopher Aicher, Abigail Z Jacobs, and Aaron Clauset. Adapting the stochastic block model to edge-weightednetworks. arXiv preprint arXiv:1305.5782 , 2013.[3] Aaron F Alexander-Bloch, Nitin Gogtay, David Meunier, Rasmus Birn, Liv Clasen, Francois Lalonde, RhoshelLenroot, Jay Giedd, and Edward T Bullmore. Disrupted modularity and local connectivity of brain functionalnetworks in childhood-onset schizophrenia.

Frontiers in systems neuroscience , 4:147, 2010.[4] Keri L Anderson, Jeﬀrey S Anderson, Sourabh Palande, and Bei Wang. Topological data analysis of functionalmri connectivity in time and space domains. In

International Workshop on Connectomics in Neuroimaging ,pages 67–77. Springer, 2018.[5] Kanika Bansal, Javier O Garcia, Steven H Tompson, Timothy Verstynen, Jean M Vettel, and Sarah F Muldoon.Cognitive chimera states in human brain networks.

Science advances , 5(4):eaau8535, 2019.[6] D S Bassett, J A Brown, V Deshpande, J M Carlson, and S T Grafton. Conserved and variable architecture ofhuman white matter connectivity.

NeuroImage , 54(2):1262–1279, 2011.[7] Danielle S Bassett, Brent G Nelson, Bryon A Mueller, Jazmin Camchong, and Kelvin O Lim. Altered restingstate complexity in schizophrenia.

NeuroImage , 59(3):2196–2207, 2012.[8] C´ecile Bordier, Carlo Nicolini, and Angelo Bifone. Graph analysis and modularity of brain functional connectivitynetworks: searching for the optimal threshold.

Frontiers in neuroscience , 11:441, 2017.[9] Neven Caplar, Sandro Tacchella, and Simon Birrer. Quantitative evaluation of gender bias in astronomicalpublications from citation counts.

Nature Astronomy , 1(6):0141, 2017.[10] Gunnar Carlsson. Topology and data.

Bulletin of the American Mathematical Society , 46(2):255–308, 2009.[11] Fan Chung and Xing Peng. Decomposition of random graphs into complete bipartite graphs.

SIAM Journal onDiscrete Mathematics , 30(1):296–310, 2016.[12] Eli J Cornblath, Arian Ashourvan, Jason Z Kim, Richard F Betzel, Rastko Ciric, Azeez Adebimpe, Graham LBaum, Xiaosong He, Kosha Ruparel, Tyler M Moore, et al. Temporal sequences of brain activity at rest areconstrained by white matter structure and modulated by cognitive demands.

Communications biology , 3(1):1–12,2020.[13] Carina Curto. What can topology tell us about the neural code?

Bulletin of the American Mathematical Society ,54(1):63–78, 2017.[14] Marcel A de Reus and Martijn P van den Heuvel. Estimating false positives and negatives in brain networks.

Neuroimage , 70:402–409, 2013.[15] Emily L Dennis, Neda Jahanshad, Katie L McMahon, Greig I de Zubicaray, Nicholas G Martin, Ian B Hickie,Arthur W Toga, Margaret J Wright, and Paul M Thompson. Development of brain structural connectivitybetween ages 12 and 30: a 4-tesla diﬀusion imaging study in 439 adolescents and adults.

Neuroimage , 64:671–684, 2013.[16] Michelle L Dion, Jane Lawrence Sumner, and Sara McLaughlin Mitchell. Gendered citation patterns acrosspolitical science and social science methodology ﬁelds.

Political Analysis , 26(3):312–327, 2018.[17] Jordan D Dworkin, Kristin A Linn, Erin G Teich, Perry Zurn, Russell T Shinohara, and Danielle S Bassett. Theextent and drivers of gender imbalance in neuroscience reference lists. arXiv preprint arXiv:2001.01002 , 2020.[18] Herbert Edelsbrunner and John Harer. Persistent homology-a survey.

Contemporary Mathematics , 453:257–282,2008.[19] Paul Erd¨os and Alfr´ed R´enyi. On random graphs, i.

Publicationes Mathematicae (Debrecen) , 6:290–297, 1959.1620] Noah Friedkin. A test of structural features of granovetter’s strength of weak ties theory.

Social networks ,2(4):411–422, 1980.[21] Tova Fuller, Peter Langfelder, Angela Presson, and Steve Horvath. Review of weighted gene coexpressionnetwork analysis. In

Handbook of Statistical Bioinformatics , pages 369–388. Springer, 2011.[22] Robert Ghrist. Barcodes: the persistent topology of data.

Bulletin of the American Mathematical Society ,45(1):61–75, 2008.[23] Chad Giusti, Eva Pastalkova, Carina Curto, and Vladimir Itskov. Clique topology reveals intrinsic geometricstructure in neural correlations.

Proceedings of the National Academy of Sciences , 112(44):13455–13460, 2015.[24] Alexandros Goulas, Alexander Schaefer, and Daniel S Margulies. The strength of weak connections in themacaque cortico-cortical network.

Brain Structure and Function , 220(5):2939–2951, 2015.[25] Mark S Granovetter. The strength of weak ties.

American journal of sociology , 78(6):1360–1380, 1973.[26] Aric Hagberg, Pieter Swart, and Daniel S Chult. Exploring network structure, dynamics, and function usingnetworkx. Technical report, Los Alamos National Lab.(LANL), Los Alamos, NM (United States), 2008.[27] Gregory Henselman and Robert Ghrist. Matroid Filtrations and Computational Persistent Homology. arXivpreprint arXiv:1606.00199 , 2016.[28] Kathryn Hess. Topological adventures in neuroscience. In

Topological Data Analysis , pages 277–305. Springer,2020.[29] Danijela Horak, Slobodan Maleti´c, and Milan Rajkovi´c. Persistent homology of complex networks.

Journal ofStatistical Mechanics: Theory and Experiment , 2009(03):P03034, 2009.[30] Matthew Kahle. Topology of random clique complexes.

Discrete Mathematics , 309(6):1658–1671, 2009.[31] Matthew Kahle. Random geometric complexes.

Discrete & Computational Geometry , 45(3):553–573, 2011.[32] Matthew Kahle. Sharp vanishing thresholds for cohomology of random ﬂag simplicial complexes. 2012.[33] Lida Kanari, Ad´elie Garin, and Kathryn Hess. From trees to barcodes and back again: theoretical and statisticalperspectives. arXiv preprint arXiv:2010.11620 , 2020.[34] Christopher W Lynn, Ari E Kahn, Nathaniel Nyema, and Danielle S Bassett. Abstract representations of eventsarise from mental errors in learning and memory.

Nature communications , 11(1):1–12, 2020.[35] Xiaoke Ma and Lin Gao. Discovering protein complexes in protein interaction networks via exploring the weakties eﬀect.

BMC systems biology , 6(S1):S6, 2012.[36] Daniel Maliniak, Ryan Powers, and Barbara F Walter. The gender citation gap in international relations.

International Organization , 67(4):889–922, 2013.[37] Andr´e Melo and Heiko Paulheim. Detection of relation assertion errors in knowledge graphs. In

Proceedings ofthe Knowledge Capture Conference , pages 1–8, 2017.[38] Sara McLaughlin Mitchell, Samantha Lange, and Holly Brus. Gendered citation patterns in internationalrelations journals.

International Studies Perspectives , 14(4):485–492, 2013.[39] Sarah E Morgan, Simon R White, Edward T Bullmore, and Petra E V´ertes. A network neuroscience approachto typical and atypical brain development.

Biological Psychiatry: Cognitive Neuroscience and Neuroimaging ,3(9):754–766, 2018.[40] Cian Naik, Fran¸cois Caron, and Judith Rousseau. Sparse networks with core-periphery structure. arXiv preprintarXiv:1910.09679 , 2019.[41] Mark EJ Newman. The structure and function of complex networks.

SIAM review , 45(2):167–256, 2003.[42] Ippei Obayashi. Volume-optimal cycle: Tightest representative cycle of a generator in persistent homology.

SIAM Journal on Applied Algebra and Geometry , 2(4):508–534, 2018.1743] Nina Otter, Mason A Porter, Ulrike Tillmann, Peter Grindrod, and Heather A Harrington. A roadmap for thecomputation of persistent homology.

EPJ Data Science , 6(1):17, 2017.[44] Alice Patania, Giovanni Petri, and Francesco Vaccarino. The shape of collaborations.

EPJ Data Science , 6(1):18,2017.[45] Giovanni Petri, Martina Scolamiero, Irene Donato, and Francesco Vaccarino. Topological strata of weightedcomplex networks.

PloS One , 8(6):e66506, 2013.[46] Abolfazl Ramezanpour, V Karimipour, and Alireza Mashaghi. Generating correlated networks from uncorrelatedones.

Physical Review E , 67(4):046107, 2003.[47] Douglas A Reynolds. Gaussian mixture models.

Encyclopedia of biometrics , 741, 2009.[48] Vanessa Robins, Mohammad Saadatfar, Olaf Delgado-Friedrichs, and Adrian P Sheppard. Percolating lengthscales from topological persistence analysis of micro-ct images of porous materials.

Water Resources Research ,52(1):315–329, 2016.[49] Emiliano Santarnecchi, Giulia Galli, Nicola Riccardo Polizzotto, Alessandro Rossi, and Simone Rossi. Eﬃciencyof weak brain connections support general cognitive functioning.

Human brain mapping , 35(9):4566–4582, 2014.[50] M ´Angeles Serrano, Mari´an Bogun´a, and Alessandro Vespignani. Extracting the multiscale backbone of complexweighted networks.

Proceedings of the national academy of sciences , 106(16):6483–6488, 2009.[51] Ann Sizemore, Chad Giusti, and Danielle S Bassett. Classiﬁcation of weighted networks through mesoscalehomological features.

Journal of Complex Networks , page cnw013, 2016.[52] Ann E. Sizemore, Chad Giusti, Ari Kahn, Jean M. Vettel, Richard F. Betzel, and Danielle S. Bassett. Cliquesand cavities in the human connectome.

Journal of Computational Neuroscience , Nov 2017.[53] Alexander H Stephan, Ben A Barres, and Beth Stevens. The complement system: an unexpected role in synapticpruning during development and disease.

Annual review of neuroscience , 35:369–389, 2012.[54] Martijn P van den Heuvel, Siemon C de Lange, Andrew Zalesky, Caio Seguin, BT Thomas Yeo, and RubenSchmidt. Proportional thresholding in resting-state fmri functional connectivity networks and consequences forpatient-control connectome studies: Issues and recommendations.

Neuroimage , 152:437–449, 2017.[55] Bernadette C M van Wijk, Cornelis J Stam, and Andreas Daﬀertshofer. Comparing brain networks of diﬀerentsize and connectivity density using graph theory.

PLoS One , 5(10):e13701, 2010.[56] Bo Wang, Armin Pourshafeie, Marinka Zitnik, Junjie Zhu, Carlos D Bustamante, Seraﬁm Batzoglou, andJure Leskovec. Network enhancement as a general method to denoise weighted biological networks.

Naturecommunications , 9(1):1–8, 2018.[57] Duncan J Watts and Steven H Strogatz. Collective dynamics of ‘small-world’networks.

Nature , 393(6684):440–442, 1998.[58] Eugene P Wigner. On the distribution of the roots of certain symmetric matrices.

Annals of Mathematics , pages325–327, 1958.[59] Yingying Xing, Jian Lu, and Shendi Chen. Weighted complex network analysis of shanghai rail transit system.

Discrete Dynamics in Nature and Society , 2016, 2016.[60] Lin Yan, Yusu Wang, Elizabeth Munch, Ellen Gasparovic, and Bei Wang. A structural average of labeled mergetrees for uncertainty visualization.

IEEE Transactions on Visualization and Computer Graphics , 26(1):832–842,2019.[61] Jiaxuan You, Rex Ying, Xiang Ren, William L Hamilton, and Jure Leskovec. Graphrnn: Generating realisticgraphs with deep auto-regressive models. arXiv preprint arXiv:1802.08773 , 2018.[62] Andrew Zalesky, Alex Fornito, Luca Cocchi, Leonardo L Gollo, Martijn P van den Heuvel, and Michael Break-spear. Connectome sensitivity or speciﬁcity: which is more important?

NeuroImage , 142:407–420, 2016.[63] An Zeng and Giulio Cimini. Removing spurious interactions in complex networks.

Physical Review E ,85(3):036101, 2012. 1864] Dale Zhou, Max A Bertolero, Jennifer Stiso, Eli J Cornblath, Erin G Teich, Ann S Blevins, Virtualmario,Christopher Camp, Jordan D Dworkin, and Danielle S Bassett. Gender diversity statement and code notebookv1.1. https://github.com/dalejn/cleanBib.[65] Fang Zhou, S´ebastien Mahler, and Hannu Toivonen. Simpliﬁcation of networks by edge pruning. In

BisociativeKnowledge Discovery , pages 179–198. Springer, 2012.[66] Afra Zomorodian and Gunnar Carlsson. Computing persistent homology.

Discrete & Computational Geometry ,33(2):249–274, 2005. 19

Persistent homology measures higher order structure in weighted networks by detecting topological cavities thatform at diﬀerent edge weight (or density) thresholds. Those cavities that exist across thresholds are called persistentcavities. The three main steps of persistent homology for weighted graphs as performed in this paper are (i) creatinga sequence of binary graphs from the weighted network, (ii) transforming this sequence of binary graphs into asequence of simplicial complexes, and (iii) detecting persistent cavities that form and collapse throughout the sequence[23, 45, 29, 30]. We discuss these three steps in more detail below, but we advise the interested reader to also consultRefs. [22, 43, 45].

Given a weighted network in which all edge weights are unique, the edge weights induce an ordering on the edgesfrom greatest to least. We can follow this ordering to create a sequence of binary graphs with one new edge addedat each step. More rigorously, we construct G ⊆ G ⊆ · · · ⊆ G | E | , (2)where G i is the binary graph containing the strongest (i.e. highest weighted) i edges in the original weighted network.In our case, we always have | E | = (cid:0) N (cid:1) so that the last graph G | E | is the complete graph on N nodes. Next we transform our graphs into simplicial complexes so that we can detect higher order structure. A simplicialcomplex is similar to a graph in that it records connectivity between nodes, but in a simplicial complex a set of k nodes can participate in a polyadic relation called a simplex . Speciﬁcally, a simplicial complex is a set K = ( V, S )where V is the vertex set and S is the set of simplices, such that if σ ⊂ V is a simplex ( σ ∈ S ), then for any τ ⊆ σ , τ is also a simplex ( τ ∈ S ). Geometrically, a k -simplex is the convex hull of k + 1 aﬃnely positioned points, whichwe interpret as a building block on k + 1 nodes within the complex. Intuitively, a 0-simplex is a node, a 1-simplexan edge, a 2-simplex a ﬁlled triangle, a 3-simplex a ﬁlled tetrahedron, and so on.We can transform a binary graph G into a simplicial complex X ( G ) by adding a k -simplex between k + 1 nodeswhenever the k + 1 nodes are all-to-all connected (form a k + 1 clique). The constructed X ( G ) is called the cliquecomplex or ﬂag complex of G . Now that we can take an arbitrary binary graph G and create from it a simplicialcomplex X ( G ), we perform this step on all binary graphs in Eq. 2. The result is a sequence of simplicial complexes X ( G ) ⊆ X ( G )) ⊆ · · · ⊆ X ( G | E | ) , (3)where each X ( G i ) is the clique complex of G i . Beginning with one simplicial complex X ( G i ), a topological cavity of dimension k > k -simplices thatis not ﬁlled by ( k + 1)-simplices. A cavity in dimension 1 could manifest as a ring of 1-simplices (edges), the interiorof a loop of 2-simplices such that the inside is empty, or even a tube of 2-simplices. In dimension 2, a topologicalcavity could similarly manifest as a shell of 2-simplices (such as an octagon), or perhaps as a shell of 2-simplices withmany 3-simplices attached on the outside. Cavities in dimensions > tracks the cavitiesthroughout the sequence of simplicial complexes. Consider the step X ( G i ) → X ( G i +1 ) where the → indicates themapping sending nodes and simplices in X ( G i ) to their counterparts in X ( G i +1 ). With the addition of the i + 1edge, three non-exclusive topological situations could occur. First, the edge addition could create a new k -simplexthat completes the shell of a k -cavity. In other words, this new edge could form a k -cavity. Second, some k -cavitythat existed in X ( G i ) may still be a cavity (non-tessellated shell), although it might be smaller given the addition ofsimplices added with the new edge. In this case we say that the k -cavity persists from X ( G i ) to X ( G i +1 ). Third andﬁnally, the new edge may add simplices such that a cavity that existed in X ( G i ) becomes tessellated, or ﬁlled in, in X ( G i +1 ). Here we say that edge i + 1 killed the k -cavity. When we extend this process across the entire sequence ofsimplicial complexes from Eq. 3, we recover the formation, persistence, and tessellation of topological cavities. Thecollection of persistent topological cavities is called the persistent homology of the weighted network.20 For any persistent cavity, the edge density at which the cavity ﬁrst appears is called the birth , and the edge densityat which the cavity is killed is called the death . The persistent homology records the (birth, death) pair of everypersistent cavity. The collection of persistent k -cavities is called the barcode and is visually represented as a sequenceof horizontal lines in which each line represents one persistent cavity and extends from the persistent cavity birth toits death.Barcodes as mathematical objects lack particular helpful properties, such as a unique mean, so we often representthe persistent homology in dimension k as a Betti curve β k . At edge density ρ , β k ( ρ ) is the number of persistentcavities alive at density ρ . Though Betti curves do not retain the persistence information of the barcodes, they arehelpful for visualization and interpretation of the persistent homology of graph models because we can compute themean and standard error across replicates. To supplement our descriptions in the Methods section, we here provide extended details regarding our generation ofnetwork models. As stated above, every network contained N = 70 nodes and from each model network we generated500 replicates. If the edge weights were not unique, random noise was added to ensure uniqueness while retainingthe relative ordering of edges. IID noise

Each edge weight was drawn uniformly at random from [0 , Weighted stochastic block model networks

Using code from [2] (rewritten in julia), we created the assortative,core periphery, and disassortative models. All three models consisted of diﬀerent arrangements of 16 blocks describingedges between N/ N/ N is not divisible by 4). Each block was eithera high-weight or a low-weight block. The high-weight blocks consisted of entries drawn from a normal distributionwith parameters µ = 20, σ = 5 for the assortative and disassortative models, and µ = 15, σ = 5 for the core peripherymodel. The low-weight blocks contained entries drawn from a normal distribution with µ = 10 and σ = 5.High- and low-weight blocks were arranged based on the speciﬁc model type. For the assortative model, fourhigh-weight blocks were placed along the diagonal, and the rest of the matrix was ﬁlled with low-weight blocks. Forthe disassortative model, four low-weight blocks were placed on the diagonal and the rest of the matrix was ﬁlled withhigh-weight blocks. Finally, the core periphery model contained high-weight blocks along the top row (of four rows)and along the ﬁrst column (of four columns) of blocks, while the rest was ﬁlled with low-weight blocks. Additionally,see Fig. A1 for block arrangements. Conﬁguration models

For both the geometric conﬁguration and discrete uniform conﬁguration models, weformed networks using julia code inspired by Ref. [41] and the NetworkX implementation [26]. To create a network,we ﬁrst drew node strengths from a distribution and set this vector as the target node strength vector; for the geo-metric conﬁguration model, the chosen distribution was a geometric distribution with p = 0 .

01 then scaled by 100,whereas for the discrete uniform conﬁguration model, the chosen distribution was a discrete uniform distribution withparameters a = 0, b = 1000. Beginning with an empty graph, we repeatedly joined pairs of randomly chosen nodeswith edges of weight 1, counting l connections formed between a pair of nodes as one edge with weight l . This processcontinued until the strength of each node in the network matched the target node strength deﬁned by the targetnode strength vector. See Refs. [41, 26] for more details and https://github.com/asizemore/Noise_and_TDA forcode. Dot product

To create the dot product network, we chose N points uniformly at random from R dim . Here, N = 70 and dim = 3. Each point (cid:126)v chosen in R dim was associated with a node v in the network. We then assignedthe edge weight between v and u to be (cid:126)v · (cid:126)u . Distance-based models

The ﬁve distance based models rely on diﬀerent distance metrics to determine edgeweights. Below we detail each in turn. For all except the weighted ring lattice, the model begins with choosing N points (cid:126)v uniformly at random from R dim , and associating these points ( (cid:126)v ) with nodes ( v ). For the results reportedin this paper, dim = 3. Each model has an associated distance metric d ( (cid:126)v, (cid:126)u ), and the edge weight between nodes v, u for the model are assigned to d ( (cid:126)v,(cid:126)u ) . See below for the chosen distance functions.21 For the cosine geometric model, d ( (cid:126)v i , (cid:126)v j ) = 1 − (cid:126)v i · (cid:126)v j || (cid:126)v i |||| (cid:126)v j || . • For the random geometric model, d ( (cid:126)v, (cid:126)u ) = (cid:118)(cid:117)(cid:117)(cid:116) dim (cid:88) i =1 ( (cid:126)v i − (cid:126)u i ) . • For the root mean squared deviation model, d ( (cid:126)v, (cid:126)u ) = (cid:118)(cid:117)(cid:117)(cid:116) dim dim (cid:88) i =1 ( (cid:126)v i − (cid:126)u i ) . • For the squared euclidean distance model, d ( (cid:126)v, (cid:126)u ) = dim (cid:88) i =1 ( (cid:126)v i − (cid:126)u i ) . Finally, forming the weighted ring lattice network can be imagined as placing N nodes uniformly around a circle,and then connecting node pairs with edges weighted according to their distance. Speciﬁcally, given nodes v , . . . , v N ,we connect all pairs ( v i , v j ), i < j by an edge with weight 1 if j − i = 1 or i = 1, j = N . The distance d ( v i , v j ) isthen the hop distance between nodes v i and v j on this ring. For the ﬁnal weighted network, we assign edge weightsto equal d ( v i ,v j ) between nodes v i and v j . For example, the edge connecting v and v N =70 in our networks has aweight of 1 / Below we include additional plots and results that are not shown in the main text, but that serve to support ourobservations and inferences.

In order to investigate added noise from a variety of perspectives, we chose network models with a wide range ofproperties. In Fig. A1, A2, A3 we show an example weighted network created from each model (top). We additionallyshow the strength distribution (middle) and edge weight distribution (bottom) of one weighted network serving asan example of the given model. 22igure A1:

Network properties for the ﬁrst four of twelve models.

For each model, we show the adjacencymatrix (top), node strength distribution (middle), and edge weight distribution (bottom) of an example graph.Figure A2: (Continued from previous) Network properties for the middle four of twelve models.

Foreach model, we show the adjacency matrix (top), node strength distribution (middle), and edge weight distribution(bottom) of an example graph. 23igure A3: (Continued from previous) Network properties for the ﬁnal four of twelve models.

Foreach model, we show the adjacency matrix (top), node strength distribution (middle), and edge weight distribution(bottom) of an example graph. 24igure A4:

Betti curves for all models at ρ T = 0 . ± the standard deviation shown with shaded regions. See Fig. 3 for more details. To supplement Fig. 3, we include the Betti curves for all models with ρ T = 0 . ρ T for all experiments discussed in the main text.Given the large number of replicates, experimental setups, and values of ρ T tested, we additionally host interactivevisualizations at https://asizemore.github.io/noise_and_tda_supplement/ .In Section 3.4 we took the added noise portion of the thresholded model network runs and computed the persistenthomology of the added noise network in isolation. Said another way, after constructing the gold section of Fig. 1,we took all of the gold edges to be one weighted network. If the model was thresholded at ρ T , the isolated noisenetwork would have density 1 − ρ T . In the same style as Fig. 3c, we show the average isolated noise network Betticurves for all tested values of ρ T in Fig. A5. 25igure A5: Betti curves generated from isolated noise networks across values of ρ T . See Fig. 3 for plotdetails. 26n Section 3.4 we also investigated how randomizing the model edge weights would aﬀect the classiﬁcation results.In Fig. A6 we show the average Betti curves generated by randomizing model weights across all tested values of ρ T .Figure A6: Betti curves generated from randomized model weight experiments across values of ρ T . SeeFig. 3 for plot details. 27s discussed in Section 3.5, the weighted probabilistic triangle model produces a network in which the value ofthe parameter p aﬀects the likelihood of forming triangles as the ﬁltration unfolds. We show average Betti curvesacross tested values of ρ T for all investigated values of p in Fig. A7.Figure A7: Betti curves generated from the weighted probabilistic triangle model across values of p , ρ T . See Fig. 3 for plot details. 28lso in Section 3.5, we introduced the m -clique and weighted clique models. We show the average Betti curvesfor all tested values of ρ T and m in Fig. A8.Figure A8: Betti curves generated from the m -clique and weighted clique models across values of m , ρ T . See Fig. 3 for plot details. 29inally, in Section 3.6 we discussed how a more realistic addition of noise to a model network could inﬂuenceinterpretations of Betti curves. After adding overlapping noise to all network models, we computed the persistenthomology for multiple values of ρ a , ρ b . See Fig. A9 for resulting Betti curves.Figure A9: Betti curves generated from adding noise to model networks in which the probability ofnoise increased linearly from 0 to 1 within the interval [ ρ a , ρ b ] . All tested values of [ ρ a , ρ b ] shown. See Fig. 3for plot details. ρ T The classiﬁcation experiments discussed above were performed for all values of ρ T . For most experiments, confusionmatrices for ρ T = 0 . Classiﬁcation accuracy for experiments in the main text.

Means and standard deviations oftest classiﬁcation accuracy across 100 random disjoint sets of training versus testing sets. Each point represents themean test classiﬁcation accuracy (over 100 sets) across all 12 network types when the classiﬁer was trained on 250graph instantiations (for a total of 12 ×

250 = 3000 networks in the training set), and tested on a separate 250 graphinstantiations (for a total of 3000 networks in the testing set), for one threshold value and for one set of features.Finally, we plot confusion matrices for all values of ρ T for experiments using the model weights (Fig. A11), addednoise (Fig. A12), the entire ﬁltration (Fig. A13), and the isolated added noise (Fig. A14) barcodes, in additionto those using crossover (Fig. A15) or noise-exclusive (Fig. A16) bars. We show the confusion matrices for therandomized model weights classiﬁcation experiment in Fig. A17. In Section 3.5, we discussed a growing clique model in which at each step in the ﬁltration we expand the one clique.Said another way, the growing graph proceeds such that if we have added exactly (cid:18) n (cid:19) edges for some n ≤ N , thenthe added edges form an n -clique. At these speciﬁc points in the ﬁltration, we know that every node within theclique connects to every other node in the clique, and no other edges exist within the graph (all other nodes areisolated). This rigidity oﬀers an opportunity to mathematically determine how noise added to the network at thispoint will evolve. In what follows, we will oﬀer a brief beginning to this venture.Suppose that we have a clique with L < N nodes, and all other M = N − L nodes are isolated. If we createthe clique complex from this clique, we have one maximal ( L − M maximal 0-simplices. Denote thenumber of k -simplices (( k + 1)-cliques) in a binary graph by f k . From Morse theory we know that − f k − + f k − f k +1 ≤ β k ≤ f k where β k is the k -th Betti number. Therefore, determining expectations for f k for all k will provide informationabout β k and the Euler characteristic.Next, let us consider how f k evolves within random noise. Given a random graph on N nodes in which each edgeexists with probability p < E [ f k ] = p m k (cid:0) nk +1 (cid:1) , where m k = (cid:0) k +12 (cid:1) the number of edges within a k -simplex. Theseresults from the random IID graph were shown in Ref. [30]. In contrast, in our case we have a clique on L nodesand we then have M isolated vertices, where L + M = N . So we still have work left to do.Returning to our situation, let p be the probability of each edge not in the clique L existing. That is, if p = 0, wehave only the clique edges in the graph, but if p = 1 all edges in the network will exist. Begin with k = 1, an edge.To determine E [ f ] we must count the elements of three sets: (i) the number of edges within the clique (between twonodes of set L ), (ii) the number of edges that have one node from L and one from M , and (iii) the number of edges31igure A11: Classiﬁcation results for all values of ρ T using barcode summaries computed from bars inthe model weights section of the ﬁltration. Classiﬁcation results for all values of ρ T using barcode summaries computed from bars inthe added noise section of the ﬁltration. Classiﬁcation results for all values of ρ T using barcode summaries computed from all bars. Classiﬁcation results for all values of ρ T using barcode summaries from the persistenthomology of the added noise network in isolation. Classiﬁcation results for all values of ρ T using barcode summaries of those bars in thebarcode that are born in the model weights section but die in the added noise section of the ﬁltration. All birth densities are set to ρ T . 36igure A16: Classiﬁcation results for all values of ρ T using barcode summaries of those bars that areborn and killed at or after ρ T . Classiﬁcation results for all values of ρ T using barcode summaries computed from barsin the ﬁrst section of the ﬁltration. For these experiments, the model weights were randomized before thepersistent homology was computed. 38ithin the originally isolated node set (between two nodes of set M ). Writing these out formally we have E [ f ] = (cid:18) L (cid:19) + pLM + p (cid:18) M (cid:19) . For 2-simplices, we also have a reasonable deﬁnition drawing from the fact that each 2-simplex has between 3 and0 nodes in L (and between 0 and 3 nodes in M ). The edges in L already exist, so we are only asking about theprobability of edges forming that are not between two L nodes. Again writing the above formally we have f = (cid:18) L (cid:19) + p M (cid:18) L (cid:19) + p L (cid:18) M (cid:19) + p (cid:18) M (cid:19) . Indeed any k -simplex will consist of between 0 and k + 1 nodes from L , and consequently between k + 1 and 0 nodesin M . Using this notion we can ﬁnally write the expectation for f k for such a graph with a clique of L nodes andrandom edges added elsewhere with probability p as E [ f k ] = (cid:18) Lk + 1 (cid:19) + p m k (cid:18) Mk + 1 (cid:19) + k − (cid:88) i =0 p m k − m i (cid:18) Li + 1 (cid:19)(cid:18) Mk − i (cid:19) ..