Sketching Merge Trees for Scientific Data Visualization
SSketching Merge Trees
Mingzhe Li
School of Computing, University of UtahSalt Lake City, UT, [email protected]
Sourabh Palande
School of Computing, University of UtahSalt Lake City, UT, [email protected]
Bei Wang School of Computing, University of UtahSalt Lake City, UT, [email protected]
Abstract
Merge trees are a type of topological descriptors that record the connectivity among the sublevel setsof scalar fields. In this paper, we are interested in sketching a set of merge trees. That is, given a set T of merge trees, we would like to find a basis set S such that each tree in T can be approximatelyreconstructed from a linear combination of merge trees in S . A set of high-dimensional vectors canbe sketched via matrix sketching techniques such as principal component analysis and column subsetselection. However, up until now, topological descriptors such as merge trees have not been knownto be sketchable. We develop a framework for sketching a set of merge trees that combines theGromov-Wasserstein framework of Chowdhury and Needham with techniques from matrix sketching.We demonstrate the applications of our framework in sketching merge trees that arise from dataensembles in scientific simulations. Theory of computation: Randomness, geometry and discretestructures: Computational geometry
Keywords and phrases
Merge trees, sketching, topological data analysis, applications, experiments,implementations
Funding
DOE DE-SC0021015
Acknowledgements
We thank Jeff Phillips for discussions involving column subset selection andBenwei Shi for his implementation on length squared sampling. We also thank Ofer Neiman forsharing the code on low stretch spanning tree. We thank Lin Yan for sharing ensemble datasets andgenerating scalar field visualization shown in Figure 2 and Figure 6.
Topological descriptors such as merge trees, contour trees, Reeb graphs, and Morse–Smalecomplexes serve to describe and identify characteristics associated with scalar fields, withmany applications in the analysis and visualization of scientific data [48, 56]. In this paper,we are interested in sketching a set of topological descriptors.We are motivated by the idea of matrix sketching. A set of high-dimensional vectorsis sketchable via matrix sketching techniques such as principle component analysis (PCA),and column subset selection (CSS), as illustrated in Figure 1 (gray box). Given a dataset of N points with d features, represented as a d × N matrix A (with row-wise zero empirical Corresponding author a r X i v : . [ c s . C G ] J a n Sketching Merge Trees mean), together with a parameter k , PCA aims to find a k -dimensional subspace H of R d that minimizes the average squared distance between the points and their correspondingprojections onto H . Equivalently, for every input point a i (a column vector of A ), PCA findsa k -dimensional embedding y i (a column vector of Y ) along the subspace H to minimize || A − ˆ A || F = || A − BY || F . B is a d × k matrix whose columns b , b , . . . , b k form anorthonormal basis for H . Y is a k × N coefficient matrix, whose column y i encodes thecoefficients for approximating a i using the basis from B . That is, a i ≈ ˆ a i = P kj =1 b j Y j,i . Another technique we discuss is CSS, whose goal is to find a small subset of the columnsin A to form B such that the projection error of A to the span of the chosen columns isminimized, that is, to minimize || A − ˆ A || F = || A − BY || F , where we restrict B to come fromcolumns of A . Such a restriction is important for data summarization, feature selection,and interpretable dimensionality reduction. Thus, with either PCA or CSS, given a set ofhigh-dimensional vectors, we could find a set of basis vectors such that each input vector canbe approximately reconstructed from a linear combination of the basis vectors.Now, what if we replace a set of high-dimensional vectors by a set of objects that encodetopological information of data, specifically topological descriptors? Until now, topologicaldescriptors have not been known to be sketchable. In this paper, we focus on merge trees,which are a type of topological descriptors that record the connectivity among the sublevelsets of scalar fields. We address the following question: given a set T of merge trees, can wefind a basis set S such that each tree in T can be approximately reconstructed from a linearcombination of merge trees in S ?
Figure 1
The overall pipeline for sketching a set of merge trees.
Our overall pipeline is illustrated in Figure 1 and detailed in Section 4. In steps 1 and 2,given a set of N merge trees T = { T , T , · · · , T N } as input, we represent each merge tree T i as a metric measure network and employ the Gromov-Wasserstein (GW) framework ofChowdhury and Needham [25] to map it to a column vector a i in the data matrix A . Instep 3, we apply matrix sketching techniques, in particular, column subset selection (CSS)and non-negative matrix factorization (NMF), to obtain an approximated matrix ˆ A , where A ≈ ˆ A = B × Y . In step 4, we convert each column in ˆ A into a merge tree (referred to as asketched merge tree) using spanning trees, in particular, minimum spanning trees (MST) andlow-stretch spanning trees (LSST). Finally, in step 5, we return a set of basis merge trees S . Li, S. Palande, and B. Wang 3 by applying LSST or MST to each column b j in B . Each entry Y j,i in the coefficient matrix Y defines the coefficient for basis tree S j in approximating T i . Thus, intuitively, with theabove pipeline, given a set of merge trees, we could find a set of basis trees such that eachinput tree can be approximately reconstructed from a linear combination of the basis trees.Our contribution is two-fold. First, we combine the notion of GW distances with matrixsketching techniques to give a class of algorithms for sketching a set of merge trees. Second,we provide experimental results that demonstrate the utility of our framework in sketchingmerge trees that arise from scientific simulations. Understanding the sketchability propertiesof merge trees can be particularly useful for the analysis and visualization of scientificensembles, where our framework can be used to find good ensemble representatives and toidentify outliers. In this section, we review relevant work on merge trees, Gromov-Wasserstein distances, graphalignment and averaging, matrix sketching, and spanning trees.
Merge Trees.
Merge trees (also known as barrier trees [40] or join trees [21]) are a type oftopological descriptors that record the connectivity among the sublevel sets of scalar fields.They are rooted in Morse theory [65], which characterizes scalar field data by the topologicalchanges in its sublevel sets at isolated critical points. Other types of topological descriptors forscalar fields include contour trees [21], Reeb graphs [72], and Morse–Smale complexes [34, 33].In this paper, instead of a direct comparison between a pair of merge trees using existingmetrics for merge tree or Reeb graphs (e.g., [9, 11, 8, 10, 75, 24, 66, 28, 67, 12]), we treatmerge trees as metric measure networks and utilize the Gromov-Wasserstein frameworkdescribed in Section 3 to obtain their alignment and vector representations.
Gromov-Wasserstein Distances.
Gromov introduced Gromov-Hausdorff (GH) distances [43]while presenting a systematic treatment of metric invariants for Riemannian manifolds. GHdistances can be employed as a tool for shape matching and comparison ( e.g. , [17, 58, 59,62, 63]): shapes are treated as metric spaces, and two shapes are considered equal if theyare isometric. Memoli [60] modified the formulation of GH distances by introducing arelaxed notion of proximity between objects, thus generalizing GH distances to the notion ofGromov-Wasserstein (GW) distances for practical considerations. Since then, GW distanceshave had a number of variants based on optimal transport [77, 78] and measure-preservingmappings [61]. Apart from theoretical explorations [60, 76], GW distances have been utilizedin the study of graphs and networks [49, 80, 81], machine learning [38, 19], and word embed-dings [6]. Recently, Memoli et al. [64] considered the problem of approximating (sketching)metric spaces using GW distance. Their goal was to approximate a (single) metric measurespace modeling the underlying data by a smaller metric measure space. The work presentedin this paper focuses on approximating a set of merge trees (modeled as a set of metricmeasure networks) with a smaller set of merge trees.
Aligning and Averaging Graphs.
Graph alignment or graph matching is a key ingredientin performing comparisons and statistical analysis on the space of graphs ( e.g. , [37, 44]). Itis often needed to establish node correspondences between graphs of different sizes. Theworks most relevant here are the ones that utilize Riemannian geometry in studying statisticson graphs. Jain and Obermayer [51] imposed a metric structure on the space of graphsfor performing alignment and estimating the mean of a distribution on attributed graphs.Guo et al. [47, 46] utilized the metric structure in [51] to quantify graph differences and to
Sketching Merge Trees compute optimal deformations between graphs. They further established a framework forcomputing mean, covariance, and PCA of graphs. For merge trees specifically, Gasparovic etal. [41] studied the metric 1-center of a set of labeled merge trees, which found applicationsin visualization [82]. However, these works focused on aligning graph nodes or tree nodesover permutations, thus creating “hard matchings” [25]. On the other hand, the approachesbased on the GW distances [69, 25] are rooted in measure theory and employ probabilistic or“soft matchings” of nodes. Information in a graph can be captured by a symmetric positivesemidefinite matrix that encodes distances or similarities between pairs of nodes. Dryden et al. [32] described a way to perform statistical analysis and to compute the mean of suchmatrices. Agueh et al. [4] considered barycenters of several probability measures, whereasCuturi et al. [27] and Benamou et al. [13] developed efficient algorithms to compute suchbarycenters. Peyre et al. [69] combined these ideas with the notion of GW distances [60]to develop GW averaging of distance/similarity matrices. Chowdhury and Needham [25]built upon the work in [69] and provided a GW framework to compute a Frechét meanamong these matrices using measure couplings. In this paper, we utilize the GW frameworkproposed in [25] for “soft matching” among merge trees.
Matrix Sketching.
Many matrix sketching techniques build upon numerical linear algebraand vector sketching. For simplicity, we formulate the problem as follows: Given a d × N matrix A , we would like to approximate A using fewer columns, as a d × k matrix B suchthat A and B are considered to be close with respect to some problem of interest. Basicapproaches for matrix sketching include truncated singular value decomposition (SVD),column or row sampling [30, 31], random projection [73], and frequent directions [42, 55];see [70, 79] for surveys.The column (or row) sampling approach carefully chooses a subset of the columns (orrows) of A proportional to their importance , where the importance is determined by thesquared norm ( e.g. , [30]) or the (approximated) leverage scores ( e.g. , [31]). The randomprojection approach takes advantage of the Johnson-Lindenstrauss (JL) Lemma [53] to createan N × k linear projection matrix S ( e.g. , [73]), where B = AS . The frequent directionsapproach [55, 42] focuses on replicating properties of the SVD. The algorithm processes eachcolumn of A at a time while maintaining the best rank- k approximation as the sketch. Spanning Trees of Weighted Graphs.
Given an undirected, weighted graph G , a spanningtree is a subgraph of G that is a tree that connects all the vertices of G with a minimumpossible number of edges. We consider two types of spanning trees: the minimal spanningtree (MST) and the low stretch spanning tree (LSST) [1, 2, 3]. Whereas the MST tries tominimize the sum of edge weights in the tree, LSST tries to minimize the stretch (relativedistortion) of pairwise distances between the nodes of G . LSSTs were initially studied in thecontext of road networks [5]. They also play an important role in fast solvers for symmetricdiagonally dominant (SDD) linear systems [35, 54].A LSST does not stretch distances of G too much, in a sense that the distance betweenany two nodes in the tree is at most a small constant times the distance between the samenodes in G . Alon et al. [5] studied LSSTs in the context of the k -server problem on a roadnetwork and provided the first theoretical bound on the average stretch of LSSTs. Elkin etal. [35] used LSSTs to improve the running time for solving symmetric diagonally dominant(SDD) linear systems [74]. Koutis et al. [54] further improved upon the running time ofLSST computation to provide a faster SDD solver. In a series of papers, Abraham andcoauthors [1, 2, 3] proposed incrementally faster algorithms to compute LSSTs, while at thesame time improving the bound on the average stretch. Most of these works focused on . Li, S. Palande, and B. Wang 5 bounding the average stretch. The problem of finding a spanning tree that minimizes themax stretch, referred to as the Minimum Max-Stretch spanning Tree (MMST), is knownto be NP-hard. However, Emek and Peleg [36] presented an approximation algorithm forMMST which is a variation of the LSST. We begin by reviewing the notion of a merge tree that arises from a scalar field. We thenintroduce the technical background needed to map a merge tree to a column vector in thedata matrix. We primarily utilize the Gromov-Wasserstein (GW) framework of Chowdhuryand Needham [25], which builds upon theoretical results on GW distances [59, 60], with afew ingredients from Peyre et al. [69].
Figure 2
Two examples of merge trees from scalar (height) fields. For each example, from left toright: 2D scalar fields visualization, merge trees embedded in the graphs of the scalar fields, andabstract visualization of merge trees as rooted trees equipped with height functions.
Merge Trees.
Let f : M → R be a scalar field defined on the domain of interest M , where M can be a manifold or a subset of R d . For our experiments in Section 5, M ⊂ R . Merge treescapture the connectivity among the sublevel sets of f , i.e. , M a = f − ( −∞ , a ]. Formally, twopoints x, y ∈ M are equivalent , denoted by x ∼ y , if f ( x ) = f ( y ) = a , and x and y belong tothe same connected component of a sublevel set M a . The merge tree , T ( M , f ) = M / ∼ , isthe quotient space obtained by gluing together points in M that are equivalent under therelation ∼ . To describe a merge tree procedurally, as we sweep the function value a from −∞ to ∞ , we create a new branch originating at a leaf node for each local minimum of f .As a increases, such a branch is extended as its corresponding component in M a grows untilit merges with another branch at a saddle point. If M is connected, all branches eventuallymerge into a single component at the global maximum of f , which corresponds to the root ofthe tree. For a given merge tree, leaves, internal nodes, and root node represent the minima,merging saddles, and global maximum of f , respectively. Figure 2 displays a set of twoscalar fields with their corresponding merge trees embedded in the graphs of the scalar fields.In a nutshell, a merge tree T is a rooted tree equipped with a scalar function defined on itsnode set, f : V → R . Gromov-Wasserstein Distance for Measure Networks.
The Gromov-Wasserstein (GW)distance was proposed by Memoli [59, 60] for metric measure spaces. Peyre et al. [69]introduced the notion of a measure network and defined the GW distance between suchnetworks. The key idea is to find a probabilistic matching between a pair of networks bysearching over the convex set of couplings of the probability measures defined on the networks.A finite, weighted graph G can be represented as a measure network using a triple( V, W, p ), where V is the set of n nodes, W is a weighted adjacency matrix, and p is aprobability measure supported on the nodes of G . For our experiments, p is taken to beuniform, that is, p = n n , where n = (1 , , . . . , T ∈ R n .Let G ( V , W , p ) and G ( V , W , p ) be a pair of graphs with n and n nodes, respect-ively. Let [ n ] denote the set { , , . . . , n } . V = { x i } i ∈ [ n ] and V = { y j } j ∈ [ n ] . A coupling Sketching Merge Trees between probability measures p and p is a joint probability measure on V × V whosemarginals agree with p and p . That is, a coupling is represented as an n × n non-negativematrix C such that C n = p and C T n = p . Given matrix C , its binarization is an n × n binary matrix, denoted as C> : this matrix has 1 where C >
0, and 0 elsewhere.The distortion of a coupling C with an arbitrary loss function L is defined as [69] E ( C ) = X i,k ∈ [ n ] ,j,l ∈ [ n ] L ( W ( i, k ) , W ( j, l )) C i,j C k,l . (1)Let C = C ( p , p ) denote the collection of all couplings between p and p . The Gromov-Wasserstein discrepancy [69] is defined as D ( C ) = min C ∈C E ( C ) . (2)In this paper, we consider the quadratic loss function L ( a, b ) = | a − b | . The Gromov-Wasserstein distance [25, 60, 69] d GW between G and G is defined as d GW ( G , G ) = 12 min C ∈C X i,k ∈ [ n ] ,j,l ∈ [ n ] | W ( i, k ) − W ( j, l ) | C i,j C k,l . (3)It follows from the work of Sturm [76] that such minimizers always exist and are referred toas optimal couplings . Memoli [60] showed that ( d GW ) / defines a distance on the “space ofmetric measure spaces quotient by measure-preserving isometries” [69]. Alignment and Blowup.
Given a pair of graphs G and G , a coupling C ∈ C ( p , p ) canbe used to align their nodes. For each node x i ∈ V , let u i = |{ y j ∈ V : C ( i, j ) > }| denotethe number of nodes in V that have a nonzero coupling probability with x i . We define anew node set V = S x i ∈ V { ( x i , l ) : 1 ≤ l ≤ u i } , which, roughly speaking, contains u i copiesof each x i ∈ V . For x i ∈ V , let { y i l } l ∈ [ u i ] denote the nodes in V such that C ( x i , y i l ) > V as p (( x i , l )) = C ( i, i l ). Finally, for x, x ∈ V , l ∈ [ u x ],and l ∈ [ u x ], we define a new weight matrix W such that W (( x, l ) , ( x , l )) = W ( x, x ).The new graph G = ( V , W , p ) is the blowup of G . G can be possibly enlarged w.r.t. G . Similarly, we can construct the blowup G = ( V , W , p ) of G .An optimal coupling C expands naturally to a coupling C between p and p . Aftertaking appropriate blowups, C can be binarized to an n × n permutation matrix (where n , n ≤ n ), and used to align the nodes of the two blown-up graphs. The GW distance isgiven by a formulation equivalent to Equation 3 based on an optimal coupling, d GW ( G , G ) = 12 X i,j | W ( i, j ) − W ( i, j ) | p ( i ) p ( j ) . (4) Fréchet Mean.
Given a collection of graphs G = { G , G , . . . , G N } , a Fréchet mean [25] G of G is a minimizer of the functional F ( H, G ) = N P Ni =1 d GW ( G i , H ) over the space N ofmeasure networks, G = min H ∈N N N X i =1 d GW ( G i , H ) . (5)Chowdhury and Needham [25] defined the directional derivative and the gradient of thefunctional F ( H, G ) at H and provided a gradient descent algorithm to compute the Fréchetmean. Their iterative optimization begins with an initial guess H of the Fréchet mean. At . Li, S. Palande, and B. Wang 7 the k th iteration, there is a two-step process: each G i is first blown-up and aligned to thecurrent Fréchet mean, H k ; then H k is updated using the gradient of the functional F ( H k , G )at H k . Such a two-step process is repeated until convergence where the gradient vanishes.For the complete algorithmic and implementational details, see [25]. If G = ( V , W , p ) is theFréchet mean, then we have W ( i, j ) = 1 N N X k =1 W k ( i, j ) , where W k is the weight matrix obtained by blowing-up and aligning G k ∈ G to G . That is,when all the graphs in G are blown-up and aligned to G , the weight matrix of G is given bya simple element-wise average of the weight matrices of the graphs. a b c .
120 0 0 .
04 0 0 .
08 00 0 0 .
12 0 0 00 0 .
12 0 0 0 00 .
08 0 .
04 0 0 0 00 0 0 0 0 .
08 0 . .
08 0 0 0 .
04 0 00 0 0 0 .
12 0 0
Figure 3
An optimal coupling between two simple merge trees (a) and (c). The coupling matrixis visualized in (b): yellows means high and dark blue means low probability. Couplings betweenthe Fréchet mean T with T and T are shown in (d) and (e), respectively. A Simple Example.
We give a simple example involving a pair of merge trees in Figure 3. T in (a) and T in (c) contain 8 and 6 nodes, respectively (nodes are labeled starting witha 0 index). The optimal coupling C obtained by the gradient descent algorithm of [25] isvisualized in (b). C is an 8 × x in T is matched to node y in T with the highest probability (red stars). Similarly, x is matched with y (orangestars), and x is matched with y (blue stars), respectively. Node x in T is coupled withboth y and y in T (green stars), with y having a higher probability (0 .
08) than y (0 . T and T to their Fréchet mean (denoted as T ) via blown-ups. T has 12 nodes. This gives rise to a coupling matrix between T and T (of size 12 × T and T (of size 12 × z of T is matched with node x of T in (d) and node y of T in (e), respectively; and z ismatched with x in (d) as well as y in (e). Now both trees T and T are blown-up to be T and T , each with 12 nodes, and can be vectorized into column vectors of the same size. Given a set T of N merge trees as input, our goal is to find a basis set S such that eachtree in T can be approximately reconstructed from a linear combination of merge trees in S .We propose to combine the GW framework [25] with techniques from matrix sketching toachieve this goal. We detail our pipeline to compute S , as illustrated in Figure 1. Step 1: Representing Merge Trees as Metric Measure Networks.
The first step is torepresent merge trees as metric measure networks as described in Section 3. Each merge tree
Sketching Merge Trees T ∈ T can be represented using a triple ( V, W, p ), where V is the node set, W is a matrix ofpairwise distances between its nodes, and p is a probability distribution on V .In this paper, we define p as a uniform distribution, i.e. , p = | V | | V | . To define W ,recall that each node x ∈ V is associated with a scalar value f ( x ). For a pair of adjacentnodes x, x ∈ V , W ( x, x ) = | f ( x ) − f ( x ) | , i.e. , the weight W ( x, x ) is the absolute differencein function value between them. We define W in such a way to encode information in f ,which is inherent to a merge tree. For a pair of nonadjacent nodes x, x ∈ V , W ( x, x ) isthe shortest path distance between them in T ; where a shortest path between x and x byconstruction goes through their lowest common ancestor in T . Step 2: Merge Tree Vectorization via Blowup and Alignment to the Fréchet Mean.
Thesecond step is to convert each merge tree into a column vector of the same size via blowupand alignment to the Fréchet mean. Having represented each merge tree as a metric measurenetwork, we can use the GW framework to compute a Fréchet mean of T , denoted as T = ( V , W , p ). Let n = | V | . In theory, n may become as large as Q Ni =1 | V i | . In practice, n ischosen to be much smaller; in our experiments, we choose n to be a small constant factor (2or 3) times the size of the largest input tree. The optimal coupling C between T and T = T i is an n × n i matrix with at least n nonzero entries. If the number of nonzero entries is greaterthan n , we keep only the largest value in each row. That is, if a node of T has a nonzeroprobability of coupling with more than one node of T , we consider the mapping with onlythe highest probability, so that each coupling matrix C has exactly n nonzero entries. Wethen blow up each T to obtain T = ( V , W , p ), and align T with T . The above procedureensures that each blown-up tree T has exactly n nodes, and the binarized coupling matrix C between T and T induces a node matching between them.We can now vectorize ( i.e. , flatten) each W (an n × n matrix) to form a column vector a ∈ R d of matrix A (where d = n ), as illustrated in Figure 1 (step 2) . Each a is a vectorrepresentation of the input tree T with respect to the Fréchet mean T . Step 3: Merge Tree Sketching via Matrix Sketching.
The third step is to sketch mergetrees by applying matrix sketching to the data matrix A , as illustrated in Figure 1 (step 3).By construction, A is a d × N matrix whose column vectors a i are vector representationsof T i . We apply matrix sketching techniques to approximate A by ˆ A = B × Y . In ourexperiments, we use two linear sketching techniques, namely, column subset selection (CSS)and non-negative matrix factorization (NMF). See Section 6 for implementation details.Using CSS, the basis set is formed by sampling k columns of A . Let B denote the matrixformed by k columns of A and let Π = BB + denote the projection onto the k -dimensionalspace spanned by the columns of B . The goal of CSS is to find B such that k A − Π A k F isminimized. We experiment with two variants of CSS.In the first variant of CSS, referred to as Length Squared Sampling ( CSS-LSS ), we sample(without replacement) columns of A with probabilities q i proportional to the square of theirEuclidean norms, i.e. , q i = k a i k / k A k F . We modify the algorithm slightly such that beforeselecting a new column, we factor out the effects from columns that are already chosen.In the second variant of CSS, referred to as the Iterative Feature Selection ( CSS-IFS ), weuse the algorithm proposed by Ordozgoiti et al. [68]. Instead of selecting columns sequentiallyas in
CSS-LSS , CSS-IFS starts with a random subset of k columns. Then each selectedcolumn is either kept or replaced with another column, based on the residual after the otherselected columns are factored out simultaneously. In practice, d = ( n + 1) n/ . Li, S. Palande, and B. Wang 9 In the case of NMF, the goal is to compute non-negative matrices B and Y suchthat k A − ˆ A k F = k A − BY k F is minimized. We use the implementation provided in thedecomposition module of the scikit-learn package [26, 39]. The algorithm initializes matrices B and X = Y T and minimizes the residual Q = A − BX T + b j x Tj alternately with respectto column vectors b j and x j of B and X , respectively, subject to the constraints b j ≥ x j ≥ Step 4: Reconstructing Sketched Merge Trees.
For the fourth step, we convert eachcolumn in ˆ A as a sketched merge tree. Let ˆ A = BY , where matrices B and Y are obtainedusing CSS or NMF. Let ˆ a = ˆ a i denote the i th column of ˆ A . We reshape ˆ a as an n × n weightmatrix ˆ W . We then obtain a tree structure ˆ T from ˆ W by computing its MST or LSST.A practical consideration is the simplification of a sketched tree ˆ T coming from NMF.ˆ T without simplification is an approximation of the blow-up tree T . It contains many morenodes compared to the original tree T . Some of these are internal nodes with exactly oneparent node and one child node. In some cases, the distance between two nodes is almostzero. We further simplify ˆ T to obtain a final sketched tree ˆ T by removing internal nodesand nodes that are too close to each other; see Section 6 for details. Step 5: Returning Basis Trees.
Finally, we return a set of basis merge trees S usinginformation encoded in the matrix B . Using CSS, each column b j of B corresponds directlyto a column in A ; therefore, the set S is trivially formed by the corresponding merge treesfrom T . Using NMF, we obtain each basis tree by applying MST or LSST to columns b j of B with appropriate simplification, as illustrated in Figure 1 (step 5). Error Analysis.
For each of our experiments, we compute the global sketch error (cid:15) = k A − ˆ A k F ,as well as column-wise sketch error (cid:15) i = k a i − ˆ a i k , where (cid:15) = P Ni =1 (cid:15) i . By construction, (cid:15) i ≤ (cid:15) . For merge trees, we measure the GW distance between each tree T i and its sketchedversion ˆ T i , that is τ i = d GW ( T i , ˆ T i ), referred to as the element-wise GW loss . The global GWloss is defined to be τ = P Ni =1 τ i . For theoretical considerations, see discussions in Section 7. A Simple Synthetic Example.
We give a simple synthetic example to illustrate our pipeline.A time-varying scalar field f is a mixture of 2D Gaussians that translate and rotate on theplane. We obtain a set of merge trees (indexed from 0) from 12 consecutive time steps,referred to as the Rotating Gaussian . In Figure 4(a) and (b), we show the scalar fields andthe corresponding merge trees, respectively. Each merge tree is computed from − f ; thus, itsleaves correspond to the local maxima (red), internal nodes are saddles (white), and the rootnode is the global minimum (blue) of f (see Figure 4(a)).Since the dataset is quite simple, a few basis trees are sufficient to get good sketching results.With two basis trees ( k = 2), both CSS algorithms select S = { T , T } . The topologicalstructures of T and T are recognizably distinct among the input trees. For k = 3, CSS-LSS returns S = { T , T , T } , while CSS-IFS gives S = { T , T , T } . In Figure 4(a) and (b), wehighlight the three basis trees selected with CSS-IFS and their corresponding scalar fields,respectively, with green boxes. These basis trees clearly capture rather unique structuralrepresentatives in the set. It is important to note that although T and T share a similartree topology, they are of different heights ( T has a height of 30, whereas T has a heightof 24), and the branching nodes (black arrows) are positioned differently w.r.t. the root.We also show a few input trees { T , T , T , T } (blue boxes) and their sketched versions(red boxes) in Figure 4(c). The input and the sketched trees are almost indistinguishablefor T , T , and T (ignoring differences in tree layouts), but there are minor differences for T . We also visualize the data matrix A , ˆ A , B , and highlight the coefficient matrix Y in Figure 4(d). The Frechét mean tree T contains 11 nodes. The matrix Y in Figure 4(d),
Rotating Gaussian : CSS-IFS with MST. (a) Visualizing a time-varying mixture ofGaussian functions together with (b) their corresponding merge trees. (c) Examples of input mergetrees (blue boxes) with their sketched versions (red boxes). (d) Visualizing data matrices associatedwith the sketching, while highlighting the coefficient matrix Y . for instance, shows that T (orange box) is a linear combination of basis trees T , T , and T , with coefficient 0 . − .
01, and 0 .
57, respectively. In addition, columns in Y with high(yellow or light green) coefficients ( w.r.t. the given basis) may be grouped together, formingclusters such as { T , T , T , T , T , T , T } , whose elements look structurally similar.On the other hand, using NMF, when k = 2, we display the data matrices together withbasis trees (obtained via MST) in Figure 5. The most interesting aspect of using NMF isthat the basis trees (green boxes) are not elements of T ; however, they very much resemblethe basis trees obtained by CSS algorithms. In addition, columns in Y with high coefficients( w.r.t. the same basis tree) may be grouped together that show, for instance, structuralsimilarities among the input trees T , T , T , T , and T . . Li, S. Palande, and B. Wang 11
Rotating Gaussian : data matrices together with basis tress returned by NMF.
We demonstrate the applications of our merge tree sketching framework with three ensembledatasets from scientific simulations. The key takeaway is that for each ensemble dataset,our framework obtains basis merge trees that capture structural transitions and structuraldiversity among the ensemble members. In general, CSS gives better sketched trees thanNMF;
Iterative Feature Selection ( CSS-IFS ) performs slightly better than
Length SquaredSampling ( CSS-LSS ), based on error analysis. MST gives more visually appealing trees inpractice than LSST. All source codes and data will be made publicly available. ab c
Figure 6
Visualizing a scalar field f that arises from a vorticity field of (a) Heated Cylinder dataset, (b)
Corner Flow dataset, and a velocity magnitude field of (c)
Red Sea dataset, respectively.Critical points (after de-noising) are shown in red (maxima) and white (saddles). Global minimaare in blue. Land is colored blue in (c). All merge trees are computed from − f . Heated Cylinder Dataset.
Two of our datasets come from numerical simulations availableonline . The first dataset, referred to as the Heated Cylinder with Boussinesq Approximation ( Heated Cylinder in short), comes from the simulation of a 2D flow generated by a heatedcylinder using the Boussinesq approximation [45, 71]. The dataset shows a time-varyingturbulent plume containing numerous small vortices. We convert each time instance of theflow into a scalar field using its vorticity. We generate a set of merge trees from these scalar https://cgl.ethz.ch/research/visualization/data.php fields based on 31 time steps (they correspond to steps 600-630 from the original 1310 timesteps). This set captures the evolution of small vortices over time. An instance of the scalarfield after de-noising is illustrated in Figure 6(a). Corner Flow Dataset.
The second dataset, referred to as the
Cylinder Flow AroundCorners ( Corner Flow in short), arises from the simulation of a viscous 2D flow around twocylinders [7, 71]. The channel into which the fluid is injected is bounded by solid walls. Avortex street is initially formed at the lower left corner, which then evolves around the twocorners of the bounding walls. We generate a set of merge trees from the vorticity fields ofthe first 100 time instances, which describe the formation of a one-sided vortex street on theupper right corner; see Figure 6(b) for an example.
Red Sea Dataset.
The third dataset, referred to as the
Red Sea eddy simulation ( Red Sea inshort) dataset, originates from the IEEE Scientific Visualization Contest 2020 . The datasetwas generated using a high-resolution MITgcm (Massachusetts Institute of Technology generalcirculation model), together with remote sensing satellite observations. It is used to study thecirculation dynamics and eddy activities of the Red Sea (see [50, 83, 84]). For our analysis,we use merge trees that arise from velocity magnitude fields of an ensemble (named )with 60 times steps. Figure 6(c) shows a latter time step that captures the formation ofvarious eddies, which are circular movements of water important for transporting energy andbiogeochemical particles in the ocean. Given merge trees T = { T , . . . , T } from the Heated Cylinder dataset, we apply both CSSand NMF to obtain a set of basis trees S and reconstruct the sketched trees. We compare allsketching techniques by setting k = 5, all of which give fairly good sketching results. Coefficient Matrices and Basis Trees from CSS and NMF.
As shown in Figure 7,
CSS-IFS produces five basis trees: T , T , T , T , and T . Figure 7(d) visualizes these basis treeswhile highlighting their structural differences (purple circles). The coefficient matrix Y inFigure 7(c) contains a number of yellow or light green blocks, indicating that consecutiveinput trees are grouped together into clusters, and represented by one or two shared basis.The columns of Y corresponding to the basis trees are highlighted with orange boxes inFigure 7(c). The basis trees appear to be good representatives of the clusters.Further investigation of the input trees reveals that these basis trees capture structuraltransitions in the underlying scalar fields, even though such changes are not easy to detectdirectly from their corresponding scalar fields (see Figure 9). In Figure 8, we see noticeablestructural transitions (purple circles) among trees from adjacent time steps: T → T , T → T , T → T , etc. The input trees with similar structures can be clustered basedon such transitions, where the basis trees are selected as representatives of each cluster.On the other hand, CSS-LSS gives five basis trees, S = { T , T , T , T , T } , which arequite similar to the results of CSS-IFS , as shown in Figure 10. Using NMF, we show the fivebasis trees together with coefficient matrix Y in Figure 11. It is interesting to notice thestar-like basis trees, although it is unclear if there is an intuitive explanation. Sketched Trees.
As illustrated in Figure 7(a) and (b), the column-wise sketch error andthe element-wise GW loss can be used to guide our investigation into individual trees. Using https://kaust-vislab.github.io/SciVis2020/ . Li, S. Palande, and B. Wang 13 ba ffi cient Matrix3 22 221919 19 2619 cdef Tree 22
Figure 7
Heated Cylinder , CSS-IFS with MST: (a) column-wise sketch error, (b) element-wiseGW loss, (c) coefficient matrix Y , (d) basis trees, (e) input trees (blue boxes) with their sketchedversions (red boxes), (f) weight matrices associated with T during the sketching process.
87 14 15 23 24
Figure 8
Heated Cylinder : CSS-IFS with MST. Instances of structural transitions amongconsecutive merge trees.
CSS-IFS , we give an example of a well-sketched tree ( T ) with a couple of outliers ( T and T ) based on the error analysis. In Figure 7(e), we compare each input tree (blue box) Figure 9
Heated Cylinder : scalar fields that correspond to the basis trees selected by
CSS-IFS .
Heated Cylinder : CSS-LSS with MST.
Heated Cylinder : NMF with MST. against its sketched version (red box). T and its sketched version ˆ T are indistinguishable.Even though T and T are considered outliers relative to other input trees, they bothshare similar subtrees (whose roots are pointed by black arrows) with their sketched versions.In Figure 7(f), we further investigate the weight matrices from different stages of the . Li, S. Palande, and B. Wang 15 sketching pipeline for T . From left to right, we show the weight matrix W of the inputtree, its blow-up matrix W (which is linearized to a ), the approximated column vectorˆ a after sketching (reshaped into a square matrix), the weight matrix ˆ W of the MSTderived from the reshaped ˆ a , the weight matrix of the MST after simplification, and rootalignment ˆ W w.r.t. T . We observe minor changes between W (blue box) and ˆ W (redbox), which explains the structural variation between T and ˆ T . Reconstructing Merge Trees with LSST.
For comparative purposes, instead of MST, wedemonstrate the tree reconstruction results using LSST for
CSS-IFS , as shown in Figure 12.The reconstructed tree using LSST for T (blue box) is visually less appealing compared tothe reconstruction using MST. The star-like features are most likely a consequence of thepetal decomposition algorithm of LSST. However, the weight matrix view shows that theLSST preserves distances fairly well (red box vs. blue box).
22 22
Figure 12
Heated Cylinder : CSS-IFS with LSST.
Error Analysis.
Finally, we discuss our results quantitatively. The global sketch error for
CSS-IFS , CSS-LSS , and NMF is 70 .
17, 93 .
22, and 53 .
91, respectively. Using MST, the globalGW loss for
CSS-IFS , CSS-LSS , and NMF is 0 .
15, 0 .
25, and 0 .
33, respectively. Using LSST,the global GW loss for
CSS-IFS , CSS-LSS , and NMF is 0 .
39, 0 .
35, and 0 .
47, respectively.Therefore, for this particular dataset, NMF gives the best matrix approximation but lessvisually appealing merge trees, while
CSS-IFS performs best among the three sketchingtechniques in terms of preserving tree topology measured by the GW distance.
The
Corner Flow dataset contains 100 merge trees. We start with an error analysis with thecoefficient matrices, and then describe the sketching results in detail.
Error Analysis with Coefficient Matrices.
First, we compare the coefficient matricesgenerated using both CSS algorithms, for k = 15 and k = 30, respectively. In Figure 13(a), thematrix Y for CSS-IFS contains a larger number of yellow rows (indicating large coefficients) incomparison with Figure 13(b), which means that consecutive input trees can be reconstructedwith a small number of shared basis. This result implies that
CSS-IFS produces basis(columns) that are better cluster representatives than
CSS-LSS . In terms of errors,
CSS-IFS has a smaller global GW loss: 14 .
41 ( k = 15) and 10 .
90 ( k = 30) for CSS-IFS , and 23 . k = 15) and 16 .
21 ( k = 30) for CSS-LSS , respectively.
CSS-IFS also has a smaller (6886 . CSS-LSS (11641 . k = 15. The column-wise sketch error isvisualized in Figure 14(a), and the element-wise GW loss is shown in Figure 14(b). Basis Trees.
We report the sketching results with 15 basis under
CSS-IFS . The basis treesare T , T , T , T , T , T , T , T , T , T , T , T , T , T , and T ; see Figure 14(c-d). Similar to the Heated Cylinder dataset, we observe noticeable structural transitionsamong some pairs of adjacent input trees, which partition the set into clusters with similar
15 basis ba
30 basis 15 basis30 basis
Figure 13
Corner Flow dataset:
CSS-IFS (a) and
CSS-LSS (b) with MST. Visualizing weightmatrices Y with 15 and 30 basis trees, respectively. structures (blocks of yellow). As illustrated in Figure 14(c), the basis trees (orange boxes)serve as good cluster representatives, as they are mostly selected among yellow blocks. Inparticular, they capture structural transitions in the set, which are highlighted in purplecircles in Figure 14(d). Such structural transitions may not be obvious in their scalar fieldscounterparts; see Figure 15. abcd Figure 14
Corner Flow : CSS-IFS with MST. Visualizing column-wise sketch error (a), element-wise GW loss (b), weight matrix Y with weights associated with basis trees, and (d) a subset ofbasis trees. Purple circles highlight structural changes between basis trees. Outliers among Sketched Merge Trees.
Finally, we investigate individual sketched trees.We utilize the column-wise sketch error in Figure 14(a) with element-wise GW loss in Fig-ure 14(b) to select outliers among the sketched trees, such as T , T , T , and T (magentaboxes). Figure 14(c) shows that such outlier trees are linear combinations of basis trees withfairly uniform weights. In Figure 16(b), we observe that these outlier trees are less visually . Li, S. Palande, and B. Wang 17
11 20 2635 60 72
Figure 15
Corner Flow : CSS-IFS with MST. Scalar fields that correspond to the basis. appealing, where their sketched versions (red boxes) contain a number of star-like features.For instance, Figure 16(d) shows noticeable amount of changes in comparing the merge treeweight matrices for T (blue box) and ˆ T (red box).On the other hand, in Figure 16(a), trees with lower element-wise GW loss are structurallysimilar to basis trees, and thus have a good approximation of their topology. For instance,the sketched tree ˆ T (red box) is indistinguishable w.r.t. to the original input T .
74 745216 16 523170 Tree 52Tree 70 abc d
70 7022 22 31
Figure 16
Corner Flow : CSS-IFS with MST. Individual sketched trees with high and low errors.
The
Red Sea dataset comes with 60 merge trees. The input set does not exhibit naturalclustering structures because many adjacent time steps give rise to trees with noticeablestructural changes. Nevertheless, we report our sketching results and discuss how such resultsimprove with an increasing number of basis.
Coefficient Matrices.
Using both
CSS-IFS and
CSS-LSS , we compare the coefficientmatrices Y for k = 10 ,
15 and 30, respectively; see Figure 17. For k = 10, S contains T , T , T , T , T , T , T , T , T , T . For k = 15, S includes T , T , T , T , T , T , T , T , T , T , T , T , T , T , and T . Since CSS-IFS is not a deterministic algorithm, as weincrease k , the basis set for k = 10 does not necessarily form a subset of the basis set for k = 15. The input trees appear to have diverse structures, since the clustering among themis unclear. This phenomenon is evident by the lack of long yellow rows in the matrices Y . Itis also interesting to notice that for both CSS algorithms, there exists a subset of consecutivecolumns that contain few selected basis (orange boxes for k = 30).In general, the global sketch error and GW loss improve as we increase the number ofbasis. With 15 basis, the global GW loss for CSS-IFS , CSS-LSS , and NMF is 1 . . . .
71, 1408 .
21, and 1115 . CSS-IFS bestpreserves the tree topology.
10 basis15 basis30 basis 10 basis15 basis30 basis ba Figure 17
Red Sea dataset:
CSS-IFS (a),
CSS-LSS (b). Visualizing weight matrix Y with 10,15, and 30 basis trees, respectively. Basis Trees and Sketched Trees.
We now discuss basis trees obtained using
CSS-IFS with15 basis. In Figure 18(d), we show a subset of the basis trees, T , T , T , T , and T todemonstrate the diverse structures. We include column-wise sketch errors and element-wiseGW losses in Figure 18(a) and Figure 18(b), respectively. Similar to the Corner Flow dataset,we could use these errors to study examples of well-sketched trees (green boxes) and outliers(magenta boxes). Well-sketched trees are those with low GW losses and sketch errors, like . Li, S. Palande, and B. Wang 19 T , T , and T . Given the diversity of the input trees, with just 15 basis, we still observevery similar subtree structures among these trees (whose roots are pointed by black arrows),see Figure 18(e). Sketch ErrorGW Loss3726 48163 50 500 1 7 8 4523 bacde
Weight Matrix Y3 16 503 16 50
Figure 18
Red Sea dataset:
CSS-IFS , k = 15. Visualizing column-wise sketch error (a), element-wise GW loss (b), weight matrix Y (c), a subset of basis trees (d), and examples of well-sketchedtrees (e). Basis trees are encoded by green boxes. Input trees and their sketched version are encodedby blue and red boxes, respectively. In this section, we provide some implementation details for various algorithms employed inour merge tree sketching framework.
Matrix Sketching Algorithms.
We use two variants of column subset selection (CSS)algorithms, as well as non-negative matrix factorization (NMF) to sketch the data matrix A .Here, we provide pseudocode for these matrix sketching algorithms.Modified Length Squared Sampling ( CSS-LSS ) s ← B is an empty matrix, A = A . s ← s + 1. Select column c from A with the largest squared norm (or select c randomlyproportional to the squared norm) and add it as a column to B . Remove c from A . For each remaining column c in A ( i.e. , c = c ), factor out the component along c as: a. u ← c/ k c k b. c ← c − h u, c i u While s < k , go to step 2.Iterative Feature Selection (
CSS-IFS ) Choose a subset of k column indices r = { i , i , . . . , i k } uniformly at random. Construct subset B r = [ a i , a i , . . . , a i k ] of A with columns indexed by r . Repeat for j = 1 , , . . . , k : a. Let X jl denote matrix formed by replacing column a i j with column a l in B r , where l ∈ [ n ] \ r . Let X + jl denote its Moore-Penrose pseudoinverse. b. Find w = argmin l ∈ [ n ] \ r k A − X jl X + jl A k F . c. B r ← X jw . d. r ← ( r \ { i j } ) S { w } .Non-Negative Matrix Factorization (NMF) Given A and k , initialize B ∈ R d × k , Y = X T ∈ R k × N using the non-negative doublesingular value decomposition algorithm of Boutsidis and Gallopoulos [15]. Normalize columns of B and X to unit L norm. Let E = A − BX T . Repeat until convergence: for j = 1 , , . . . , k , a. Q ← E + b j x Tj . b. x j ← [ Q T b j ] + . c. b j ← [ Qx j ] + . d. b j ← b j / k b j k . e. E ← Q − b j x Tj .Here, [ Q ] + means that all negative elements of the matrix Q are set to zero. LSST Algorithm.
We construct low stretch spanning trees (LSST) using the petal decom-position algorithm of Abraham and Neiman [3]. Given a graph G , its LSST is constructed byrecursively partitioning the graph into a series of clusters called petals . Each petal P ( x , t, r )is determined by three parameters: the center of the current cluster x , the target node ofthe petal t , and the radius of the petal r .A cone C ( x , x, r ) is the set of all nodes v such that d ( x , x ) + d ( x, v ) − d ( x , v ) ≤ r . Apetal is defined as a union of cones of varying radii. Suppose x → x → · · · → x k = t is thesequence of nodes on the shortest path between nodes x and t . Let d k denote the distance d ( x k , t ). Then the petal P ( x , t, r ) is defined as the union of cones C ( x , x k , ( r − d k ) /
2) forall x k such that d k ≤ r .Beginning with a vertex x specified by the user, the algorithm partitions the graph intoa series of petals. When no more petals can be obtained, all the remaining nodes are putinto a cluster called the stigma. A tree structure, rooted in the stigma, is constructed byconnecting the petals and the stigma using some of the intercluster edges. All other edgesbetween clusters are dropped. This process is applied recursively within each petal (and thestigma) to obtain a spanning tree structure. Merge Tree Simplification.
To reconstruct a sketched tree, we reshape the sketched columnvector ˆ a of ˆ A into an n × n matrix ˆ W , and obtain a tree structure ˆ T by computing its MSTor LSST. ˆ T is an approximation of the blow-up tree T . To get a tree approximation closerto the original input tree T , we further simplify ˆ T as described below.The simplification process has two parameters. The first parameter α is used to mergeinternal nodes that are too close ( ≤ α ) to each other. Let R be the diameter of ˆ T and n . Li, S. Palande, and B. Wang 21 the number of nodes in ˆ T . α is set to be c α R/n for c α ∈ { . , , } . A similar parameterwas used in simplifying LSST in [3]. The second parameter β = c β R/n is used to merge leafnodes that are too close ( ≤ β ) to the parent node, where c β ∈ { . , , } . Let ˆ W be theweight matrix of ˆ T . The simplification process is as follows: Remove from ˆ T all edges ( u, v ) where ˆ W ( u, v ) ≤ α . Merge all leaf nodes u with their respective parent node v if ˆ W ( u, v ) ≤ β . Remove all the internal nodes.The tree ˆ T obtained after simplification is the final sketched tree. Merge Tree Layout.
To visualize both input merge trees and sketched merge trees, weexperiment with a few strategies. To draw an input merge tree T equipped with a functiondefined on its nodes, f : V → R , we set each node u ∈ V to be at location ( x u , y u ); where y u = f ( u ), and x u is chosen within a bounding box while avoiding edge intersections. Theedge ( u, v ) is drawn proportional to its weight W ( u, v ) = | f ( u ) − f ( v ) | = | y u − y v | .To draw a sketched tree as a merge tree, we perform the following steps: Fix the root of the sketched tree at (0 , The y-coordinate of each child node is determined by the weight of the edge between thenode and its parent. The x-coordinate, which determines the left-to-right ordering of the child nodes, iscomputed using a heuristic strategy described below. a. Sort the child nodes by their distance to the parent node in descending order. b. Suppose the order of child nodes after sorting is c , c , . . . , c t . If t is odd, we reorderthe nodes from left to right as c t , c t − , c t − , . . . , c , c , c , c , . . . , c t − , c t − . If t is even,we reorder the nodes as c t − , c t − , c t − , . . . , c , c , c , c , . . . , c t − , c t .The idea is to keep the child nodes that have a larger distance to the parent near the centerto avoid edge crossings between sibling nodes and their subtrees.Our layout strategy assumes that the trees are rooted. However, ˆ T , which is ourapproximation of T , is not rooted. In our experiments, we use two different strategies topick a root for ˆ T and align T and ˆ T for visual comparison.Using the balanced layout strategy, we pick the node u of ˆ T that minimizes the sumof distances to all other nodes. Set u to be the balanced root of ˆ T . Similarly, we find thebalanced root v of the input tree T . T and ˆ T are drawn using the balanced roots.Using the root alignment strategy, we compute the optimal coupling between T and ˆ T using the GW framework. Then we find the node u from ˆ T that has the highest couplingprobability with the root node v in T and layout ˆ T rooted at u . Programming Language and Included Packages.
Our framework is mainly implementedin
Python . The code to compute LSST and MST from a given weight matrix is implementedin
Java . The
CSS-IFS algorithm for matrix sketching is implemented in
MATLAB . For dataprocessing and merge tree visualization, we use
Python packages, including numpy , matplotlib ,and networkx . In addition, the GW framework of Chowdhury and Needham [25] requires the Python Optimal Transport (POT) package.
We discuss some theoretical considerations in sketching merge trees. In the first two steps ofour framework, we represent merge trees as metric measure networks and vectorize themvia blow-up and alignment to a Fréchet Mean using the GW framework [25]. Each mergetree T = ( V, W, p ) ∈ T is mapped to a column vector a in matrix A , where W captures the shortest path distances using function value differences as weights. The computation of theFréchet mean T is an optimization process, but the blow-up of T and its alignment to T does not change the underlying distances between the tree nodes, which are encoded in W .Therefore, reshaping the column vector a back to a pairwise distance matrix and computingits corresponding MST fully recovers the original input merge tree.In the third step, we sketch the matrix A using either NMF or CSS. Both matrixsketching techniques (albeit with different constraints) aim to obtain an approximationˆ A = BY of A that minimizes the error (cid:15) = k A − ˆ A k F . Let A k denote the (unknown) bestrank- k approximation of A . In the case of CSS, the theoretical upper bound was given asa multiplicative error of the form (cid:15) ≤ (cid:15) k · k A − A k k F , where (cid:15) k depends on the choice of k [29, 16], or it was given as an additive error (cid:15) ≤ k A − A k k F + (cid:15) k,A , where (cid:15) k,A depends on k and k A k F [30, 57]. k A − A k k F is often data dependent. In the case of NMF, a rigoroustheoretical upper bound on (cid:15) remains unknown.Given an approximation ˆ A of A , the next step is to reconstruct a sketched merge treefrom each column vector ˆ a of ˆ A . We reshape ˆ a into an n × n matrix ˆ W and construct asketched tree ˆ T by computing the MST or the LSST of ˆ W . The distance matrix ˆ D of thesketched tree ˆ T thus approximates the distance matrix W of the blow-up tree T .When a sketched merge tree is obtained via a LSST, there is a theoretical upperbound on the relative distortion of the distances [3], that is, θ ≤ O (log n log log n ) for θ = ( n ) P x,x (cid:16) ˆ D ( x, x ) / ˆ W ( x, x ) (cid:17) . When a sketched merge tree is obtained via a MST,the theoretical bounds on k ˆ W − ˆ D k F are unknown, although, in practice, MST typicallyprovides better sketched trees in comparison with LSST, as demonstrated in Section 5.Finally, although the smoothing process does not alter the tree structure significantly, itdoes introduce some error in the final sketched tree, whose theoretical bound is not yetestablished.Therefore, while we have obtained good experimental results in sketching merge trees,there is still a gap between theory and practice for individual sketched trees. Filling such agap is left for future work. In this paper, we present a framework to sketch merge trees. Given a set T of merge trees of(possibly) different sizes, we compute a basis set of merge trees S such that each tree in T can be approximately reconstructed using S . We demonstrate the utility of our framework insketching merge trees that arise from scientific simulations. Our approach is flexible enoughto be generalized to sketch other topological descriptors such as contour trees, Reeb graphs,and Morse–Smale graphs (e.g., [23]), which is left for future work. References Ittai Abraham, Yair Bartal, and Ofer Neiman. Embedding metrics into ultrametrics andgraphs into spanning trees with constant average distortion.
Proceedings of the 18th AnnualACM-SIAM Symposium on Discrete Algorithms , pages 502–511, 2007. Ittai Abraham, Yair Bartal, and Ofer Neiman. Nearly tight low stretch spanning trees.
IEEESymposium on Foundations of Computer Science , pages 781–790, 2008. Ittai Abraham and Ofer Neiman. Using petal-decompositions to build a low stretch spanningtree.
ACM Symposium on Theory of Computing , pages 395–406, 2012. Martial Agueh and Guillaume Carlier. Barycenters in the Wasserstein space.
SIAM Journalon Mathematical Analysis , 43(2):904–924, 2011. . Li, S. Palande, and B. Wang 23 Noga Alon, Richard M. Karp, David Peleg, and Douglas West. A graph-theoretic game andits application to the k-server problem.
SIAM Journal on Computing , 24(1):78–100, 1995. David Alvarez-Melis and Tommi Jaakkola. Gromov-Wasserstein alignment of word embeddingspaces.
Proceedings of the Conference on Empirical Methods in Natural Language Processing ,pages 1881–1890, 2018. Irene Baeza Rojo and Tobias Günther. Vector field topology of time-dependent flows in a steadyreference frame.
IEEE Transactions on Visualization and Computer Graphics , 26(1):280–290,2020. U. Bauer, B. Di Fabio, and C. Landi. An edit distance for Reeb graphs.
Proceedings of theEurographics Workshop on 3D Object Retrieval , pages 27–34, 2016. Ulrich Bauer, Xiaoyin Ge, and Yusu Wang. Measuring distance between reeb graphs.
Proceed-ings of the 30th Annual Symposium on Computational Geometry , page 464, 2014. Ulrich Bauer, Claudia Landi, and Facundo Memoli. The Reeb graph edit distance is universal.
Foundations of Computational Mathematics , 2017. Ulrich Bauer, Elizabeth Munch, and Yusu Wang. Strong equivalence of the interleavingand functional distortion metrics for Reeb graphs. In Lars Arge and János Pach, editors, , volume 34 of
Leibniz Interna-tional Proceedings in Informatics (LIPIcs) , pages 461–475, Dagstuhl, Germany, 2015. SchlossDagstuhl–Leibniz-Zentrum fuer Informatik. Kenes Beketayev, Damir Yeliussizov, Dmitriy Morozov, Gunther Weber, and Bernd Hamann.Measuring the distance between merge trees.
Topological Methods in Data Analysis andVisualization III: Theory, Algorithms, and Applications, Mathematics and Visualization , pages151–166, 2014. Jean-David Benamou, Guillaume Carlier, Marco Cuturi, Luca Nenna, and Gabriel Peyré.Iterative Bregman projections for regularized transportation problems.
SIAM Journal onScientific Computing , 37(2):A1111–A1138, 2015. Philip Bille. A survey on tree edit distance and related problems.
Theoretical ComputerScience , 337(1-3):217–239, 2005. Christos Boutsidis and Efstratios Gallopoulos. SVD based initialization: A head start fornonnegative matrix factorization.
Pattern Recognition , 41(4):1350–1362, 2008. Christos Boutsidis, Michael W. Mahoney, and Petros Drineas. An improved approximationalgorithm for the column subset selection problem.
Proceedings of the 20th Annual ACM-SIAMSymposium on Discrete Algorithms , pages 968–977, 2009. Alexander M. Bronstein, Michael M. Bronstein, and Ron Kimmel. Efficient computationof isometry-invariant distances between surfaces.
SIAM Journal on Scientific Computing ,28(5):1812–1836, 2006. Horst Bunke and Kaspar Riesen. Graph edit distance: Optimal and suboptimal algorithmswith applications.
Analysis of Complex Networks: From Biology to Linguistics , pages 113–143,2009. Charlotte Bunne, David Alvarez-Melis, Andreas Krause, and Stefanie Jegelka. Learninggenerative models across incomparable spaces.
International Conference on Machine Learning ,pages 851–861, 2019. Gabriel Cardona, Arnau Mir, Francesc Rosselló, Lucía Rotger, and David Sánchez. Copheneticmetrics for phylogenetic trees, after Sokal and Rohlf.
BMC Bioinformatics , 14(1):3, 2013. Hamish Carr, Jack Snoeyink, and Ulrike Axen. Computing contour trees in all dimensions.
Computational Geometry , 24(2):75–94, 2003. Mathieu Carrière and Steve Oudot. Local equivalence and intrinsic metrics between Reebgraphs. In Boris Aronov and Matthew J. Katz, editors, , volume 77 of
Leibniz International Proceedings in Informatics(LIPIcs) , pages 25:1–25:15, Dagstuhl, Germany, 2017. Schloss Dagstuhl–Leibniz-Zentrum fuerInformatik. Michael J. Catanzaro, Justin M. Curry, Brittany Terese Fasy, Janis Lazovskis, Greg Malen,Hans Riess, Bei Wang, and Matthew Zabka. Moduli spaces of Morse functions for persistence.
Journal of Applied and Computational Topology , 4:353–385, 2020. Frédéric Chazal, David Cohen-Steiner, Marc Glisse, Leonidas J. Guibas, and Steve Y. Oudot.Proximity of persistence modules and their diagrams.
Proceedings of the 25th Annual Sym-posium on Computational Geometry , pages 237–246, 2009. Samir Chowdhury and Tom Needham. Gromov-Wasserstein averaging in a Riemannianframework.
Proceedings of the IEEE/CVF Conference on Computer Vision and PatternRecognition Workshops , pages 842–843, 2020. Andrzej Cichocki and Anh-Huy Phan. Fast local algorithms for large scale nonnegative matrixand tensor factorizations.
IEICE transactions on fundamentals of electronics, communicationsand computer sciences , 92(3):708–721, 2009. Marco Cuturi and Arnaud Doucet. Fast computation of Wasserstein barycenters.
Proceedingsof the 31st International Conference on Machine Learning, PMLR , 32(2):685–693, 2014. Vin de Silva, Elizabeth Munch, and Amit Patel. Categorified Reeb graphs.
Discrete &Computational Geometry , pages 1–53, 2016. Amit Deshpande and Santosh Vempala. Adaptive sampling and fast low-rank matrix approx-imation. In
Approximation, Randomization, and Combinatorial Optimization. Algorithms andTechniques , pages 292–303, 2006. Petros Drineas, Ravi Kannan, and Michael W. Mahoney. Fast Monte Carlo algorithms formatrices ii: Computing a low-rank approximation to a matrix.
SIAM Journal on Computing ,36:158–183, 2006. Petros Drineas, Malik Magdon-Ismail, Michael W. Mahoney, and David P. Woodruff. Fastapproximation of matrix coherence and statistical leverage.
Journal of Machine LearningResearch , 13:3441–3472, 2012. Ian L. Dryden, Alexey Koloydenko, and Diwei Zhou. Non-Euclidean statistics for covariancematrices, with applications to diffusion tensor imaging.
Annals of Applied Statistics , 3(3):1102–1123, 2009. Herbert Edelsbrunner, John Harer, Vijay Natarajan, and Valerio Pascucci. Morse-Smalecomplexes for piece-wise linear 3-manifolds.
Proceedings of the 19th Annual symposium onComputational Geometry , pages 361–370, 2003. Herbert Edelsbrunner, John Harer, and Afra J. Zomorodian. Hierarchical Morse-Smalecomplexes for piecewise linear 2-manifolds.
Discrete & Computational Geometry , 30:87–107,2003. Michael Elkin, Yuval Emek, Daniel A. Spielman, and Shang-Hua Teng. Lower-stretch spanningtrees.
Proceedings of the 27th Annual ACM Symposium on Theory of Computing , 2005. Yuval Emek and David Peleg. Approximating minimum max-stretch spanning trees onunweighted graphs.
SIAM Journal on Computing , 38(5):1761–1781, 2009. Frank Emmert-Streib, Matthias Dehmer, and Yongtang Shi. Fifty years of graph matching,network alignment and network comparison.
Information Sciences , 346–347:180–197, 2016. Danielle Ezuz, Justin Solomon, Vladimir G Kim, and Mirela Ben-Chen. GWCNN: A metricalignment layer for deep shape analysis.
Computer Graphics Forum , 36:49–57, 2017. Cédric Févotte and Jérôme Idier. Algorithms for nonnegative matrix factorization with the β -divergence. Neural computation , 23(9):2421–2456, 2011. Christoph Flamm, Ivo L. Hofacker, Peter F. Stadler, and Michael T. Wolfinger. Barrier treesof degenerate landscapes.
Zeitschrift für Physikalische Chemie , 216(2), 2002. Ellen Gasparovic, Elizabeth Munch, Steve Oudot, Katharine Turner, Bei Wang, and YusuWang. Intrinsic interleaving distance for merge trees. arXiv preprint arXiv:1908.00063, 2019. Mina Ghashami, Edo Liberty, Jeff M Phillips, and David P Woodruff. Frequent directions:Simple and deterministic matrix sketching.
SIAM Journal of Computing , 45(5):1762–1792,2016. . Li, S. Palande, and B. Wang 25 Mikhail Gromov.
Metric Structures for Riemannian and Non-Riemannian Spaces , volume 152of
Progress in mathematics . Birkhäuser, Boston, USA, 1999. Shawn Gu and Tijana Milenković. Data-driven network alignment.
PLoS ONE , 15(7):e0234978,2020. Tobias Günther, Markus Gross, and Holger Theisel. Generic objective vortices for flowvisualization.
ACM Transactions on Graphics , 36(4):141:1–141:11, 2017. Xiaoyang Guo and Anuj Srivastava. Representations, metrics and statistics for shape analysisof elastic graphs. In
Proceedings of the IEEE/CVF Conference on Computer Vision andPattern Recognition Workshops , 2020. Xiaoyang Guo, Anuj Srivastava, and Sudeep Sarkar. A quotient space formulation for statisticalanalysis of graphical data. arXiv preprint arXiv:1909.12907, 2019. C. Heine, H. Leitte, M. Hlawitschka, F. Iuricich, L. De Floriani, G. Scheuermann, H. Hagen,and C. Garth. A survey of topology-based methods in visualization.
Computer GraphicsForum , 35(3):643–667, 2016. Reigo Hendrikson. Using Gromov-Wasserstein distance to explore sets of networks. Master’sthesis, University of Tartu, 2016. Ibrahim Hoteit, Xiaodong Luo, Marc Bocquet, Armin Köhl, and Boujemaa Ait-El-Fquih.Data assimilation in oceanography: Current status and new directions. In Eric P. Chassignet,Ananda Pascual, Joaquin Tintoré, and Jacques Verron, editors,
New Frontiers in OperationalOceanography . GODAE OceanView, 2018. Brijnesh J Jain and Klaus Obermayer. Learning in Riemannian orbifolds. arXiv preprintarXiv:1204.4294, 2012. Tao Jiang, Eugene L Lawler, and Lusheng Wang. Aligning sequences via an evolutionary tree:complexity and approximation.
Proceedings of the 26th Annual ACM Symposium on Theoryof Computing , pages 760–769, 1994. William B Johnson and Joram Lindenstrauss. Extensions of Lipschitz mappings into a Hilbertspace.
Contemporary Mathematics , 26:189–206, 1984. Ioannis Koutis, Gary L. Miller, and Richard Peng. A nearly- m log n time solver for SDDlinear systems. Procedings of the IEEE 52nd Annual Symposium on Foundations of ComputerScience , 2011. Edo Liberty. Simple and deterministic matrix sketching.
Proceedings of the 19th ACM SIGKDDInternational Conference on Knowledge Discovery and Data Mining , pages 581–588, 2013. Shusen Liu, Dan Maljovec, Bei Wang, Peer-Timo Bremer, and Valerio Pascucci. Visualizinghigh-dimensional data: Advances in the past decade.
IEEE Transactions on Visualization andComputer Graphics , 23(3):1249–1268, 2017. Michael W. Mahone and Petros Drineas. Structural properties underlying high-qualityrandomized numerical linear algebra algorithms. In M. Kane P. Buhlmann, P. Drineas andM. van de Laan, editors,
Handbook of Big Data , pages 137–154. Chapman and Hall, 2016. Facundo Mémoli.
Estimation of distance functions and geodesics and its use for shapecomparison and alignment: theoretical and computational results . PhD thesis, University ofMinnesota, 2005. Facundo Mémoli. On the use of Gromov-Hausdorff distances for shape comparison.
EurographicsSymposium on Point-Based Graphics , pages 81–90, 2007. Facundo Mémoli. Gromov-Wasserstein distances and the metric approach to object matching.
Foundations of Computational Mathematics , 11(4):417–487, 2011. Facundo Mémoli and Tom Needham. Gromov-Monge quasi-metrics and distance distributions.arXiv preprint arXiv:1810.09646, 2020. Facundo Mémoli and Guillermo Sapiro. Comparing point clouds.
Proceedings of the Euro-graphics/ACM SIGGRAPH Symposiumon Geometry Processing , pages 32–40, 2004. Facundo Mémoli and Guillermo Sapiro. A theoretical and computational framework forisometry invariant recognition of point cloud data.
Foundations of Computational Mathematics ,5:313–347, 2005. Facundo Memoli, Anastasios Sidiropoulos, and Kritika Singhal. Sketching and clusteringmetric measure spaces. arXiv preprint arXiv:1801.00551, 2018. J. Milnor.
Morse Theory . Princeton University Press, New Jersey, 1963. Dmitriy Morozov, Kenes Beketayev, and Gunther Weber. Interleaving distance between mergetrees.
Proceedings of Topology-Based Methods in Visualization (TopoInVis) , 2013. Elizabeth Munch and Anastasios Stefanou. The ‘ ∞ -cophenetic metric for phylogenetic trees asan interleaving distance. In Research in Data Science , Association for Women in MathematicsSeries, pages 109–127. Springer International Publishing, 2019. Bruno Ordozgoiti, Sandra Gómez Canaval, and Alberto Mozo. A fast iterative algorithm forimproved unsupervised feature selection.
IEEE 16th International Conference on Data Mining ,pages 390–399, 2016. Gabriel Peyré, Marco Cuturi, and Justin Solomon. Gromov-Wasserstein averaging of kerneland distance matrices.
Proceedings of the 33rd International Conference on Machine Learning,PMLR , 48:2664–2672, 2016. Jeff M. Phillips. Coresets and sketches. In
Handbook of Discrete and Computational Geometry ,chapter 48. CRC Press, 3rd edition, 2016. S. Popinet. Free computational fluid dynamics.
ClusterWorld , 2(6), 2004. URL: http://gfs.sf.net/ . G. Reeb. Sur les points singuliers d’une forme de pfaff completement intergrable ou d’unefonction numerique (on the singular points of a complete integral pfaff form or of a numericalfunction).
Comptes Rendus Acad.Science Paris , 222:847–849, 1946. Tamás Sarlós. Improved approximation algorithms for large matrices via random projections.
Proceedings of 47th IEEE Symposium on Foundations of Computer Science , pages 143–152,2006. Daniel A. Spielman and Shang-Hua Teng. Nearly-linear time algorithms for graph partitioning,graph sparsification, and solving linear systems. In
Proceedings of the 26th Annual ACMSymposium on Theory of Computing , 2004. Raghavendra Sridharamurthy, Talha Bin Masood, Adhitya Kamakshidasan, and Vijay Natara-jan. Edit distance between merge trees.
IEEE Transactions on Visualization and ComputerGraphics , 2018. Karl-Theodor Sturm. The space of spaces: curvature bounds and gradient flows on the spaceof metric measure spaces. arXiv preprint arXiv:1208.0434, 2012. Vayer Titouan, Nicolas Courty, Romain Tavenard, and Rémi Flamary. Optimal transport forstructured data with application on graphs.
International Conference on Machine Learning ,pages 6275–6284, 2019. Vayer Titouan, Rémi Flamary, Nicolas Courty, Romain Tavenard, and Laetitia Chapel. SlicedGromov-Wasserstein.
Advances in Neural Information Processing Systems , pages 14726–14736,2019. David P. Woodruff. Sketching as a tool for numerical linear algebra.
Foundations and Trendsin Theoretical Computer Science , 10(1-2):1–157, 2014. Hongteng Xu, Dixin Luo, and Lawrence Carin. Scalable Gromov-Wasserstein learning forgraph partitioning and matching.
Advances in Neural Information Processing Systems , pages3046–3056, 2019. Hongteng Xu, Dixin Luo, Hongyuan Zha, and Lawrence Carin. Gromov-Wasserstein learningfor graph matching and node embedding.
International Conference on Machine Learning ,pages 6932–6941, 2019. Lin Yan, Yusu Wang, Elizabeth Munch, Ellen Gasparovic, and Bei Wang. A structural averageof labeled merge trees for uncertainty visualization.
IEEE Transactions on Visualization andComputer Graphics , 26(1):832–842, 2020. Peng Zhan, George Krokos, Daquan Guo, and Ibrahim Hoteit. Three-dimensional signature ofthe Red Sea eddies and eddy-induced transport.
Geophysical Research Letters , 46(4):2167–2177,2019. . Li, S. Palande, and B. Wang 27 Peng Zhan, Aneesh C. Subramanian, Fengchao Yao, and Ibrahim Hoteit. Eddies in the red sea:A statistical and dynamical study.