Data-Driven Space-Filling Curves
DData-Driven Space-Filling Curves
Liang Zhou, Chris R. Johnson, and Daniel Weiskopf
Data-driven space-filling curvePeano-Hilbert curveFunctional boxplots
Volume renderer
Volume rendering with brushing-and-linking
Line chart view
Fig. 1. Visualizations of an ensemble of volumetric data. The key idea is to employ a mapping from 3D space (as in the volumerenderings) to 1D space via space-filling curves, which then allows us to show boxplots of the data distributions. The ensembleis generated by sampling from Gaussian distributions of data values in the nucleon data with varying extents of uncertainty. Theboxplots of the ensemble linearized with the Peano-Hilbert curve (bottom) do not preserve the coherency of 3D features—the smalltorus structure of high intensity cannot be readily identified. In contrast, our data-driven space-filling curve method (top) preservesfeatures from 3D even in the 1D linearized representation as high intensities are more concentrated. This observation is confirmed bybrushing-and-linking—the torus could be covered by one brush and its surroundings with two brushes with our method (see the volumerendering on the right and the yellow and purple regions in “Data-driven space-filling curve”), whereas multiple brushes are required bythe Peano-Hilbert curve (yellow and purple regions in “Peano-Hilbert curve”).
Abstract —We propose a data-driven space-filling curve method for 2D and 3D visualization. Our flexible curve traverses the dataelements in the spatial domain in a way that the resulting linearization better preserves features in space compared to existingmethods. We achieve such data coherency by calculating a Hamiltonian path that approximately minimizes an objective function thatdescribes the similarity of data values and location coherency in a neighborhood. Our extended variant even supports multiscale datavia quadtrees and octrees. Our method is useful in many areas of visualization, including multivariate or comparative visualization,ensemble visualization of 2D and 3D data on regular grids, or multiscale visual analysis of particle simulations. The effectiveness of ourmethod is evaluated with numerical comparisons to existing techniques and through examples of ensemble and multivariate datasets.
Index Terms —Space-filling curves, comparative visualization, ensemble visualization, multivariate visualization
NTRODUCTION
Space-filling curves (SFCs) linearize an n -D image through a one-to-one mapping into one dimension. Such linearization is useful invisualization as a tool for dimensionality reduction for 2D and 3Ddatasets. In this paper, we propose a data-driven space-filling curvemethod for data on regular or multiscale grids. Our main goal is topreserve spatial coherency (i.e., locality) and data coherency (i.e., datafeatures) at the same time. We construct a faithful representation ofthe original 3D or 2D data after linearization. The method is intendedfor easy feature identification in the 1D visualization of space-fillingcurves—as line-plots—and facilitates subsequent user interactions, e.g.,brushing-and-linking in comparative visualizations.An important factor for choosing an appropriate space-filling curveis how well the locality of a dataset is preserved. Among typical space-filling curves, the Peano-Morton curve does not effectively preservethe locality, whereas the Peano-Hilbert curve [10] is considered tohave better locality. Therefore, the Peano-Hilbert curve is popular invisualization. However, these space-filling curves ignore the content of • Liang Zhou and Chris R. Johnson are with SCI Institute, University of Utah.E-mail: lzhou; [email protected].• Daniel Weiskopf is with Visualization Research Center (VISUS), Universityof Stuttgart. E-mail: [email protected] received xx xxx. 201x; accepted xx xxx. 201x. Date of Publicationxx xxx. 201x; date of current version xx xxx. 201x. For information onobtaining reprints of this article, please send e-mail to: [email protected] Object Identifier: xx.xxxx/TVCG.201x.xxxxxxx a dataset.This issue is illustrated in Fig. 1. It shows the visualization of anensemble of a nucleon volumetric dataset generated by sampling fromGaussian distributions with varying extents of uncertainty: the vol-ume rendering generated with a blue-white-red color map is shownin Fig. 1 (left). Boxplots along the Peano-Hilbert curve are shownin Fig. 1 (central bottom), where the feature coherency in 3D is notpreserved—the small torus structure of high intensity cannot be iden-tified. In fact, the torus is split into distant pieces in the 1D spaceand multiple brushes are required to select the feature (yellow areasin the “Peano-Hilbert curve” of Fig. 1). By contrast, with our method(Fig. 1 (central top)), the torus can be identified as a feature spanning amuch smaller range in 1D, and can be selected with a single brush (asseen in the yellow region) thanks to better preservation of features inthe spatial domain. With brushing-and-linking, the same regions arehighlighted in yellow in 3D (Fig. 1 (right)) using linearizations with ourmethod and the Peano-Hilbert curve. The better feature preservation ofour method is also demonstrated with the purple brushes.Our main contribution is a data-driven space-filling curve approachthat comprises two variants of techniques: one for 2D and 3D regulargrids, and another for 2D and 3D multiscale data. For regular grids, ourmethod generates Hamiltonian cycles by replacing a minimum spanningtree using an objective function that combines locality and positionterms; for multiscale data—quadtrees and octrees—our method findsadaptive Hamiltonian paths across data scales in a greedy fashion. Toenable the calculation of Hamiltonian paths for multiscale data, wemake a second contribution: a simple and efficient technique that findsa Hamiltonian path given only the entry and exit edges (2D) and faces a r X i v : . [ c s . G R ] S e p . ELATED W ORK
Space-filling curves [24], discovered by Peano [19], are traditionaltopics in mathematics but now have various applications across differentareas in computer science. Well-known space-filling curves include thePeano curve [19], the Gray code ordering [8], and the Peano-Hilbertcurve [10]. These methods consider only spatial discretization onregular grids. Adaptively refined space-filling curves are availablefor multiscale data structures, specifically, quadtrees and octrees, fordynamic load balancing for high-performance computing [4]. However,these methods use static configurations that are independent of thecontent of the data and relate only to the size of the data.The context-based space-filling curve [5] is one of the few examplesof a data-dependent curve. It targets to improve autocorrelation in2D image and video encoding. There, a “dual graph” (we use thisredefinition by Dafner et al. [5] throughout our paper) is generated fromthe input image and then a minimum spanning tree of the graph is found,where weights are determined by an objective function. Finally, thespace-filling curve is constructed by replacing the minimum spanningtree with a Hamiltonian path from a Hamiltonian cycle. However,this method is limited to 2D data and does not support multiscaledata, making it unsuitable for many visualization applications. Unlikethis method, our data-driven space-filling curves support 3D volumedata and multiscale data of 2D and 3D, which are not possible withthe context-based space-filling curves [5]. In addition, our methodintroduces a new objective function that achieves both feature andlocality coherency, making it more flexible than the context-basedmethod. Another example is an approximation method of the space-filling curve with a data-driven metric [26]. However, only simple 2Dexamples with distributed points are demonstrated and it is unclear howthe method could be extended to more complex data such as images andvolumes. A random space-filling curve method [13] based on the “coverand merge” strategy is not data-driven but inspires the computationalframework of the context-based space-filling curve [5] as well as ourregular grid techniques.Space-filling curves are useful for many visualization purposes. Theyfacilitate comparative visualizations due to locality preservation, i.e.,points that are close on the space-filling curve are close in the original2D/3D space (not necessarily the other way around). Space-fillingcurves are used in ensemble visualization of 3D volumetric data [6, 28].Peano-Hilbert curves are calculated for 3D ensemble data of multiplelevels-of-details, and the linearized results are visualized as interactiveenhanced line charts [6] making comparisons of 3D members possi-ble. Similarly, a nonlinear compression method is available for thelinearized 3D ensemble calculated using Peano-Hilbert curves [28].Hilbert attention maps [16] use Peano-Hilbert curves to visualize time-varying eye-tracking data sampled on 2D regular grids as a static image,allowing features of interest that span a small neighborhood to betraced easily in the attention maps. For all methods above, brushing-and-linking is used as the major exploration approach that relates the1D linearization and the original data. Since our technique improvesspace-filling curves implementations for visualization, all of these visu-alization applications could potentially benefit from our method.Hamiltonian paths and cycles form the computational basis of ourmethod. A Hamiltonian path/cycle is a path/cycle that visits eachnode in a graph exactly once, and a Hamiltonian cycle can be easilyconverted to a Hamiltonian path by a single cut on the cycle. Thecomputation of the general Hamiltonian path problem is NP-hard [1].For restricted scenarios, however, more efficient solutions are possible.The existence/nonexistence of a Hamiltonian path is proven for 2D h ttps://github.com/zhou-l/DataDrivenSpaceFillCurve.git grid graphs [11]; for 3D graphs of even-numbered nodes along eachdimension, a Hamiltonian path can be generated from a Hamiltonian cy-cle [2]. However, these methods require specified entry and exit nodes,which is infeasible for data-driven space-filling curves for multiscaledata. This is because if a path leaves a block of finer nodes and entersto a block of coarser nodes, we only know the exiting face of the blockof finer nodes and the entering face of the block of coarser nodes. Wepropose a more flexible Hamiltonian path generation method—for both2D and 3D regular grids, given only edges/faces of entry and exit of abounding rectangle/box—as a building block for our method.Ensemble visualization, an active and challenging visualizationtopic [18], is one of the target applications of our technique. Besidesthe aforementioned methods using space-filling curves [6, 28], thereare alternative techniques that use depth-based statistics [9, 14, 21, 29],scatterplots and parallel coordinates [23], trend graphs and parallelcoordinates [17], and a flexible linked-view system with a configurablecollection of statistical representations [20]. Depth-based statistics isa fundamental building block for ensemble visualization. The com-putation and visualization of depth-based statistics is available for 1Dfunctions [27], 2D surfaces [9], 2D contours [29], 3D contours [21],and 2D and 3D curves [14]. In our paper, we employ a 3D extension ofthe surface boxplot [9] together with our data-driven curves to visualizeensemble datasets. ROBLEM F ORMULATION
To support regular grids data and multiscale data with a unified repre-sentation, we model the input data in 2D and 3D as a graph: G = ( V , E , L ) , where vertices V are nodes/vertices of the grid, edges E connect neigh-boring vertices (typically 4-neighbor and 6-neighbor for 2D and 3Ddata, respectively), and L is the level of the scale of the vertex. Ourformulation facilitates a flexible multiscale representation with theper-vertex scale L , as shown in Fig. 2. V G EV G L = 3 L = 2 L = 1 E quadtree P min Fig. 2. A 2D multiscale graph G = ( V , E , L ) on a quadtree with a Hamilto-nian path P min . Regular grids are a special case of G where the level is constant( L =
1) for all vertices and the graph becomes a grid graph (Fig. 3). G Grid graph G c Circuit graph u u ca b w C i C j L = 1 V E C
Fig. 3. (Left) A 2D graph on a regular grid G = ( V , E , ) with correspondingdata values (black = 0, red = 1), and its associated circuit graph G c (right)of circuits C . Adjacent circuits of C i are drawn in light blue. The edgeweights of data values between circuits C i and C j are shown on the right. Typical space-filling curves [10, 19, 24] focus on the geometry of thecurves, i.e., the geometry of V and E , which concerns only the preser-vation of locality but not data features as the curves are ignorant of theunderlying data s ( V ) . Our goal is to generate space-filling curves that d) Hamiltonian cycle/path C i C j − .
33 0 . − .
33 0 . − .
33 0 .
67 0 . − . − . − . G c G ′ c Circuit graph(a) Dual graph of the ciruit graph (b) Minimum spanning tree .
67 0 00 . . (c) Merging circuits merge Fig. 4. Intermediate steps of the framework of [5] on 2D regular grids. The dual graph G (cid:48) c (a) of circuits graph G c is a directed graph with our newweights (labeled on edges). Then, the minimum spanning tree (b) is found on the dual graph. Next, the Hamiltonian cycle is generated by mergingcircuits using the minimum spanning tree (c). Finally, (d) the Hamiltonian cycle is converted into a Hamiltonian path by making a cut anywhere on thecycle (shown as X). preserve both locality and data features. This requires the scan order tobe updated according to the data. We focus on data-driven space-fillingcurves that traverse through connected nodes within the graph G . Coun-terexamples (curves that jump between unconnected nodes) are knownto have poor locality coherency, e.g., the scanline and the Peano-Mortoncurve. In our case, the space-filling curve problem is equivalent toa Hamiltonian path problem [11] with coherency preservation, whichallows us to formulate the generation of data-driven space-filling curvesas an optimization problem of Hamiltonian paths with an objectivefunction that takes measures of both locality preservation and data-feature preservation into account. Finding the minimum total weight ofall possible Hamiltonian paths is hard on regular grids [5] and also onmultiscale grids.We denote a Hamiltonian path P through all vertices V as a sequence: P = ( v , v ,..., v n ) , where v i ∈ V is adjacent to v i + for 1 ≤ i < n . We aim to find a path P min that minimizes an objective function f ( P ) : P min = argmin P f ( P ) . The objective function is formulated to be the sum of weights W that iscomprised of a feature preservation term N that concerns data values s ( v ) of vertex v , and a locality preservation term R : f ( P ) = n − ∑ i = W ( v i , v i + ) , (1) W ( v i , v i + ) = ( − α ) N ( s ( v i ) , s ( v i + )) + α R ( v i , v i + ) , where α ∈ [ , ] is a user-set blend factor. Our locality preservationterm is a simplified, first-order locality measure. The true localitymeasure of space-filling curves is multiscale, and, therefore, muchmore complicated. However, our simplified model still yields betterpositional coherency compared to the scanline and the context-basedmethod, as shown in Section 6.Solving the minimization problem is infeasible except for extremelysmall datasets, and, therefore, we find an approximate optimum of theobjective function. For regular grids, the optimum of f ( P ) is approxi-mated by adopting the strategy used by the context-based method [5]but with our new objective function and an extension to 3D. Stepsinvolved in the framework of regular grids are illustrated in Fig. 4.In Section 4, we briefly review the setup of the framework of theregular grid and elaborate on our new objective function and its impact.The rationale and details of this framework can be found elsewhere [5].For multiscale data, we propose approximately minimizing the objec-tive function f ( P ) using a top-down and recursive greedy algorithm,which is explained in Section 5. PACE -F ILLING C URVE G ENERATION FOR R EGULAR G RIDS
The steps for computing data-driven space-filling curves on regulargrids are described in Algorithm 1. Our new contributions are a newobjective function as explained in Section 4.1 and the extension to 3Ddetailed in Section 4.2.We briefly review the computational framework [5] using a 2Dexample as illustrated in Fig. 4. For a regular grid G with an even number of vertices in each dimension, we first convert it to a graph G c of small circuits C (Fig. 3), and then compute the dual graph G (cid:48) c (referto the redefinition in the context-based method [5]) of G c (Fig. 4 (a)).With the dual graph of small circuits, we are able to find P min byconstructing the minimum spanning tree of G (cid:48) c . The width and heightof G (cid:48) c are w d and h d respectively; each node of G (cid:48) c corresponds toa circuit C k for k ∈ { ,..., w d × h d } of 2 × v j adjacent to vertex v i in P isnow transformed to evaluating the weight on circuits W ( C i , C j ) , where C i and C j are adjacent, and the dual of C i is already in the minimumspanning tree (Fig. 3 (right)). A minimum spanning tree is the treethat minimizes the sum of weights among all possible trees [25], i.e., itguarantees to find W ( C i , C i + ) as the minimum among all W ( C i , C j ) ineach step. The minimum spanning tree is built by joining edges of G (cid:48) c using Prim’s algorithm (Fig. 4 (b)). Next, the minimum spanning treeis converted to a Hamiltonian cycle by merging the circuits accordingto the minimum spanning tree with the cover-and-merge strategy [13](Fig. 4 (c)). Finally, a Hamiltonian path P min is created by making asingle cut anywhere in the Hamiltonian cycle [5] (Fig. 4 (d)). Algorithm 1
Data-Driven SFC for Regular Grids procedure DD SFCR EG G RID ( G ) G (cid:48) c ← buildSmallCircsDualGraph ( G ) (cid:46) G —2D/3D gridgraph, G (cid:48) c —dual graph of small circuits graph of G W ← calculateDualGraphWeights ( G (cid:48) c ) (cid:46) W —weights on G (cid:48) c MST ← findMinSpanTree ( G (cid:48) c , W ) (cid:46) MST—minimumspanning tree P min ← mergeHamCycleAndCut (MST, v s ) (cid:46) v s —entryvertex, P min —data-driven SFC return P min In our new objective function, the definition of weights of circuits onregular grids by adding the circuit C j to the minimum spanning treereads: W ( C i , C j ) = ( − α ) N ( C i , C j ) + α R ( C i , C j ) , α ∈ [ , ] , (2)where C i and C j are adjacent circuits, and the dual of C i is already inthe minimum spanning tree.For value coherency, we reuse the definition of value weights ofcircuits of the context-based method [5]. The value weight that growsthe minimum spanning tree with C j reads: N ( C i , C j ) = | u | + | u | + | w | + | w | + | c | − | b | − | a | , (3) S C j S C i C j C i Fig. 5. G (cid:48) c is partitioned (the blue dash-dot line) into blocks denoted bytheir centers (e.g., S C i and S C j ) to accommodate the new positional termof the objective function. urs Context-based Peano-Hilbert Scanline (b)(c)(a) (d)(e) Autocorrelation of data valueAutocorrelation of distance
Fig. 6. Comparison of linearization methods applied to a slice of the nucleon dataset (a). Our new data-driven space-filling curve is compared toprevious methods: the context-based space-filling curve, the Peano-Hilbert curve, and scanline ordering. The spatial layout of the respective curves isshown in (b), and the linearization of the data values in (c). The spatial layout (b) is colored by the traversal order of curves (the horizontal axis of (c))with the parula colormap (right of (b) Ours). Autocorrelations of value (d) and radial distance (e) quantify data coherency and locality preservation,respectively. The plots show that our approach provides the best compromise between the two conflicting goals. Note that the autocorrelations ofdata values of our method and the context-based method are largely overlapping.
Spatial layoutLinearizations of data valuesAutocorrelations of radial distanceNucleon slice
Fig. 7. The effect of different α (left–right: 0, 0.01, 0.03, 0.1, 0.3, and 1.0) on the scan order of our space-filling curves for the nucleon slice data (left).The order is color-coded using the same color map as in Fig. 6. where u , u , w , and w belong to edges along the growing directionof the minimum spanning tree, whereas a , b , and c belong to facesacross the growing direction. All values above are differences of datavalues s ( v ) of vertices along corresponding edges in the grid graph asdefined in Fig. 3 (right).To measure the positional coherency, we first partition the dual graphinto blocks with width w b and height h b , and denote the block centerof circuit C k as S C k , as shown in Fig. 5. Then, we derive our positionalcoherency term that measures the distance of the 2D position of thecircuit to the block center. The positional term is defined as follows: R ( C i , C j ) = R pos ( C j ) = || ( C j . x , C j . y ) − ( S C j . x , S C j . y ) || , (4)where R pos ( C j ) measures the positional difference as the spatial dis-tance between C j and the center of the block S C j . Since an edge weightis required for finding the minimum spanning tree, we assign R pos ( C j ) to the edge C i – C j in the dual graph to facilitate a unified weight defini-tion with the value term.A comparison of our data-driven curve for 2D regular grids andother linearization techniques is shown in Fig. 6. It can be seen thatour method (Fig. 6 (Ours)) yields coherent results and correctly revealsthe two peaks as coherent and neighboring features. The context-basedspace-filling curve (Fig. 6 (Context-based)) also reveals such struc-tures but its spatial layout is not localized (Fig. 6 (b, Context-based)),which is confirmed by a similar autocorrelation of value (Fig. 6(d))and a inferior autocorrelation of radial distance (Fig. 6(e)) comparedto our new method). This indicates that our new method yields morecoherent results than the context-based method. The scanline order(Fig. 6 (Scanline)) generates a cluttered line chart that goes up anddown and it is not possible to see the data content; the Peano-Hilbert curve (Fig. 6 (Peano-Hilbert)) fails to show the two bright regions asneighboring features, and the concentrated overall structure is shownalong the whole span of the line chart.In our method, the blend factor α allows the user to flexibly changethe importance of value coherency and positional coherency, which isnot possible in the context-based space-filling curve [5]. Fig. 7 showsthe effect of α on the traversal order of an image. The impact of α on value coherency and positional coherency is data-dependent andnonlinear. We empirically used an α value of 0.1 (except for Fig. 16,where α =
0) as tests on datasets for evaluation (Section 6) show thatsuch a value yields a good balance between the positional coherencyand data-value coherency. We recommended using α = . Because a data-driven or context-based space-filling curve technique for3D data on regular grids is useful for visualization applications [6, 28],we extend our data-driven space-filling curve to 3D regular grids. Fig. 8shows a comparison of linearizations of a synthetic volume data—asphere with increasing data value from exterior to interior (Fig. 8 (d))—using our data-driven method, the Peano-Hilbert curve, and scanlineordering. It can be seen that our method (Fig. 8 (a)) best preserves thevalue signature of the sphere as a concentrated continuous single peakwith least noise, which is not possible with the Peano-Hilbert curve(Fig. 8 (b)) or the scanline (Fig. 8 (c)), the sphere is split into manypieces make the feature unidentifiable.We extend the computational framework of the aforementioned2D method (Algorithm 1) to 3D. Here, G c —the equivalent to the urs Peano-Hilbert Scanline Fig. 8. Linearizations (top) of a synthetic volume data of a sphere(left) with our data-driven curve, the Peano-Hilbert curve, and scanlineordering. The scan orders of curves are shown in the bottom row. a a a a b b b b u u u u w w w w c c c c C j C i (a)(b) Parallel Nonparallel (c)
Fig. 9. The value weights of cubes on 3D regular grids (a). The cubesneed to be converted into cycles during merging. There exist (b) six cycleconfigurations of a unit cube, and the cycles are merged with (c) twoassociation rules. circuit graph in 2D—is now comprised of small unit cubes—of 2 × × N ( C i , C j ) = ∑ r = ( | u r | + | w r | + | c r | − | b r | − | a r | ) , R ( C i , C j ) = R pos ( C j ) , R pos ( C j ) = || ( C j . x , C j . y , C j . z ) − ( S C j . x , S C j . y , S C j . z ) || . (5)As an analogy to the 2D case, u r and w r are edges along the growing di-rection, while a r , b r , and c r are faces across the growing direction. Thevalue weights of cubes on 3D regular grids are illustrated in Fig. 9 (a).Instead of four neighbors (top, down, left, right) in the 2D case, sixneighbors (with front and back as additional neighbors) are used in the3D case when building the minimum spanning tree.The fact that a 3D unit cube is no longer directly a cycle as in the2D case makes the conversion from the minimum spanning tree to aHamiltonian cycle more complicated. There exist six possible cycleconfigurations in a unit cube as shown in Fig. 9 (b).After building the minimum spanning tree, we grow the Hamiltoniancycle by traversing the tree and associating unit cycles with a randomconfiguration (from the six configurations) with association rules [2].Two association rules for two neighboring unit cycles are adopted(Fig. 9 (c)): if parallel neighboring edges exist, we break the paralleledges and associate the four endpoints; if parallel edges do not exist, weneed to break the neighboring edges and associate all eight endpoints.Our data-driven technique could be extended for n data attributes,where multidimensional data values live in regular grids in a 2D or3D spatial domain. The vertices in the graph now have vector-baseddata values, and the weights of the objective function can be defined ascertain metrics of the vectors, e.g., L2-norm. Our current visualizationis based on the linearization of one data attribute for multidimensionaldata, and one member (typically, the median) for ensemble datasets. ULTISCALE D ATA -D RIVEN S PACE -F ILLING C URVES
Our data-driven curve for a multiscale data is equivalent to finding aminimum Hamiltonian path that traverses every leaf node in a quadtreeor octree T . However, the aforementioned regular grid strategy is notapplicable to build a dual graph for multiscale graphs because theyoften do not contain even-numbered vertices along each dimension(Fig. 10 (a)), and, therefore, a Hamiltonian cycle does not exist. As aresult, we resort to an approximation strategy to the minimum Hamilto-nian path on multiscale grids with top-down adaptive refinement. Algorithm 2
Data-Driven SFC for Multiscale Data procedure SFC
MULTISCALE ( { I , I , ··· , I L c } , T ) (cid:46) T: treestructure (quadtree/octree) P top ← findTopLevelSFC ( I L c ) (cid:46) computes P top —the top levelSFC (data-driven) P min ← [] (cid:46) P min —SFC of the whole data v last ← (cid:46) v last —the last SFC node for i in range(1, length( P top )) do block ← T ( P top [ i ] )) (cid:46) retrieves the corresponding block ofthe current SFC node from the tree Pz ← refine ( { I , I , ··· , I L c } , T , block, v last ) (cid:46) Pz —SFC ofthe current block computed by adaptive refinement (data-driven) P min ← [ P min , Pz ] (cid:46) appends P min with Pz v last ← P min [ last ] (cid:46) records the last member of P min return P min The process of our mutliscale data-driven space-filling curve methodis summarized in Algorithm 2. Given a multiscale dataset on a quadtree(Fig. 10 (a)) or octree whose nodes are of levels 1 ≤ L ≤ L c , where1 is for the finest level and L c is for the coarsest level, we preparean image/volume pyramid of L c levels { I , I , ··· , I L c } for subsequentcomputations. First, the top-level space-filling curve P top of the coarsestlevel I L c is found (Fig. 10 (b)). Based on the number of nodes in thecoarsest pyramid level, the path is calculated using either the regular-grid-based data-driven curve method as described in Section 4 or thegeneral Hamiltonian path method as discussed in Section 5.1. Then,we adaptively refine each element of the top-level curve P top (i.e., amultiscale node in the corresponding quadtree/octree)—at each level, aminimum Hamiltonian path is found with our flexible Hamiltonian pathmethod that improves the Hamiltonian path method on grid graphs [11,15] (Fig. 10 (c)). Finally, the linearization is achieved: both the datavalue and the scale of the vertex are recorded (Fig. 10 (d)).The objective function is approximately minimized during the pro-cess. The data value coherency term is minimized approximately withthe flexible Hamiltonian path generation for each level in a block, andby finding the best entry node during adaptive refinement; the local-ity term is implicitly minimized by the hierarchical block-by-blockadvancing of the curve similar to the Peano-Hilbert curve.In the rest of this section, we explain the flexible Hamiltonian pathgeneration method, and then, the adaptive refinement process; finally,we discuss scenarios when the multiscale technique or the regular gridstechnique should be used. Typical Hamiltonian path methods solve the problem, i.e., ( G , v s , v t ) , ona regular grid G with distinct, explicitly given entry and exit vertices v s and v t . However, this is not appropriate for our method as the adjacentvertices are of different scales. For example, as shown in Fig. 10,the exit vertex of the top-level block 3 (Fig. 10 (b)) cannot be knownbeforehand, but only the exit face of the block is known given thetop-level space-filling curve. Therefore, in our formulation, we rewritethe Hamiltonian path problem as ( G , v s , F t ) , (6)where F t is the exit side/face of the bounding rectangle/box of G . Thetask is then to calculate the minimum path from v s to a valid vertex on F t . b) Top-level SFC(a) Input multiscale data (c) Adaptive, data-driven refinement of each node in the top-level SFC (d) Linearization of data values (blue) and scale levels (red)12 34 Find the best entry to the next top-level node Recursively refine the nodeRecursively refine the nodeRecursively refine the node Find the best entry to the next top-level node
Fig. 10. Steps to compute a data-driven space-filling curve for multiscale data.
We have to compute the minimum path among all possible pathsfrom entry point v s to all valid vertices on F t . Here, the objectivefunction f ( P ) is simplified to describe only the data value coherencyof the sequence, and it is defined as the sum of gradient magnitude ofdata values s ( V ) along the path: f ( P ) = n − ∑ i = || s ( v i + ) − s ( v i ) || . (7)Therefore, a smoothly changing path is encouraged and a path thatfluctuates significantly is punished.We can find (or show the nonexistence of) a Hamiltonian paththrough given entry and exit vertices for small grids directly by us-ing exhaustive search. A larger graph has to be partitioned into smallerones: in practice, the largest graph that can be solved directly is 8 × × × F t onthe left and top for 3D graphs are shown in (c) and (d). Since the (a) (b)(c) (d) Fig. 11. Flexible Hamiltonian paths for 2D and 3D grid graphs. Exit sidesare on the (a) right and (b) left for these examples of 2D graphs. Theexample 3D graph has exit faces on the (c) left and at the (d) top. flexible Hamiltonian path method is the building block of our multiscalespace-filling curve techniques, exhaustive search is implemented in annon-recursive fashion using stacks to improve efficiency.
The refinement method refine —as described in Algorithm 3—is thecore of the space-filling curve for multiscale data. If any of the nodeswithin the block is not a leaf node, it has to be refined all the way downto the finest level in a data-driven fashion. The key is to determine thesuitable entry node for blocks at different levels (the findBestEntry function in Algorithm 3): we keep track of the last vertex v last in theHamiltonian path and utilize it to find the matching entry node in thenext block, i.e., the node within the adjacent block to v last that has theminimum difference to its data value. The combination of this pro-cess and the Hamiltonian path generation function linearizeHamPath (Section 5.1) approximates the minimization. Algorithm 3
Refinement of a Multiscale Block procedure REFINE ( { I , I , ··· , I L c } , T , block, v last ) v s ← findBestEntry ( v last ) (cid:46) finds the best entry vertex v s (data-driven) if needRefine (block) then P curr ← [] (cid:46) P curr —SFC of the current multiscale block P currTop ← linearizeHamPath ( I block , T , v s ) (cid:46) finds thetop-level SFC P currTop of the current data block I block (data-driven) for i in range(1, length( P currTop )) do ctopBlock ← T( P currTop [ i ] ) (cid:46) ctopBlock: top-levelsub-block within the current block if needRefine (ctopBlock) then if i ≥ then v last ← P currTop [ i − ] P finer ← refine ( { I , I , ··· , I L c } , T , ctopBlock, v last ) (cid:46) recursively computes the SFC of ctopBlock P curr ← [ P curr , P finer ] (cid:46) appends P curr with P finer else P curr ← [ P curr , P currTop [ i ]] (cid:46) appends P curr with P currTop [ i ] else P curr ← linearizeHamPath ( I block , block, v s ) (cid:46) data-driven return P curr Fig. 12 (Quadtree) shows the linearization with our data-driven tech-nique for quadtree on a synthetic image (Fig. 12 (Quadtree, first col-umn)). The resulting multiscale linearizations and their reconstructedlinearizations are shown in the second column of Fig. 12 (row 1). Here,“reconstructed” refers to generating the linearization back to the regulargrid using the visit order of the coordinates of multiscale nodes and uad t r ee Spatial Configuration O c t r ee LinearizationInput Data
Fig. 12. Data-driven space-filling curves for quadtree and octree. The input data are shown in the first column, the linearizations in the secondcolumn, and the spatial configurations of the space-filling curves in the third column. their scale information. It can be seen that our technique preservesthe value signatures of five disks—each as a peak—which is moreprominent in the reconstructed space-filling curve. Fig. 12 (row 1,third column) shows the geometry of the space-filling curve over thequadtree color mapped by the traversal order (from blue to yellow). Anoctree data of five spheres is shown in Fig. 12 (Octree): the linearizationusing our technique Fig. 12 (Octree, second column, top) preservesthe value pattern of five spheres, more evident in the reconstructed ver-sion Fig. 12 (Octree, second column, bottom). The spatial configurationof the space-filling curve is shown in Fig. 12 (Octree, third column),with the color map showing the scan order.
VALUATION
Data value Radial Euclidean distance D (a) (b) D (c) (d) Fig. 13. Autocorrelations of data value (first column) and radial distance(second column) for our 2D techniques (first row) and our 3D techniques(second row). Note that larger autocorrelation means better coherency.
Our method is evaluated by numerical comparison of autocorrela-tions of our techniques ( α = . u ( i ) that measure the data coherency of space-filling curves; 2) autocorrelation of radialEuclidean distances of elements in the linearization t ( i ) that measuresthe spatial coherency, i.e., locality, of the curves. The definitions of thetwo measures are shown as follows: u ( i ) = s ( P ( i )) , t ( i ) = || [ P ( i ) . x , P ( i ) . y , P ( i ) . z ] − [ , , ] || . Note that the distance measure t ( i ) is applicable only to regular gridsand P ( i ) . z = ; andan image of 5 randomly placed disks)—are used in the evaluation of2D methods, and 5 volumetric datasets (fuel, neghip, nucleon, heartischemia, and a procedural volume generated with a tangle function;all downsampled to 32 ) are used for 3D. Autocorrelations are shownin Fig. 13, where the horizontal axis is the lag (shift) of the signal, andthe vertical axis is the value of normalized autocorrelation.Averaged autocorrelations of data values in 2D (Fig. 13 (a)) indicatethat our regular grid method (green) has almost the same feature co-herency as the context-based curve (purple) as they are overlapping, andboth perform much better than the Peno-Hilbert curve (blue) and thescanline (red); our data-driven method for quadtree (black) performsbetter than the Peano-Hilbert curve and scanline. In terms of averagedautocorrelations of radial distance (Fig. 13 (b)), the Peano-Hilbert curve(blue) performs best and is followed by our data-driven method (green),and then the context-based method (purple); the scanline method (red)has much worse performances than other techniques.For 3D data, the evaluation compares our regular grid-based data-driven technique, our multiscale technique for octree, the Peano-Hilbertcurve, and the scanline. As shown in Fig. 13 (c), our regular-gridmethod (green) tops other techniques for averaged autocorrelation ofdata value, and our octree technique (black) follows, and then, thePeano-Hilbert curve (blue), and the scanline (red). For autocorrelationsof radial distance (Fig. 13 (d)), our regular grid method is better thanthe scanline but out-performed by the Peano-Hilbert curve.The evaluation confirms that our data-driven technique balances thedata value coherency and locality coherency, and is more flexible thanexisting techniques. The comparisons also suggest that our regulargrid techniques have better data value coherency performance thanour multiscale techniques—the former is preferred when high-quality h ttp://schorsch.efi.fh-nuernberg.de/data/volume/ peed (a) VelocityXVelocityY
Highest pressureLow density
Density (b)Negative velocity
High speed
VelocityZ Pressure
Low density
Fig. 14. Visualization of an SPH simulation of dam break: (a) rendering of particles with brushed regions, (b) multivariate line charts generated usingour octree-based data-driven space-filling curve. linearization is needed for volumetric data and computational time isnot a limiting factor.Our multiscale techniques are most suitable for multiscale data bynature, e.g., multiscale simulation of particles. An octree can be builtto have a few (even just one) particles in the finest level node so thataccurate data values of almost all particles could be preserved. However,for regular grids, this is more difficult if not impossible. Either moreparticles are averaged out or a very fine grid has to be built. In addition,multiscale techniques are faster than regular grid techniques as fewernodes have to be visited in multiscale structures compared to regulargrids for the same input data.In contrast, our regular grid techniques yield more coherent lin-earization results than our multiscale curves. This is due to that themultiscale curve uses the top-down approach to ensure a Hamiltonianpath exists, but it breaks coherent features in space on certain occasionsas demonstrated in Fig. 12 (Octree). The aforementioned numericalcomparison of coherency confirms that the regular grid techniques havehigher coherency than multiscale techniques.Therefore, we recommend using the multiscale technique for in-trinsically multiscale data, especially, point datasets, and for reducingcomputation time. The regular grids techniques are recommended forhigher linearization quality for images and volumetric data on uniformgrids. Furthermore, preprocessing the input data with a segmenta-tion could improve the coherency, and, potentially, the efficiency ofour method (for regular grids, fewer comparisons are needed whenneighbors are homogeneous).
ISUALIZATION , U
SER I NTERACTION , AND I MPLEMENTA - TION
Our method facilitates visualizations that use the horizontal axis for anyspatial configuration along the space-filling curve, freeing the verticalspace to visualize values aligned within the spatial configuration. Formultivariate data, each variable can be visualized as a line plot (Fig. 14),whereas for ensemble data, functional boxplots are used (Fig. 1,15,16).Here, we employ the surface boxplot [9] for 2D ensemble datasets; anextension of the method [9] to 3D is applied to 3D ensemble datasets.The conventional color scheme used for boxplots is adopted.We have built an interactive visualization tool to support the explo-ration of space-filling curve-based visualization of datasets. The toolcomprises three linked views: a line plot view that shows the linearizeddata; a 3D renderer that renders volumetric data with direct volumerendering, and particle datasets with polygon-based rendering; a 2Drenderer that shows the data slice. The line plot view is linked with 2Dand 3D renderers using the scan order of the space-filling curve thatrecords the pixel ID of the line plot and its associated pixels/voxelsin the original data. Brushing-and-linking allows us to brush regionsin the line plot view, highlighting them in the 2D and 3D renderings.Interactive zooming and panning are supported in the line plot view sothat both the overall structure and details of the line visualizations canbe examined.Our space-filling curve techniques were implemented using Matlab. The visualization tool is built using C++, Qt, and OpenGL, and isaccelerated by the GPU. The QCustomPlot library [7] was used to aidthe implementation of the line plot view. Our method was tested ona 2019 13-inch Macbook Pro with 2.3 GHz Intel i5 CPU, 8 GB mainmemory, and an Intel Iris 655 integrated GPU. The data-driven space-filling curve only needs to be computed once for a given dataset, andthe computation time depends on the number of vertices in the graphrepresentation of the dataset. Timings of generating data-driven space-filling curves for examples of the paper are summarized in Table 1. Fullinteractivity was achieved for the exploration of all examples.
Table 1. Computation time of data driven space-filling curves on exampledatasets.
Dataset Size TimeNucleon slice 64 ×
64 pixels 12sNucleon 32 × ×
32 voxels 24sBrain atlas 176 ×
208 pixels 3m39sSPH 4000 particles/11796 octree nodes 43sMyocardial ischemia 128 × ×
128 voxels 4h31m
XAMPLES
We demonstrate the usefulness of our method with examples of mul-tiscale multivariate particle data, ensemble of medical images on 2Dregular grids, and volumetric ensemble datasets on regular grids. Vi-sualizations of ensemble datasets are based on linearizations of themedian members. The smooth particle hydrodynamics (SPH) datasetshown in Fig. 14 is a timestep in a dam break simulation [22]; thedataset contains particles with six attributes: density, pressure, speed,and velocity in X , Y , and Z directions, respectively. The data is decom-posed into an octree and linearized using our octree-based data-drivencurves. Data values of all attributes are linearized with the spatial layoutof the space-filling curve of the pressure attribute. We highlight regionsthat are distinct from their neighborhood in the linearizations: lowvalues of density, high values of pressure, and high values of speed andvelocity. Here, the most prominent feature is the highest pressure region(brushed in purple) with values over 50,000 as shown in the zoom-in.The particles within the brushed regions can be seen in the 3D rendering(Fig. 14 (a)). Our method yields a new visual debugging method thatshows clear, non-occluded quantities of each attribute embedded in aspatial context. It could complement non-spatial multivariate plots [22]for a more comprehensive visual debugging system.Fig. 15 shows the visualization of a series of open-access MRIslices [12]. The boxplot linearized with our method (Fig. 15 (b, center))exhibits a more coherent feature that is more concentrated than in thePeano-Hilbert curve linearization (Fig. 15 (b, bottom)). As shown inthe zoom-in of Fig. 15 (b, top), the outlier (the red curve) has widerlow-value regions (inside the gray boxes) than the band as shown inthe zoom-in. With brushing-and-linking, it is confirmed that the outlierimage (Fig. 15 (a, top)) has larger lateral ventricle area than the median eano-Hilbert curve boxplot (a) (b) OutlierMedian
Data-driven space-filling curve boxplot
Fig. 15. Visualization of a brain atlas with our data-driven space-filling curve and a Peano-Hilbert curve (with images padded to size of 256 × (a) (b) (c)Data-driven space-filling curve boxplotPeano-Hilbert curve boxplot Fig. 16. Ensemble visualizations of a heart ischemia simulation. The median member is volume-rendered in (a)—with 1D functional boxplotslinearized with (b) our data-driven space-filling curve (top) and a Peano-Hilbert curve (bottom). The ischemic region that has potential value greaterthan 3 eV is selected in the boxplots and highlighted in (c). image (Fig. 15 (a, bottom)) with the brush on the boxplot linearizedwith our data-driven space-filling curve.The myocardial ischemia dataset was generated by image-based,experimentally derived cardiac electrical potential simulation [3]. Weuse a subset of ensemble runs of the simulation and sample the dataon regular grids for our experiment. Here, we are interested in theacute ischemic regions associated with mean potentials greater or equalto 3 eV. As shown in Fig. 16 (b, top), the linearized 3D boxplot us-ing data-driven technique yields more concentrated global featuresthan the linearization with the Peano-Hilbert curve (Fig. 16 (b, bot-tom)). The region of interest (high potential regions) is bounded in asmall neighborhood with our method that could be selected with onebrush (Fig. 16 (b, top)), whereas the Peano-Hilbert curve yields a morescattered result—a large number of brushes are required (Fig. 16 (b,bottom)). The volume rendering (Fig. 16(a)) of the median ensemblemember shows that the region of interest is spatially continuous (whitein the rendering); the highlighted regions in space (Fig. 16 (c)) verifythat our method gives good coherency of the feature.
ONCLUSION AND F UTURE W ORK
We have introduced data-driven space-filling curves for 2D and 3Dvisualization. We have designed our methods to preserve coherency ofboth data value and locality after the mapping from the spatial domainto 1D. The methods are applicable for data on regular grids and in mul-tiscale. We have modeled the problem as finding a Hamiltonian paththat approximates the minimum of an objective function that blendsa data value term and a locality term. Two variants of techniques are available for regular grids and multiscale data (quadtrees and octrees).The effectiveness of our method has been evaluated by comparing toexisting methods on various datasets with qualitative visual comparisonand quantitative comparisons of autocorrelations. We have confirmedthat existing methods cannot preserve both data features and localityafter linearization. Through multivariate and ensemble visualization ex-amples with a wide range of real-world datasets, we have demonstratedthe usefulness of our data-driven space-filling curves.In the future, we would like to extend our method for time-varyingdata to understand the coherency in time. The positional term for regulargrids requires uniform blocks of a user-defined size, which should beimproved to be data-driven. The method could be used for multi-fieldvisualization such that different field data, e.g., scalar, vector, and tensor,could be visualized in a linear layout for non-occluded comparisons andinvestigation of correlations. Finally, we would also like to accelerateour method with parallel computing to support larger datasets. A CKNOWLEDGMENTS
This work is supported in part by the Intel Graphics and VisualizationInstitutes of XeLLENCE, the National Institute of General MedicalSciences of the National Institutes of Health under grant numbers P41GM103545 and R24 GM136986 and the Department of Energy undergrant number DE-FE0031880, and by the Deutsche Forschungsgemein-schaft (DFG, German Research Foundation) Project-ID 251654672 –TRR 161 (project B01). The GPU used for this research was donatedby the Nvidia Corporation. The SPH dataset is provided by StefanReinhardt.
EFERENCES [1] B. Bollobas.
Graph Theory: An Introductory Course . Springer-Verlag,New York, 1979. doi: 10.1007/978-1-4612-9967-7[2] S. Briais, S. Caron, J.-M. Cioranesco, J.-L. Danger, S. Guilley, J.-H. Jour-dan, A. Milchior, D. Naccache, and T. Porteboeuf. 3D hardware canaries.In E. Prouff and P. Schaumont, eds.,
Cryptographic Hardware and Em-bedded Systems – CHES 2012 , pp. 1–22. Springer, Berlin, Heidelberg,2012.[3] B. Burton, K. Aras, W. Good, J. Tate, B. Zenger, and R. MacLeod. Aframework for image-based modeling of acute myocardial ischemia us-ing intramurally recorded extracellular potentials.
Annals of BiomedicalEngineering , 2018. doi: 10.1007/s10439-018-2048-0[4] P. M. Campbell, K. D. Devine, J. E. Flaherty, L. G. Gervasio, and J. D.Teresco. Dynamic octree load balancing using space-filling curves. Techni-cal Report CS-03-01, Williams College Department of Computer Science,2003.[5] R. Dafner, D. Cohen-Or, and Y. Matias. Context-based space filling curves.
Computer Graphics Forum , 19(3):209–218, 2000. doi: 10.1111/1467-8659.00413[6] I. Demir, C. Dick, and R. Westermann. Multi-charts for comparative 3Densemble visualization.
IEEE Transactions on Visualization and ComputerGraphics
IEEETransactions on Software Engineering , 14(10):13811393, 1988. doi: 10.1109/32.6184[9] M. Genton, C. Johnson, K. Potter, G. Stenchikov, and Y. Sun. Surfaceboxplots.
Statistical Journal , 3(1):1–11, 2014.[10] D. Hilbert. Ueber die stetige Abbildung einer Line auf ein Fl¨achenst¨uck.
Mathematische Annalen , 38(3):459–460, 1891. doi: 10.1007/BF01199431[11] A. Itai, C. H. Papadimitriou, and J. L. Szwarcfiter. Hamilton paths ingrid graphs.
SIAM Journal on Computing , 11(4):676–686, 1982. doi: 10.1137/0211056[12] D. S. Marcus, T. H. Wang, J. Parker, J. G. Csernansky, J. C. Morris, andR. L. Buckner. Open access series of imaging studies (OASIS): Cross-sectional MRI data in young, middle aged, nondemented, and dementedolder adults.
Journal of Cognitive Neuroscience , 19(9):1498–1507, 2007.doi: 10.1162/jocn.2007.19.9.1498[13] Y. Matias and A. Shamir. A video scrambling technique based on spacefilling curves (extended abstract). In
Advances in Cryptology — CRYPTO’87 , pp. 398–417, 1988. doi: 10.1007/3-540-48184-2 35[14] M. Mirzargar, R. T. Whitaker, and R. M. Kirby. Curve boxplot: Gen-eralization of boxplot for ensembles of curves.
IEEE Transactions onVisualization and Computer Graphics , 20(12):2654–2663, 2014. doi: 10.1109/TVCG.2014.2346455[15] W. F. Mitchell. Hamiltonian paths through two- and three-dimensionalgrids.
Journal of Research of the National Institute of Standards andTechnology , 110(2):127–136, 2005. doi: 10.6028/jres.110.012[16] R. Netzel and D. Weiskopf. Hilbert attention maps for visualizing spa-tiotemporal gaze data. In
Second Workshop on Eye Tracking and Visual-ization (ETVIS) , pp. 21–25, 2016. doi: 10.1109/ETVIS.2016.7851160[17] H. Obermaier, K. Bensema, and K. I. Joy. Visual trends analysis in time-varying ensembles.
IEEE Transactions on Visualization and ComputerGraphics , 22(10):2331–2342, 2016. doi: 10.1109/TVCG.2015.2507592[18] H. Obermaier and K. I. Joy. Future challenges for ensemble visualization.
IEEE Computer Graphics and Applications , 34(3):8–11, 2014. doi: 10.1109/MCG.2014.52[19] G. Peano. Sur une courbe, qui remplit toute une aire plane.
MathematischeAnnalen , 36(1):157–160, 1890. doi: 10.1007/BF01199438[20] K. Potter, A. Wilson, P.-T. Bremer, D. Williams, C. Doutriaux, V. Pascucci,and C. Johnson. Ensemble-Vis: A framework for the statistical visual-ization of ensemble data. In
Proceedings of the 2009 IEEE InternationalConference on Data Mining Workshops , pp. 233–240, 2009.[21] M. Raj, M. Mirzargar, J. S. Preston, R. M. Kirby, and R. T. Whitaker.Evaluating shape alignment via ensemble visualization.
IEEE ComputerGraphics and Applications , 36(3):60–71, 2016. doi: 10.1109/MCG.2015.70[22] S. Reinhardt, M. Huber, O. Dumitrescu, M. Krone, B. Eberhardt, andD. Weiskopf. Visual Debugging of SPH Simulations. In , pp. 117–126, 2017. doi: 10.1109/iV.2017.20 [23] P. Rosen, B. Burton, K. Potter, and C. Johnson. muView: A visual analysissystem for exploring uncertainty in myocardial ischemia simulations. In
Visualization in Medicine and Life Sciences III , pp. 49–69. Springer Nature,2016. doi: 10.1007/978-3-319-24523-2 3[24] H. Sagan.
Space-Filling Curves . Springer-Verlag, New York, 1994. doi:10.1007/978-1-4612-0871-6[25] R. Sedgewick.
Algorithms . Addison-Wesley Longman Publishing Co.,Inc., Boston, USA, 1984.[26] E. Skubalska-Rafajowicz. Applications of the space-filling curves withdata driven measure-preserving property.
Nonlinear Analysis: Theory,Methods & Applications , 30(3):1305–1310, 1997. doi: 10.1016/S0362-546X(97)00277-0[27] Y. Sun and M. G. Genton. Functional boxplots.
Journal of Computationaland Graphical Statistics , 20(2):316–334, 2011. doi: 10.1198/jcgs.2011.09224[28] J. Weissenb¨ock, B. Fr¨ohler, E. Gr¨oller, J. Kastner, and C. Heinzl. Dynamicvolume lines: Visual comparison of 3D volumes through space-fillingcurves.
IEEE Transactions on Visualization and Computer Graphics ,25(1):1040–1049, 2019. doi: 10.1109/TVCG.2018.2864510[29] R. T. Whitaker, M. Mirzargar, and R. M. Kirby. Contour boxplots: Amethod for characterizing uncertainty in feature sets from simulationensembles.