Evaluating Ordering Strategies of Star Glyph Axes
Matthias Miller, Xuan Zhang, Johannes Fuchs, Michael Blumenschein
EEvaluating Ordering Strategies of Star Glyph Axes
Matthias Miller * University of Konstanz
Xuan Zhang † RWTH Aachen University
Johannes Fuchs ∗ University of Konstanz
Michael Blumenschein ∗ University of Konstanz A BSTRACT
Star glyphs are a well-researched visualization technique to repre-sent multi-dimensional data. They are often used in small multiplesettings for a visual comparison of many data points. However, theiroverall visual appearance is strongly influenced by the ordering of di-mensions. To this end, two orthogonal categories of layout strategiesare proposed in the literature: order dimensions by similarity to gethomogeneously shaped glyphs vs. order by dissimilarity to empha-size spikes and salient shapes. While there is evidence that salientshapes support clustering tasks, evaluation, and direct comparisonof data-driven ordering strategies has not received much researchattention. We contribute an empirical user study to evaluate the effi-ciency, effectiveness, and user confidence in visual clustering tasksusing star glyphs. In comparison to similarity-based ordering, ourresults indicate that dissimilarity-based star glyph layouts supportusers better in clustering tasks, especially when clutter is present.
Keywords:
Star glyph, axes ordering, quantitative evaluation
NTRODUCTION
Data glyphs are compact visual representations of multi-dimensionaldata points. Due to their small graphical appearance, they canbe used in various settings like within node-link diagrams [12],treemaps [14], tables [23], or geographic maps [16]. For instance,star glyphs are employed in the medical domain [30] and can beused to show the spatial distribution of food production [28].Due to their use of visual variables, star glyphs [33] are an ade-quate choice to encode single data points comprising numerical data.The glyph’s axes represent the data dimensions, and their lengthsencode numeric values. Since glyphs are versatile, different designvariations of star glyphs emerged in literature. Many have alreadybeen extensively analyzed by the community (e.g., [16], see [17] fora full enumeration). However, there is not much empirical researchabout the effect of axes ordering strategy on visual comparison tasks.The ordering influences the shape of a star glyph and affects itsreadability and similarity judgment [24, 25]. Therefore, we need(task-based) guidelines to arrange the dimensions in star glyphs [35].Numerous ordering strategies for star glyphs have been proposed[3, 4, 15, 22, 24, 25, 29, 35, 37] which can be grouped into similarity-based (short:
SIM ), favoring homogeneous shapes, and dissimilarity-based orderings (short:
DIS ), emphasizing spikes and salient shapes.Some approaches also discuss symmetry, monotonicity, convexityand concavity, feature saliency, and user-driven relationships amongneighboring dimensions. The ordering strategies typically analyzethe relationship among all pair-wise dimensions and then adjustthe axes of every star glyph simultaneously according to a metric(e.g.,
SIM or DIS ). However, this also means that not all glyphs willresult in the desired shape. In particular, outliers may be encoded byshapes which the reordering algorithm is trying to avoid.We address the research question: “Which ordering strategy ismost useful for similarity search and data grouping tasks (clustering) * e-mail: fi[email protected] † e-mail: [email protected] using star glyphs?”. According to the task taxonomy by Andrienkoand Andrienko [2], similarity search, and grouping are among themost common analysis tasks for glyphs [16]. While different strate-gies have been proposed, they are not yet evaluated by empirical stud-ies. Klippel et al. [24, 25] evaluated the influence of a star glyph’sshape in grouping tasks. Although they found out that salient shapes,e.g., having spikes, can support grouping tasks, they did not apply adimension ordering strategy that considers these salient properties.Sorting the dimensions by dissimilarity favors the spikey-design,which Klippel et al. states to be promising for grouping. We com-pare this ordering strategy with the similarity-based design which isoften proposed in the literature [3, 6, 15, 35]. We conducted an em-pirical user study with 15 participants to evaluate the efficiency, dataclustering quality, noise identification quality, and user confidencebetween the two different strategies (first independent variable). Ourresults show that star glyphs, ordered by a dissimilarity-based lay-out, support users better in a clustering task.Real-world data often contains non-relevant dimensions with clut-ter and noise that may distort interesting patterns [18]. Additionally,clusters do often not span across all dimensions but exist only in sub-spaces [26]. Therefore, we investigate impact of clutter on clusteridentification and reordering strategies as a second independent vari-able. We use the term clutter dimensions to describe attributes thatdo not discriminate clusters but hinder the comprehension of featurerelationships in the data [29]. Therefore, we also investigate the in-fluence of clutter separately, and in combination with the orderingstrategies. For replicability and reproducibility, the material of thestudy (benchmark data, study results, analysis scripts, and code) ispublicly available at https://osf.io/bje89. ELATED W ORK
Finding an optimal star glyph axes ordering has proven to be NP-complete [3] and more research is required [31,35]. It is related to theordering of axes in parallel coordinates [3, 11, 21, 34, 38], RadViz [1,9, 10, 20], ArcViz [27], and other axes-based radial visualizations assummarized by Behrisch et al. [5]. Ordering algorithms typicallydefine an objective function, modeling a good dimension order(according to their interpretation) and apply a heuristic to find anaxes order which maximizes the objective function [35].
Different visual characteristics can be subject to shape optimizationwhen applying specific ordering strategies of the star glyph axes.Ward [35] summarizes four major strategies which have been ex-tended by others: user- and data-driven , correlation- and similarity-driven , spikes and salient shapes , and symmetry-driven . User-driven dimension orderings enable experts to adjust the shapeof a star glyph based on their domain knowledge [32]. Users canselect a data point to sort the data dimensions with ascending or de-scending order ( data-driven ) to reveal patterns between records [35].
Correlation- and similarity-driven strategies improvestar glyphs by adjacent placement of similar axes to supportunderstanding of clusters, outliers, and relationships [7].Ankerst et al. propose heuristic algorithms based on similarity forstar glyphs to improve the overall perception [3]. Similarly, Arteroet al. use similarity heuristics of attributes to apply dimension-ordering and take perceptional aspects as Gestalt Laws into account a r X i v : . [ c s . G R ] A ug y applying dimension reduction [4]. Yang et al. combine similarity-based ordering with a hierarchical structure of the dimension toenable interactive exploration of high-dimensional subspaces [37].Friendly and Kwan argue that using correlation-based ordering instar glyphs supports the identification of shape irregularities [15].The authors did not conduct a survey to underpin their statement. Spikes and salient shapes such as “ has-one-spike ” are help-ful in visual grouping tasks of data points according to Klip-pel et al [24]. They argue, that concavity is more suitable forcomparability than convexity, which is especially true for the “star”glyph, due to the large variations between adjacent dimen-sions. The salience of dissimilar neighboring axes shall en-hance the comparison speed and help to detect changes.Klippel et al. showed that the star glyph shape with eight dimen-sions influences classification tasks [25]. Especially, in contrast toearlier work that state that similarity-driven orderings improve high-dimensional visualizations, dissimilarity between neighboring axescontribute salient properties that are perceptually more noticeable.
Symmetry-driven reordering methods help to reducethe visual complexity of star glyphs and, therefore, sup-port comparison tasks by improving memorability [22].By providing some examples, Peng et al. argue thatorderings with simple and symmetric as well as mono-tonic shapes of the star glyphs facilitate the identification of valuedifferences between multiple dimensions [29]. They emphasize thatsymmetry and similarity are primary factors to identify patterns. Forthis, Gestalt Laws are a solid foundation for perception design [36].Peng et al. state that star glyphs can be optimized by aligning thesymmetry on the vertical or horizontal axis [29]. An additional ro-tation optimization step can be included in the pipeline to find thebest global rotation for all star glyphs of a dataset. Rotation can beapplied, e.g., on top of similarity or spike-based ordering.
While many ordering strategies, algorithms, and heuristics havebeen proposed star glyph dimension ordering, empirical evaluationis missing. Previous approaches mainly argue by showing examplesor providing arguments w.r.t. to e.g., Gestalt laws. While this is use-ful to find differences between strategies, we also need empiricalevidence to directly compare strategies respecting scalability, perfor-mance, analysis tasks, data characteristics, and user perception [35].We are only aware of two studies conducted by Klippel etal. [24, 25]. Their results indicate that spikes and salient shapeshave a positive effect on visual grouping tasks and colored axes pos-itively affect the processing speed and reduce the negative influenceof shape saliency on rotated data glyphs. However, Klippel et al. didnot directly compare different ordering strategies or evaluated themagainst a benchmark. Instead, they designed different star glyphshapes and analyzed how participants grouped them by their under-standing of similarity during an exploratory analysis task. In ourstudy, we aim to close this research gap by comparing two proposedreordering strategies using a controlled, empirical user experiment.Figure 1:
Comparison of similarity (
SIM ) and dissimilarity (
DIS )based ordering using the same data records.
MPIRICAL U SER S TUDY
We evaluate whether a similarity- (homogeneous shape, short:
SIM )or dissimilarity-based layout (spike and salient shape, short:
DIS ) ismore efficient and effective for a visual clustering task. We designedour study based on Klippel et al.’s work [24,25]. We adopted the task,user interface, glyph design (including colored axes), and datasets’dimensionality (eight dimensions). However, in contrast, we appliedtwo different reordering algorithms (
SIM and
DIS ) and differentclutter levels as independent factors, and evaluate the results using abenchmark dataset.
The participants had to manually assign star glyphs into reasonableclusters and identify noise, i.e., items not belonging to any cluster.In the study, we used the term group instead of cluster. To assessthe performance, we use four dependent variables as quality mea-sures: (i) task completion time , (ii) quality of groups , (iii) quality ofidentified noise , and (iv) the confidence of the participants.
Participants.
We recruited 15 participants from the local studentpopulation (seven female, two bachelor, twelve master, one PhDstudent). The age ranged from 20 – 27 years with a median of 23.The participants had a different background in data analysis andvisualization: ten had general knowledge in data analysis, four haddata visualization experience, and one has used star glyphs before.All participants received a compensation of 10 EUR.
Glyph Design and Implementation.
The glyphs are de-signed analog to Klippel et al.’s work [24, 25] using a con-tour, gray background, and colored axes. We used Col-orBrewer [19] to select diverging colors and applied theordering algorithm by Ankerst et al. [3]. The Euclidean distanceis used to measure the (dis)similarity between dimensions. We ranan exhaustive search to find the permutation with the highest (
SIM )and lowest (
DIS ) similarity. An example of star glyphs with thetwo orderings is depicted in Fig. 1. Orientation (rotation) of the starglyphs is not considered and chosen randomly. All orderings are pre-computed not to influence the run time during the study. We providethe study and the ordering strategy implementation on our websites . Hypotheses.
We address the following two hypotheses:
H1.
Clutter negatively influences visual comparison. With increas-ing clutter, the performance of grouping tasks drops, independentof the axes ordering. In particular, we expect users to be (a) slower, (b) less accurate in grouping accuracy, (c) less accurate in noiseidentification, and (d) less confident of their grouping.
H2.
Klippel et al. [24, 25] argue that spikes and salient shapessupport users in similarity estimation and grouping tasks. Therefore,the performance of users should increase with a dissimilarity-basedordering. Furthermore, we hypothesize that the salient shapes shouldsupport users even more if the data contains clutter since sharp edgesare more perceptually apparent. In particular, we expect users to be (a) faster, (b) more accurate in grouping accuracy, (c) more accuratein noise identification, and (d) more confident of their groupingwhen dimensions are ordered by dissimilarity.
We manually created 18 different datasets using the PCDC tool [8].Every dataset contains 50 records of which 2–7 data points areselected as noise (randomly distributed across all dimensions). Theremaining data points are grouped into 2, 3, or 4 clusters with similarcluster sizes. Besides, we introduced clutter dimensions which donot discriminate any cluster, since we uniformly distributed all datapoints across the clutter dimensions. 6 datasets contain no clutter( ), 6 one- ( ), and 6 two clutter dimensions ( ). For instance,in condition a dataset consists of six dimensions discriminating http://subspace.dbvis.de/sg-study and */sg-ordering . igure 2: Study prototype.
Users can group visually similar starglyphs using drag&drop. Noise points remain in the left panel.the cluster, while the remaining two introduce clutter. Thus, wegenerated the datasets to keep the number of dimensions consistent.To verify the manually created clusters, we run a DBSCAN [13](parameters: minPts = ε = .
5) on all datasets.
Tasks and Procedure.
Each study took an hour on average. Partici-pants filled out a consent form, demographics, and report on previ-ous knowledge in data analysis, information visualization, and starglyphs. Afterward, we described how to read the visual encodingof a star glyph using an artificial car dataset as an example. Specifi-cally, we clarified that star glyphs with similar shapes on differentaxes are not similar (rotation invariance). Finally, the participantsperformed three training trials before the study was recorded.To conduct the study, we used a 27-inch screen with 2560x1440resolution and a mouse to execute given tasks. Every participant hadto perform 18 trials, leading to 15 participants × trials = trials for the entire study. In between two trials, we showeda blank screen with the term ‘break’ to motivate the participants tohave regular breaks. Each trial consisted of manually grouping all50 star glyphs of one dataset into distinct groups and noise. Then,the participants stated the confidence level about their selection on a7-point Likert-scale. We did not provide the number of clusters perdataset and explicitly told the participants that there might be glyphswhich do not belong to any group (noise).Fig. 2 shows our interface. Participants were able to add new ordelete groups. Glyphs can be interactively assigned to groups bydrag&drop. If a glyph was considered to be noise, then it remainedin the left panel. Participants were able to undo or change a group-ing also using drag&drop. In the study, we did not constrain the taskcompletion time. We ended the study with an interview about theparticipants’ strategy and preferences regarding the SIM and
DIS or-dering by showing examples. Questions and answers were recorded.
Randomization.
Each participant performed 18 trials, i.e., thegrouping task on all benchmark datasets was equally distributed be-tween
SIM and
DIS . We randomized the order of the trials as follows:First, we grouped the datasets into their level of difficulty based onthe amount of clutter ( , , ). Then, participants performed thetrials with increasing difficulty, i.e., 6 × , then 6 × , and finally6 × . For every clutter condition, we randomized the dataset orderand randomly assigned 3 × SIM and 3 ×
DIS . We attached the ran-domization algorithm and our configuration in the supplementarymaterial. A summary of our trials:3 levels of difficulty (clutter: , , ) ×2 ordering strategies ( SIM , DIS ) ×3 trials (2, 3, 4 clusters) ×15 participants =
270 trials in totalData Collection, Post-Processing, and Analysis.
In each trial, werecorded the grouping task completion time, the selected groups andnoise, and the participants’ confidence. Some participants createdgroups with only one or two glyphs. Thus, in a post-processingstep, we converted such small groups into noise to execute a morecoherent analysis. We measured the quality of the identified noise bycomputing the Jaccard index between noise and ground truth noise. Figure 3:
Cluster quality analysis.
Difference between clutterdimensions , , and as well as DIS and
SIM ordering.The grouping quality is also based on the Jaccard index between thegrouping and ground truth. However, since participants could havealso selected too few or too many groups, we structured our qualitycomputation as a two-step process: First, we computed the averageJaccard index of each group to its best match in the ground truth.Second, we computed the average Jaccard index of every groundtruth cluster to its best match in the selection. Using this method,we considered too few, too many groups, as well as too few and toomany records per group. The final clustering quality is the averagescore of both steps.
We executed a statistical analysis to summarize the study results. Wereport all statistically significant findings ( p < .
05) and some inter-esting trends visible in the data. We check for normal distributionusing a one-sample Kolmogorov-Smirnov test . For a better compar-ison, we always report the median ( ¯ x ) and, additionally, the mean( µ ) for normally distributed samples. Analysis scripts and detailedresults can be found in the supplementary material. Statistical tests.
Confidence is measured as Likert-scale. Therefore,a
Pearson’s Chi-squared test is used for the analysis. Given the non-normal nature of the measures time , cluster quality , and noise iden-tification quality w.r.t. , , and , we used a non-parametricFriedman’s test and a Wilcoxon signed rank test with Bonferronicorrection (Post-hoc). The same measures do also not follow a nor-mal distribution w.r.t. the strategies
SIM and
DIS . Hence we used a
Wilcoxon signed rank test with continuity correction . Considering
SIM and
DIS within and reveal normal distributed samples forthe measures time , quality of clustering , and quality of noise identifi-cation . Hence, we use a paired t-test for the statistical analysis. Task Completion Time.H1a.
Task completion time increased with clutter levels , but notsignificantly: ( ¯ x = . s ), (179 . s ), and (184 . s ). H2a.
Using the ordering
DIS (176 . s ) users completed the groupingtask slightly faster than SIM (178 . s ), but only for datasets withclutter dimensions. : DIS (168 . s ) vs. SIM (158 . s ), : 180 . s /179 . s , and : 180 . s / 190 . s . Differences are not significant. Cluster Quality .An overview of the cluster quality is depicted in Fig. 3.
H1b.
There were significant effects of clutter level on cluster qual-ity ( χ ( , N = ) = . , p < . ( ¯ x = .
85) compared to ( . p < . ( . p < . and ( p < . H2b.
When comparing ordering strategies , participants were moreaccurate with
DIS ( ¯ x = .
69) compared to
SIM ( ¯ x = . p < . and , but not . : DIS ( ¯ x = . µ = .
81) vs.
SIM ( ¯ x = . µ = . : DIS ( ¯ x = . µ = .
66) vs.
SIM ( ¯ x = . µ = . : DIS ( ¯ x = . µ = .
57) vs.
SIM ( ¯ x = . µ = . p < . oise Identification Quality.H1c. There was a significant effect of clutter level on noise identifi-cation ( χ ( , N = ) = . , p < . ( ¯ x = .
8) compared to ( ¯ x = . p < . ( ¯ x = . p < . and ( p < . H2c.
In general, there is no difference between
SIM and
DIS w.r.t.noise identification quality (both ¯ x = .
5) There are no differencesfor (both µ = . DIS ¯ x = . SIM ¯ x = .
8) and (both ¯ x = . µ = . clutter condition, there was also a significanteffect of ordering strategy on noise identification ( t ( ) = . p = . DIS were more accurate ( ¯ x = . µ = .
39) in comparison to
SIM ( ¯ x = . µ = . p < . Confidence.H1d.
There was a significant effect of clutter level on confidence ( χ ( , N = ) = . , p < . ( ¯ x =
2) compared to (1, p < . H2d.
There is no significant effect between
SIM (1) and
DIS (1).While there is also no effect within the different clutter levels ( :2 / : 1 /
1, and : 1 / SIM without clutter dimensions andmore confident with
DIS with increasing clutter.
Ordering Preferences.
11 out of 15 participants reported that theycould see the clusters more clearly with dissimilarity reorderingbecause they could use the orientation of the spikes as a determiningfactor. Some participants reported that they generally found thegrouping tasks challenging, and they were not quite sure about theresults. Interestingly, most of them said to have personal preferencestowards the patterns with more smooth and convex shapes, namelythe patterns produced by similarity reordering.
Similarity Estimation Strategies.
The strategies reported by theparticipants can be grouped into three categories: (1) the majority ofparticipants focused primarily on the spikes’ orientation; (2) partici-pants reported that they tried to find the center of a star glyph, andobserve at which position of the glyph the center lies and how thegray area around the center is shaped; (3) a few participants searchedfor unique shape-parts and matched it with others.
ISCUSSION & F
UTURE W ORK
In summary, our study revealed two major findings.
Clutter Analysis.
Clutter negatively influences the visual compar-ison of star glyphs. There is a significant drop in cluster quality,noise identification quality, and confidence with an increasing num-ber of clutter dimensions. Also, task completion time changed con-siderably, although not statistically significant. Therefore, we canpartially confirm our hypotheses
H1a – H1d .We expected these results as more clutter hampers similarityestimation in clustering tasks. As a result, cluster performance drops.While this is a general problem in information visualization [18],it particularly affects star glyphs as clutter may change their shapeconsiderably. Glyph designers should, therefore, think of usingautomatic algorithms to remove clutter dimensions, if possible.
Ordering Analysis.
There are differences between the two evalu-ated ordering strategies. Generally, the quality of the clustering wassignificantly more accurate with
DIS , in particular for datasets con-taining clutter ( , ). Participants also performed the task slightlyfaster using DIS . However, they were on average 10 seconds fasterwith
SIM in non-cluttered datasets. We can see that
DIS significantlysupports noise identification for a cluttered dataset ( ), but we can-not see a difference for the other clutter conditions. While many par-ticipants reported that they prefer a dissimilarity-based layout, wecannot see a significant result from the study. However, analyzingthe Likert-scale distributions reveal a tendency that participants are more confident with SIM for clutter-free datasets ( ) and with DIS for cluttered datasets ( ). Across all trials, we can confirm the hy-potheses H2b and
H2c , but completion time (
H2a ) and confidence(
H2d ) depend on the properties of the dataset.These results are in line with Klippel et al. [24, 25]. We foundit interesting that the difference between
SIM and
DIS is even morestriking in cluttered datasets. The spikes seem to help users inidentifying clutter dimension and improving the overall clusters.However, we could also see that, without clutter, participants werefaster and more confident using a similarity-based ordering. Theremaining question is whether it would be possible to combine
SIM and
DIS into a combined ordering strategy. Our study did not revealwhether participants need as many spikes as possible or whethera few important spikes are enough to improve the cluster quality.Further research needs to be done in this area. Another relevantquestion is also how the rotation of entire glyphs influences groupingquality in clustering tasks and further investigation in this directionis advisable as, for example, already started by Fuchs et al. [16].
Design Considerations.
With the results gained from our study, wederive the following design considerations: (1)
As the performance of users drop considerably when clutterdimensions are present, glyph designers should try to avoid clutterby applying a feature selection method first, if possible. (2)
Since, for datasets with clutter , salient shapes and spikes supportgrouping tasks, we recommend using
DIS strategies . (3) For datasets without clutter , we did not find a clear differencebetween
SIM and
DIS . As
SIM seem to be slightly faster and lesserror prone to rotation [24, 25]. We recommend to use this strategy.
Limitations.
We identified two main threats to our results’ validity.(1) The number of trials (270) is rather small, in particular, forthe effectiveness and efficiency analysis of a specific clutter level.This affects not only the statistical analysis, but outliers may alsodistort the results. The number of trials per participant cannot beincreased with the current study design; otherwise, the study wouldtake much longer than one hour. Therefore, we suggest repeatingthe study with more participants to increase the number of trials.(2) While we designed our datasets with different cluster structuresand distributions, we limited them by eight dimensions as Klippel etal. [25]. There might be differences for datasets with less, more, oran odd number of dimensions.
ONCLUSION AND F UTURE W ORK
We conducted an empirical user study to evaluate the impact ofclutter and axes ordering to clustering performance with star glyphs.Our results show that users perform better when the glyphs representsalient shapes and spikes, which is achieved by a dissimilarity-basedordering of the dimensions. Furthermore, we elicited that there is asignificant impact of clutter on the clustering performance in general.As future work, we plan to extend and re-run the study basedon our discussed limitations and include other reordering strategies,as well. Extending to that, we want to investigate whether thereis an influence of the data characteristics and rotation (e.g., favorsymmetrical glyph shapes) to the ordering strategy. If so, we areinterested in developing techniques to select the most useful orderingstrategy based on the given data and task. Finally, automatic orderingstrategies should be compared to user-driven axes arrangements,which are determined by experts based on their domain knowledge. A CKNOWLEDGMENTS
We thank David Pomerenke and the anonymous reviewers for theirvaluable feedback and support. Funded by the Deutsche Forschungs-gemeinschaft (DFG, German Research Foundation) – Projektnum-mer 251654672 – TRR 161 (Project A03).
EFERENCES [1] G. Albuquerque, M. Eisemann, D. J. Lehmann, H. Theisel, and M. A.Magnor. Improving the Visual Analysis of High-dimensional DatasetsUsing Quality Measures. In
Proc. of the IEEE Conf. on Visual AnalyticsScience and Technology , pp. 19–26. IEEE Computer Society, 2010.doi: 10.1109/VAST.2010.5652433[2] N. V. Andrienko and G. L. Andrienko.
Exploratory Analysis of Spatialand Temporal Data: A Systematic Approach . Springer, 2006. doi: 10.1007/3-540-31190-4[3] M. Ankerst, S. Berchtold, and D. A. Keim. Similarity Clustering ofDimensions for an Enhanced Visualization of Multidimensional Data.In
IEEE Symp. on Information Visualization , pp. 52–60, 1998. doi: 10.1109/INFVIS.1998.729559[4] A. O. Artero, M. C. F. de Oliveira, and H. Levkowitz. Enhanced HighDimensional Data Visualization through Dimension Reduction andAttribute Arrangement. In
Intl. Conf. on Information Visualisation , pp.707–712, 2006. doi: 10.1109/IV.2006.49[5] M. Behrisch, M. Blumenschein, N. W. Kim, L. Shao, M. El-Assady,J. Fuchs, D. Seebacher, A. Diehl, U. Brandes, H. Pfister, T. Schreck,D. Weiskopf, and D. A. Keim. Quality metrics for information visual-ization.
Comput. Graph. Forum , 37(3):625–662, 2018. doi: 10.1111/cgf.13446[6] I. Borg and T. Staufenbiel. Performance of Snow Flakes, Suns, andFactorial Suns in the Graphical Representation of Multivariate Data.
Multivariate Behavioral Research , 27(1):43–55, 1992. doi: 10.1207/s15327906mbr2701 4[7] R. Borgo, J. Kehrer, D. H. S. Chung, E. Maguire, R. S. Laramee,H. Hauser, M. Ward, and M. Chen. Glyph-based Visualization: Foun-dations, Design Guidelines, Techniques and Applications. In
Euro-graphics - State of the Art Reports , pp. 39–63, 2013. doi: 10.2312/conf/EG2013/stars/039-063[8] S. Bremm, M. Hess, T. von Landesberger, and D. W. Fellner. PCDC -On the Highway to Data - A Tool for the Fast Generation of Large Syn-thetic Data Sets. In
EuroVis Workshop on Visual Analytics . Eurograph-ics Association, 2012. doi: 10.2312/PE/EuroVAST/EuroVA12/007-011[9] L. D. Caro, V. Fr´ıas-Mart´ınez, and E. Fr´ıas-Mart´ınez. Analyzing theRole of Dimension Arrangement for Data Visualization in Radviz. In
Advances in Knowledge Discovery and Data Mining , pp. 125–132.Springer, 2010. doi: 10.1007/978-3-642-13672-6 13[10] S. Cheng, W. Xu, and K. Mueller. RadViz Deluxe: An Attribute-AwareDisplay for Multivariate Data.
Processes , 5(4):75, 2017. doi: 10.3390/pr5040075[11] A. Dasgupta and R. Kosara. Pargnostics: Screen-Space Metrics for Par-allel Coordinates.
IEEE Trans. on Vis. and Comp. Graph. , 16(6):1017–1026, 2010. doi: 10.1109/TVCG.2010.184[12] R. F. Erbacher. Glyph-based generic network visualization. In
Visual-ization and Data Analysis , pp. 228–237, 2002. doi: 10.1117/12.458790[13] M. Ester, H. Kriegel, J. Sander, and X. Xu. A Density-Based Algorithmfor Discovering Clusters in Large Spatial Databases with Noise. In
Proc. of the Intl. Conf. on Knowledge Discovery and Data Mining , pp.226–231, 1996.[14] F. Fischer, J. Fuchs, and F. Mansmann. ClockMap: Enhancing CircularTreemaps with Temporal Glyphs for Time-Series Data. In
EurographicsConference on Visualization, EuroVis 2012, Vienna, Austria, June 5-8. ,2012. doi: 10.2312/PE/EuroVisShort/EuroVisShort2012/097-101[15] M. Friendly and E. Kwan. Effect ordering for data displays.
Com-putational Statistics & Data Analysis , 43(4):509–539, 2003. doi: 10.1016/S0167-9473(02)00290-6[16] J. Fuchs, P. Isenberg, A. Bezerianos, F. Fischer, and E. Bertini. TheInfluence of Contour on Similarity Perception of Star Glyphs.
IEEETrans. on Vis. and Comp. Graph. , 20(12):2251–2260, 2014. doi: 10.1109/TVCG.2014.2346426[17] J. Fuchs, P. Isenberg, A. Bezerianos, and D. A. Keim. A SystematicReview of Experimental Studies on Data Glyphs.
IEEE Trans. on Vis.and Comp. Graph. , 23(7):1863–1879, 2017. doi: 10.1109/TVCG.2016.2549018[18] S. Garc´ıa, J. Luengo, and F. Herrera.
Data Preprocessing in DataMining , vol. 72 of
Intelligent Systems Reference Library . Springer,2015. doi: 10.1007/978-3-319-10247-4 [19] M. Harrower and C. A. Brewer. ColorBrewer.org: An Online Toolfor Selecting Colour Schemes for Maps.
The Cartographic Journal ,40(1):27–37, 2003.[20] P. Hoffman, G. G. Grinstein, K. A. Marx, I. Grosse, and E. Stanley.DNA visual and analytic data mining. In
IEEE Visualization Proc. , pp.437–442, 1997. doi: 10.1109/VISUAL.1997.663916[21] A. Inselberg and B. Dimsdale. Parallel Coordinates: A Tool for Visual-izing Multi-dimensional Geometry. In
Proc. IEEE Visualization , pp.361–378, 1990. doi: 10.1109/VISUAL.1990.146402[22] G. Kayaert and J. Wagemans. Delayed shape matching benefits fromsimplicity and symmetry.
Vision Research , 49(7):708 – 717, 2009. doi:10.1016/j.visres.2009.01.002[23] C. Kintzel, J. Fuchs, and F. Mansmann. Monitoring large IP spaceswith ClockView. In
Intl. Symp. on Visualization for Cyber Security ,p. 2, 2011. doi: 10.1145/2016904.2016906[24] A. Klippel, F. Hardisty, R. Li, and C. Weaver. Colour-EnhancedStar Plot Glyphs: Can Salient Shape Characteristics Be Overcome?
Cartographica , 44(3):217–231, 2009. doi: 10.3138/carto.44.3.217[25] A. Klippel, F. Hardisty, and C. Weaver. Star Plots: How ShapeCharacteristics Influence Classification Tasks.
Cartography and Ge-ographic Information Science , 36(2):149–163, 2009. doi: 10.1559/152304009788188808[26] H. Kriegel, P. Kr¨oger, and A. Zimek. Clustering High-DimensionalData: A Survey on Subspace Clustering, Pattern-Based Clustering, andCorrelation Clustering.
ACM Transactions on Knowledge Discoveryfrom Data , 3(1):1:1–1:58, 2009. doi: 10.1145/1497577.1497578[27] T. V. Long. ArcViz: An Extended Radial Visualization for ClassesSeparation of High Dimensional Data. In
Intl. Conf. on Knowledgeand Systems Engineering , pp. 158–162, 2018. doi: 10.1109/KSE.2018.8573428[28] T. Opach, S. Popelka, J. Dolezalova, and J. K. Rød. Star and polylineglyphs in a grid plot and on a map display: which perform better?
Cartography and Geographic Information Science , 45(5):400–419,2018. doi: 10.1080/15230406.2017.1364169[29] W. Peng, M. O. Ward, and E. A. Rundensteiner. Clutter Reduction inMulti-Dimensional Data Visualization Using Dimension Reordering.In
IEEE Symp. on Information Visualization , pp. 89–96, 2004. doi: 10.1109/INFVIS.2004.15[30] T. Ropinski and B. Preim. Taxonomy and usage guidelines for glyph-based medical visualization. In
Simulation and Visualization , pp. 121–138, 2008.[31] T. Rze´zniczak. Evaluation of multidimensional visualization techniquesfor medical patterns representation.
Journal of Theoretical and AppliedComputer Science , 7(4):70–85, 2013.[32] D. Sacha, A. Stoffel, F. Stoffel, B. C. Kwon, G. P. Ellis, and D. A. Keim.Knowledge Generation Model for Visual Analytics.
IEEE Trans. onVis. and Comp. Graph. , 20(12):1604–1613, 2014. doi: 10.1109/TVCG.2014.2346481[33] J. Siegel, E. Farrell, R. Goldwyn, and H. Friedman. The SurgicalImplications of Physiologic Patterns in Myocardial Infarction Shock.
Surgery , 72(1):126–141, 1972.[34] A. Tatu, G. Albuquerque, M. Eisemann, P. Bak, H. Theisel, M. A.Magnor, and D. A. Keim. Automated Analytical Methods to SupportVisual Exploration of High-Dimensional Data.
IEEE Trans. on Vis. andComp. Graph. , 17(5):584–597, 2011. doi: 10.1109/TVCG.2010.242[35] M. O. Ward. Multivariate Data Glyphs: Principles and Practice. In
Handbook of Data Visualization , pp. 179–198. Springer, 2008. doi: 10.1007/978-3-540-33037-0 8[36] C. Ware. Information Visualization: Perception for Design.
SanFrancisco, CA: Morgan Kaufmann , 2004.[37] J. Yang, W. Peng, M. O. Ward, and E. A. Rundensteiner. Interactive Hi-erarchical Dimension Ordering, Spacing and Filtering for Explorationof High Dimensional Datasets. In
IEEE Symp. on Information Visual-ization , pp. 105–112, 2003. doi: 10.1109/INFVIS.2003.1249015[38] Z. Zhang, K. T. McDonnell, and K. Mueller. A Network-Based Inter-face for the Exploration of High-Dimensional Data Spaces. In