[PDF] Blue Noise Plots

Abstract

We propose Blue Noise Plots, two-dimensional dot plots that depict data points of univariate data sets. While often one-dimensional strip plots are used to depict such data, one of their main problems is visual clutter which results from overlap. To reduce this overlap, jitter plots were introduced, whereby an additional, non-encoding plot dimension is introduced, along which the data point representing dots are randomly perturbed. Unfortunately, this randomness can suggest non-existent clusters, and often leads to visually unappealing plots, in which overlap might still occur. To overcome these shortcomings, we introduce BlueNoise Plots where random jitter along the non-encoding plot dimension is replaced by optimizing all dots to keep a minimum distance in 2D i. e., Blue Noise. We evaluate the effectiveness as well as the aesthetics of Blue Noise Plots through both, a quantitative and a qualitative user study. The Python implementation of Blue Noise Plots is available here.

Full PDF

BBlue Noise Plots Christian van Onzenoodt Gurprit Singh Timo Ropinski Tobias Ritschel Ulm University Max-Planck Institute for Informatics, Saarbrücken University College London B l u e n o i s e J i tt e r Duration (s) [ geyser ] Duration (s) [ geyser ] Figure 1:

Blue Noise Plots (left) prevent clutter and provide visually more appealing results than frequently used jitter plots (right) . Importantly,Blue Noise Plots are unbiased , in the sense that no data point is ever changed and strictly all points of the sample are presented.

Abstract

We propose Blue Noise Plots, two-dimensional dot plots that depict data points of univariate data sets. While often one-dimensional strip plots are used to depict such data, one of their main problems is visual clutter which results from overlap. Toreduce this overlap, jitter plots were introduced, whereby an additional, non-encoding plot dimension is introduced, along whichthe data point representing dots are randomly perturbed. Unfortunately, this randomness can suggest non-existent clusters, andoften leads to visually unappealing plots, in which overlap might still occur. To overcome these shortcomings, we introduce BlueNoise Plots where random jitter along the non-encoding plot dimension is replaced by optimizing all dots to keep a minimumdistance in 2D i. e., Blue Noise. We evaluate the effectiveness as well as the aesthetics of Blue Noise Plots through both, aquantitative and a qualitative user study. The Python implementation of Blue Noise Plots is available here .

1. Introduction

Consider depicting a univariate data set, e. g., observed ages in acohort, on paper. While we could simply report ﬁrst order statistics,such as for instance the mean, this would be an oversimpliﬁcationin many important cases [Tal07]. Instead, we would like to show alldata points, and thus ask for the optimal way to represent these withdots in a two-dimensional plot.A strip plot simply plots data points on a single horizontal axis,while an observer is free to apply domain knowledge to judge what isthe density, what might be modes or what might be outliers. Recentefforts, to communicate data to a wider public and not just to expertshave resulted in such plots to be used increasingly in print media,television and on the web. Strip plots are most effective when thenumber of data points is still low enough to be displayed, but highenough to represent the information. The ﬁrst row of Fig. 2 showsexamples of strip plots on a univariate toy data set. Unfortunately,the main disadvantage of using strip plots to depict such data sets, is clutter which often leads to overdraw. As such, if two dots x i and x j are closer than what the printer, display or the human visual systemcan reliably discern, the advantage is lost, since not all data pointsare effectively communicated. For instance, the data set { , , } results in the same visual representation, as the data set { , , , , } , because the individual dots representing the values 2 and 4 would notbe discernible when plotting the second data set, due to overdraw. StripJitter Blue noise

Dataset A Dataset B

Seed 1 Seed 2 Seed 1 Seed 2

Plot type

Figure 2:

Three different plotting approaches (rows) for two dif-ferent univariate datasets at two different random seeds (columns) .Strip plots always fail to convey the datasets, as they look the same.Jitter plots, may sometimes fail: dataset B looks like dataset A forseed 2 while they are different. Our Blue Noise Plot never fails.

As a remedy, jitter plots have been proposed [Cha18, Tuk77],which introduce an additional, non-encoding dimension, alongwhich dots are randomly perturbed. For our example, of univariatedata points plotted in two dimensions, this means to simply movethe respective dots vertically by a random amount. The second rowof Fig. 2 shows examples of such jitter plots on the before datasets. While jitter plots usually reduce the amount of clutter and are a r X i v : . [ c s . G R ] F e b . van Onzenoodt, G. Singh, T. Ropinksi, T. Ritschel / Blue Noise Plots easy to implement, they have three main drawbacks. First, the intro-duced randomness leads to gaps and clusters, which might be falselyperceived as features present in the data set. Second, no minimumdistance between dots is enforced, which in the worst case mightlead to overlap, something we have observed in many real worldexamples. Lastly, jitter plots do often look visually unappealing.By introducing Blue Noise Plots, we provide a solution to thethree main drawbacks of jitter plots. A conceptual comparison ismade in the last row of Fig. 2. Computer graphics has explored blue noise [Uli88], that is, dot patterns, that are still random, butwithout dots getting to close to each other. By proposing a modi-ﬁed Loyd relaxation algorithm [Llo82], we can extend jitter plotsto become Blue Noise Plots. Importantly, Blue Noise Plots are unbiased , in the sense that we make no assumption on how datapoints are represented, no data point is ever changed and strictlyall data points of a data set are presented, setting it apart fromother methods [BS04, BW08, KHD ∗

10, MG13, MPOW17], that ei-ther re-sample or change the way data points are presented.To investigate the impact of Blue Noise Plots on task performanceand visual appeal, we have conducted a crowd sourced user study,whereby the obtained results indicate, that Blue Noise Plots arebeneﬁcial over conventional jitter plots. Furthermore, we have per-formed a qualitative study assessing the visual appeal of Blue NoisePlots, and discuss objective quality measures.

2. Previous Work

Our work addresses the visualization problem of plotting univariatedata sets using the computer graphics methodology of blue noise,both of which we will review now.

Visualizing univariate data sets.

To visualize univariate data sets,several different plot types exist. We distinguish between directand aggregated visualizations. While direct methods allow for thedepiction of the individual data points forming the data set, aggre-gated methods only communicate the data set per se, often in anapproximate manner. Several well known plots fall into the latercategory, such as for instance histograms [Pea94], box plots [Spe69],and violin plots [HN98]. Nevertheless, this category is not in thefocus of our work, and we instead aim for direct techniques, whichexplicitly convey the existing data points. One of these techniques,that is frequently used are strip plots [Cle85], where data points arerepresented as symbols, usually dots or circles. As stated above,these symbols are simply plotted all along the same dimension, inde-pendent of the occurring densities. While strip plots are an intuitiveway to convey univariate data sets in a direct manner, they come withsevere limitations, as they suffer from overdraw in dense regions.As a consequence, they might introduce distortion, as the maximumdensity to be communicated is limited by this overdraw. There aretwo main approaches to deal with the problem of overdraw. Thesize and or alpha blending value of the representing symbols canbe altered, or data points can be transformed such that overdraw isreduced. While naturally, these approaches can be combined, wesolely focus on transformation approaches in this paper.Among the techniques applying transformations, the stacked dotplot is the most basic, as it simply stacks the shown dots, which effec-tively resembles a bar chart for non-continuous data [Cle93, Wil99]. For continuous data sets, more elaborate packing schemes needto be employed, in order to obtain an acceptable layout. So-called beeswarm plots exploit such packing which enables them to obtaina stacked and dense representation without the need for binning, buttheir simple construction leads to false features and clustering.In contrast to the stacked representations described above, jitterplots randomly distribute all points in a given range along an ad-ditional, non-encoding dimension [Tuk77]. Thus, it also becomespossible to reduce the amount of overdraw, to a degree dependenton the plot size and the occurring densities. Jittering originally goesback to Chamber et al. [Cha18], while Tukey and Tukey additionallyexploit constraints [TT90].Recently, several visualization approaches have been proposed,which deal with the shortcomings of dot plots in general. Bachthalerand Weiskopf have introduced continuous scatter plots, which sac-riﬁce the discrete nature of scatter plots, in order obtain a densevisualization [BW08]. Mayorga and Gleicher go further by automat-ically grouping some dots, while keeping others [MG13], which iscombined with interactive exploration. Along a similar line, Bertiniand Santucci introduce non-uniform samplings in order to com-municate density in 2D scatter plots [BS04]. Keim et al. introducegeneralized scatter plots, where the dot locations are modiﬁed inorder to reduce overlap [KHD ∗ ∗

20] and Rapp etal. [RPD20] also address the problem of overdraw in scatter plotswhere they select a subset of data points from a large data set suchthat the resulting patterns follows the density, yet has blue noise.Our work does not select subsets, but shows all data points, andintroduce an additional, non-encoding dimension so that dots canbecome blue noise in the ﬁrst place.Some forms of box plot also communicate individual data points,e.g., when they are outliers, and thus can be considered a combina-tion of aggregated and direct visualization. Many of the contribu-tions made in this paper, can also be applied in this context.

Blue noise.

Dot patterns are often described in terms of their ex-pected power spectrum proﬁle. Patterns exhibiting mostly high-frequency content in their power spectrum are characterized as blue noise . The resulting spatial distribution of dots respect someminimum distance giving perceptually pleasing patterns [Yel83].Consequently, blue noise has been widely adopted in many com-puter graphics applications including halftoning [Uli88], stip-pling [Sec02], artistic packing [RRS13], anti-aliasing [DW85] andvariance reduction for Monte Carlo rendering [SOA ∗ . van Onzenoodt, G. Singh, T. Ropinksi, T. Ritschel / Blue Noise Plots Algorithm 1

Lloyd relaxation P ← uniform () repeat V = voronoi ( P ) for p i ∈ P do p i ← for q j ∈ V i do p i ← p i + q j / | V i | end for end for until converged plot ( P ) + Algorithm 2

Jitter Plots P ← stack ( X , uniform ()) plot ( P ) = Algorithm 3

Blue Noise Plots P ← stack ( X , uniform ()) repeat V = voronoi ( P ) for p i ∈ P do p i ← for q j ∈ V i do p i ← p i + q j / | V i | end for end for P ← stack ( X , unstack ( P , X )) until converged plot ( P ) Figure 3:

Our approach (right) is a combination of Lloyd relaxation (left) and jitter plots (middle) with a data constraint extension. Thefunction stack ( A , B ) stacks vector A on top of vector B. The function unstack ( A , B ) returns the vector A with the dimensions from B removed. [RRSG15] that treat different dimensions in the dot optimization dif-ferently, but without complying to data points, only optimizing fordifferent spectra. In this work, we show that dots distributed whileobeying to blue noise provide better data visualizations. Therefore,we optimize the dot layout using Lloyd relaxation [Llo82] to obtaina blue noise distribution. However, unlike traditional approaches,our optimization works by keeping the encoding dimension ﬁxed,while optimizing the dot positions along the other, non-encodingdimension. Our optimization runs in higher dimension than the dataand uses a novel distance metric to emphasize the non-encodingdimension to guide the optimization. We already have mentionedHu et al. [HSVK ∗

19] as a rare example of a visualization paper thatrelates to blue noise. Their task might be easier than ours, as theyassume sampling from multiple importance functions to producemulti-class blue noise, without adhering to ordinality or coordinates.

3. Blue Noise Plots

We will here give a formal deﬁnition of our approach, starting fromin- and output (Sec. 3.1), to a variational formulation with con-straints (Sec. 3.2) and our implementation to minimize it (Sec. 3.3).The section concludes with several extensions, such as centrality,resembling bee swarm plots, automatically choosing the plot height(Sec. 3.4), as well as introducing a multi-class version (Sec. 3.5).

Input to our method is a set X = { x i ∈ R } of univariate data points,where values are assumed to be associated with the horizontal axis.Output is a set Y = { y i ∈ R } of scalar jitter values associated withthe vertical axis. We write P = { p i = ( x i , y i ) ∈ R } to denote thecombination of data values along the horizontal (encoding) dimen-sion, and jitter value along the vertical (non-encoding) dimensioninto a set of two-dimensional dots. Optimization is performed for the set of vertical jitter values Y of anoutput Blue Noise Plot, given the univariate input data set X . We minimize the costarg min Y ∑ y i ∈ Y E q ∼ V ( p i , P ) κ ( p i , q ) , (1)the sum of the expected value E of the κ -distance from the i -thoutput dot p i to all sites q in its Voronoi cell V ( p i , P ) in respect toall other dots P . Uniform metric.

For classic Blue Noise Plots, we use κ ( p , p ) = || ( p − p ) T (cid:18) (cid:19) || . (2)Here, the constant diagonal matrix, emphasises the non-encodingdimension along the vertical direction. We will in Sec. 3.4 introducefurther metrics to realize other variants. While two-dimensional Lloyd relaxation [Llo82] would minimizethe distance cost, it unfortunately does not adhere to the hard con-straints. In Lloyd relaxation, after a random initialization (Alg. 1,Line 1) every dot is iteratively replaced by its Voronoi cell cen-ter (Alg. 1, Line 2 to 10), followed by a re-computation of theVoronoi cells (Alg. 1, Line 3) in a expectation-maximization pro-cedure [DLR77]. Running it directly, will loose the informationpresent in the data sets, such as trivially done by jitter plots (Alg. 2).Our main contribution is a solver that extends Lloyd relaxation toproduce dot patterns with even distribution that adhere to the data, asexplained in (Alg. 3). Including the hard constraint is intuitive: useboth dimensions for the relaxation (Alg. 3, Line 1) but never updatethe encoding dimension (Alg. 3, Line 10). During optimization, dotsmove vertically, with a single degree of freedom per dot, but theircost computation, including the Voronoi construction, involves 2D.Note, how this is different from optimizing only an 2D dot pat-tern and not involving the 1D information. The information is notupdated, but it is crucial to include it in the distance computation,such as to satisfy the Lloyd objective in what is perceived: 2D space. . van Onzenoodt, G. Singh, T. Ropinksi, T. Ritschel / Blue Noise Plots y x p p p qqqqq qqq qq q q q q qq qq q q q q qq q qq qqq q q V ( p P) qq V ( p P) V ( p P) qqq Figure 4:

Lloyd relaxation involves Voronoi cells V (blue, grey andorange areas) computed on the output dot pattern P. These V aresampled using sites q shown here as rectangles. The optimizationrelates every dot p i to all the q j in its cell V ( p i , P ) . In practice, different options to construct Voronoi regions exist.We follow the approach from Balzer et. al [BSD09] but without thecapacity constraint. We use 8 ,

192 random 2D points q to discretizethe domain. Note that faster GPU methods exist, that make use ofregular grids [HIKL ∗

99, LNW ∗ The output dots P are from a domain that typically is wider than highwhen using the horizontal axis as the encoding axis, and the verti-cal axis as the non-encoding dimension. This is, as most datasetshave only a couple of different data points per interval in the datadimension, compared to the total number of data points. Unfortu-nately, it might not be obvious how to choose an appropriate height,both for jitter plots, as well as for Blue Noise Plots for a givendata set. The desired distance between dots is part of the problemdeﬁnition. Making recommendations how to choose the distance oftwo dots such that they become visually discernible is clearly anoften-encountered visualization challenge, but out-of-scope for thiswork. We will assume it to be known and use a distance of two-timesthe dot radius in all results we show. We will now show, how tochoose the plot height automatically and optimally. d max o = 25 r r × d m a x × o = hr × d max r × d m a x × o = . Figure 5:

Auto-height (see text).

We assume access tothe density function d ( x ) ∈ R → R + , deﬁned on thedomain of the data distribu-tion (black in Fig. 5). Typ-ically, we are only given asample of the distributionnot the true density. Hencethe density function needsto be estimated from thesample, for which we useKernel Density Estimation in practice. Please note, that we do not ingeneral rely on density estimation, unless the plot height is chosenautomatically or we optimize for centrality (Sec. 3.4.1).The optimal height depends on the maximal density d max , thetotal number of data points n , and the desired distance r betweendots. The desired distance is chosen by the user (pink in Fig. 5). Itdepends on the output medium, whereby a typical choice is to makeit twice the size of a dot, so they become discernable. We note thatat the data coordinate with d max , we need to “stack” r · d max · n dots(orange in Fig. 5). Ignoring optimal packings with efﬁciency around 0.9, and assuming a conservative efﬁciency of 1 instead, stackingsuch dots with radius r needs to be r · d max · n = h (green, Fig. 5). The point density in a Blue Noise Plot might vary. Alternatively,we can restrict the points to not use all the space available due tothe non-encoding dimension. To this end, for a ﬁxed width (manualor automatic), we limit dots to move less along the non-encodingdimension, resembling the appearance of beeswarm plots. We referto such a plot as having centrality .Choosing a varying height is based on the generalization of theaforementioned automatic height (Sec. 3.4). Instead of choosing asingle height value h , we choose height as a function h ( x ) of thedata coordinate x itself.This idea is conveniently realized by changing the metric itselfto be non-uniform. How distant two dots are, is depending on thedensity at the coordinate x = ( x − x ) / κ ( p , p ) = || ( p − p ) T (cid:18) d ( x )

00 1 (cid:19) || (3)Note, that p and p are typically close, so even while the functionis not a metric for all pairs, it locally is as the density function issmooth. In particular it can be chosen arbitrarily smooth by usinga smooth kernel in the density estimation, that, in the limit, corre-sponds to the constant height. For data sets comprised of different classes, Blue Noise Plots can beextended to multi-class blue noise [Wei10]. Here, all data points inone class maximize Eq. 1, as well as all possible unions of all datapoints in all classes do. The solver implements this by alternatingbetween the individual classes and their unions.

4. Results

We present both qualitative and quantitative results of our work.

Single-class.

We show results of our, as well as existing approacheson typical data sets in Fig. 6. We see, that our approach does min-imize the amount of visual clutter in the form of overlap for allexamples. While this is particularly the case in denser regions, evenin sparser regions (for example in Fig. 6, c) on the right), dots aremore evenly distributed over the available domain, supporting theperception of individual dots.

Automatic height.

As described in Sec. 3.4, we dynamically adaptthe height of the plot, depending on the number of dots and theirdistribution. While Fig. 7 shows an example of differently sizedsubsets of the geyser data set using a constant height for theplot, Fig. 8, shows the dynamic adaption. We do so to compromisebetween a compact plot and room for the dots to relax and thereforereduce overlap. If the number of data points gets large, retaining aﬁxed distance is only possible at the expense of a high plot.

Centrality.

Fig. 9 shows results where the height is chosen auto-matically, but varying with the data dimension. Depending on the . van Onzenoodt, G. Singh, T. Ropinksi, T. Ritschel / Blue Noise Plots J i tt e r Total Sleep (h) [ sleep ] B l u e n o i s e J i tt e r Duration (s) [ geyser ] B l u e n o i s e J i tt e r Total Bill ($) [ tips ] B l u e n o i s e Figure 6:

Comparison of three different data sets, each of themvisualized using a traditional jitter plot and our Blue Noise Plot.

Duration (s) [ geyser ] Figure 7:

Comparison of plots with different numbers of dots. Allplots are drawn using the same height, but with different numbersof dots. These plots show a random subset of the geyser data set,visualized using our Blue Noise Plot. Here a) shows 64 dots, b)shows 128 dots and ﬁnally c) shows 256 dots. reliability of the underlying density estimation, this can be an effec-tive additional cue. At any rate, adding blue noise improves uponjitter in readability and aesthetics.

Multi-class.

Finally, we show an extension to multiple classes ofdata points. Here, every input point additionally has a class label.We use our method to produce a plot that is blue noise for allclasses jointly, as well as for every class on its own. Fig. 10 shows Duration (s) [ geyser ] Figure 8:

Comparison of adaptive plots with different numbers ofdots. These plots show a random subset of the geyser data set,visualized using our Blue Noise Plot. Here a) shows 64 dots, b)shows 128 dots and ﬁnally c) shows 256 dots.

Duration (s) [ geyser ] C o n s t a n t C e n t r a l i t y Total Bill ($) [ tips ] C o n s t a n t C e n t r a l i t y Figure 9:

Optimal constant plot height (ﬁrst and third) and a vary-ing height (second and fourth) , both for Blue Noise Plots. an example of this where the ﬁrst two rows show the blue noisedistributed dots of the individual classes. The third row, shows theﬁnal plot, where the ﬁrst two rows are combined. Fig. 11 showsmore examples of our Blue Noise Plots, encoding multiple classes.Here, we can also see that our approach nicely distributes all thedots, as well as the dots for the individual classes over the entiredomain. Further, Fig. 11 and Fig. 12 show examples of quantizeddata. While the blue noise pattern is less prominent in this case, thisshows another strength of our approach. In contrast to jitter plots,where the overlap between dots is ampliﬁed by this type of datadistribution, our approach spreads out the dots vertically.

Icons.

Inspired by approaches which represent data points withmore complex primitives instead of dots [HHD03, RRS13], we have . van Onzenoodt, G. Singh, T. Ropinksi, T. Ritschel / Blue Noise Plots D i nn e r c l a ss L u n c h c l a ss M u l t i - c l a ss Total bill [ tips ] Figure 10:

Multi-class Blue Noise Plot for the tips data set withtwo classes: dinner and lunch encoded into color. The ﬁrst tworows show the individual, the third the combined plot. Please note,how this is three visualizations of one set, fulﬁlling all intra- andinter-class, as well as the data constraints simultaneously. J i tt e r B l u e n o i s e

32 60

Bill length [ penguins ]

36 40 52 56 J i tt e r B l u e n o i s e log Viral load [ covid ] J i tt e r B l u e n o i s e

24 82

Years [ gapminder ]

32 40 66 74

MaleFemaleTyp 1Typ 2Typ 3EuropeAsiaAfricaAmericasOzeania

Figure 11:

Different examples of multi-class data sets, visualizedusing jitter plots as well as Blue Noise Plots. used our method to place icons as seen in Fig. 13. Here, every datapoint has a unique icon, making relations visible without bias orclutter.

Parameter choice.

A typical result of a data set containing 256data points, as shown in this paper, requires 40 iterations of Lloydrelaxation, with 8,192 Voronoi samples, resulting in a total time ofsix seconds for a naive, non-parallel implementation.

Analysis.

We analyze of our results both from the graphics perspec-tive using the spectral quality of Blue Noise Plot as dot patterns, aswell as with overlap measures used to analyze plots.

Petal Length (cm) [ iris ] Miles / Gallon [ cars ] J i tt e r B l u e n o i s e J i tt e r B l u e n o i s e Figure 12:

Quantized data sets, such as shown here, where notmany x values are shared by a data point, are difﬁcult to optimizefor, but worth addressing: besides being less visually appealing inmany cases, according to our study, jitter plots, due to clutter, aremore difﬁcult to read.

Wins [

Premier League ]

96 196 476 576 585k

Auxiliary Income [ bundestag ] Figure 13:

Blue Noise Plots can further be used to position extendedprimitives, instead of dots, here, little icons depicting soccer clubsor political parties.

To perform spectral analysis, we compute the expected powerspectrum of dots obtained from the geyser dataset. We generate100 different realizations of the dot patterns, compute their powerspectrum and average these power spectra to get the expected spec-trum. Fig. 14 shows these expected power spectra for O UR (right)and J ITTER (left). We compare this against vanilla Lloyd relaxation(middle), which is not a plotting method, as it does not producean unbiased result, but can serve as some upper bound of what wecould achieve when using it as a backbone. For J

ITTER , the spec-trum is ﬂat like white noise . O UR approach gives a dark region inthe middle of the spectrum conﬁrming the blue noise behavior. Thebright line in the middle is due to the non-uniform density of dotsalong the horizontal axis, where they obey to the data values. If we . van Onzenoodt, G. Singh, T. Ropinksi, T. Ritschel / Blue Noise Plots run Lloyd relaxation without constraining the data along the hori-zontal axis, the dark region in the middle gets larger (middle) andwe get uniform density points. That is why, there is no bright line inthe middle spectrum. The anisotropic structure of the dark regionis evident due to the non-square domain of the plot. A domain oflength L has a valid spectrum at only 1 / L -th frequencies [SSC ∗ x -axis ∈ [ , ) and y -axis ∈ [ , . ) ,the spectrum is valid only at integer frequencies along the x -axisand every (1 / . = ) 5-th frequency along y -axis. Lastly, the centraldark line in the Lloyd relaxation spectrum (middle) implies denser stratiﬁcation of the x -axis w.r.t. the y -axis. Jitter plot Lloyd relaxation Blue Noise PlotPlot Plot, unwarped width density

Figure 14:

The top part shows the plot, its density function as ablue line and the unwarped plot points. When performing this onjitter, blue noise based on Voronoi and Blue Noise Plot, we ﬁnd thethree spectra seen. A well-distributed dot set should exhibit a lowenergy (black) in the low-frequency regions (center). While beinginferior to Lloyd relaxation, we fair substantially better than jitter.

We also analyzed our plots using a point overlap metric presentedfor scatter plots [vOHR20]. In Fig. 15 we see this overlap metric(less is better) applied to the result of O UR and J ITTER at differentdot counts and for different datasets. This quantiﬁes what was hintedat before: with J

ITTER , one can get almost-acceptable results aswell as very bad ones (as seen by the high variance; now in a whileJ

ITTER might discover an accidental Blue Noise Plot) while O UR isconsistently providing a low variance with less overlap. When dotcount increases, variance of J ITTER becomes less, but the gap toO UR becomes even wider.

5. User Study gapminder continent

64 128 256 512 gapminder continent gapminder continent gapminder continent O v e r l a p Point count

Figure 15:

Overlap analysis (see text).

To evaluate BlueNoise Plots, weconducted two userstudies. The ﬁrstis a preferencestudy (Sec. 5.1),indicating that BlueNoise Plots areconsidered moreappealing over jitterplots. The secondis a threshold ex-periment (Sec. 5.2), conﬁrming that users are performing better to perceive the underly-ing distribution when using our method. In both experiments, wecompare O UR approach (Alg. 3) to a J ITTER baseline (Alg. 2).

To evaluate the visual preference, we conducted auser study with a total of N =

12 participants (3 female, 9male, M age = . , SD = . tips, titanic, iris, penguin, geyser,car, gapminder, tooth , and diamond ) visualized usingboth, O UR , as well as J ITTER treatment, presented in a random-ized side-by-side layout. They were asked two questions: First, torate which one is “more visually appealing” on a choice-enforcingfour-point Likert scale, ranging from “Strongly agree” to “Stronglydisagree”. Second, to indicate which treatment, if any, they prefer.

Analysis.

Analyzing responses to the ﬁrst question using aMann–Whitney U test, we ﬁnd O UR ( Mdn = . IQR = .

0) tobe signiﬁcantly more visually appealing compared to J

ITTER plots(

Mdn = . IQR = . U = .

5, signiﬁcant p < . p < .

05) for tips (difference of J

ITTER and O UR of 1.25) pen-guin (0.75) iris (0.50) tooth (0.58) gapminder (0.58) andlower responses (no signiﬁcance) for car (0.08) titanic (0.16) geyser (0.25) and diamonds (0.25).For the second question we found a preference of O UR tech-nique in 62 .

04% of all responses, in 9 .

26% of the cases a preferencetowards the J

ITTER plot and 28 .

7% without a preference. Whenfurther analysing responses to the second question, for the individ-ual data sets, we ﬁnd preferences of ( diamond : 83 . gap-minder : 75 . geyser : 75 . penguin : 83 . tips :83 . car : 58%, tooth : 58%), possibly due to the fact, that these datasets do not suffer from overdraw. Free-text responses.

Afterwards, we gave participants the optionto respond to the following question using a free text ﬁeld:“Do youprefer one of the options? If yes, why?”.While analyzing the free text responses, we found that our partic-ipants appreciated the Blue Noise Plot not only being “prettier”, butalso for being easier to understand. They for example stated, thatthe Blue Noise Plot is “deﬁnitely prettier”, it looks “cleaner and lessnoisy”, and “more organized”. Besides this aesthetic aspects, theyalso stated that the a Blue Noise Plot is “easier to understand”, thatdots are “more easily distinguisable”, and “easier to count”.This indicates that our approach might not only be more visuallypleasing but also improves the understanding of the data, informingthe next study to conﬁrm these subjective judgements. Further stud-ies would be required to understand preference for variants of ourapproach, such as centrality or multi-class patterns. . van Onzenoodt, G. Singh, T. Ropinksi, T. Ritschel / Blue Noise Plots

A total of N =

232 participants from the Amazon MTurkMasters population were simultaneously shown a dot plot on theleft and two variants of a distribution to the right (Fig. 16). Theywere tasked to indicate in a two-alternative forced choice, whichvariant of the distribution corresponds to the dot plot. Dot plots were,randomly, either using O UR or J ITTER . Distributions comprised ofB-spline curves with ﬁve uniformly-placed control points drawn asline plots. Their variants result from choosing a random control pointin every trial and perturbing it vertically by an offset O . In every trial,a staircase procedure (QUEST, [WP83]) was conducted to estimatethe threshold of O i. e., at which level of difference of the referencedistribution, different dot plots allow humans to answer correctlyin 75 % of the cases. A successful treatment, would have a lowersuch threshold, which is the dependent variable we record in unitsof just-noticeable differences (JND) [OJEF18, HYFC14, KH15]. Figure 16:

Experimental stimulus, showing a dot plot of a givendata set on the left, and two reference distributions to the right.

Data preparation.

For 72 participants the threshold experimentdid not converge after 100 trials. In a staircase procedure withoutbounds this indicates they clicked randomly as any deterministicresponse will ultimately converge to a value, be it high or low.Filtering resulted in 160 valid responses. Based on timings frompiloting, participants were paid $2, for a target rate of $8/hour.

Analysis.

A Mann-Whitney U test ﬁnds a signiﬁcantly smallerthreshold for Blue Noise Plots (

Mdn = . IQR = . Mdn = . IQR = . U = . p = . Discussion.

At ﬁrst, this study design can appear contrived, and itcan be asked why not perform a direct comparison. However, there isno single reliable offset O that is valid over all subjects, their viewingconditions, stimuli, training effects, etc. Hence no O could also befound in a pilot study or using any other process. Consequently,the difference to study needs to adapt to the conditions, and this isexactly what a staircase procedure does.Next, one could wonder why JND is a measure of success. JNDis the smallest change a channel (from algorithm over display to thehuman visual system) encodes. An efﬁcient visual channel –such aswe want our technique to be– aims to reproduce as many differentvalues as possible, to maximize the entropy, realizing communica-tion with a high bandwidth. Hence, our smaller JNDs indicate thetask was made easier as detailed by van Onzenoodt et al. [OJEF18].

6. DiscussionLloyd relaxation backbone.

We use Lloyd relaxation asan admittedly simple means to achieve a blue noise spec-trum. Many other reﬁned techniques have been proposed [DGBOD12, BSD09, Fat11, SGBW10, ÖG12, ZHWW12, LSM ∗ Point quality.

We have shown many examples that clearly outper-form jitter plots as a baseline found in countless papers printed everyday. We further analyzed the dot quality according to state-of-the-artdot correlation metrics. Still the result quality of even the most naïveblue noise method could be considered superior to ours, but this isnot a plausible comparison to make. General but biased graphicstechniques can remove or add dots, move them freely, etc. makingthe task much easier than our unbiased setting. But even if thereis a gap, it is not clear if the patterns we produce are actually anyclose to the best patterns we can hope for even with those additionalconstraints. A reader is encouraged to apply the blue noise Turingtest: is it obvious how to move the points to make the pattern actuallybetter for a human? We think, yes, maybe, but in many cases only bydiminishing returns compared to what is the improvement over jitter.Future work might ﬁnd optimization approaches to get point setsthat are unbiased in our sense, yet at even higher spectral quality.

Visualization impact.

Hu et al. [HSVK ∗

19] and Reinert et al.[RRS13] have made links between placement of primitives accord-ing to data and distribution quality. Our work is ignorant of the waydata points are ultimately presented, so it would be important tohave a loop back and ask what size, color, icons or animation wouldallow for efﬁcient visualization of a dot set, given the spectrumis now high-quality. In particular, our approach can cover higherdimensions, leading to further visualization questions. We thinkboth our and their work will open up new problems and solutions invisualization optimizing for aesthetics and clutter avoidance.

7. Conclusions

We improve the visual appeal and functionality of jitter plots, byre-casting their randomization into an optimization procedure to putdots “nicely”, resulting in improved visual appeal and depiction ofunivariate data sets. During our user studies, we found that our BlueNoise Plot were not only considered to be visually more appealingcompared to frequently used jitter plots, but easier to interpret. Ouruser study also supports our hypothesis that our plots enable a moreaccurate estimation of univariate data sets, compared to jitter plots.While we use one encoding data dimension and one additional,non-encoding dimension to target the important case of 2D visu-alization, other combinations are possible. For 3D [SLC ∗

18] ortangible [LIRC12] visualization, an optimization could be extendedto ﬁx two data dimensions and optimize a third one. In other futurework, instances of randomization in visualization, e. g., in user inter-faces, and Human-Computer interaction, even including the physicalworld, could be moved forward into problems where information isneither placed regular, nor random, but inspired by blue noise. . van Onzenoodt, G. Singh, T. Ropinksi, T. Ritschel / Blue Noise Plots

References [BS04] B

ERTINI

E., S

ANTUCCI

G.: By chance is not enough: preservingrelative density through nonuniform sampling. In

Proc. InfoVis (2004),pp. 622–629.[BSD09] B

ALZER

M., S

CHLÖMER

T., D

EUSSEN

O.: Capacity-constrained point distributions: a variant of Lloyd’s method.

ACM TransGraph 28 , 3 (2009), 1–8.[BW08] B

ACHTHALER

S., W

EISKOPF

D.: Continuous scatterplots.

IEEETrans Vis and Comp Graph 14 , 6 (2008), 1428–1435.[Cha18] C

HAMBERS

J. M.:

Graphical methods for data analysis . CRCPress, 2018.[Cle85] C

LEVELAND

W. S.:

The elements of graphing data . WadsworthPubl. Co., 1985.[Cle93] C

LEVELAND

W. S.:

Visualizing data . Hobart Press, 1993.[DGBOD12] D E G OES

F., B

REEDEN

K., O

STROMOUKHOV

V., D ES - BRUN

M.: Blue noise through optimal transport.

ACM Trans Graph 31 , 6(2012), 1–11.[DLR77] D

EMPSTER

A. P., L

AIRD

N. M., R

UBIN

D. B.: Maximumlikelihood from incomplete data via the EM algorithm.

J Royal StatisticalSociety: Series B (Methodological) 39 , 1 (1977), 1–22.[DW85] D

IPPÉ

M. A. Z., W

OLD

E. H.: Antialiasing through stochasticsampling. In

Proc. SIGGRAPH (1985), p. 69–78.[Fat11] F

ATTAL

R.: Blue-noise point sampling using kernel density model.

ACM Trans Graph 30 , 4 (2011), 1–12.[HHD03] H

ILLER

S., H

ELLWIG

H., D

EUSSEN

O.: Beyond stip-pling—methods for distributing objects on the plane. In

Comp GraphForum (2003), vol. 22, pp. 515–22.[HIKL ∗

99] H

OFF

III K. E., K

EYSER

J., L IN M., M

ANOCHA

D., C UL - VER

T.: Fast computation of generalized voronoi diagrams using graphicshardware. In

Proc. SIGGRAPH (1999), pp. 277–86.[HN98] H

INTZE

J. L., N

ELSON

R. D.: Violin plots: a box plot-densitytrace synergism.

The American Statistician 52 , 2 (1998), 181–184.[HSVK ∗

19] H U R., S HA T., V AN K AICK

O., D

EUSSEN

O., H

UANG

H.:Data sampling in multi-view and multi-class scatterplots via set coveroptimization.

IEEE Trans Vis and Comp Graph 26 , 1 (2019), 739–748.[HYFC14] H

ARRISON

L., Y

ANG

F., F

RANCONERI

S., C

HANG

R.: Rank-ing visualizations of correlation using weber’s law.

IEEE Trans Vis andComp Graph 20 , 12 (2014), 1943–52.[KH15] K AY M., H

EER

J.: Beyond weber’s law: A second look at rankingvisualizations of correlation.

IEEE Trans Vis and Comp Graph 22 , 1(2015), 469–78.[KHD ∗

10] K

EIM

D. A., H AO M. C., D

AYAL

U., J

ANETZKO

H., B AK P.:Generalized scatter plots.

Information Visualization 9 , 4 (2010), 301–11.[LIRC12] L EE B., I

SENBERG

P., R

ICHE

N. H., C

ARPENDALE

S.: Be-yond mouse and keyboard: Expanding design considerations for informa-tion visualization interactions.

IEEE Trans Vis and Comp Graph 18 , 12(2012), 2689–2698.[Llo82] L

LOYD

S.: Least squares quantization in PCM.

IEEE TransInformation Theory 28 , 2 (1982), 129–137.[LNW ∗

10] L I H., N

EHAB

D., W EI L.-Y., S

ANDER

P. V., F U C.-W.:Fast capacity constrained voronoi tessellation. In

Proc. ACM i3D (2010).[LSM ∗

19] L

EIMKÜHLER

T., S

INGH

G., M

YSZKOWSKI

K., S

EIDEL

H.-P., R

ITSCHEL

T.: Deep point correlation design.

ACM Trans Graph(Proc. SIGGRAPH Asia) 38 , 6 (2019), 1–17.[MG13] M

AYORGA

A., G

LEICHER

M.: Splatterplots: Overcoming over-draw in scatter plots.

IEEE Trans Vis and Comp Graph 19 , 9 (2013),1526–38.[MPOW17] M

ICALLEF

L., P

ALMAS

G., O

ULASVIRTA

A., W

EINKAUF

T.: Towards perceptual optimization of the visual design of scatterplots.

IEEE Trans Vis and Comp Graph 23 , 6 (2017), 1588–99. [ÖG12] Ö

ZTIRELI

A. C., G

ROSS

M.: Analysis and synthesis of pointdistributions based on pair correlation.

ACM Trans Graph 31 , 6 (2012),1–10.[OJEF18] O

NDOV

B., J

ARDINE

N., E

LMQVIST

N., F

RANCONERI

S.:Face to face: Evaluating visual comparison.

IEEE Trans Vis and CompGraph 25 , 1 (2018), 861–71.[Pea94] P

EARSON

K.: Contributions to the mathematical theory of evolu-tion.

Phil. Trans of the Royal Society of London. A 185 (1894), 71–110.[RPD20] R

APP

T., P

ETERS

C., D

ACHSBACHER

C.: Void-and-clustersampling of large scattered data and trajectories.

IEEE Trans Vis CompGraph 26 , 1 (2020), 780–9.[RRS13] R

EINERT

B., R

ITSCHEL

T., S

EIDEL

H.-P.: Interactive by-example design of artistic packing layouts.

ACM Trans Graph 32 , 6(2013).[RRSG15] R

EINERT

B., R

ITSCHEL

T., S

EIDEL

H.-P., G

EORGIEV

I.:Projective blue-noise sampling.

Comp Graph Forum (2015).[Sec02] S

ECORD

A.: Weighted voronoi stippling. In

Proc. NPAR (2002),p. 37–43.[SGBW10] S

CHMALTZ

C., G

WOSDEK

P., B

RUHN

A., W

EICKERT

J.:Electrostatic halftoning. 2313–27.[SLC ∗

18] S

ICAT

R., L I J., C

HOI

J., C

ORDEIL

M., J

EONG

W.-K., B

ACH

B., P

FISTER

H.: DXR: A toolkit for building immersive data visualiza-tions.

IEEE Trans Vis and Comp Graph 25 , 1 (2018), 715–25.[SOA ∗

19] S

INGH

G., O

ZTIRELI

C., A

HMED

A. G., C

OEURJOLLY

D.,S

UBR

K., D

EUSSEN

O., O

STROMOUKHOV

V., R

AMAMOORTHI

R.,J

AROSZ

W.: Analysis of sample correlations for Monte Carlo rendering.

Comp Graph Forum (Proc. EGSR) 38 , 2 (2019).[Spe69] S

PEAR

M. E.:

Practical charting techniques . McGraw-Hill,1969.[SSC ∗

20] S

INGH

G., S

UBR

K., C

OEURJOLLY

D., O

STROMOUKHOV

V., J

AROSZ

W.: Fourier analysis of correlated Monte Carlo importancesampling.

Computer Graphics Forum 39 , 1 (2020), 7–19.[Tal07] T

ALEB

N. N.:

The black swan: The impact of the highly improba-ble , vol. 2. Random House, 2007.[TT90] T

UKEY

J., T

UKEY

P.:

Strips displaying empirical distributions: I.textured dot strips . Tech. rep., Bellcore Technical Memorandum, 1990.[Tuk77] T

UKEY

J. W.:

Exploratory data analysis , vol. 2. Reading, MA,1977.[Uli88] U

LICHNEY

R. A.: Dithering with blue noise.

Proc. IEEE 76 , 1(1988).[vOHR20]

VAN O NZENOODT

C., H

UCKAUF

A., R

OPINSKI

T.: On theperceptual inﬂuence of shape overlap on data-comparison using scatter-plots.

Computers & Graphics (2020).[Wei10] W EI L.-Y.: Multi-class blue noise sampling.

ACM Trans Graph29 , 4 (2010), 1–8.[Wil99] W

ILKINSON

L.: Dot plots.

The American Statistician 53 , 3(1999), 276–281.[WP83] W

ATSON

A. B., P

ELLI

D. G.: QUEST: A Bayesian adaptivepsychometric method.

Perception & psychophysics 33 , 2 (1983), 113–20.[Yel83] Y

ELLOTT

J. I.: Spectral consequences of photoreceptor samplingin the rhesus retina.

Science 221 , 4608 (1983).[YXX ∗

20] Y

UAN

J., X

IANG

S., X IA J., Y U L., L IU S.: Evaluation ofsampling methods for scatterplots.

IEEE Trans Vis and Comp Graph (2020).[ZHWW12] Z

HOU

Y., H

UANG

H., W EI L.-Y., W

ANG

R.: Point samplingwith general noise spectrum.