[PDF] Deconstructing Categorization in Visualization Recommendation: A Taxonomy and Comparative Study

Abstract

Visualization recommendation (VisRec) systems provide users with suggestions for potentially interesting and useful next steps during exploratory data analysis. These recommendations are typically organized into categories based on their analytical actions, i.e., operations employed to transition from the current exploration state to a recommended visualization. However, despite the emergence of a plethora of VisRec systems in recent work, the utility of the categories employed by these systems in analytical workflows has not been systematically investigated. Our paper explores the efficacy of recommendation categories by formalizing a taxonomy of common categories and developing a system, Frontier, that implements these categories. Using Frontier, we evaluate workflow strategies adopted by users and how categories influence those strategies. Participants found recommendations that add attributes to enhance the current visualization and recommendations that filter to sub-populations to be comparatively most useful during data exploration. Our findings pave the way for next-generation VisRec systems that are adaptive and personalized via carefully chosen, effective recommendation categories.

Full PDF

11 Deconstructing Categorization in Visualization Recommendation:A Taxonomy and Comparative Study

Doris Jung-Lin Lee, Vidya Setlur, Melanie Tory, Karrie Karahalios, Aditya Parameswaran

Abstract —Visualization recommendation (VisRec) systems provide users with suggestions for potentially interesting and useful nextsteps during exploratory data analysis. These recommendations are typically organized into categories based on their analyticalactions, i.e., operations employed to transition from the current exploration state to a recommended visualization. However, despite theemergence of a plethora of VisRec systems in recent work, the utility of the categories employed by these systems in analyticalworkﬂows has not been systematically investigated. Our paper explores the efﬁcacy of recommendation categories by formalizing ataxonomy of common categories and developing a system,

Frontier , that implements these categories. Using

Frontier , we evaluateworkﬂow strategies adopted by users and how categories inﬂuence those strategies. Participants found recommendations that addattributes to enhance the current visualization and recommendations that ﬁlter to sub-populations to be comparatively most usefulduring data exploration. Our ﬁndings pave the way for next-generation VisRec systems that are adaptive and personalized via carefullychosen, effective recommendation categories.

Index Terms —Visual analysis; analytical workﬂow; discovery-driven analysis; visualization recommendations. (cid:70)

NTRODUCTION

Exploratory visual analysis is an iterative process ofasking and answering questions about data through visual-izations, where new questions often arise from unexpectedobservations. Challenges arise when the current analysispath does not yield interesting observations; this commonpain point can cause users to feel stuck or overwhelmed,unsure of what question to ask next [1], [2]. Visualizationrecommendation (VisRec) systems guide users along theirexploration journey by suggesting effective visual encod-ings [3], [4], [5] or potentially interesting visualizations [6],[7], [8], [9], [10], [11].Recommendations are often organized into categoriesbased on the analytical actions they embody.

Analytical ac-tions can be thought of as transitions between visualizationstates, corresponding to the operations performed to gener-ate recommendations given the current visualization state.Example categories include

Filter , displaying recommen-dations of sub-populations of the data, derived by addingﬁlters to the current visualization, and

Enhance , displayingrecommendations of an additional attribute added to thecurrent visualization. Figure 1 illustrates how recommenda-tion categories can support the analysis of a college dataset.A scatterplot of

SATAverage and

AverageCost can be ﬁltered to speciﬁc

HighestDegrees or Regions (right side,top) or enhanced by adding the

FundingModel attribute tothe color channel (right side, bottom). Users can exploretheir data via moves in the visualization space, selectingvisualizations based on analytical actions that representpotential “next steps” in their analysis.Most VisRec systems display a small set of bespoke • D. Lee and A. Parameswaran are with the University of California,Berkeley. V. Setlur and M. Tory are with Tableau Research. K. Karahaliosis with the University of Illinois, Urbana-Champaign. • This work has been submitted to the IEEE for possible publication.Copyright may be transferred without notice, after which this version mayno longer be accessible. analytical-action-based recommendation categories withouta clear understanding of why the set was selected. Thislimited selection of categories in existing systems stemsfrom challenges in both development and evaluation . Froman evaluation standpoint, determining the value of a givenrecommendation for a speciﬁc user goal is, in general, achallenge in recommender system design [12], but doingso for visual analysis tools is even harder. Unlike websearch, where the typical goal is to ﬁnd a single item (e.g.a movie to watch), insights in visual analytics often arisefrom multiple visualizations. This design is further com-plicated by the variety of user goals, ranging from speciﬁcinquiries to more complex open-ended objectives such asunderstanding relationships across attributes [13]. From adevelopment standpoint, there is an interface design andperformance cost for dynamically computing large numbersof recommendations [14]. As a result, most systems rely ona small set of ﬁxed recommendation categories.While prior work has certainly demonstrated the ben-eﬁts of VisRec in supporting exploratory analysis [4], [5],[6], the design space of VisRec categories has not been thor-oughly explored and evaluated. With new VisRec systemsbeing introduced time and again in the visualization andHCI literature [15], there is a pressing need to take a stepback to organize and make sense of the design space ofanalytical-action-based categories in VisRec, and to furtherevaluate the beneﬁts of various types of recommendationcategories as a whole, and relative to each other. Thisevaluation is crucial for distilling design guidelines for next-generation VisRec systems and for enabling past, present,and future VisRec systems to be understood and comparedin the context of an organization.In this paper, we deconstruct categorization in VisRecsystems by comparing and evaluating the value of differentrecommendation categories in visual analytic workﬂows.We further investigate how analysis strategies are inﬂu-enced by employing recommendation categories, as well as a r X i v : . [ c s . H C ] F e b Fig. 1: A screenshot of

Frontier with a dataset containing college information. Starting from the Current View displayinga scatterplot of AverageCost versus SATAverage on the left, the user ﬁnds an interesting visualization recommendedthrough the

Enhance category highlighting the two distinct clusters for Private and Public FundingModels (shown witha red border). This recommendation is generated from the Current View, further “enhanced” by adding FundingModel tothe color channel.the efﬁcacy of various recommendation categories for differ-ent task and dataset characteristics. While recommendationcategories such as

Filter and

Enhance are in fact presentin prior systems [2], [4], [5], [7], [16], [17], there has been nosystematic organization or comparison of recommendationcategories and their underlying analytical actions. Anotherchallenge is that no existing VisRec system comprehensivelyimplements the space of possible categories to compare theireffects on analytical workﬂows. This crucial gap in existingliterature motivated our system,

Frontier . We developed Frontier as an apparatus for investigating the merits andpitfalls of various recommendation categories in a singlesystem.Our contributions are summarized as follows: • We present a taxonomy of common analytical action-based recommendation categories employed in visualanalysis, synthesizing existing literature from VisRec andonline analytical processing (OLAP). The taxonomy en-ables us to map out the design space of existing VisRecsystems as well as future ones. (Section 3) • We develop a design probe,

Frontier , implementing tenrecommendation categories from the taxonomy to explorethe usage and impact of these categories in a visualanalysis workﬂow. (Section 4) • We present a mixed-methods user study to understandhow recommendation categories support visual analysisand the relative efﬁcacy of various recommendation cate-gories. (Section 5, 6)As part of this work, one of our goals was to take stock

1. The name

Frontier is inspired by how the application of analyticalactions offer next steps along potential exploration paths—enabling anexplorer to expand the frontier of discovery. In the context of graphsearch algorithms, the term frontier refers to the nodes that lie betweenwhat has been discovered and those as yet undiscovered [18]. of and systematize research in the rapidly-evolving areaof VisRec systems. Our ﬁndings validate prior conjecturesabout the value of categorizing recommendations [4], [10]and further reveal how there are substantial differencesin the usage of various categories. For instance, partici-pants indicated that recommendation categories

Enhance (adding one attribute) and

Filter (displaying data sub-populations) were most useful, while

Pivot (swapping anattribute) was one of the least useful. Such ﬁndings point tothe importance of comparative evaluation across categoriesand guidelines for improving category design in futureVisRec systems presented in Section 8.

ELATED W ORK

This work is draws from work on VisRec systems, facetedsearch interfaces, and traditional recommender systems.

Manual visualization speciﬁcation tools [19], [20] requirethe user to specify the exact data attributes, data subsetof interest, and the visual encodings for creating a visu-alization. This process is often tedious and overwhelmingduring the early, exploratory stages of analysis, especiallyfor users without visualization design or data explorationexperience [2], [5], [7]. To alleviate this issue, VisRec systemssuggest visualizations to assist users with visual analysis.VisRec systems can be classiﬁed based on whether theysuggest visual encodings (i.e., encoding recommenders)or aspects of the data to visualize (i.e., data-based rec-ommenders) [21]. The earliest VisRec systems focused onrecommending visual encodings, assuming that the dataattributes were already identiﬁed by the user [3], [22]. Sub-sequent work automatically recommended graphical encod-ings based on perceptual effectiveness [23], [24]. Our focus in this work is on data-based recommenders, as in manyrecent papers [8], [9], [10], [11], [16], [25], [26], that suggestinteresting visualizations based on statistical properties ofthe data. To narrow the design space of recommendationcategories, we leverage visualization best practices for deter-mining visual encodings. Throughout the rest of this paper,we use the term VisRec system to refer to those that employdata-based recommendations.Some VisRec systems are entirely automatic [9], [27],whereas other, more recent, mixed-initiative systems sup-port some user interaction to guide the recommenda-tions [5], [10], [26]. Mixed-initiative VisRec systems combinemanual speciﬁcation with recommendations [7], [17], [28],[29], [30], [31], [32], [33]. For instance, Voyager [4], [5],[21] suggests visualizations based on user-selected ﬁeldsand wildcards to iterate through possible data attributesor encodings. Many of these systems organize the resultingrecommendations into categories based on their analyticalactions [4], [5], [7], [10], [29]. For example, DIVE [7] andVoyager display a set of univariate distributions to helpusers get an overview of the distributions that exist in thedata as well as recommendations that add an additionalattribute to the user-selected visualizations. On the otherhand, Zenvisage [26] searches for visualizations based ontheir similarity with a user-speciﬁed pattern over a space ofpossible ﬁlter values. While intuitively valuable, it remainsunclear what the effects of different categories are on usersand which categories are most helpful. Our work specif-ically investigates analytical action-based recommendationcategories and how they impact mixed-initiative visual an-alytics workﬂows.

Facets help users quickly reﬁne their search options by ap-plying multiple ﬁlters based on a categorized classiﬁcationof the information elements in web and product search [34],[35] and text corpora querying [36], [37], [38] interfaces.Faceted search interfaces bear similarities to VisRec systemsin that both support progressive disclosure and incrementalconstruction of queries where users can formulate the equiv-alent of a sophisticated data query through a series of small,exploratory steps [35], [39]. One key problem in building afaceted search interface is selecting the facets. Some systemssimply present the ﬁrst few alphabetized facets [40]; otherspresent a subset of facets ranked by frequency of use [41],[42], [43]. Prior VisRec systems have drawn an analogy be-tween faceted browsing and exploratory visual analysis [4],[6]. Similarly, our work draws inspiration from principlesof facets as recommendation categories to address the com-plexity of analytical tasks [13]. While facets are constructedbased on aspects of items in the corpus (e.g., size, brandfor products; topics for articles), in VisRec, categories canbe organized based on their relationship with the currentvisualization or visual characteristics as discussed in moredetail in Section 3.

Unlike search systems that aim to maximize the relevanceof retrieved items, the goal of a recommender system isto help users discover items of interest. As a result, an

Category Related WorkDistribution [4], [5], [7], [9], [10], [11], [44]Correlation [5], [6], [7], [9], [10], [45], [46]Enhance [4], [5], [7], [17], [33]Generalize (attribute) [7], [11], [33]Generalize (ﬁlter) [7]Pivot [7], [17], [33]Filter (add) [16], [17], [27]Filter (swap) [2]Difference [8], [27], [30]Similarity [2], [47]

TABLE 1: Survey of recommendation categories from 20VisRec systems. We included systems that recommend vi-sualization(s) based on the result of some analytical action.ideal recommender system must strike the right balancebetween suggesting items that are relevant as measured byrecommender accuracy, and those that are diverse, and aretherefore surprising and unexpected.This diversity-accuracy tradeoff has been well-studiedin the recommender system literature and has led to metricsbeyond accuracy such as serendipity, novelty, coverage, anddiversity [48], [49], [50]. The tradeoff manifests itself invisual data exploration as well, where users often want todiscover non-obvious, unexpected insights, but still wantto see visualizations relevant to the attributes or valuesthat they are interested in. The context-dependent actions inthe taxonomy described next, are examples of visualizationrecommendations relevant to the user’s context; context-independent actions can reveal more surprising aspectsabout the data. Akin to diversiﬁcation in traditional rec-ommender systems, our paper highlights the importanceof surfacing a relevant, yet diverse set of potential nextsteps that satisﬁes the user’s information needs in a visualanalytic workﬂow.

AXONOMY OF R ECOMMENDATION C ATE - GORIES

Analytical actions correspond to transitions through the vi-sualization space to generate categories of recommendationsgiven a user’s current visualization state. While varioustaxonomies [51], [52], [53] exist for describing the typesof tasks (or actions ) employed during visual analysis, asstated in Law et al. [47], we are not aware of any taxonomythat encompasses the types of data-based visualization rec-ommendations that can be generated via analytical actionsand are reﬂected in present-day VisRec system. This sectiondeﬁnes such a taxonomy, providing a common vocabularyfor the organizing principles behind recommendation cate-gories. The taxonomy arose through a systematic review of20 VisRec systems detailed in the supplementary material.Through open and axial coding [54], we uncoveredten major types of analytical actions used to generate andgroup visualization recommendations in existing systems,summarized in Table 1. To keep the design space of recom-mendation categories tractable, we focused on data-basedrecommendations driven by the operational and characteris-tic transitions in the visualization design space. We codiﬁedthese actions into a taxonomy as seen in Figure 2. Thetable is not intended to capture a comprehensive set of allanalytical action types, but rather to synthesize the most

Analytical Action Context-

DependentContext-

IndependentCharacteristicOperational AddRemoveSwap SimilarityDi ﬀ erenceDistributionCorrelationPivotFilter (swap) valueattributevalueattributevalueattribute Generalize (attribute)Generalize (value)EnhanceFilter (add)Context-

Dependent

Fig. 2: A taxonomy of common analytical actions used inrecommending visualizations for visual analysis. The ana-lytical actions are indicated in blue.common ones used to categorize recommendations so thatwe can explore the interaction design space more deeply.At its highest level, the taxonomy deﬁnes two maincategories: operational and characteristic . The operational cate-gory describes analytical actions that navigate users throughthe visualization space via operations such as add, remove,and swap. The characteristic category describes actions thatreveal certain characteristic patterns in the data, such asskewness and correlation. The taxonomy is further brokendown into context-dependent and context-independent cate-gories. Actions are context-dependent if they depend on the current view or visualization (i.e., the selected attributes,values, and visual encodings); they are context-independent if they do not depend on the current view .Note that the operational actions described above over-laps with some of the categories in the chart transitionmodel in GraphScape [24]. However, GraphScape providesa chart transition model that describes visualization edits,whereas our taxonomy describes actions in a VisRec contextdrawn from existing VisRec systems. Since our focus is lesson encoding-based recommendations, we do not considersuch aspects of the GraphScape taxonomy, such as Scale andMark. Likewise, GraphScape does not include characteristicactions described in this taxonomy.

Operational actions apply data-oriented operations thattransition the current view to a related, neighboring part ofthe visualization space. By deﬁnition, analytical actions thatare operational must also be context-dependent as they oper-ate on the current view. As seen in Figure 2, there are threebroad categories of operational actions based on whether anattribute or value is added, removed, or swapped, leadingto six ( × ) individual categories.The example in Figure 3 demonstrates how operationalactions can be thought of as moving along different paths inthe attribute or value hierarchy. Every node in the attributeor value hierarchy represents a set of selected attributesor values. A user’s current visualization is composed oftheir position on the attribute hierarchy (i.e., the space ofall attribute combinations) and their position on the valuehierarchy (i.e., the space of all ﬁlter value combinations).Movements through these hierarchies deﬁnes the set of pos-sible operational actions. This conceptual model formalizesthe space of possible visualizations that are one move awayfrom the current visualization. Our model draws on Online Analytical Processing (OLAP), a sub-ﬁeld of data manage-ment that targets analytical querying of multi-dimensionaldata. However, unlike OLAP, which only considers thevalue hierarchy [55], we introduced the analogous attributehierarchy to help capture common operations in visualanalytics. Here are the six operational analytical actioncategories: • Enhance : adds an additional attribute to the currentview. If the user selects attributes A and B , Enhance displays visualizations involving attributes A , B , and C .This action corresponds to moving down the attributehierarchy (Figure 3 red). • Filter (add) : adds an additional ﬁlter to the currentview. If the user selects attributes A and B , Filter(add) displays visualizations involving A , B , and a ﬁlter F . In OLAP [55], this is known as a drill-down on the valuehierarchy (Figure 3 orange). • Filter (swap) : switches out the ﬁlter value to a dif-ferent value, while keeping the ﬁlter attribute ﬁxed. Ifthe user selects attributes A and ﬁlter F = V , Filter(swap) displays visualizations involving A and an alter-native ﬁlter F = V (cid:48) . This action corresponds to movinghorizontally across the value hierarchy to a node with thesame ﬁlter attribute (Figure 3 purple). • Generalize (attribute) : removes one attributefrom the current view to display the more general trend. Ifthe user selects attributes A and B , visualizations involv-ing either A or B are displayed. This action correspondsto moving up the attribute hierarchy (Figure 3 turquoise). • Generalize (value) : removes one ﬁlter from the cur-rent view to display the more general trend. If the userselects attributes A and a ﬁlter F , visualizations involvingonly A are displayed. In OLAP, the removal of a ﬁlter isknown as a roll-up on the value hierarchy (Figure 3 pink). • Pivot : displays visualizations that can be constructed ifone of the attributes from the current view is replacedwith another attribute. If the user selects attributes A and B , Pivot displays visualizations involving either A and another attribute B (cid:48) , or B and another attribute A (cid:48) .This action corresponds to moving horizontally along theattribute hierarchy (Figure 3 green). Characteristic analytical actions are designed to surfacesalient visual and statistical characteristics of the data,sorted based on an interestingess metric that is described inSection 3.3. Characteristic actions that are context-independent are designed with an overview intent, similar to “breadth-ﬁrst” exploration strategies in web search [14] that areindependent of the user’s search query. We describe twotypes of independent actions that highlight patterns thatmay be of interest to the user: • Correlation : highlights bivariate relationships be-tween quantitative ﬁelds in the data through scatterplotsof different combinations of quantitative attributes. • Distribution : displays the possible univariate distri-butions in the dataset, with COUNT as the default measure.The visualization can either be a histogram, bar chart, orline chart depending on the data type of the attribute.Characteristic actions that are context-dependent showcasesalient visual characteristics based on the current view.

Brand=Ford Origin=EuropeBrand=ChevroletBrand=Chevrolet & Cylinder=8 Brand=Ford & Origin=EuropeCylinder=8{ }

Value Hierarchy …HorsepowerCountryCountry & Horsepower Country & Cylinder Cylinder{ }

Attribute Hierarchy … Horsepower & CylinderCountry & Horsepower & CylinderPivot G e n e r a li z e ( A tt r i b u t e ) Generalize (Value)Filter (Swap)Filter (Add) E n h a n c e Brand=Ford Origin=EuropeBrand=ChevroletBrand=Chevrolet & Cylinder=8 Brand=Ford & Origin=EuropeCylinder=8{ }

Value Hierarchy …HorsepowerCountryCountry & Horsepower Country & Cylinder Cylinder{ }

Attribute Hierarchy … Horsepower & CylinderCountry & Horsepower & CylinderPivot G e n e r a li z e ( A tt r i b u t e ) Generalize (Value)Filter (Swap)Filter (Add) E n h a n c e Fig. 3: Operational actions represent transitions through the attribute and value hierarchies. • Similarity / Difference : highlights data patterns thatare visually similar or different from the current view. Within each analytical action category, visualizations areoften ranked using some interestingness objective. Giventhe different chart characteristics for different types of vi-sualizations, the interestingness objective, even for a givenaction, may be different for visualization types. For example,a user may be interested in the degree of correlation ina scatterplot, while they may be interested in differencesbetween the bar values in a bar chart. In this paper, weconsider commonly occurring basic chart types, includingbar charts, histograms, line charts, and scatterplots, typicallyemployed by existing VisRec systems. Even this set resultsin a considerable number of choices corresponding to everycombination of action and visualization type. We identiﬁeda small number of classes of objectives that have been usedin prior work for these combinations, which we catalogbelow.For the characteristic actions, the objective typically cap-tures the salient visual characteristics expressed by a visu-alization, such as the degree of correlation or skew. Visual-izations are ranked from the

Correlation action based onmonotonicity [25], typically most to least correlated, whilethose from the

Distribution action are ranked from mostto least skewed. For the

Similarity action, bar and linechart visualizations are ranked based on similarity to the current view , computed via the Euclidean distance betweenthe measure values of the visualizations [2], [26].For the operational actions, the objective used is typicallydetermined by the visualization type of the recommendedvisualizations. These objectives capture perceptual charac-teristics generally associated with something unexpected orinsightful in the visualization, including: • Non-uniformity : For bar/line charts and histograms withouta ﬁlter , visualizations are ranked highly if they are highlyuneven, indicating the presence of outlying categories orshifts in distributions [6], [10]. • Deviation : For bar/line charts and histograms with a ﬁlter ,the ranking is based on the deviation between the ﬁl-tered and unﬁltered (overall) distributions, based on theintuition that a visualization is potentially interesting if itdiffers greatly from some expected reference [8], [16]. • Correlation : For uncolored scatterplots , a visualization isranked higher if it displays a high degree of dependencebetween the two measures, as measured by mutual infor-mation [56], [57] or Spearman’s correlation [25]. • Separability:

For colored scatterplots , a visualization isranked higher if the colors for each category distinctlyseparate clusters of data points in the scatterplot [6], [58].Supplementary materials provide details as to how theseobjectives were implemented in our design probe,

Frontier . HE F RONTIER S YSTEM

We introduce a system,

Frontier , that provides visualizationrecommendations across multiple analytical action-basedcategories.

Frontier is a design probe that enables us tosystematically explore and compare these categories. Forbrevity, we refer to the analytical action-based categoriesdisplayed in

Frontier simply as recommendation categories henceforth. The

Frontier interface is composed of four areasas illustrated in Figure 4. Starting from the left (Figure 4A),we have the Control Panel, a manual speciﬁcation interfacefor specifying the visualization in the Current View (Fig-ure 4B). The Control Panel lists measures and dimensionsand allows users to add or remove attributes and values.The Speciﬁcation Panel (Figure 4A top) allows users toﬁne-tune their visualization by arranging attributes acrossspeciﬁc encoding channels. Users can toggle on and offspeciﬁc recommendation categories. If a category is notapplicable for the given Current View, the cursor iconchanges to a forbidden sign upon hover in the CategoryMenu (Figure 4C). The recommendations are displayedrow-by-row on the right (Figure 4D) analogous to facetedweb search results to encourage browsing [35]. We drewFig. 4:

Frontier consists of four areas: Control Panel (A), Cur-rent View (B), Category Menu (C), and RecommendationsPanel (D).inspiration from existing VisRec interfaces [5], [7], [10] andemployed guidelines from mixed-initiative interfaces [59],[60] to balance interface usability with comprehensivenessin the display of recommendation categories. We iterated onthe design with feedback from an interaction designer andmade signiﬁcant changes to the interface over a period ofsix months.

The following design considerations emerged while itera-tively designing

Frontier : • C1:

Concise and informative.

Recommendation categoriesshould provide a manageable set of options as “next-steps”in a user’s analytical workﬂow. Users should never beshown an empty category nor should they be showncategories with overlapping recommendations. • C2:

Coordinated and actionable.

Recommendation cate-gories should be coordinated and consistent with otherparts of the system, such as in the Category Menu andCurrent View. The user should be able to bring a recom-mended visualization into the Current View. • C3:

Interpretable and visually discernible.

Recommendationcategories should be self-explanatory and display visualindicators that convey their key characteristics or high-light how they differ from the Current View.These requirements echo design considerations from priorwork in mixed-initiative visual analytics systems [5], [45].

Frontier is a web-based system with components describedas follows. First, the

Data Manager loads the dataset andmetadata (i.e., the data type, data model, and default aggre-gation) and computes statistics (i.e., cardinality, correlation,minimum, maximum). The

Context Manager maintains in-formation about the attributes and values that the user hasselected. Then the

Category Manager determines which rec-ommendation categories to display for a given Current Viewand maintains a list of categories. Finally, each

Category con-tains information about speciﬁc recommendation categories,a sorted list of top- k recommended visualizations, and theirassociated scores. Details of the system architecture andimplementation along with source code can be found in thesupplementary materials. To select a manageable set of recommendation categories(C1) to display to users, we designed the following work-ﬂow. These rules are similar to the ones adopted in Voy-ager [4] and DIVE [7], which ﬁrst provide an overview viaunivariate distributions, followed by more relevant visual-izations based on subsequent user selection.At the start of the analysis, when no attributes areselected, the

Correlation and

Distribution actionsdisplay univariate and bivariate visualizations, enablingusers to get an overview of the dataset (Figure 5A, 5B).Operational categories evolve based on the Current Viewand come into play once any attribute is selected. To avoidredundancy across categories (C1),

Frontier only showscontext-independent categories when there are no attributesselected. For example, when a user selects a single quanti-tative attribute,

Enhance (Figure 5F) generates a collectionof scatterplots, similar to what is shown for

Correlation .Similarly, when a categorical attribute is in the Current View,only

Pivot recommendations (Figure 5E) are displayed, toavoid operating over the same collection of visualizations asthe ones in

Distribution .To make the recommendation categories more succinctand manageable (C1), from Table 1,

Frontier consolidates ﬁlter (add) and ﬁlter (swap) to give

Filter , generalize(attribute) and generalize (value) to give

Generalize ,similarity and difference to give

Similarity . Frontier ’s Filter adds an additional ﬁlter to the Current View whenthere is no ﬁlter in the Current View (Figure 5G). Whena ﬁlter is in place,

Filter keeps the speciﬁed ﬁlter at-tribute, while swapping out one of the attribute valuesto showcase alternative data subsets for comparison. Notethat any applied ﬁlter is always retained in all actionsexcept in

Filter . In

Generalize , we display all possiblevisualizations by removing either one ﬁlter or one attributethat is in the Current View (Figure 5C). In

Similarity ,visualizations that look most similar to the Current Vieware ranked highest, but users can reverse the sort order tolook at the most dissimilar visualizations (Figure 5D).Users can double click any recommendation to bring thevisualization into the Current View; this sets elements in theControl Panel to be consistent with that of the selected visu-alization (C2). We display the axis label of any element thatdiffers between the Current View and the recommendationin blue, to ease comparison and highlight differences (C3).

TUDY D ESIGN

We conducted a mixed-methods study to explore howvarious recommendation categories impact visual analysisworkﬂows. Our primary goal was to study the relativeusefulness of various categories as well as how catego-rization inﬂuences the analysis workﬂow in general. Tostudy these goals, our design probe,

Frontier , implementsthe categories described in the previous section. Further, tounderstand the effect of categorization in mixed-initiativeVisRec workﬂows, we included a mixed-initiative VisRecbaseline, mirrored off of

Frontier , that featured the samerecommendations, but without any categorization. We de-scribe this baseline later on. We opted against comparingwith manual speciﬁcation tools without recommendations,given the general beneﬁts of VisRec in prior studies [4], [5],[6]. We also opted against comparing with existing VisRecsystems that implement a subset of categories, as this wouldnot allow us to tease apart the impact of categorization onanalytic workﬂows.Overall, our exploratory study aimed to address thefollowing research questions: • RQ1: How do recommendation categories support andinﬂuence analytical workﬂows? What problem-solvingand exploration strategies do users adopt when usingrecommendation categories in a mixed-initiative context? • RQ2: What are the differences in user behavior across rec-ommendation categories? What is the value and impactof individual recommendation categories and how doesthis vary across tasks and datasets?

We recruited participants ( female, male) fromwithin a software company. Nine were experienced usersof a popular, commercial charting tool, had limitedproﬁciency, and two had no experience. In a between-subjects design, participants were randomly assigned to use Frontier or Baseline with either the College [61] or Olympic (a) Correlation (b) Distribution (c) Generalize(d) Similarity (e) Pivot(f) Enhance (g) Filter

Fig. 5: Examples of various recommendation categories implemented in

Frontier . (A)

Correlation generates scatterplotswith bivariate relationships between quantitative ﬁelds ranging from high to low correlation. (B)

Distribution shows thepossible univariate distributions from the dataset ranging from skewed to normal distributions. In the following examples,the current view is shown on the left, with the corresponding recommendations shown on the right. (C)

Generalize shows possible visualizations when one attribute or ﬁlter from the current view is removed (removed attributes shownwith strikethroughs). (D)

Similarity highlights data patterns ranging from most to least similar to the current view. (E)

Pivot shows possible visualizations that can be constructed if one of the current attributes is changed to another (changedattributes shown in blue). (F)

Enhance shows possible visualizations when an additional attribute is added to the currentview (additional attributes shown in blue). (G)

Filter displays the data subsets that can be constructed from the currentview when a ﬁlter is applied.Medals [62] dataset, with six participants per condition-dataset combination. Henceforth, we sufﬁx .F or .B in theidentiﬁer to display whether the participant used

Frontier orthe

Baseline condition.

There were two main parts to the study: closed-ended tasksand open-ended exploration.

Part 1: Closed-ended tasks

Closed-ended tasks were mainly intended to familiarizeparticipants with the system while also providing someconsistent objectives for task comparison. Participants com-pleted four closed-ended questions that included commonvisual analytic tasks, including: • Q1 (

Correlate ): ﬁnd other measures that are linearly cor-related with a selected attribute. • Q2 (

Filter Compare ): compare bar charts across differentdata sub-populations. • Q3 (

One v.s. All ): compare a ﬁltered distribution with theoverall distribution. • Q4 (

Pattern ): compare the temporal trend across differentmeasures.For each task, participants answered a multiple choicequestion on a paper worksheet. They were instructed touse

Frontier to answer the question, but were not toldhow to do so. All participants used the Cars dataset [63]for closed-ended tasks. This dataset was chosen becauseof its simple schema (ﬁve measures and ﬁve dimensions), clean insight patterns, and because it is commonly usedfor demonstrating visualization systems [5], [6], [20], thusenabling comparisons.

Part 2: Open-ended exploration

Following the closed-ended tasks, participants completedan open-ended exploration task. This task enabled us toobserve how people would choose to use (or not use) therecommendations in a natural analysis ﬂow. Participantsexplored either the College or Olympic Medals dataset.Instructions were: “

We’d like you to explore this data to lookfor interesting insights. As you work, please let us know whatquestions or hypotheses you’re trying to answer as well as anyinsights that you are learning about the data. ” Participants wereinstructed to talk aloud and star recommendations theyfound useful.The two datasets for open-ended exploration were cho-sen due to their real-world and accessible nature. We chosedatasets with different characteristics, enabling us to studya wider range of analytical inquiries: the College datasethas ten measures and six dimensions with low to mediumcardinality, while the Medals dataset contains only threemeasures and twelve dimensions with medium to highcardinality.

In the

Frontier condition, participants used the full versionof

Frontier with all of the recommendation categories. Theordering of recommendation categories within the interface was randomized for each user to minimize the preferencefor recommendations displayed at the top of the page.To study the impact of categorization as a whole, weintroduced a

Baseline condition. This condition displayed thesame set of recommendations except that the recommenda-tion categories were removed so that all the recommenda-tions appeared in a single, grid layout . The goal of this Base-line is to establish a vanilla VisRec system that (i) eliminatesthe effects of recommendation categories, while preservingcertain characteristics for a controlled comparison in that (ii) it is mixed-initiative and (iii) displays the same overall setof recommendations.To understand this condition better, note that the or-ganization of VisRec into categories is a result of both thelabeling (i.e., interface elements such as dividers and textualdescriptions of the categories) as well as the ranking withineach category. To remove the effects of categorization (i) , wenot only had to remove the category labels, but also had toshufﬂe the display order of the recommendations. In bothconditions, participants can browse for more visualizationsvia horizontal scrolling (single scroll bar for the

Baseline ,one scroll bar per recommendation category for

Frontier ). Toprevent preferential bias towards top-ranked visualizations,we further ensured that the exact same set of visualizationsappeared with and without scrolling across both conditions.While we acknowledge that our baseline is not per-fect, we considered alternative designs, including a no-recommendations baseline or no baseline at all. However, abaseline that only removes the category labels, but does notalter the display order would only evaluate the effects of ex-plicit category labels and is thus not a meaningfully differentbaseline. Since the goal of the study is not to demonstrateperformance difference between

Frontier and

Baseline , weopted for a baseline with recommendations for investigatingthe research questions around recommendation categories.

Sessions lasted approximately one hour, consisting of ap-proximately ﬁve minutes of introduction and tutorial, minutes of closed-ended tasks, minutes of open-endedexploration, and minutes of semi-structured interview-ing. The tutorial video introduced the interface using theCars dataset and stated that recommendations were selectedbased on an interestingness ranking and that the blue textindicated changes from the Current View. For Frontier , thevideo additionally described each recommendation cate-gory. The post-study interview included 7-point Likert scalequestions (e.g., overall usability, recommendation useful-ness) and open-ended questions on the system design andrecommendations. Study scripts and protocols can be foundin the supplementary material.

We employed a mixed-methods approach involving bothqualitative and quantitative analyses. The primary focus ofour work was a qualitative analysis of how recommenda-tions of different categories inﬂuenced people’s analytical

2. See screenshot in supplementary material. The

Baseline interfacehas a similar layout to several existing VisRec systems [9], [28] workﬂows. We conducted a thematic analysis through open-coding of session videos, focusing on strategies participantstook to answer their questions.We thematically classiﬁed each participant based on howfrequently they engaged with manual controls versus therecommendation panels. To obtain these classiﬁcations, weassigned separate labels for characterizing each participant’susage of the Control Panel and the recommendations (1:Majority of the time, 2: Sometimes, 3: Not often). Based onthese labels, we grouped the participants by their relativefrequency of use, where participants employed a manual-oriented strategy if they exhibit a higher usage of the ControlPanel than recommendations, balanced if they had compara-ble usage of both, and recommendation-oriented if they exhibita higher usage of recommendations than the Control Panel.Additionally, we deﬁne a visualization as useful if one ormore of the following occurred: (a) the participant verballydescribed an insight, (b) the visualization was brought intoview, (c) the visualization was starred, or (d) the partici-pant expressed that it was useful or interesting. We codedinsights from the video recordings, reusing the deﬁnition ofan insight from prior work [44], [64].The quantitative analysis consisted of Likert questionresults from the interview as well as counts of expressedhypotheses, data insights, and recommendations partici-pants found useful. We employed statistical testing whereappropriate, but considered the quantitative analysis mainlyas a complement to our qualitative ﬁndings. We adopt a 95%conﬁdence interval for all statistical analyses. Our analysisapproach is similar to other studies that employed mixed-methods to investigate analytical workﬂows [44], [65].

TUDY F INDINGS

To understand how recommendation categories supportanalytical workﬂows, we ﬁrst examine the strategies par-ticipants adopted and understand their motivations forswitching between different modes of exploration. Then wedelve deeper to examine the speciﬁc beneﬁts of recommen-dation categories and their affordances. Finally, we highlighthow user’s perceptions regarding the recommendation cat-egories can evolve over the course of an analysis workﬂow.

Strategies in mixed-initiative recommendation workﬂows

Based on thematic analysis of how frequently participantsengaged with manual control versus recommendation pan-els, we observe three major strategies that they employedacross both the

Frontier and

Baseline conditions. We gener-ally observe that participants were more inclined to use rec-ommendations in their workﬂow when using

Frontier thanin the

Baseline . We sought to better understand participants’motivations for opting for different analysis options.Participants employed a recommendation-oriented strat-egy for exploring unfamiliar attributes ( N F,B = 6 , par-ticipants ) during preliminary analysis ( N F,B = 5 , ) or

3. We use the notation N F,B to report measurements for

Frontier and

Baseline respectively. In the example above, N F,B = 6 , means that sixparticipants using Frontier and three participants using

Baseline usedrecommendations to explore unfamiliar attributes. when they were out of ideas on what to pursue further( N F,B = 5 , ). We found a small group of participants( N = 3 for Frontier ; N = 1 for Baseline ) who relied almostentirely on the recommendations to drive their analyses andused the Control Panel only sparingly. Most of these partici-pants either expressed that they had limited experience withcreating visualizations or were unsure what to expect fromthe dataset. The sentiment expressed by these participantslargely corresponded to the challenges that visualizationnovices face in translating abstract questions about theirdata to visualization speciﬁcations [1]. As P .F explained“ ...the recommendations gave me a jumpstart [...] because if Ididn’t have the recommendations to begin with, I wouldn’t evenknow where to start. ”Participants also adopted a balanced strategy intermix-ing the use of the Control Panel and recommendationsin unexpected ways. Three participants ( P .F , P .F , P .B ) selected recommended visualizations that were“close enough” to what they wanted, then made minortweaks using the Control Panel to attain their desired vi-sualization. Participants also created familiar visualizationsto trigger desired recommendations. For example, P .F wanted to look for linear trends in the data. They recalledseeing a clean linear trend between ACT and SAT scorespreviously, so they ﬁrst created the same visualization viathe Control Panel. Then they browsed through recommen-dations resulting from Similarity in order to ﬁnd similarvisualizations. Participants were able to leverage recom-mendations effectively in their workﬂow since the recom-mendation categories were transparent and interpretable,leading to predictable behavior.Participants followed a manual-oriented strategy whenthe perceived cost of engaging in manual speciﬁcation waslower than the effort it took to interact with the recommen-dations. This occurs when participants had a speciﬁc hy-pothesis in mind ( P .F , P .B , P .B , P .B ) or whenparticipants expressed a preference for manual speciﬁcationdue to their familiarity with existing charting interfaces( P .F , P .B , P .B ). P .B explained the reason whythey adopted a manual-oriented approach: If the question that I want to answer is very clear, thenI will go do it myself. There are two scenarios that Iwill switch from the left panel to recommendations. Onething is, I don’t know what the next step is and I wantinsights. Second thing is, I don’t know how to do it.

As shown in Figure 6, a similar pattern is also observedfor the strategies taken to solve the closed-ended task.In tasks where manual speciﬁcation required signiﬁcantlymore work than simply browsing the recommendations foranswers (

Correlate , Filter Compare ), participants were morelikely to adopt a recommendation-oriented strategy. On theother hand, in the

One vs. All task where participants hadto compare a ﬁlter and unﬁltered visualization, participantsopted for the manual-oriented strategy as it was fairly easyto remove a ﬁlter.Participants also adopted a manual-oriented strategywhen the perceived effort to interact with recommenda-tions was higher than usual, such as when they are over-whelmed by the large, unorganized panel of recommen-dations in the

Baseline . This is supported by the post-study Likert ratings, where participants reported recom-

Question Condition

Count

Recommendation Manual

Fig. 6: The number of participants who took a manual-oriented approach in solving the closed-ended questionversus a recommendation-oriented approach.mendations in

Baseline to be less useful ( µ F,B = . , . ; σ F,B = . , . ; U = . , p ¡ . via Mann-Whitney test)and more overwhelming ( µ F,B = . , . ; σ F,B = . , . ; U = . , p = . ) than Frontier . Value and impact of recommendation categories

We ﬁnd that the presence of recommendation categoriesleads to richer and higher-utility exploration. During open-ended exploration, there were more insights generated viarecommendations in

Frontier than in the

Baseline ( N F,B =171 , ; t = . , p < . ). A similar trend was observedfor the total number of useful visualizations generated viarecommendations ( N F,B = 149 , ; t =2.47, p < . ), alsoshown in Figure 7 (top).Our observations suggest that recommendation cate-gories reduced overhead associated with interpreting thevisualization recommendations. While we did not measurethe visualization read-time directly due to the exploratorynature of the tasks, several qualitative observations supportthis idea. First, six out of participants using Frontier expressed that they appreciated the organization. P noted:“ I like being able to really quickly visually inspect a bunch ofthings, because I can just slide a bunch of stuff past my eyes andbe able to pick the stuff that jumped out. ” In contrast, manyparticipants in the

Baseline condition went back and forthmultiple times between visualizations to make comparisonsand ensure that they had the right answer when solving theclosed-ended questions ( P , , , ), which, at times,led to mistakes.During the study, we noticed that some participantsappeared to be “stuck” in their analyses if they either: a)verbally expressed that they were out of ideas, b) implicitlywhen they had hypothesizing time of greater than oneminute, or c) expressed reluctance to explore further. Inparticular, only one out of participants using Frontier got “stuck” (once) compared to three participants getting“stuck” (total of ﬁve occurrences) in the

Baseline condition.This is partly attributed to how

Frontier participants repur-posed and adapted their workﬂows to take advantage of thediverse set of actions available through various recommen-dation categories.Participants often leveraged categories with the sameaxes such as

Enhance and

Filter to attain insights involv-ing comparisons across multiple visualizations. For exam- Category

Number of Records

EnhanceFilterSimilarityGeneralizePivotDistributionCorrelation

Condition

FrontierBaseline

Category

Number of Records

EnhanceFilterSimilarityGeneralizePivotDistributionCorrelation

Dataset

CollegeMedals

Fig. 7: Number of useful visualizations for each recommen-dation category by condition (top) and by dataset (bottom)for open-ended tasks. Top:

Enhance and

Filter are mostuseful in

Frontier , but to a lesser extent in

Baseline . Bottom:

Pivot and

Distribution are more useful in dimension-heavy datasets (e.g., Medals), whereas

Correlation ismore useful in measure-heavy datasets (e.g., College).ple, P .F was interested in the age distribution of Russianathletes because of their highest medal count. They createda histogram distribution of age for Russia and browsedthrough the Filter action to see distributions for othercountries. They exclaimed: “

Oh wow! Italy has some reallyold people for their medalists ”. Seeing the Italy age distribu-tion in the context of other age distributions highlightedits uniqueness; the visualization in isolation would havebeen uninteresting. Such comparisons across visualizationswithin an axes-consistent category are prevalent and oftenlead to better distributional awareness and understandingof the general patterns and trends in the dataset.

Evolving perceptions around recommendation categories

We found that participants came into the study with adiverse set of perceptions and expectations about recom-mendations that evolved throughout the course of the open-ended exploration session. For example, P .F explainedthat “ I feel like these suggestions require a lot more thoughtprocess in my head. So for the suggestions, because the one or two times that it doesn’t seem a lot useful, I probably disregard it af-terwards. ” P .F echoed a similar sentiment; several uninter-esting visualizations early on deteriorated their conﬁdencein Correlation : “

I thought that

Correlation would beinteresting. And it showed me like height and weight correlationand you heard me say I wasn’t really interested in that. So it kindof made me nullify the entire

Correlation panel together. ”We also observed the reverse where participants witha negative initial impression of recommendations gainedmore trust and understanding over time. P .B expressedthat they had a bias against recommendation systems andwas reluctant to look at it. However, ﬁnding useful thingsfrom the recommendation encouraged them to adopt moreof the recommendations in their workﬂow later on. Before I even started the study, I have a bias aboutrecommendation panels. Because most of the time rec-ommendation panels do not show you what you want.So I’m already kind of wanting to do my own thing anddo it myself, because my bias is that is more reliable thanusing a recommendation. [...] Once it started showingthings, I was like, ‘Oh, that is kind of interesting’, or‘Oh, that is kind of relevant’. I started paying a littlemore attention to it. I had to keep reminding myself like,‘Oh yeah, this is the part of the screen I’m supposed tobe looking at, it can actually be useful’.

We also observed similar effects on a per-category level.For example, P .F discovered interesting insights basedon Filter and noted in the subsequent analysis that theyexplicitly focused on the

Filter category because theyknew it would likely give something interesting. out ofthe participants also expressed that there was a learningcurve in familiarizing themselves with the recommendationcategories. A more longitudinal follow-up study is requiredto understand how users would interact with the recom-mendation categories when they become more familiar. Enhance and Filter were most useful, while Pivot least

As shown in Figure 7, some categories of recommendationswere more useful than others. Somewhat surprisingly, whileparticipants reported signiﬁcantly more useful visualiza-tions in

Frontier than in

Baseline especially for

Enhance and

Filter , we found that the relative ordering of usefulnessfor different recommendation types was largely the sameindependent of the condition. Note that while the categorieswere not explicitly shown in the

Baseline interface, theywere logged on the system side for the purpose of thisanalysis. As shown in Figure 7,

Enhance and

Filter aresigniﬁcantly more useful than

Pivot ( t =4.12, p¡0.05; t =3.24,p¡0.05 respectively via t-test).In both conditions, we observed that participants manu-ally performed analytical sequences that were similar to theanalytical actions that produced the recommendations. Wesaw repeated patterns of participants manually performingthe pivot operation on 14 separate occasions and the ﬁlteringoperation on seven occasions. Given that Pivot was notactually regarded as very useful compared to the othercategories (Figure 7), we suspect there may be a differencebetween the types of operation a user would like to perform manually, versus ones that they would prefer being rec-ommended to them. Regardless, these interaction patternsindicate that Filter and

Pivot do indeed resemble thenatural, intuitive “next-steps” in users’ workﬂows.Participants’ post-study ratings of different categoriesin

Frontier largely corresponded to the usefulness countsduring the session in Figure 7. One exception is that

Correlation and

Distribution were perceived aseasily understandable and useful by more than ﬁve out ofthe eight participants who provided a rating, but had lowranks relative to the usage frequencies captured in Figure 7.The discrepancy likely stems from the fact that these twocontext-independent categories are only displayed whenthe Current View is empty, so participants did not see themas frequently as the other categories.

Utility of recommendation categories across datasets

As shown in Figure 7 (bottom), the number of usefulvisualizations from certain categories depended on thedataset. While the usefulness of each category largely fol-lowed the trend observed in Figure 7 (left), the usage of

Pivot , Distribution , and

Correlation differed acrossdatasets. Given that the College dataset contained largenumbers of measures, while the Medals datasets had fewmeasures and more dimensions, it was no surprise that therewere more uses of

Correlation in the College dataset( N = 8 ) than in the Medals dataset ( N = 3 ).On the other hand, Distribution was used more fre-quently with the Medals dataset ( N = 9 ) than in the Collegedataset ( N = 6 ), possibly because it also contained barcharts of the count distributions of dimensions (showing asurprising trend that Europe won signiﬁcantly more medalsthan any other continent), whereas Distribution for theCollege dataset showed mostly histograms of measures.

Pivot was also used more frequently in the Medals dataset( N = 11 ) than in the College dataset ( N = 6 ), although thewe were unable to determine the reason. TUDY L IMITATIONS

While the analytical workﬂows that participants chose maybe inﬂuenced by the category selection logic, we tried tominimize the effect of the display order on category prefer-ences by randomizing the ordering of the various categoriesacross users. We did not investigate the effects of recommen-dation ranking functions, but instead adopted standard datainterestingness metrics from the literature [6], [10], [16], [46],[66]. Future work should explore how the interplay betweenranking functions, visualization types, and category labelsinﬂuences the usefulness of recommendations.While we employed two different datasets to study taskeffects, future studies with more realistic dataset properties(large, higher-dimensional), larger sample sizes, differentproblem domains, and varying user expertise would behelpful. We have not explicitly controlled for visualizationexpertise, leading to more participants with limited proﬁ-ciency. We also acknowledge potential novelty and unfamil-iarity effects in our short one-hour study: most participantsdid not become fully ﬂuent with all the categories andfeatures provided in

Frontier . The correlation between the ‘warm-up’ closed-ended task and the participant behaviormay also be a potential confound. As a result, participantsexhibited a strong afﬁnity towards the Control Panel dueto preconceptions of and familiarity with existing chartingtools. A longitudinal study that examines how categorizedrecommendations are used in practice is important futurework.

ESIGN I MPLICATIONS

From our study ﬁndings, we ﬁrst describe the guiding prin-ciples for the design of VisRec categories. We then discussinterface considerations for recommendation categories andtheir potential pitfalls. Finally, we identify opportunities forsupporting analytical actions in visual analysis.

Evidence from our study shows how recommendation cat-egories can be powerful constructs that help situate usersby establishing a mental framework for reasoning aboutrecommendation results. The semantic grouping and vi-sual affordances of recommendation categories “lift” thevisual analysis to operate at the level of analytical actions.By observing how participants switched between manualspeciﬁcation and recommendations, we ﬁnd that predictablecategories reduce users’ perceived cost of employing recom-mendations — crucial for establishing an effective mixed-initiative workﬂow where recommendations can be usedseamlessly in conjunction with manual speciﬁcation.Furthermore, the success of

Enhance and

Filter lendsan important lesson for future VisRec systems in designingsimple and readily-accessible recommendation categories.In particular, our study suggests that transparency andinterpretability are essential characteristics that lead to rec-ommendation categories that are predictably useful.One heuristic for evaluating the complexity of a rec-ommendation is to check whether the underlying actionaddresses a question involving a single element, which caneither be a descriptor of the visualization’s characteristics oran element that differs from the current view. For example,

Correlation answers the question: “

Which attributes arecorrelated? ” and

Enhance answers: “

What visualizations canbe generated by adding one additional attribute? ” Participants’failure to adopt

Pivot may be partially due to multipledegrees of freedom in which attributes could be swapped.Further, there may difﬁculty in articulating the analyticalquestion that

Pivot affords. This led to additional cognitiveeffort to reason about what was retained versus varied. An-other contributor might be the drastic encoding change thatcan occur during swapping. This inconsistent behavior canbe jarring and inhibits one’s ability to compare across thecollection. It remains an open question whether “anchoring”techniques that recommends appropriate encodings basedon prior context [33] can be applied to recommended collec-tions to offer a more consistent

Pivot . Encoding inconsis-tency across collections is never an issue when swapping outvalues in

Filter as the visualized attributes are unchangedwhen we move across the value hierarchy. This may explainwhy

Filter was useful in providing complementary viewson sub-populations of data, often leading to unexpectedinsights. While our category selection algorithm takes anoverview-ﬁrst [67] approach in showing context-independent categories at the beginning and context-dependent recommendations once a speciﬁed view exists,several users explicitly cleared their selection in orderto get the overview from time-to-time. Additionally, twoparticipants indicated that they hoped to ﬁnd visualizationrecommendations that were more unrelated and surprisingrather than simple alternatives to their Current View.The diversity-accuracy tradeoff is a classical problem inrecommender system design [68]. Supporting a blend ofboth types of visualization recommendations is a ﬁrststep towards assisting users with different informationpreferences and needs. Further research is needed todevelop and evaluate these recommendations as well as tovalidate our proposed taxonomy.

While recommendations are at most an annoyance whenthey are not interesting to the user, they can be potentiallydetrimental if they are used to draw conclusions withoutdeeper examination. P .F summarized this tradeoff be-tween exploration and exploitation: “ For recommendation,sometimes you get completely irrelevant things, things that arekind of normal. But then on the other side, you get this serendipi-tous discovery, which is very cool. [...] I mean, it’s also dangerous,because you maybe see something where you should furtherinvestigate it if it’s really an effect. ”Over-reliance on recommendations could be problem-atic. For example, P .B said that they had built trust thatthe system would show something interesting. When theinterface did not show anything interesting, they quicklymoved onto the next hypothesis instead of digging deeperbecause they inferred that the system was telling them notto look there. We observed a similar effect during a closed-ended task, where participants were asked to select thedata sub-populations that had more 8-cylinder cars (Q2).When Frontier displayed only visualizations for three out ofthe four multiple choice answer options in

Filter , manyparticipants who employed recommendations simply drewtheir conclusions based on the three visualizations includedin the category, without verifying the remaining one.The potential for erosion of creativity and critical think-ing when interacting with an intelligent system is well-known [6], [69]. While this issue is not particular to rec-ommendations based on analytical actions, but a more gen-eral phenomenon with recommendations [70], the ease ofuse and the apparent trust that users perceive from rec-ommendation categories may exacerbate these issues. Thischallenge points to a need for future research in designingrecommendations that provide some notion of coverage orinform users about what has or has not been examined.

Even though categorized visualization recommendationsprovide a means of organization, limited screen space makesit impossible to show all categories at once. Furthermore,users typically only peruse the ﬁrst few items of a recom-mended list [71]. The diversity of preferences and individualstrategies observed in our study suggests that personalized selection of recommendations may be worthwhile For in-stance, while ten out of the 24 participants believed thatthere should be fewer recommendations, two participants(one from each condition) wanted to see more.In post-study interviews, participants indicated that theywanted a more user-driven approach in creating their ownorganization. Three wanted the ability to extract selectedrecommendations into a separate dashboard, tab, or pageand rearrange them freely into their own groups; eightwanted to hide some or all parts of the recommendationcategories and retrieve them on demand. This points toan interesting future direction towards a hybrid human-recommender workﬂow. Similar to the variability of peo-ple’s web search behavior [72], recommendations could beadaptive and personalized to tailor to users’ preferences.Personalization yields potential beneﬁts beyond provid-ing adaptive interfaces, namely, in providing optimizationopportunities for system scalability. One of the challengesfor visualization recommendations systems is the high com-putational cost associated with searching over a large searchspace of possible visualizations [6], [8], [73]. Given thatthere are preferences for certain categories over others fordifferent users, datasets, and tasks, there is an opportunityto reduce the computational cost of an exhaustive search bypruning the recommendation search space.

EYOND T HE F INAL F RONTIER

The goal of recommender systems is in some sense toanticipate future user needs. Recommendation categorieshelp organize these possible futures as readily-availableoptions to drive analytical workﬂows. We introduced ataxonomy based on prior literature to examine the useful-ness of various recommendation categories based on theunderlying analytical actions. We implemented

Frontier asa design probe to better understand how users push to-wards the frontier, taking next steps in their exploration.Our user study conﬁrmed that recommendation categoriesare indeed useful for facilitating data exploration, helpingusers understand and interpret the visualizations. Whilethe general utility of categorization was not surprising,we more deeply explored how various categories of visu-alization recommendations were employed and the diverseworkﬂow strategies that users adopted. Design implicationsstemming from this study provide unique opportunities forsupporting delightful user experiences with next-generationVisRec systems. R EFERENCES [1] L. Grammel, M. Tory, and M. Storey, “How information visu-alization novices construct visualizations,”

IEEE Transactions onVisualization and Computer Graphics , vol. 16, no. 6, pp. 943–952,Nov 2010.[2] D. J.-L. Lee, J. Lee, T. Siddiqui, J. Kim, K. Karahalios, andA. Parameswaran, “You can’t always sketch what you want:Understanding sensemaking in visual query systems,”

IEEE Trans-actions on Visualization and Computer Graphics , pp. 1–1, 2019.[3] J. D. Mackinlay, P. Hanrahan, and C. Stolte, “Show Me: Automaticpresentation for visual analysis,”

IEEE Transactions on Visualizationand Computer Graphics , vol. 13, no. 6, pp. 1137–1144, 2007.[4] K. Wongsuphasawat, D. Moritz, A. Anand, J. Mackinlay, B. Howe,and J. Heer, “Voyager: Exploratory Analysis via Faceted Browsingof Visualization Recommendations,”

IEEE Transactions on Visual-ization and Computer Graphics , vol. 22, no. 1, pp. 649–658, 2016. [5] K. Wongsuphasawat, Z. Qu, D. Moritz, R. Chang, F. Ouk,A. Anand, J. Mackinlay, B. Howe, and J. Heer, “Voyager 2 :Augmenting Visual Analysis with Partial View Speciﬁcations,”2017.[6] Z. Cui, S. K. Badam, A. Yalc¸in, and N. Elmqvist, “Datasite:Proactive visual data exploration with computation of insight-based recommendations,” CoRR , vol. abs/1802.08621, 2018.[Online]. Available: http://arxiv.org/abs/1802.08621[7] K. Hu, D. Orghian, and C. Hidalgo, “Dive: A mixed-initiativesystem supporting integrated data exploration workﬂows,” in

Proceedings of the Workshop on Human-In-the-Loop Data Analytics .ACM, 2018, p. 5.[8] M. Vartak, S. Madden, and A. N. Parmeswaran, “SEEDB : Support-ing Visual Analytics with Data-Driven Recommendations,” 2015.[9] G. Wills and L. Wilkinson, “Autovis: automatic visualization,”

Information Visualization , vol. 9, no. 1, pp. 47–69, 2010.[10] C¸ . Demiralp, P. J. Haas, S. Parthasarathy, and T. Pedapati, “Fore-sight: Rapid data exploration through guideposts,” arXiv preprintarXiv:1709.10513 , 2017.[11] J. Seo and B. Shneiderman, “A rank-by-feature framework forinteractive exploration of multidimensional data,”

Information vi-sualization , vol. 4, no. 2, pp. 96–113, 2005.[12] S. M. McNee, J. Riedl, and J. A. Konstan, “Makingrecommendations better: An analytic model for human-recommender interaction,” in

CHI ’06 Extended Abstracts on HumanFactors in Computing Systems , ser. CHI EA ’06. New York, NY,USA: Association for Computing Machinery, 2006, p. 1103–1108.[Online]. Available: https://doi.org/10.1145/1125451.1125660[13] T. Munzner,

Visualization Analysis and Design , ser. AK PetersVisualization Series. CRC Press, 2015. [Online]. Available:https://books.google.de/books?id=NfkYCwAAQBAJ[14] D. Tunkelang,

Faceted Search . Morgan and Claypool Publishers,2009.[15] D. J.-L. Lee, “Insight machines: The past, present, and future ofvisualization recommendation,”

Medium , Feb. 2020.[16] D. J.-L. Lee, H. Dev, H. Hu, H. Elmeleegy, and A. Parameswaran,“Avoiding drill-down fallacies with vispilot: Assisted explorationof data subsets,” in

Proceedings of the 24th InternationalConference on Intelligent User Interfaces , ser. IUI ’19. NewYork, NY, USA: ACM, 2019, pp. 186–196. [Online]. Available:http://doi.acm.org/10.1145/3301275.3302307[17] S. van den Elzen and J. J. van Wijk, “Small multiples, large singles:A new approach for visual data exploration,” in

Computer GraphicsForum , vol. 32, no. 3pt2. Wiley Online Library, 2013, pp. 191–200.[18] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein,

Introduc-tion to Algorithms, Third Edition , 3rd ed. The MIT Press, 2009.[19] C. Stolte, D. Tang, and P. Hanrahan, “Polaris: A system forquery, analysis, and visualization of multidimensional relationaldatabases,”

IEEE Transactions on Visualization and Computer Graph-ics , vol. 8, no. 1, pp. 52–65, 2002.[20] A. Satyanarayan, D. Moritz, K. Wongsuphasawat, and J. Heer,“Vega-Lite: A High-Level Grammar of Interactive Graphics,”2017. [Online]. Available: https://vega.github.io/vega-lite/[21] K. Wongsuphasawat, D. Moritz, A. Anand, J. Mackinlay, B. Howe,and J. Heer, “Towards a general-purpose query language forvisualization recommendation,” in

Proceedings of the Workshop onHuman-In-the-Loop Data Analytics . ACM, 2016, p. 4.[22] J. Mackinlay, “Automating the design of graphical presentationsof relational information,”

ACM Transactions on Graphics ,vol. 5, no. 2, pp. 110–141, 1986. [Online]. Available: http://portal.acm.org/citation.cfm?doid=22949.22950[23] D. Moritz, C. Wang, G. L. Nelson, H. Lin, A. M. Smith, B. Howe,and J. Heer, “Formalizing visualization design knowledge asconstraints: Actionable and extensible models in draco,”

IEEEtransactions on visualization and computer graphics , vol. 25, no. 1,pp. 438–448, 2019.[24] Y. Kim, K. Wongsuphasawat, J. Hullman, and J. Heer, “Graph-Scape: A Model for Automated Reasoning about VisualizationSimilarity and Sequencing,”

Proc. of ACM CHI 2017 , 2017.[25] L. Wilkinson, A. Anand, and R. Grossman, “Graph-theoreticscagnostics,” in

IEEE Symposium on Information Visualization, 2005.INFOVIS 2005.

IEEE, 2005, pp. 157–164.[26] T. Siddiqui, A. Kim, J. Lee, K. Karahalios, and A. Parameswaran,“Effortless data exploration with zenvisage: an expressive andinteractive visual analytics system,”

Proceedings of the VLDB En-dowment , vol. 10, no. 4, pp. 457–468, 2016. [27] A. Anand and J. Talbot, “Automatic Selection of PartitioningVariables for Small Multiple Displays,” vol. 2626, no. c, 2015.[28] A. Key, B. Howe, D. Perry, and C. Aragon, “VizDeck,”

Proceedings of the 2012 international conference on Managementof Data - SIGMOD ’12 , p. 681, 2012. [Online]. Available:http://dl.acm.org/citation.cfm?doid=2213836.2213931[29] M. A. Yalc¸ın, N. Elmqvist, and B. B. Bederson, “Keshif: Rapid andexpressive tabular data exploration for novices,”

IEEE transactionson visualization and computer graphics , vol. 24, no. 8, pp. 2339–2352,2018.[30] P.-M. Law, R. C. Basole, and Y. Wu, “Duet: Helping data analysisnovices conduct pairwise comparisons by minimal speciﬁcation,”

IEEE transactions on visualization and computer graphics

US 20180088753 A1 ,2018.[33] H. Lin, D. Moritz, and J. Heer, “Dziban : Balancing Agency &Automation in Visualization Design via Anchored Recommenda-tions,”

Proceedings of the 2020 CHI Conference on Human Factors inComputing Systems - CHI ’20 , 2020.[34] M. A. Hearst, “Design recommendations for hierarchical facetedsearch interfaces,” in

Proc. SIGIR 2006, Workshop on Faceted Search ,August 2006, pp. 26–30.[35] K.-P. Yee, K. Swearingen, K. Li, and M. Hearst, “Faceted metadatafor image search and browsing,” in

Proceedings of the SIGCHIConference on Human Factors in Computing Systems , ser. CHI ’03.New York, NY, USA: ACM, 2003, pp. 401–408. [Online]. Available:http://doi.acm.org/10.1145/642611.642681[36] N. Cao, J. Sun, Y. Lin, D. Gotz, S. Liu, and H. Qu, “Facetatlas:Multifaceted visualization for rich text corpora,”

IEEE Transactionson Visualization and Computer Graphics , vol. 16, no. 6, pp. 1172–1181,Nov 2010.[37] C. Collins, F. B. Vi´egas, and M. Wattenberg, “Parallel tag cloudsto explore and analyze faceted text corpora,”

VAST 09 - IEEESymposium on Visual Analytics Science and Technology, Proceedings ,pp. 91–98, 2009.[38] M. Dork, N. Henry Riche, G. Ramos, and S. Dumais, “Pivotpaths:Strolling through faceted information spaces,”

IEEE Transactions onVisualization and Computer Graphics , vol. 18, no. 12, pp. 2709–2718,Dec 2012.[39] J. English, M. Hearst, R. Sinha, K. Swearingen, and K.-p. Yee,“Flexible Search and Navigation using Faceted Metadata,” 2002.[40] M. Hearst, A. Elliott, J. English, R. Sinha, K. Swearingen, andK.-P. Yee, “Finding the ﬂow in web site search,”

Commun.ACM

IEEE Transactions onVisualization and Computer Graphics , vol. 23, no. 1, pp. 21–30, 2017.[45] A. Srinivasan, S. M. Drucker, A. Endert, and J. Stasko, “Augment-ing Visualizations with Interactive Data Facts to Facilitate Inter-pretation and Communication,”

IEEE Transactions on Visualizationand Computer Graphics , vol. 25, no. 1, pp. 672 – 681, 2019.[46] T. N. Dang and L. Wilkinson, “Scagexplorer: Exploringscatterplots by their scagnostics,” in

Proceedings of the 2014 IEEEPaciﬁc Visualization Symposium , ser. PACIFICVIS ’14. Washington,DC, USA: IEEE Computer Society, 2014, pp. 73–80. [Online].Available: http://dx.doi.org/10.1109/PaciﬁcVis.2014.42[47] P.-M. Law, A. Endert, and J. Stasko, “Characterizing AutomatedData Insights,”

IEEE VIS , p. arXiv:2008.13060, Aug. 2020.[48] S. Vargas and P. Castells, “Rank and Relevance in Novelty andDiversity Metrics for Recommender Systems,”

RecSys , 2011.[49] M. Kaminskas and D. Bridge, “Diversity, Serendipity, Novelty,and Coverage : A Survey and Empirical Analysis of Beyond-Accuracy Objectives in Recommender Systems,”

ACM Transactionsof Intelligent System , vol. 7, no. 1, pp. 1–42, 2016.[50] C. L. A. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, H. I.Storage, and R. Information, “Novelty and Diversity in Informa-tion Retrieval Evaluation,”

SIGIR , pp. 659–666, 2008. [51] T. Munzner, “Chapter 3. Why: Task Abstraction,” in VisualizationAnalysis and Design , 2014, pp. 42–65. [Online]. Available:http://dx.doi.org/10.1201/b17511-4[52] M. Brehmer and T. Munzner, “A Multi-Level Typology of AbstractVisualization Tasks,”

IEEE Transactions on Visualization andComputer Graphics , vol. 19, no. 12, pp. 2376–2385, 2013. [Online].Available: http://ieeexplore.ieee.org/document/6634168/[53] R. Amar, J. Eagan, and J. Stasko, “Low-level components ofanalytic activity in information visualization,”

Proceedings - IEEESymposium on Information Visualization , pp. 111–117, 2005.[54] A. Strauss and J. Corbin,

Grounded theory methodology: An overview. ,ser. Handbook of qualitative research. Thousand Oaks, CA, US:Sage Publications, Inc, 1994, pp. 273–285.[55] J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart,M. Venkatrao, F. Pellow, and H. Pirahesh, “Data cube: Arelational aggregation operator generalizing group-by, cross-tab, and sub-totals,”

Data Mining and Knowledge Discovery ,vol. 1, no. 1, pp. 29–53, Mar 1997. [Online]. Available:doi.org/10.1023/A:1009726021843[56] K. P. Murphy,

Machine Learning: A Probabilistic Perspective . TheMIT Press, 2012.[57] S. Kandel, R. Parikh, A. Paepcke, J. M. Hellerstein, and J. Heer,“Proﬁler: Integrated statistical analysis and visualization for dataquality assessment,” in

Proceedings of the International WorkingConference on Advanced Visual Interfaces , 2012, pp. 547–554.[58] M. Sedlmair, A. Tatu, T. Munzner, and M. Tory, “Ataxonomy of visual cluster separation factors,”

ComputerGraphics Forum , vol. 31, no. 3pt4, pp. 1335–1344, 2012.[Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-8659.2012.03125.x[59] E. Horvitz, “Principles of mixed-initiative user interfaces,” in

Proceedings of the SIGCHI Conference on Human Factors in ComputingSystems , ser. CHI ’99. New York, NY, USA: ACM, 1999, pp.159–166. [Online]. Available: http://doi.acm.org/10.1145/302979.303030[60] J. Nielsen and R. Molich, “Heuristic Evaluation of user interfaces,”

CHI ’90 Proceedings of the SIGCHI Conference on Human Factors inComputing Systems

IEEE Transactions on Visualization andComputer Graphics , vol. 20, no. 12, pp. 2122–2131, 2014.[65] N. Mahyar and M. Tory, “Supporting communication and co-ordination in collaborative sensemaking,”

IEEE transactions onvisualization and computer graphics , vol. 20, no. 12, pp. 1633–1642,2014.[66] D. J.-L. Lee, J. Kim, R. Wang, and A. Parameswaran,“Scattersearch: Visual querying of scatterplot visualizations,” in

IEEE Symposium on Information Visualization, 2019 (Poster) , ser.InfoVis ’19, 2019. [Online]. Available: https://arxiv.org/abs/1907.11743[67] B. Shneiderman, “The eyes have it: A task by data type taxonomyfor information visualizations,” in

Proceedings of the 1996 IEEESymposium on Visual Languages , ser. VL ’96. Washington, DC,USA: IEEE Computer Society, 1996, pp. 336–. [Online]. Available:http://dl.acm.org/citation.cfm?id=832277.834354[68] G. Adomavicius and Y. Kwon, “Overcoming accuracy-diversitytradeoff in recommender systems: A variance-based approach,” 12008, pp. 151–156, 2008 Workshop on Information Technologiesand Systems, WITS 2008 ; Conference date: 13-12-2008 Through14-12-2008.[69] E. Fast, B. Chen, J. Mendelsohn, J. Bassen, and M. Bernstein, “Iris:A Conversational Agent for Complex Tasks,”

CHI 2018 , 2018.[Online]. Available: http://arxiv.org/abs/1707.05015[70] M. Correll, “Ethical dimensions of visualization research,”in

Proceedings of the 2019 CHI Conference on Human Factorsin Computing Systems , ser. CHI ’19. New York, NY, USA:Association for Computing Machinery, 2019, p. 1–13. [Online].Available: https://doi.org/10.1145/3290605.3300418 [71] T. Zhou, Z. Kuscsik, J.-g. Liu, J. Rushton, and Y.-c. Zhang, “Solvingthe apparent diversity-accuracy dilemma of recommender sys-tems,”

Proceedings of the National Academy of Sciences , vol. 107,no. 10, pp. 4511–4515, 2010.[72] R. W. White and S. M. Drucker, “Investigating behavioralvariability in web search,” in

Proceedings of the 16th InternationalConference on World Wide Web , ser. WWW ’07. New York,NY, USA: ACM, 2007, pp. 21–30. [Online]. Available: http://doi.acm.org/10.1145/1242572.1242576[73] M. Vartak, S. Huang, T. Siddiqui, S. Madden, andA. Parameswaran, “Towards Visualization RecommendationSystems,”