You can't always sketch what you want: Understanding Sensemaking in Visual Query Systems
Doris Jung-Lin Lee, John Lee, Tarique Siddiqui, Jaewoo Kim, Karrie Karahalios, Aditya Parameswaran
YYou can’t always sketch what you want :Understanding Sensemaking in Visual Query Systems
Doris Jung-Lin Lee, John Lee, Tarique Siddiqui, Jaewoo Kim, Karrie Karahalios, Aditya Parameswaran
Abstract — Visual query systems (VQSs) empower users to interactively search for line charts with desired visual patterns, typically specifiedusing intuitive sketch-based interfaces. Despite decades of past work on VQSs, these efforts have not translated to adoption in practice,possibly because VQSs are largely evaluated in unrealistic lab-based settings. To remedy this gap in adoption, we collaborated with expertsfrom three diverse domains—astronomy, genetics, and material science—via a year-long user-centered design process to develop a VQSthat supports their workflow and analytical needs, and evaluate how VQSs can be used in practice. Our study results reveal that ad-hocsketch-only querying is not as commonly used as prior work suggests, since analysts are often unable to precisely express their patterns ofinterest. In addition, we characterize three essential sensemaking processes supported by our enhanced VQS. We discover that participantsemploy all three processes, but in different proportions, depending on the analytical needs in each domain. Our findings suggest that all threesensemaking processes must be integrated in order to make future VQSs useful for a wide range of analytical inquiries.
Index Terms —Visual analytics, exploratory analysis, visual queries ntroduction
Line charts are commonly employed during data exploration—theintuitive connected patterns often illustrate complex underlying pro-cesses and yield interpretable and visually compelling data-driven nar-ratives [12]. However, discovering line charts that display certainmeaningful patterns, trends, or characteristics of interest is often anoverwhelming and error-prone process, consisting of manual examina-tion of large numbers of line charts. For example, when trying to findsupernovae, which exhibits a unique pattern of brightness over time (aninitial peak followed by a long-tail decay), astronomers often have tomanually construct and inspect thousands of line chart visualizations tofind ones with their desired pattern. To address this exploration chal-lenge, there has been a large number of papers dedicated to building
Visual Query Systems (VQSs)—a term coined by Ryall et al. [41] todescribe systems that allow users to specify and search for desired linechart patterns via visual interfaces [9,11,18,20,25,27,41,47,49]. Theseinterfaces typically include a sketching canvas where users can draw apattern of interest, with the system automatically traversing all potentialvisualization candidates to find those that match the specification.While these intuitive specification interfaces were proposed as apromising solution to the problem of painful manual exploration ofvisualizations for time-series analysis [41, 49], to the best of our knowl-edge, VQSs have not lived up to these expectations and are not verycommonly used in practice. One likely reason for the lack of VQS adop-tion may be attributed to how prior work has focused almost exclusivelyon optimizing the pattern-matching algorithms and interactions, withfew invested in understanding actual user needs and how VQSs can beused for solving real-world problems.
Our paper seeks to understandhow VQSs can actually be used in practice, as a first step towards thebroad adoption of VQSs in data analysis . Unlike prior work on VQSs,we set out to not only evaluate VQSs in-situ on real problem domains,but also involve participants from these domains in the VQS design.We present findings from a series of interviews, contextual inquiry, par-ticipatory design, and user studies with scientists from three di ff erentdomains— astronomy, genetics, and material science —over the course • Doris and Aditya are with University of California, Berkeley.Email: {dorislee, adityagp}@berkeley.edu • John, Tarique, Jaewoo and Karrie are with University of Illinois,Urbana-Champaign.E-mail: { lee98, tsiddiq2, jkim475, kkarahal}@illinois.eduManuscript received xx xxx. 201x; accepted xx xxx. 201x. Date of Publicationxx xxx. 201x; date of current version xx xxx. 201x. For information onobtaining reprints of this article, please send e-mail to: [email protected] Object Identifier: xx.xxxx / TVCG.201x.xxxxxxx
Phase I Phase II Phase III
Need-Finding Prototyping
Semi-structured InterviewsPreliminary DiscussionContextual Inquiry Collective Brainstorm (Participatory Design)Demos & CritiquesIterative Prototyping
Evaluation
An Understanding of Workflows: Users, Dataset, Challenges Sensemaking Model & Behavioral PatternsDomain/Problem Characterization Zenvisage++ system & VQS Taxonomy Open-ended ExplorationPre-study SurveyPost-study InterviewData/Operation Design Grounded Evaluation O u r A pp r o ac h O u t c o m e Fig. 1: Lifecycle model summarizing our research approach and the outcome of each phase. of a year-long collaboration. The amount of time we invested in each ofthese three diverse domains surpasses the norm in this field and is keyto uncovering the insights presented in this paper. These domains wereselected to capture a diverse set of goals and datasets wherein VQSscan help address important scientific questions, such as: How does atreatment a ff ect the expression of a gene in a breast cancer cell-line?Which battery components have sustainable levels of energy-e ffi ciencyand are safe and cheap to manufacture in production?In this work, we adapt methods from user-centered design(UCD) [14, 31, 32], such as interviews, contextual inquiry, and partici-patory design, into our design-implementation-evaluation cycle [44];our methodology is summarized in Figure 1. Via contextual inquiry andinterviews, we first identified challenges in existing data analysis work-flows in these domains that could be potentially addressed by a VQS.Building on top of an existing open-source VQS, Zenvisage [47, 48],we iterated on the design of the VQS with participants over the courseof a year to better compose data exploration workflows that lead toinsight discovery. Rather than targeting a domain-specific solution, weengaged with multiple domains to observe di ff erences and commonali-ties across domains and synthesize high-level insights regarding the useof VQSs. While conducting this multi-phased, mixed-methods researchagenda across three diverse use cases was challenging, this endeavorwas necessary for addressing the qualitative, participant-centered re-search questions investigated.We organize our design study findings into a taxonomy of VQS capa-bilities, involving three sensemaking processes inspired by Pirolli andCard’s notional model of analyst sensemaking [37]. The sensemakingprocesses include top-down pattern search (translating a pattern “in-the-head” into a visual query), bottom-up data-driven inquiries (querying orrecommending based on data), and context-creation (navigating acrossdi ff erent collections of visualizations). We find that prior VQSs havefocused on enabling top-down processes (via sketching capabilities),but have largely overlooked the two other processes that we found tobe essential in all three domains. These missing capabilities partially a r X i v : . [ c s . D B ] O c t xplain why prior VQSs have not been widely adopted in practice.We finally conducted an evaluation study with nine participants usingour final VQS prototype to address their research questions on theirown datasets. During this study, participants gained novel scientificinsights, such as identifying a star that was known to harbor a Jupiter-sized planet, discovering a previously-unknown relationship betweensolvent properties, and finding characteristic gene expression profilesconfirming the results of a related publication.During this evaluation study, we were somewhat surprised to dis-cover that sketching a pattern for querying is often ine ff ective on itsown. This is due to the fact that sketching makes the assumptionthat users know the pattern that they want to sketch and are able tosketch it precisely. However, this is typically not the case in prac-tice. For example, the geneticists from our study often did not have apreconceived knowledge of what to sketch for and relied heavily onVQS-recommended common and outlying patterns to jumpstart theirqueries. Likewise, while the material scientists from our study wereinterested in datapoints that fall within specific value-ranges, they didnot have an apriori notion of what their desired patterns would looklike. Overall, participants typically opted to combine sketching withother means of pattern specification—one common mechanism wasto drag-and-drop a recommended pattern onto the canvas, and thenmodify it (e.g., by smoothing it out).To further understand how participants engaged with VQSs in theiranalytical workflows, we constructed a Markov model to characterizehow participants transitioned between di ff erent sensemaking processesduring their analysis. We found that participants often constructed adiverse set of analytical workflows tailored to their domains by focusingaround a primary sensemaking process, while iteratively interleavingtheir analysis with the two other processes. This finding points tohow all three sensemaking processes, along with seamless transitionsbetween them, are crucial for enabling the e ff ective use and adoptionof VQSs for addressing real-world challenges.To the best of our knowledge, our study is the first to holistically ex-amine how VQSs can be designed to fit the needs of real-world analysts,and how they are actually used in practice . Working with participantsfrom multiple domains enabled us to compare the di ff erences and com-monalities across di ff erent domains, thereby identifying general VQSchallenges and requirements for supporting common analytical goals.Our contributions include: • a characterization of the problems addressable by VQSs throughdesign studies with three di ff erent domains, • a taxonomy of essential VQSs capabilities, leading to a sensemak-ing model for VQSs, • an integrative VQS, zenvisage ++ capable of facilitating rapid hy-pothesis generation and insight discovery, resulting from iterationwith end-users, • study findings on how VQSs are used in practice, leading to thedevelopment of a novel sensemaking model for VQSs.Our work not only opens up a new space of opportunities beyondthe narrow use cases considered by prior studies, but also advocatescommon design guidelines and end-user considerations for buildingnext-generation VQSs. elated W ork We will now describe past work in visual query systems and existingevaluation methods of visualization systems to provide background andmotivation for our work.
Visual Query Systems: Definition and Brief Survey
The term visual query system (VQS) was introduced by Ryall et al. [41]and Correll and Gleicher [9] to describe systems that enable analysts todirectly search for line chart visualizations matching a queried pattern,constructed through a visual specification interface. Examples of suchsystems include TimeSearcher [17, 18], where the query specificationmechanism is a rectangular box, with the tool filtering out all of the linecharts that do not pass through it, and QuerySketch [49] and GoogleCorrelate [27], where the query is sketched as a pattern on canvas,with the tool filtering out all of the line charts that have a di ff erentshape. Subsequent work, including TimeSketch [11], SketchQuery [9], and Qetch [25], recognized the ambiguity in sketching by studyinghow humans rank similarity in patterns. Finer-grained specificationinterfaces and pattern-matching algorithms have also been developed toimprove the expressiveness of sketched queries and clarify how a sketchshould be interpreted. These VQSs include QueryLines [41] wherequeries can be flexibly composed of soft constraints and preferences andSoftSelect [20] where users can vary the level of sketch similarity acrossa search pattern. Beyond sketching, Zenvisage [47, 48], SketchQuery,and TimeSearcher allow users to submit an existing visualization asthe query, either via drag-and-drop or double-clicking on the existingvisualization. In our work, we built on our system, Zenvisage, since itwas open-source, extensible, and included features beyond the patternmatch specification typically found in other systems, such as the abilityto add data filters and examine recommended patterns [48]. Design and Evaluation Methodologies for Visualization Systems
Visualization systems are typically evaluated via in-lab usability studiesor controlled studies against existing visualization baselines [33,38,50].However, successful lab-tested systems do not always translate tocommunity acceptance and adoption. The unrealistic nature of con-trolled studies has prompted the visualization research communityto develop more participant-centered, ethnographic approaches forunderstanding how analysts perform visual data analysis and reason-ing [23, 30, 38, 43, 46]. For example, multi-dimensional, in-depth,long-term case studies (MILCs) combine interviews, surveys, logging,and other empirical artifacts to create a holistic understanding of how avisualization system can be used in its intended environment [46].In the VQS literature, even though the development and evaluationof advanced VQS algorithms and interactions has been well underwayfor many years, prior work has yet to characterize and understand theneeds of target users and observe how VQSs may be used as part ofa real-world workflow, in order to address the initial questions of: 1)whether the problems that VQSs aim to address are even the right onesto address and 2) whether the chosen operations actually solve the user’sproblems. In the context of Munzner’s nested model for visualizationdesign and evaluation [30], this gap between research and adoptionstems from the common “ downstream threat ” of jumping prematurelyinto the deep levels of encoding, interaction, or algorithm design ,before a proper domain problem characterization and data / operationabstraction design is performed. Our work fills this crucial gap in theexisting literature and highlights how incorrect assumptions adopted bymost prior work in this space regarding the first two stages of Munzner’smodel may have led to the present-day failures in VQS adoption.We performed design studies [23, 43, 46] with three di ff erent subjectareas for domain problem characterization by adopting user-centereddesign practices. User-centered design (UCD) [14, 31, 32] is a classof techniques for iteratively designing a product that fits the needsand desires of users. In UCD, users convey their needs to informdesign decisions. Through participatory design (PD) [29, 42], weengaged potential stakeholders as active co-designers early on andduring every step of the design process, in order to develop a system thatthey may eventually adopt in their analytical workflows. Participatorydesign is a well-established UCD approach in the CHI and CSCWcommunities and has been successfully applied to develop systems forvisual analytics [2, 7], tangible museum experiences [8], and scientificcollaborations [6, 39].In order to “ [develop] a system model that will support [the] user’swork ” that subsequently “ fosters participatory design ”, Holzblatt andJones [19] describe contextual inquiry as a technique where researchersobserve participants in their own work environment. Likewise, wefirst perform contextual inquiry and interviews with participants tounderstand their research questions and the challenges associated withtheir existing analytical workflows, and to identify design opportuni-ties for VQSs. To better understand how VQSs can be used in-situ inparticipant’s existing workflows, we regularly gathered feedback fromparticipants and collaboratively envisioned potential designs by demon-strating preliminary versions of our protoype zenvisage ++ . Based onour design findings, we contribute to the data / operation abstractiondesign of VQSs in Munzner’s model by developing a taxonomy forcharacterizing how analysts make use of VQSs to accomplish their ana-ytical tasks. Finally, we validated our abstraction design with groundedevaluation [21, 38], where participants were invited to bring in theirown datasets and research problems that they have a vested interest into test our final deployed system. ethods Via interviews and contextual inquiry in participants’ normal workenvironments, we first identified the needs and challenges in partici-pants’ existing data analysis workflows. Given these challenges, wecollaboratively designed VQS functionalities by engaging with expertsfrom three di ff erent domains throughout the design process, leading toa final prototype zenvisage ++ . After the design phase, we conducted anevaluation study to understand how VQSs are used in the real-world an-alytical workflows. Our research methodology is illustrated in Figure 1;we now describe the study procedure in more detail. We recruited participants by reaching out to research groups who haveexperienced challenges in data exploration, via email and word-of-mouth. Based on early conversations with analysts from 12 di ff erentpotential application areas, we narrowed down to three use cases inastronomy, genetics, and material science through a process similar tothe “ winnow ” stage in Sedlmair et al. [43]. The domains were chosenbased on their suitability for VQSs as well as diversity in use cases.Six scientists, with extensive research experience in their respectivefields, participated in the design process. We interviewed participantsto learn about their dataset and research questions, shadowed partici-pants in conducting their existing analysis workflows, and subsequentlydiscussed the needs and challenges of their use cases. The interviewswere semi-structured and focused on how the analytical tasks in theirworkflows relate to the scientific questions they were interested in. For iterative prototyping, we built on top of an existing open-sourceVQS, Zenvisage [47, 48], to create a functional prototype to showcasethe capabilities of VQSs. The use of functional prototypes is a com-mon and e ff ective way of engaging with participants, by providing astarting point for collaborative design [8]. We collaborated with eachteam closely with approximately two 1-hour-long meetings per month,where we learned more about their datasets, objectives, and what addi-tional VQS functionalities could help address their research questions.During these meetings, we collectively brainstormed with participantson the design of the prototype. Participants also had the opportunityto interact with the prototype through the help of a guided facilitator.Through these excercises, we elicited feedback from participants onhow the VQS could better support their scientific goals and identifiedand incorporated several crucial capabilities into zenvisage ++ . After the prototyping phase, we performed a qualitative evaluation tostudy how analysts interact with di ff erent VQS components in practice.Participants used datasets that they have a vested interest in exploring toaddress unanswered research questions (a total of six di ff erent datasetsacross nine participants). The evaluation study participants included thesix scientists from Phase I and II, along with three additional “blank-slate” participants who had never encountered zenvisage ++ before The use of all or a subset of the project stakeholders as evaluation par-ticipants is typical in participatory design [5]. While the small samplesize of participants may be viewed as a limitation, this is a pervadingchallenge when recruiting domain-experts [3, 26]. Nevertheless, evenstudies with a small group of domain experts involved are invaluablefor understanding expert needs [43].Evaluation study participants were recruited from each of the threeaforementioned research groups, as well as domain-specific mailinglists. Prior to the study, we asked potential participants to fill out apre-study survey to determine eligibility. Eligibility criteria included:being an active researcher in the subject area with more than one year Details regarding participants can be found in the appendix in Table 3. of experience, and having worked on a research project involving dataof the same nature used in the design phase.At the start of the in-lab evaluation study, participants were providedwith an interactive walk-through of zenvisage ++ and given approxi-mately ten minutes for a guided exploration of a preloaded real-estateexample dataset. After familiarizing themselves with the tool, weloaded the participant’s dataset and encouraged them to talk-aloudduring data exploration, and use external resources as needed. If theparticipant was out of ideas, we suggested one of the main VQS func-tionalities that they had not yet used. If this operation was not applicableto their specific dataset, they were allowed to skip the operation afterhaving considered it. The user study lasted for about an hour and endedafter they covered all the main functionalities. After the study, we askedparticipants open-ended questions about their experience. urrent P articipant W orkflows and O pportunities In this section, we describe our study participants, their scientific goals,and their preferred analysis workflows, based on Phase I of our study.While we collaborated with each application domain in depth, we focuson the key findings from each domain to highlight their commonalitiesand di ff erences, in order to provide a backdrop for our VQS findingsdescribed later on. Comparing and contrasting between the diverseset of questions, datasets, and challenges across these three use casesrevealed new cross-disciplinary insights essential to better understandhow VQSs can be extended for novel and unforeseen use cases. Participants and Goals:
The Dark Energy Survey (DES) is a multi-institution project that sur-veys 300 million galaxies over 525 nights to study dark energy [10].The telescope used to survey these galaxies also focuses on smallerpatches of the sky on a weekly interval to discover astronomical tran-sients, i.e., objects whose brightness changes dramatically as a functionof time, such as supernovae or quasars. Their dataset consisted of alarge collection of light curves : brightness observations over time, oneassociated with each astronomical object, plotted as a time series. Overfive months, we worked closely with A1, an astronomer on the project’sdata management team at a supercomputing facility. Their scientificgoal was to identify potential astronomical transients in order to studytheir properties , i.e., identify patterns in line charts.
Existing Workflow and Design Opportunities:
Since astronomical datasets are often terabytes in scale, they are oftenprocessed and stored in highly specialized data management systemsin supercomputing centers. As a preliminary step, the astronomerdownloads a data sample to explore in a Jupyter notebook, performsdata cleaning and wrangling, and verifies data fidelity by computinga set of relevant statistics. Then, to identify transients, the primaryscientific goal of their exploration, the astronomer programmaticallygenerates visualizations of candidate objects with matplotlib andvisually examines each light curve. If an object of interest is identifiedthrough visual analysis, the astronomer may inspect the image of theobject for verifying that the significant change in brightness was notdue to an imaging artifact. While experienced astronomers like A1 whohave examined many transient light curves can often distinguish aninteresting transient from noise by sight, manual searching for transientsis still very time-consuming and error-prone, since the large majorityof objects are false-positives. A1 immediately recognized the potentialof VQSs, since he could use specific pattern search queries to directlyidentify these rare transients without cumbersome manual examination.
Participants and Goals:
Gene expression is a common measurement in genetics obtained viamicroarray experiments [35]. We worked with a graduate student(G1) and professor (G3) at a research university who were using geneexpression data to understand how genes are related to phenotypesexpressed during early embryonic development. Their data consisted ofa collection of gene expression profiles over time for mouse stem cells, =x^2 d ✚✚ A state='NY' B C DE (1)(2) (5)(4)(1) (3)(1) (2)(2)
Fig. 2: The zenvisage ++ system consists of : (A) data selection panel (where users can select visualized dataset and attributes), (B) query canvas (where the queried data pattern issubmitted and displayed), (C) results panel (where the visualizations most similar to the queried pattern are displayed as a ranked list), (D) control panel (where users can adjust varioussystem-level settings), and (E) recommendations (where the typical and outlying trends in the dataset is displayed). aggregated over multiple experiments. Their scientific goal was to correlate gene function with their expression profiles (i.e., line charts) by gaining a high-level overview of the expression profile patterns . Existing Workflow and Design Opportunities:
G1 often downloads the raw microarray data from a public databaseand preprocesses the data using a script written in R. Then, to explorethis data, G1 loads the preprocessed gene expression data into a customdesktop application to visualize and cluster the gene expression profiles.Prior to the study, G1 and G3 spent over a month searching for the“right” number of groups to cluster the profiles, by iteratively tuningthe parameters on the clustering application and evaluating the outputvia a mix of application-provided visualizations and programmatically-generated statistics. While regenerating their results took no more than15 minutes every time they made a change, the multi-step, segmentedworkflow meant that all changes had to be done o ffl ine, this is, theycould only test out a few variations per week. When we first demon-strated the capabilities of a VQS in our introductory meeting, G3 wasastonished to see that on performing an interaction, the recommendedvisualizations updated almost instantaneously, as opposed to waitinguntil the next meeting for G1 to re-generate the results. They expressedan interest in VQSs, since the tool had the potential to dramaticallyspeed up their collaborative analysis process. Participants and Goals:
We collaborated with material scientists at a research university whoidentify solvents for energy-e ffi cient and safe batteries. These scientistsworked on a large simulation dataset containing chemical propertiesfor more than 280,000 solvents [22]. Each row of their dataset corre-sponded to a unique solvent with 25 di ff erent chemical attributes. Weworked closely with a postdoctoral researcher (M1), professor (M2),and graduate student (M3) to design a sensible way of exploring theirdata. They wanted to use VQSs to discover solvents that not only havesimilar properties to known solvents, but are also more favorable (e.g.,cheaper or safer to manufacture). To search for these solvents, theyneeded to understand how changes in certain chemical properties a ff ectothers (expressed as trends in line charts) under specific conditions. Existing Workflow and Design Opportunities:
M1 typically starts his data exploration process by applying filters to alist of potential battery solvents using SQL queries (e.g., find solventswith boiling point over 300 Kelvins and lithium solvation energy under10 kcal / mol). By iteratively applying and adjusting di ff erent (oftencomplementary) sets of filters, he compares between di ff erent groups ofsolvents by observing their properties across a small sample. He manu-ally examines the properties of each individual solvent by inspectingthe 3D chemical structure of the solvent in a custom software, as wellas gathering information regarding the solvent by cross-referencingan external chemical database and existing uses of this solvent in lit-erature. The collected information, including cost, availability, and other physical properties, enabled researchers to select the final set ofdesirable solvents that could be feasibly experimented with in their lab.While M1 could identify potential solvents through manual lookupsand comparisons, M2 and M1 saw the value in VQSs since it was oftenimpossible to manually uncover hidden relationships between di ff erentattributes, such as how changes in one property a ff ects the behavior ofothers for a class of solvents, across large numbers of solvents. Across the domains, several themes emerged around the bottlenecksthat participants experienced in existing workflows. • Need for Expressive Querying:
While there is often a need tocompare among large numbers of data instances, it is di ffi cult toexpress and search for a desired shape-based pattern through pro-gramming languages like SQL or Python. And yet, none of theparticipants have heard of VQSs, let alone use them. • Need for Integrative Workflows:
Users often switched betweendi ff erent analytical tasks, including preprocessing, parameter spec-ification, code execution, and visualization comparisons. Thenon-interactive nature of their segmented workflows impedes ex-ploratory analysis and hinders collaboration. • Need for Faceted Exploration:
To deal with the large volumeof data present, users have to select particular samples or subsetsof data that are “worth investigating”. Often, the choice of whatcriteria to apply as filters is also exploratory.These themes seeded the collaborative feature discovery process, lead-ing to the development of the system prototype, described next. esign P rocess and S ystem O verview Given the need for a VQS, we further collaborated with participants todevelop features to address their problems and challenges in Phase II ofour study. We first provide a high-level system overview of the designproduct, zenvisage ++ , then we reflect on our feature discovery process. The zenvisage ++ interface is organized into 5 major regions all of whichdynamically update upon user interactions. Typically, participants begintheir analysis by selecting the dataset and attributes to visualize in the data selection panel (Figure 2A). Then, they specify a pattern of interestas a query (hereafter referred to as pattern query ), through either sketch-ing, inputting an equation, uploading a data pattern, or dragging anddropping an existing visualization, displayed on the query canvas (Fig-ure 2B). zenvisage ++ performs shape-matching between the queriedpattern and other possible visualizations, and returns a ranked list ofvisualizations that are most similar to the queried pattern, displayedin the results panel (Figure 2C). At any point during the analysis, ana-lysts can adjust various system-level settings through the control panel (Figure 2D) or browse through the list of recommendations providedby zenvisage ++ (Figure 2E). For comparison, the existing Zenvisage omponent Feature Purpose Task Example Similar Featuresin Past VQSsQuery by Sketch(Figure 2B1) Freehand sketching forspecifying pattern query. A: Find patterns with a peakand long-tail decay thatmay be supernovae candidates. All include sketchcanvas except [18].Input Equation(Figure 2A1) Specify a exact functionalform as a pattern query(e.g., y = x ). M: Find patterns exhibitinginversely proportionalchemical relationship. —-
Pattern Specification:
What is the shape ofthe pattern query?
Pattern Upload(Figure 2D2) Upload a pattern consistingof a sequence of points asa query. A: Find supernovae based onpreviously discovered sources. Upload CSV[27]Smoothing(Figure 2D2) Interactively adjusting the levelof denoising on visualizations,e ff ectively changing the degreeof shape approximation whenperforming pattern matching. A, M:
Eliminate patternsmatched to spurious noise. Smoothing [25]Angular slope queries [18]Trend querylines [41]RangeSelection(Figure 2B2, D4) Restrict to query only inspecific x / y ranges of interestthrough brushing selectedx-range and filteringselected y-range. A: Matching only aroundshape exhibiting a peak. M: Matching only aroundshape region that exhibit linearor exponential relationships Text Entry [25,49]Min / max boundaries [41]Range Brushing [17] T op - D o w n Match Specification:
How should the patternquery be matchedwith other visualizations?
RangeInvariance(Figure 2D1,4) Ignoring vertical or horizontaldi ff erences in pattern matchingthrough option for x-rangenormalization and y-invariantsimilarity metrics . A: Searching for existence of apeak above a certain amplitude. G: Searching for a“generally-rising" pattern. Temporal invariants [9]Data selection(Figure 2A) Changing the collection ofvisualizations to iterate over. M: Explore tradeo ff s andrelationships betweenphysical attributes. —- View Specification:
What data to visualizeand how should itbe displayed?
Display control(Figure 2D4) Changing the details ofhow visualizations shouldbe displayed. M: Non-time-series data shouldbe displayed as scatterplot. —-Filter(Figure 2D3) Display and query only on datathat satisfies the composedfilter constraints. A: Eliminate unlikelycandidates by navigating tomore probable data regions.
M, G:
Compare how overallpatterns change when filteredto particular data subsets. —- C on t e x t C r ea ti on Slice-and-Dice:
How does navigatingto another data subsetchange the query result?
Dynamic Class(Figure 9) Create custom classes of datathat satisfies one or morespecified range constraints.Display aggregatevisualizations for separatedata classes.
A, M:
Examine aggregatepatterns of di ff erent dataclasses. —- Result Querying:
What other visualizations“look similar" to theselected pattern?
Drag-and-drop(Figure 2C, E) Querying with any selectedresult visualization as patternquery (either fromrecommendations or results).
A, G, M:
Find other objects thatare similar to X; Examine whatother objects similar to X looklike overall. Drag-and-drop [17]Double-Click [9] B o tt o m - U p Recommendation:
What are the key patternsin this dataset?
Representativeand Outliers(Figure 2E) Displaying visualizations ofrepresentative trends and outlierinstances based on clustering. A: Examine anomalies and debugdata errors through outliers.
G, M:
Understand representativetrends common to this dataset(or filtered subset). —-
Table 1: Taxonomy of key capabilities essential to VQSs and major features incorporated via user-centered design. We organize each feature based on its functional component. From leftto right, each of the three sensemaking processes (first column) is broken down into key functional components (second column) in VQSs. Each component addresses a pro-forma questionfrom the system’s perspective. Table cells are further colored according to the sensemaking process that each component corresponds to (Blue: Top-down, Yellow: Context creation, Green:Bottom-up). We list the functional purpose of each feature based on how it is implemented in zenvisage ++ , example use cases from participatory design ( A: astronomy, M: materialscience, G: genetics), and similar features incorporated in past VQSs. Given the exhaustive nature of Table 1, each motivated by example use cases from one or more domains, we furtherorganize the features in terms of the Section 6 sensemaking framework and assess their e ff ectiveness in the Section 7 evaluation study. system from [48] allowed users to query via sketching or drag-and-dropand displayed representative and outlier pattern recommendations, buthad limited capabilities to navigate across di ff erent data subsets and hadfew control settings. Our zenvisage ++ system is open source and avail-able at: http://github.com/zenvisage/zenvisage ; other detailsand documentation can be found at that link. Throughout the design process, we worked closely with participants todiscover VQS capabilities that were essential for addressing their high-level domain challenges. We identified various subtasks based on theparticipant’s workflows, designed sensible features for accomplishingthese subtasks that could be used in conjunction with existing VQScapabilities, and elicited feedback on intermediate feature prototypes.Bodker et al. [4] cite the importance of encouraging user participationand creativity in cooperative design through di ff erent techniques, suchas future workshops, critiques, and situational role-playing. Similarly,our objective was to collect as many feature proposals as possible. Wefurther organized these features we added to zenvisage ++ into Table 1 through an iterative coding process [28] by one of the authors.We first collected the list of features, example usage scenarios, andsimilar capabilities in existing VQSs as open codes, corresponding toindividual rows in Table 1. Then, we further organized this list intoaxial codes representing “components”: core functionalities essentialto VQSs (second column in Table 1). Finally, the selective codescapture each of the sensemaking processes (leftmost column in Table 1).Instead of describing this table in detail, we present a typical exampleof how this table is organized. From right to left, consider the rowcorresponding to the Smoothing feature (column 3) in Table 1: one ofthe common challenges in astronomy and material science is that noisein the dataset can result in large numbers of false-positive matches. Toaddress this issue, smoothing is a feature in zenvisage ++ that enablesusers to adjust data smoothing algorithms and parameters on-the-fly toboth denoise the data and change the degree of shape approximationapplied when performing pattern matching. Smoothing, along withrange selection and range invariance, is part of the match specification component: VQS mechanisms for clarifying how matching shouldbe performed. Both match specification and pattern specification (aescription of what the pattern query should look like) are essentialcomponents for supporting the sensemaking process top-down patternsearch (in blue, as labeled in the leftmost column).It is important to note that while some of the proposed features inTable 1 (such as data filtering and view specification) are pervasivein general visual analytics (VA) systems [1, 16], they have not beenincorporated in present-day VQSs. In fact, one of the key insightshere is in recognizing the need for an integrative VQS whose sumis greater than its parts, that encourages analysts to rapidly generatehypotheses and discover insights by facilitating all three sensemakingprocesses. This finding is partially enabled by the unexpected benefitsthat come with collaborating with multiple groups of participants duringthe feature discovery process. Next, we reflect on what worked andwhat didn’t work in the feature discovery process, to inform similardesign studies for visual analytics systems.
Cross-pollination and Generalization via Parallel Use Cases.
Introducing the newly-added features to zenvisage ++ that addresseda particular domain often resulted in unexpected use cases for otherdomains. Considering feature proposals from multiple domains canalso result in cross-pollination of feature designs, often leading tomore generalized design choices. For example, around the same timewhen we spoke to astronomers who wanted to eliminate sparse timeseries from their search results, our material science collaborators alsoexpressed a need for inspecting only solvents with properties abovea certain threshold. Instead of developing separate domain-specificfeatures, data filtering arose as a crucial, common operation that waslater incorporated into zenvisage ++ to support this class of queries. The Hidden Upfront Cost of Domain Integration.
While we expected to spend most of our collaborative design e ff ort onfiguring out the mechanics of visual query specification and matching,instead, preparing participant datasets for use in our system by meetingdata and system requirements was the most time-consuming aspectof this phase . Data requirements include gaining an understandingof the problem domain, understanding the types of data suitable fora VQS, and cleaning and loading of this data. System requirementsinclude features required for the data to be visualized appropriately.Often, participants could only envision the types of queries to issue andhow variations to the system could help better address their needs afterseeing their data displayed for the first time in the prototype. We alsofound that the time it took us to satisfy the data and system requirementsdecreased as we progressed to the later domains, by leveraging existingfeatures in our prototype to satisfy some of the upfront needs. Build Connectors, not Swiss-Army Knives.
Participants often envisioned how VQSs can be used in conjunctionwith other resources that they are familiar with, including those used forreference, computing statistics, browsing related datasets, or examiningother data attributes or visualization types not supported in the VQS(scatterplots, histograms). The prevalence of external tools for support-ing analytical inquiries stems from how analysts often require multipledata sources or data attributes to further develop or verify their hypoth-esis. For example, to determine whether a particular gene belongs to aregulatory network, G2 not only needed to look at the expression datain the VQS, but also enrichment testing and knockout data. Likewise,others used specialized tools for visualizing telescope images and 3Dchemical structures. Instead of forcing our VQS prototype into a swiss-army knife, we instead focused on building connectors that enablesmoother transitions between tools. For example, our data upload andpattern upload feature invites participants to bring data from an externaltool into zenvisage ++ , while our data export feature allowed users todownload the similarity, representative trend, and outlier results as csvfiles from zenvisage ++ into an external tool. For example, geneticistscould export the clusters directly from zenvisage ++ as inputs to theirdownstream regression analysis. The Art of Problem Selection.
While our collective brainstorming led to the cross-pollination andgeneralization of features, this technique can also lead to unnecessaryfeatures that result in wasted engineering e ff ort. During co-design, We provide a detailed timeline in Appendix A. there were numerous features proposed by participants, not all of whichwere incorporated. The reasons for not carrying a feature from designto implementation stage included: • Nice-to-haves: One of the most common reasons for unincorporatedfeatures comes from participant’s requests for nice-to-have features.We use two criteria (necessity and generality across domains) tojudge whether to invest in developing a particular feature. • “One-shot” operations: We decided not to include features that onlyneeded to be performed once and remain fixed thereafter in theanalysis workflow. For example, certain preprocessing operationssuch as filtering null values only needed to be performed oncewith an external tool, whereas data smoothing is a procedure thatrequires some degree of tuning and adjustments. • Substantial research or engineering e ff ort: Some proposed featuresdid not make sense in the context of VQS or required a completelydi ff erent set of research questions. For example, the question ofhow to properly compute similarity between time series with non-uniform number of datapoints arose in the astronomy and geneticsuse case, but requires the development of a novel distance metricand algorithm that is out of the scope of our design study objective. • Underdeveloped ideas: Other feature requirements came from ca-sual specification that was underspecified. For example, A1 wantedto look for objects that have a deficiency in one band and high emis-sion in another band, but the scientific definition of “deficiency” interms of brightness levels was ambiguous.The decision of whether to invest in developing a feature requiresa careful balance between promoting unforseen feature and wastedengineering e ff orts. Failure to identify these early signs may result infeature implementations that turn out not to be useful for the participantsor result in feature bloat. ensemaking M odel for VQS s We now revisit Table 1 in an e ff ort to contextualize our design findingsusing Pirolli and Card’s sensemaking framework [37]. Pirolli andCard’s sensemaking model for expert intelligence analysis distinguishesbetween information processing tasks that are top-down (from theoryto data) and bottom-up (from data to theory). Correspondingly, in thecontext of VQSs, analysts can query either directly based on a pattern“in their head” [43] via top-down pattern specification or based on thedata or visualizations presented to them by the system via bottom-updata-driven inquiry . In addition, when analysts do not know whatattributes to visualize, context creation helps analysts navigate acrossdi ff erent collections of visualizations to seek visualization attributesof interest. In this section, we first describe the objectives of eachsensemaking process, then we discuss how each sensemaking processis comprised of functional components that address the problem anddataset characteristics of each domain. Top-down processes are “ goal-oriented ” tasks that make use of “ anal-ysis or re-evaluation of theories [and] hypotheses [to] generate newsearches ” [37]. Applying this notion to the context of VQSs, the goalof top-down pattern search is to search for data instances that exhibita specified pattern, based on analyst’s intuition about how the desiredpatterns should look like “in theory” (including visualizations frompast experience or abstract conceptions based on external knowledge).Based on this preconceived notion of what patterns to search for, thedesign challenge is to translate the pattern query from the analyst’shead to a query executable by the VQS. This requires both componentsfor specifying the pattern ( pattern specification ), as well as controlsgoverning how the pattern-matching is performed ( match specification ). Pattern Specification interfaces allow users to submit exact descrip-tions of a pattern query. This is useful when the dataset contains largenumbers of potentially-relevant pattern instances . Since it is often di ffi -cult to sketch precisely, additional shape characteristics of the patternquery (e.g., patterns containing a peak with a known amplitude, orexpressible as a functional form) can be used to further winnow the listof undesired matches. atch Specification addresses the well-known problem in VQSswhere pattern queries are imprecise [9, 11, 20] by enabling users toclarify how pattern matching should be performed. Match specificationis useful when the dataset is noisy . When the pattern query satisfiessome additional constraints (e.g., the pattern is horizontally invariant),adjusting these knobs prune away matches that are false-positives tohelp analysts discover true desired candidates. Usage Scenario:
A1 knows intuitively what a supernovae patternshould look like and its detailed shape characteristics, such as theamplitude of the peak and the level of error tolerance for defining amatch. He first performs top-down pattern search by querying fortransient patterns through sketching, then adjusts the match criterionby choosing to ignore di ff erences along the temporal dimension andchanging the similarity metric for flexible matching. In Pirolli and Card’s sensemaking model, bottom-up processes are“ data-driven ” tasks initiated by “ noticing something of interest indata ” [37]. Likewise in VQSs, bottom-up data-driven inquiry is abrowsing-oriented sensemaking process that involves tasks that areinspired by system-generated visualizations or results. The design chal-lenge for VQSs to support bottom-up inquiries is to develop the rightset of “stimuli” through recommendations that could provoke furtherdata-driven inquiries, as well as low-e ff ort mechanisms to search viathese pattern instances through result querying . As we will discusslater, this process is crucial but underexplored in past work on VQSs. Recommendations display visualizations that may be of interest tousers based on the current data context. In zenvisage ++ , recommenda-tions comprise of representative trends and outliers, which are useful forunderstanding common and outlying behaviors when a small numberof common patterns is exhibited in the dataset. Result querying enables users to query for patterns similar to a selecteddata pattern from the ranked list of results or recommendations. Typi-cally, analysts select visualizations with semantic or visual properties of interest and make use of result querying to understand characteristicproperties of similar instances.
Usage Scenario:
G2 does not have an upfront knowledge of what tosearch for. She learns about the characteristic patterns that exist in thedataset through the representative trends, a form of bottom-up inquiry,as a means to jump-start further queries via result querying, as well asunderstand groups of data instances with shared characteristics.
While top-down and bottom-up processes operate on a collection ofvisualizations with fixed X and Y attributes, context creation operates inthe regime where the analyst may be investigating the relationships be-tween multiple di ff erent attributes or values of interest. Context creationenables analysts to navigate across di ff erent visualization collectionsto learn about patterns in di ff erent regions of the data. The designchallenge of context creation is to help users visualize and comparehow data changes between these di ff erent contexts by constructing visu-alization collections with di ff erent visual encodings ( view specification )or di ff erent data subsets ( slice-and-dice ). View specification settings alter the encoding for all of the visualiza-tions on the VQS currently being examined. This ability to work withdi ff erent collections of visualizations is useful when the dataset is mul-tidimensional and the axes of interest are unknown . Modifying the viewspecification o ff ers analysts di ff erent perspectives on the data to locatevisualization collections of interest. Slice-and-Dice empowers users to navigate and compare collections ofvisualizations constructed from di ff erent subsets of the data. Data navi-gation capabilities are essential when the dataset has large numbers of“support attributes” that may be related to the visualization attributes(e.g., geographical location may influence the time series pattern forhousing prices). Analysts can either make use of pre-existing knowl-edge regarding these support attributes to navigate to a data region thatis more likely to contain the desired pattern (e.g., filtering to suburbs tofind cheaper housing) or discover unknown patterns and relationships between di ff erent data subsets (e.g., housing prices are lower in winterthan compared to summer). Usage Scenario:
M1 recognizes salient trends in his dataset such asinverse or linear correlations, but does not have fixed attributes that hewants to visualize or a pattern in mind to query with. Given a list ofphysical properties of potential interest, he performs context creationby switching between di ff erent visualized attributes to understand thedataset from alternative perspectives. He can also dynamically createdi ff erent classes of data (e.g., solvents with low solubility or have highcapacity) to examine their aggregate patterns.The three aforementioned sensemaking processes are akin to thewell-studied sensemaking paradigms of search (top-down), browse(bottom-up), and faceted navigation (context creation) on the Web [15,34]. Due to each of their advantages and limitations given di ff erentinformation seeking tasks, search interfaces have been designed to sup-port all three complementary acts and transition smoothly between themto combine the strength of all three sensemaking processes. Our evalu-ation study reveals that this integrative approach not only acceleratesthe process of visualization discovery, but also encourages hypothesesgeneration and experimentation. valuation S tudy F indings Based on audio, video screen capture, and click-stream logs recordedduring our Phase III evaluation study, we performed thematic analysisvia open coding to label every event with a descriptive code. Eventcodes included specific feature usage, insights, provoked actions, con-fusion, need for capabilities unaddressed by the system, and use ofexternal tools . To characterize the usefulness of each feature, we fur-ther labeled whether each feature was useful to a particular participant’sanalysis. A feature was deemed useful if it was either used in a sensibleand meaningful way to accomplish a task or address a question duringthe study, or has envisioned usage outside of the constrained time limitduring the study (e.g., if data was available or downstream analysis wasconducted). In this section, we will apply our thematic analysis resultsto understand how each sensemaking process occurs in practice. To understand the usefulness of di ff erent visual querying modalities, weanalyzed their frequency of use in our evaluation study. To our surprise,despite the prevalence of sketch-to-query systems in the literature, onlytwo out of our nine participants found it useful to directly sketch adesired pattern onto the canvas. The reason why most participants didnot find direct sketching useful was that they often do not start theiranalysis with a specific pattern in mind. Instead, their intuition aboutwhat to query is derived from other visualizations they encounteredduring exploration, in which case it makes more sense to query usingthose visualizations as examples directly (e.g., by dragging and drop-ping that visualization onto the canvas to submit the query). Even if auser has a pattern in mind, translating that pattern into a sketch is oftenhard to do. For example, A2 wanted to search for a highly-varyingsignal enveloped by a sinusoidal pattern indicating planetary rotation, which was hard to draw by hand.We further investigated the processes that participants engaged into construct pattern queries. Pattern queries can be generated by eithertop-down (sketching based on user’s in-the-head pattern) or bottom-up (drag-and-drop based on what user observes from data) processes.While our study is not intended as a quantitative study with di ff erentquerying modalities as conditions, we wanted to get an estimate of therelative frequency of di ff erent mechanisms across users. We examinedthe sequence of interactions that led to each pattern query and labeledeach one based on one of the five ways it can be generated—two top-down and three bottom-up ways .We find that bottom-up processes are40% more commonly used than top-down processes for generating apattern query . Within top-down processes, a pattern query could arise See Appendix D for details on our coding protocol. Top-down: sketch-to-query, sketch-to-modify; Bottom-up: Result queryingvia object of interest, via ranked result, or via recommendations. See AppendixFigure 12 for more details. ig. 3: Example of sketch-to-modify, based on canvas tracesfrom M2 (left) and A3 (right).The original drag-and-droppedquery is shown in blue andsketch-modified queries in red. P a tt e r n S p ec i f i ca ti on M a t c h S p ec i f i ca ti on R e s u lt Q u e r y i ng R ec o mm e nd a ti on V i e w S p ec i f i ca ti on S li ce - a nd - D i ce ✓ TimeSearcherQuerySketchQueryLinesSoftSelectGoogle CorrelateTimeSketch SketchQueryQetchZenvisageZenvisage ++ ✓✓✓ ✓✓ ✓✓✓ ✓✓✓✓✓ ✓✓ ✓✓ ✓✓ ✓✓ ✓✓ ✓✓ ✓✓
Process & Component ✓✓ O p e n - S ou r ce Top-Down Context Creation Bottom-Up ✓ Existing VQSs ✓ Table 2: Table summarizing whether key functional com-ponents (columns) are covered by past systems (row, or-dered by recency). Column header colors blue, yellow,green represent the three sensemaking processes. Heavily-used features for context-creation and bottom-up inquiryare largely missing from prior VQSs. from users directly sketching a new pattern or by modifying an existingsketch. For example, M2 first sketched a pattern to find solvent classeswith anticorrelated properties (pattern as a straight line with negativeslope) without much success in finding a desired match. So he insteaddragged and dropped one of the peripheral visualizations similar tohis desired one and then smoothed out the noise in the visualizationvia sketching, yielding a straight line, as shown in Figure 3 (left).M2 repeated this workflow twice in separate occurrences during thestudy and was able to derive insights. Likewise, A3 was searching forpulsating stars characterized by dramatic changes in the amplitudes ofthe light curves. She knows that stellar hotspots also exhibit dramaticamplitude fluctuations, but unlike pulsating stars, the variations happenat regular intervals. Figure 3 (right) illustrates how A3 first picked outa regular pattern (suspected starspot), then modified it slightly so thatthe pattern looks more “irregular” (to find pulsating stars).The infrequent use of top-down pattern specification was also re-flected in the fact that none of the participants queried using an equation.In both astronomy and genetics, the visualization patterns resulted fromcomplex physical processes that could not be written down as equationsanalytically. Even in the case of material science when analytical rela-tionships do exist, it is challenging to formulate patterns as functionalforms in a prescriptive manner.We found that some users employed match specification to remedyundesired results from their top-down pattern queries. While we did notrigorously study the e ff ects of di ff erent analytical parameter settings,we observed that more users refined their matches by adjusting therange and degree of approximation, rather than opting for a di ff erentsimilarity metric. This points to future work in developing more flexibleand intuitive vocabularies for modifying the match along the researchdirections pursued in [9, 25] over incorporating additional complex,o ff -the-shelf matching objectives in VQSs.Our findings suggest that while sketching is a useful construct forpeople to express their queries, the existing ad-hoc, sketch-only modelfor VQSs is insu ffi cient on its own without data examples that can helpanalysts jumpstart their exploration. In fact, we found that sketch-to-query only accounted for about a fifth of the total number of visualqueries performed during the study. This finding has profound impli-cations on the design of future VQSs, since our comparison of VQSfeatures across existing work (Table 2) suggests that past work has pri-marily focused on top-down process components, without consideringhow useful these features are in real-world analytic tasks. We suspectthat these limitations may be why existing VQSs are not commonlyadopted in practice. Note that we are not advocating for removing thenatural and intuitive sketch capabilities from future VQSs completely,but instead focusing future research and design e ff orts to examine other(often underexplored) VQS sensemaking processes. Such processescould be applied in conjunction with sketching to help analysts moreflexibly express their analytical goals, described next. As alluded to earlier, bottom-up data-driven inquiries and contextcreation are far more commonly used than top-down pattern searchwhen users have no desired patterns in mind , which is typically thecase for exploratory data analysis. In particular, top-down approacheswere only useful for 29% of the use cases, whereas they were useful for70% of the use cases for bottom-up approaches and 67% for contextcreation . We now highlight some exemplary workflows demonstratingthe e ffi cacy of the latter two sensemaking processes. Bottom-up pattern queries can come from either the ranked listof results, recommendations, or by selecting a particular object ofinterest as a drag-and-drop query. The most common use of bottom-upquerying is via recommended visualizations. For example, G2 andG3 identified that the three representative patterns recommended in zenvisage ++ corresponded to the same three groups of genes discussedin a recent publication [13]: induced genes (profiles with expressionlevels going up ), repressed genes (starting high then decreasing), and transients (rising first then dropping at another time point). The clusters provoked G2 to generate a hypothesis regardingthe properties of transients: “Is that because all the transient groupsget clustered together, or can I get sharp patterns that rise and ebbat di ff erent time points?” To verify this hypothesis, G2 increasedthe parameter controlling the number of clusters and noticed that theclusters no longer exhibited the clean, intuitive patterns he had seenearlier. G3 expressed a similar sentiment and proceeded by inspectingthe visualizations in the cluster via drag-and-drop. He found a group ofgenes that all transitioned at the same timestep, while others transitionedat di ff erent timesteps. By browsing through the ranked list of results,participants were also able to gain a peripheral overview of the dataand spot anomalies during exploration. For example, A1 spotted timeseries that were too faint to look like stars after applying the filterCLASS_STAR =
1, which led him to discover that all stars have beenmislabeled with CLASS_STAR = Context creation in VQSs enables users to change the “lens” bywhich they look through the data when performing visual querying,thereby creating more opportunities to explore the data from di ff er-ent perspectives. Echoing the sentiment from past studies in visualanalytics regarding the importance of designing features that enableusers to select relevant subsets of data [1, 16, 24, 45], we found that allparticipants found at least one of the features in context creation to beuseful.Both A1 and A2 expressed that context creation through interactivefiltering was a powerful way to dynamically test conditions and tune val-ues that they would not have otherwise experimented with, e ff ectivelylowering the barrier between the iterative hypothesize-then-comparecycle during sensemaking. During the study, participants used filteringto address questions such as: Are there more genes similar to a knownactivator when we subselect only the di ff erentially expressed genes? (G2) and Can I find more supernovae candidates if I query only onobjects that are bright and classified as a star? (A1). Three participantshad also used filtering as a way to query with known individual objectsof interest. For example, G2 set the filter as gene = this gene is regulated by the estrogen receptor, when wesearch for other genes that resemble this gene, we can find other genesthat are potentially a ff ected by the same factors. ”While filtering enabled users to narrow down to a selected datasubset, dynamic classes (buckets of data points that satisfies one ormore range constraints) enabled users to compare relationships betweenmultiple attributes and subgroups of data. For example, M2 dividedsolvents in the database into eight di ff erent categories based on voltageproperties, state of matter, and viscosity levels, by dynamically settingthe cuto ff values on the quantitative variables to create these classes.By exploring these custom classes, M2 discovered that the relation-ship between viscosity and lithium solvation energy is independent ofwhether a solvent belongs to the class of high voltage or low voltagesolvents. He cited that dynamic class creation was central to learningabout this previously-unknown attribute properties: All this is really possible because of dynamic class creation, so this allows See Appendix D for details on how this measure was computed. ou to bucket your intuition and put that together. [...] I can now bucketthings as high voltage stable, liquid stable, viscous, or not viscous and startdoing this classification quickly and start to explore trends. [...] look howquickly we can do it!
Given our observations so far as to how participants make use of eachsensemaking process in practice, we construct a Markov model tofurther investigate the interplay between these sensemaking processesin the context of an analysis workflow. Markov models have beenused in the past by Reda et al. [40] in a similar manner to analyzeinteraction sequences from open-ended, exploratory analysis evaluationstudies. The goal of such analysis is to quantitatively capture how users“ transitions between mental, interaction, and computational states ”to a ff ord researchers to qualitatively characterize the processes andbehavioral patterns “ essential to insight acquisition ” [40].To compute the state transition probabilities in the Markov model,we make use of event sequences from the evaluation study, where eachevent consists of labels describing when specific features were used.Using the taxonomy in Table 1, we map each usage of a feature in zen-visage ++ to one of the three sensemaking processes. Each participant’sevent sequence is divided into sessions, each indicating a separate lineof inquiry during the analysis. Based on these event sequences—onefor each session, we compute the aggregate state transition probabilities(edge weight labels in Figure 4) to characterize how participants fromeach domain move between di ff erent sensemaking processes .The transition probability represents the probability that an actionfrom one class would be followed by one from the other. For example,in material science, 60% of events that started with bottom-up explo-ration lead to context creation and to top-down pattern search the rest ofthe time. Self-directed edges indicate the probability that the participantwould continue with the same type of sensemaking process. For exam-ple, when an astronomer performs top-down pattern search, 64% of thetransitions were followed by another top-down process and by contextcreation the rest of the time, but never followed by a bottom-up process.This high self-directed transition probability reflects how astronomersoften need to iteratively refine their top-down query through pattern ormatch specification when looking for a specific pattern.To study how important each sensemaking process is for partici-pant’s overall analysis, we compute the eigenvector centrality of eachgraph, displayed as node labels in Figure 4. These values represent thepercentage of time the participants spend in each of the sensemakingprocesses when the transition model has evolved to a steady state [36].Given that nodes in Figure 4 are scaled by this value, in all domains,we observe that there is always a prominent node connected to two lessprominent ones—but it is also clear that all three nodes are essentialto all domains. Our observation demonstrates how participants oftenconstruct a central workflow around a main sensemaking process basedon their analytical goals and interleave variations with the two other support processes as they iterate on the analytic task . For example, thematerial scientists focus on context creation 56% of the time, mainlythrough dynamic class creation, followed by bottom-up inquiries (suchas drag-and-drop) and top-down pattern searches (such as sketch modifi-cation). The central process adopted by each domain is tightly coupledwith the problem characteristics associated with each domain. Forexample, without an initial query in mind, geneticists relied heavilyon bottom-up querying through recommendations to jumpstart theirqueries.The Markov transition model exemplifies how participants adopteda diverse set of workflows based on their unique set of research ques-tions. The bi-directional and cyclical nature of the transition graphs inFigure 4 highlight how the three sensemaking processes do not simplyfollow a linear progression towards finding a single pattern or attributeof interest. Instead, the high connectivity of the transition model illus-trates how these three equally-important processes form a sensemakingloop, representing iterative acts of dynamic foraging and hypothesis Results were broken down by domain, rather than on an individual basis,since the analytical patterns within the domains are very similar (possibily dueto the similarity between analytical inquiries and datasets within the domains).
Bottom up 17%
Context Creation 39%Top-down 44%
Bottom up 17%
Context Creation 56%
Top-down 27%
Bottom-up 49%
Context Creation 20%
Top-down 31%
Genetics
Material Science
Astronomy
Fig. 4: Markov models computed based on evaluation study event sequences, with edgesdenoting the probability that participant in the particular domain will go from one sense-making process to the next. Nodes are scaled according to their eigenvector centrality,representing the percentage of time participants would spend in a particular sensemakingprocess in steady state. The data consists of 206 event actions taken by participants duringthe study (80 for astronomy, 65 for genetics, and 61 for material science). generation. This finding reinforces the importance of each sensemak-ing process and indicates that future VQSs need to be integrative insupporting all three sensemaking process to enable a diverse set ofpotential workflows for addressing a wide range of analytical inquiries.
Although evidence from our evaluation study points to the infrequentuse of direct sketch, we have not performed controlled studies witha sketch-only system as a baseline to validate this hypothesis. Whilewe employed quantitative comparisons in various analysis throughoutthis section, our goal is to gain a formative understanding of VQSusage behavior across our small sample. Future studies with largersample sizes and more representative samples are required to generalizethese findings. The goal of our study is to uncover qualitative insightsthat might reveal why VQSs are not widely used in practice; furthervalidation of specific findings is out of the scope of this paper. Whileconcerns regarding study results being focused on zenvisage ++ mustbe acknowledged, we note that zenvisage ++ is one of the most compre-hensive VQSs to-date, covering many of the features from past systemsand more (as evident from Table 2). We believe that our integrativeVQS, zenvisage ++ , can serve as a baseline for future research in VQSto evaluate against and build upon. Given that this paper covered threedesign studies along with one evaluation study, we were unable to covereach domain to the level of detail typically found in a dedicated designstudy paper. Instead, our focus was to highlight the di ff erences andsimilarities among these domains relevant to the capabilities requiredin VQS. Future longitudinal studies may also help alleviate the noveltye ff ects that participants may have experienced during the evaluationstudy. While we have generalized our findings beyond existing work byemploying three di ff erent and diverse domains, our case studies haveso far been focused on scientific data analysis with domain-experts, asa first step towards greater adoption of VQSs. Other potential domainsthat could benefit from VQSs include: financial data for business intel-ligence, electronic medical records for healthcare, and personal data forquantified self. These di ff erent domains may each pose di ff erent sets ofchallenges (such as designing for novices) unaddressed by the findingsin this paper, pointing to a promising direction for future work. onclusion While VQSs hold tremendous promise in accelerating data exploration,they are rarely used in practice. We worked closely with analystsfrom three diverse domains to characterize how VQSs can addresstheir analytic challenges, collaboratively design VQS capabilities, andevaluate how VQSs are used in practice. Participants were able touse our final system, zenvisage ++ , for discovering desired patterns,trends, and valuable insights to address unanswered research questions.Based on these experiences, we developed a sensemaking model forhow analysts make use of VQSs. Contrary to past work, we foundthat sketch-to-query is not as e ff ective in practice as past work maysuggest. Beyond sketching, we find that each sensemaking processfulfills a central role in participants’ analysis workflows to address theirhigh-level research objectives. We advocate that future VQSs shouldinvest in understanding and supporting all three sensemaking processesto e ff ectively “close the loop” in how analysts interact and performsensemaking with VQSs. eferences [1] R. Amar, J. Eagan, and J. Stasko. Low-level components of analyticactivity in information visualization. In Information Visualization, 2005.INFOVIS 2005. IEEE Symposium on , pp. 111–117. IEEE, 2005. doi: 10.1109 / INFOVIS.2005.24[2] C. R. Aragon, S. S. Poon, G. S. Aldering, R. C. Thomas, and R. Quimby.Using visual analytics to maintain situation awareness in astrophysics. In
Visual Analytics Science and Technology, 2008. VAST’08. IEEE Sympo-sium on , pp. 27–34. IEEE, 2008. doi: 10.1088 / / / / IEEE Transactions on Visualization andComputer Graphics , 24(1):278–287, 2018. doi: 10.1109 / TVCG.2017.2743990[4] S. Bodker, K. Gronbaek, and M. Kyng. Cooperative design: Techniquesand experiences from the scandinavian scene. chap. 8. L. Erlbaum Asso-ciates Inc., Hillsdale, NJ, USA, 1993.[5] C. Bossen, C. Dindler, and O. S. Iversen. Evaluation in participatorydesign: A literature survey. In
Proceedings of the 14th ParticipatoryDesign Conference: Full Papers - Volume 1 , PDC ’16, pp. 151–160. ACM,New York, NY, USA, 2016. doi: 10.1145 / Proceedings of the 19th ACM Conference on Computer-Supported Coop-erative Work & Social Computing - CSCW ’16 , pp. 1533–1545, 2016. doi:10.1145 / Proceedingsof the SIGCHI Conference on Human Factors in Computing Systems , pp.443–452. ACM, 2012. doi: 10.1145 / Proceedings of the 19th ACM Conference onComputer-Supported Cooperative Work & Social Computing - CSCW ’16 ,pp. 13–25, 2016. doi: 10.1145 / Visual Analytics Science andTechnology (VAST), 2016 IEEE Conference on , pp. 131–140. IEEE, 2016.doi: 10.1109 / VAST.2016.7883519[10] Drlica Wagner et al. Dark energy survey year 1 results: The photometricdata set for cosmology.
The Astrophysical Journal Supplement Series ,235(2):33, apr 2018. doi: 10.3847 / / aab4f5[11] P. Eichmann and E. Zgraggen. Evaluating Subjective Accuracy in TimeSeries Pattern-Matching Using Human-Annotated Rankings. Proceedingsof the 20th International Conference on Intelligent User Interfaces - IUI’15 , pp. 28–37, 2015. doi: 10.1145 / Show Me the Numbers: Designing Tables and Graphs to Enlighten .Analytics Press, 2012.[13] B. S. Gloss, B. Signal, S. W. Cheetham, F. Gruhl, D. C. Kaczorowski, A. C.Perkins, and M. E. Dinger. High resolution temporal transcriptomics ofmouse embryoid body development reveals complex expression dynamicsof coding and noncoding loci.
Scientific Reports , 7(1):6731, 2017. doi: 10.1038 / s41598-017-06110-5[14] J. D. Gould and C. Lewis. Designing for usability—key principles andwhat designers think. Proceedings of the SIGCHI conference on HumanFactors in Computing Systems , 28(3):50–53, 1983. doi: 10.1145 / Search User Interfaces . Cambridge University Press, NewYork, NY, USA, 1st ed., 2009.[16] J. Heer and B. Shneiderman. A taxonomy of tools that support the fluentand flexible use of visualizations.
Interactive Dynamics for Visual Analysis ,10:1–26, 2012. doi: 10.1145 / Discovery Science , pp. 441–446. Springer, Berlin, Heidelberg,2001.[18] H. Hochheiser and B. Shneiderman. Dynamic query tools for time se-ries data sets: Timebox widgets for interactive exploration.
InformationVisualization , 3(1):1–18, 2004.[19] K. Holtzblatt and S. Jones. Contextual inquiry: A participatory techniquefor system design. chap. 9. L. Erlbaum Associates Inc., Hillsdale, NJ,USA, 1993.[20] C. Holz and S. Feiner. Relaxed selection techniques for querying time-series graphs. In
Proceedings of the 22Nd Annual ACM Symposium on User Interface Software and Technology , UIST ’09, pp. 213–222. ACM,New York, NY, USA, 2009. doi: 10.1145 / Proceedings of the 2008 conference on BEyondtime and errors novel evaLuation methods for Information Visualization -BELIV ’08 , p. 1, 2008. doi: 10.1145 / Topics in Current Chemistry , 376, 04 2018.[23] H. Lam, E. Bertini, P. Isenberg, C. Plaisant, and S. Carpendale. Empiricalstudies in information visualization: Seven scenarios.
IEEE Transactionson Visualization and Computer Graphics , 18(9):1520–1536, 2012. doi: 10.1109 / TVCG.2011.279[24] D. J. Lee, H. Dev, H. Hu, H. Elmeleegy, and A. Parameswaran. Avoidingdrill-down fallacies with vispilot: Assisted exploration of data subsets.In
Proceedings of the 24th International Conference on Intelligent UserInterfaces , IUI ’19, pp. 186–196. ACM, New York, NY, USA, 2019. doi:10.1145 / / CHI 08 Proceedingsof the SIGCHI Conference on Human Factors in Computing Systems , pp.1483–1492, 2008.[27] M. Mohebbi, D. Vanderkam, J. Kodysh, R. Schonberger, H. Choi, andS. Kumar. Google correlate whitepaper. 2011.[28] M. J. Muller and S. Kogan. Grounded Theory Method in HCI and CSCW.
Human Computer Interaction Handbook , pp. 1003–1024, 2012.[29] M. J. Muller and S. Kuhn. Participatory design.
Communications of theACM , 36(6):24–28, June 1993. doi: 10.1145 / IEEEtransactions on visualization and computer graphics , 15(6), 2009. doi: 10.1109 / TVCG.2009.111[31] J. Nielsen. Usability inspection methods. In
Conference Companion onHuman Factors in Computing Systems , CHI ’94, pp. 413–414. ACM, NewYork, NY, USA, 1994. doi: 10.1145 / User Centered System Design; NewPerspectives on Human-Computer Interaction . L. Erlbaum AssociatesInc., Hillsdale, NJ, USA, 1986.[33] C. North. Toward measuring visualization insight.
IEEE Computer Graph-ics and Applications , 26(3):6–9, 2006. doi: 10.1109 / MCG.2006.70[34] C. Olston and E. H. Chi. ScentTrails.
ACM Transactions on Computer-Human Interaction , 10(3):177–197, 2003. doi: 10.1145 / Nucleic Acids Research , 44(13):e120,2016. doi: 10.1093 / nar / gkw446[36] B. Pierre. Markov chains: Gibbs fields, Monte Carlo simulation, andqueues . Springer, 2011.[37] P. Pirolli and S. Card. The sensemaking process and leverage pointsfor analyst technology as identified through cognitive task analysis. In
Proceedings of International Conference on Intelligence Analysis , vol. 5,pp. 2–4, 2005.[38] C. Plaisant. The challenge of information visualization evaluation. In
Proceedings of the Working Conference on Advanced Visual Interfaces ,pp. 109–116. ACM, 2004. doi: 10.1145 / Proceedingsof the 2008 ACM Conference on Computer supported Cooperative Work ,pp. 361–370. ACM, 2008. doi: 10.1145 / Information Visualization ,15(4):325–339, 2016. doi: 10.1177 / CHI’05 ExtendedAbstracts on Human Factors in Computing Systems , pp. 1765–1768. ACM,2005. doi: 10.1145 / Participatory Design: Principles andPractices . L. Erlbaum Associates Inc., Hillsdale, NJ, USA, 1993.[43] M. Sedlmair, M. Meyer, and T. Munzner. Design study methodology:Reflections from the trenches and the stacks.
IEEE Transactions onVisualization and Computer Graphics , 18(12):2431–2440, Dec 2012. doi:10.1109 / TVCG.2012.21344] H. Sharp, Y. Rogers, and J. Preece.
Interaction Design: Beyond HumanComputer Interaction . John Wiley and Sons, Inc., USA, 2007.[45] B. Shneiderman. Dynamic queries for visual information seeking.
IEEESoftware , 11(6):70–77, 1994. doi: 10.1109 / Proceedings of the 2006 AVI workshop on BEyond time and errors: novelevaluation methods for information visualization , pp. 1–7. ACM, 2006.doi: 10.1145 / ff ort-less data exploration with zenvisage: an expressive and interactive visualanalytics system. Proceedings of the VLDB Endowment , 10(4):457–468,2016. doi: 10.14778 / The biennial Conference on InnovativeData Systems Research (CIDR) , 2017.[49] M. Wattenberg. Sketching a graph to query a time-series database. In
CHI’01 Extended Abstracts on Human factors in Computing Systems , pp.381–382. ACM, 2001. doi: 10.1145 / Proceedings of the 2008 conference on BEyond time anderrors novel evaLuation methods for Information Visualization - BELIV’08 , p. 1, 2008. doi: 10.1145 / n Appendix A, we first describe additional details about the participatory design process, as well as domain-specific artifacts collected fromcontextual inquiry. Next, in Appendix B, we articulate the space of problems amenable to VQSs and describe how the sensemaking processes(introduced in Section 6) fit into di ff erent parts of the problem space. In Appendix C, we provide supplementary information regarding ouranalysis methods and results for the evaluation study. In Appendix D, we acknowledge the individuals and agencies that have made this workpossible. A A rtifacts from P articipatory D esign Information about each participants can be found in Table 3.
ID Dataset Participated in Design Position Years of Experience Dataset Familiarity
A1 DES ✓ Researcher 10+ 3A2 Kepler Postdoc 8 5A3 Kepler Postdoc 8 5G1 Mouse ✓ Grad Student 4 4G2 Cancer Grad Student 2 2G3 Mouse ✓ Professor 10+ 2M1 Solvent (8k) ✓ Postdoc 4 5M2 Solvent (Full) ✓ Professor 10+ 5M3 Solvent (Full) ✓ Grad Student 3 5 A s t r ono m y G e n e t i cs M a t e r i a l S c i e n ce Table 3: Participant information. The Likert scale used for dataset familiarity ranges from 1 (not familiar) to 5 (extremely familiar).During the contextual inquiry, participants demonstrated the use of domain-specific tools for conducting analysis in their existing workflow,including: • Image Cutout Service (Astronomy) • Short Time-series Expression Miner (Genetics) • Solubility Database (Material Science)Fig. 5: Screenshots from contextual inquiry. Left: A1 performs data smoothing to clean the data and then examines a light curve manually using aJupyter notebook. Right: G2 uses a domain-specific software to perform clustering and visualize the outputs.
Astronomy Genetics Material Science
Discover rare astronomical objects with specific pattern properties in a large dataset containing noisy, non-uniform time series data. Understand characteristic profiles amongst a large number of genes that can rise and peak at different time points . Identify battery candidates from a large, noisy, multidimensional dataset by comparing functional relationships and tradeoffs between multiple attributes . Time (days) Lu m i no s i t y D e s i r e d I n s i g h t s C h a ll e n g e s white dwarf supernova Time (hours) E x p r ess i on L eve l G sol ( HA )
RechargabilityBetter
Rechargability Better RechargabilityHigh Capacity Low Capacity
Specialization Low
Rechargability
Fig. 6: Desired insights, problem and dataset challenges for each of the three application domains in our study.ur collaboration with participants is illustrated in Figure 7, where we began with an existing VQS (Zenvisage, as illustrated in Figure 8) andincrementally incorporated features, such as dynamic class creation (Figure 9), throughout the PD process.
Start of material science collaborationdisplay similarity score to userconsider x-range option query by input equationsview using scatterplot custom KMeans cluster sizedynamic class creation Start of astronomy collaboration error bars, flip y axisdisable original sketch query constraints filtering query by pattern loadingStart of genetics collaborationData ExportMinimum
Similarity cutoff
GeneticsMaterial ScienceAstronomy addressing querying needs data preprocessing data requirementssystem requirements system requirementsunderstanding querying and exploration needs evaluation study dynamic class aggregation table
Development
Fig. 7: Timeline for progress in participatory design studies. b d a c d Fig. 8: The existing Zenvisage prototype allowed users to sketch a pattern in (a), which would then return (b) results that had the closest Euclideandistance from the sketched pattern. The system also displays (c) representative patterns obtained through K-Means clustering and (d) outlierpatterns to help the users gain an overview of the dataset. ba Fig. 9: Example of dynamic classes. (a) Four di ff erent classes with di ff erent Lithium solvation energies (li) and boiling point (bp) attributes basedon user-defined data ranges. (b) Users can hover over the visualizations for each dynamic class to see the corresponding attribute ranges for eachclass. The visualizations of dynamic classes are aggregate across all the visualizations that lie in that class based on the user-selected aggregationmethod. C haracterizing the P roblem S pace for VQS s We now characterize the space of problems addressable by VQSs and describe how each sensemaking process fits into di ff erent problem areas thatVQSs are aimed to solve. Visual querying often consists of searching for a desired pattern instance (Z) across a visualization collection specifiedby some given attributes (X,Y). Correspondingly, we introduce two axes depicting the amount of information known about the visualized attributeand pattern instance as shown in Figure 10.Along the pattern instance axis, the visualization that contains the desired pattern may already be known to the analyst, exist as a pattern in-the-head of the analyst, or be completely unknown to the analyst. In the known pattern instance region (Figure 10 grey cell), systems suchas Tableau, where analysts manually create and examine each visualization one at a time, is more well-suited than VQSs, since analysts candirectly work with the selected instance without having to search for which visualization exhibits the desired pattern. We define top-down patternsearch as the process where analysts query a fixed collection of visualizations based on their in-the-head pattern (Figure 10 blue). On the otherhand, bottom-up data-driven inquiries (Figure 10 green) are driven by recommendations or queries that originate from the data (or equivalently,the visualization), since the pattern of interest is unknown and external to the user.The second axis, visualized attributes , depicts how much the analyst knows about which X and Y axes they are interested in visualizing. In boththe astronomy and genetics use cases, as well as past work in this space, the attribute to be visualized is known , as data was in the form of a timeseries. In the case of our material science participants, they wanted to explore relationships between di ff erent X and Y variables. In this realm of unknown attributes, context creation (Figure 10 yellow) is essential for allowing users to pivot across di ff erent visualization collections. V i s - A tt r i bu t es ( X , Y ) Pattern Instance (Z)
In-the-headUnknown U n k no w n K no w n Top-down
Pattern SearchBottom-up
Data-driven Inquiry Context Creation
Known
Unfit for VQS
Fig. 10: The problem space for VQSs is characterized by how much the analyst knows about the visualized attributes and the pattern instance.Colored areas highlight the three sensemaking processes in VQSs for addressing these characteristic problems. While prior work has focusedsolely on use cases in the blue region, we envision opportunities for VQSs beyond this to a larger space of use cases covered by the yellow andgreen regions. E valuation S tudy P rotocol Here, we detail the procedures that were conducted during the evaluation study. At the beginning of the study, participants were asked a setof pre-study survey questions to collect basic information about participant’s dataset, scientific questions, and existing workflows. While thisinformation was similar to the ones collected through participatory design and contextual inquiry (Section 4), the pre-study survey ensured thatwe have background information even for the “blank-slate” participants (who were not part of the earlier design study). • What is your current role as a scientist? What are some examples of recent questions you have researched? • Describe the workflow that you currently use to analyze and make sense of this type of data. • Can you describe an interesting finding you found with your current workflow and the process you took to obtain this insight?After the tutorial and overview of the system, participant’s selected dataset was loaded in. Participants were asked about their familiarity with thedataset and their analytical goals for the session. • On a scale of 1-5, how familiar are you with this dataset? How long have you been working with this dataset? If you have worked with thisdataset before, is there any insight that you already know from this dataset? • What is your goal for this dataset? What are you hoping to accomplish with this dataset?During the main experiment, participants engaged in talk-aloud exercises as they explored their data. These two semi-structured interviewquestions were often posed when participants begin a new line of analytical inquiry. • What is your current goal in this phase of the exploration? What type of insights are you hoping to obtain? • What actions are you planning to perform? How are you operationalize to achieve those goals?In addition, we occasionally remind participants that they ask for help on something they want to accomplish on zenvisage ++ , but were not sureabout the sequence of interactions. They were also encouraged to use other tools in their existing workflow alongside zenvisage ++ to performtheir analysis.At the end of the study, we interviewed participants with a set of open-ended questions regarding their experience with zenvisage ++ , including: • How was zenvisage ++ di ff erent from your existing workflow? • Can you describe how you would use zenvisage ++ in your current workflow? • On a scale of 1-10, how interested would you be in adopting this tool for your day-to-day workflow? • What were some insights that you have gained from today’s session? • Given the insights that you have obtained from zenvisage ++ , are there any additional analysis that you will run downstream before youpublish these results? Describe these additional downstream analysis steps. • What are the pros and cons for using zenvisage ++ ? • Were there any queries that you were unable to address with zenvisage ++ during today’s session? • What are additional features in zenvisage ++ that would help with your scientific workflow or serve your scientific need? E valuation S tudy A nalysis D etails We analyzed the transcriptions of the evaluation study recordings through open-coding and categorized every event in the user study using thefollowing coding labels: • Insight (Science) [IS] : Insight that connected back to the science (e.g. “This cluster resembles a repressed gene.”) • Insight (Data) [ID] : Data-related insights (e.g. “A bug in my data cleaning code generated this peak artifact.”) • Provoke (Science) [PS] : Interactions or observations that provoked a scientific hypothesis to be generated. • Provoke (Data) [PD] : Interactions or observations that provoked further data actions to continue the investigation. • Confusion [C] : Participants were confused during this part of the analysis. • Want [W] : Additional features that participant wants, which is not currently available on the system. • External Tool [E] : The use of external tools outside of zenvisage ++ to complement the analysis process. • Feature Usage [F] : One of the features in zenvisage ++ was used. • Session Break [BR] : Transition to a new line of inquiry.Domain IS ID PS PD C W E BR Fastro 4 12 13 57 2 18 20 22 67genetics 8 12 7 35 4 13 1 21 52mat sci 14 8 7 44 8 11 3 12 48Table 4: Count summary of thematic event code across all participants of the same domain.In addition, based on the usage of each feature during the user study, we categorized the features into one of the three usage types: • Practical [P] : Features used in a sensible and meaningful way. • Envisioned usage [E] : Features which could be used practically if the envisioned data was available or if they conducted downstream analysis,but was not performed due to the limited time during the user study. • Not useful [N] : Features that are not useful or do not make sense for the participant’s research question and dataset.The feature usage labels for each user is summarized in Figure 11. A feature is regarded as useful if it has a P or E code label. Using the matrixfrom Figure 11, we compute the percentage of useful features for each sensemaking process as: × total A1 A2 A3 M1 M2 M3 G1 G2 G3
Sketch-to-querySketch-to-modifyPattern UploadInput EquationSmoothingIgnore x-rangeNarrow x-rangeChange Similarity MetricChange AxisFilterDynamic ClassDrag and DropChange Cluster SizeRepresentative and Outlierastro matsci genetics T o p - D o w n Sp e c i f i c a t i o n C o n t e x t C r e a t i o n B o tt o m - u p I n q u i r y Not Useful Envisioned Practical
Fig. 11: Heatmap of features categorized as practical usage (P), envisioned usage (E), and not useful (N). Columns are arranged in the order ofsubject areas and the features are arranged in the order of the three foraging acts. Participants preferred to query using bottom-up methods such asdrag-and-drop over top-down approaches such as sketching or input equations. Participants found that context creation via filter constraints anddynamic class creation were powerful ways to compare between subgroups or filtered subsets.
Top-downBottom-up
The Origins of Pattern Query
Usage Frequency sketch-to-querysketch-to-modifyobject of interestranked resultrecommendation O r i g i n s Fig. 12: The number of times a pattern query originates from one of the workflows. Pattern queries are far more commonly generated viabottom-up than top-down processes. op-Down Context Creation Bottom-Up
Goal:
Discover potential supernovae candidates that exhibits peak-then- decay pattern
Support:
Examine data regions that are more likely to have supernovae candidates
Support:
Identify and eliminate sources of data anomalies to improve match accuracy for finding candidates
Support:
Find data classes that follows desired functional pattern to understand which solvent types exhibit certain tradeoffs and relationships
Goal:
Compare characteristics from different data classes to find a solvent that satisfies desirable properties
Support:
Understand the overall tradeoffs and relationships between data attributes
Support:
Search and browse for genes belonging to the same cluster
Support:
Compare known properties of genes belonging to different clusters
Goal:
Understand characteristic pattern profiles in the dataset A s t r ono m y M a t e r i a l S c i e n ce G e n e t i cs D o m a i n Sensemaking Process
Table 5: Table of example usage scenarios from each domain for each sensemaking process. We find that our participants typically have onefocused goal expressible through a single sensemaking process, but since their desired insights may not always be achievable with a single classof operation, they make use of the two other sensemaking processes to support them in accomplishing their main goal.
E A cknowledgments
We thank Chaoran Wang, Edward Xue, and Zhiwei Zhang, who have contributed to the development of zenvisage ++ , as well as our scientificcollaborators, who provided valuable feedback during the design study. We appreciate the constructive feedback from the anonymous reviewers,which significantly improved the quality of this paper. We acknowledge support from grants IIS-1513407,IIS-1633755, IIS-1652750, andIIS-1733878 awarded by the National Science Foundation, and funds from Microsoft, 3M, Adobe, Toyota Research Institute, Google, and theSiebel Energy Institute. The content is solely the responsibility of the authors and does not necessarily represent the o ffiffi