[PDF] A Visual Analytics Approach to Debugging Cooperative, Autonomous Multi-Robot Systems' Worldviews

Abstract

Autonomous multi-robot systems, where a team of robots shares information to perform tasks that are beyond an individual robot's abilities, hold great promise for a number of applications, such as planetary exploration missions. Each robot in a multi-robot system that uses the shared-world coordination paradigm autonomously schedules which robot should perform a given task, and when, using its worldview--the robot's internal representation of its belief about both its own state, and other robots' states. A key problem for operators is that robots' worldviews can fall out of sync (often due to weak communication links), leading to desynchronization of the robots' scheduling decisions and inconsistent emergent behavior (e.g., tasks not performed, or performed by multiple robots). Operators face the time-consuming and difficult task of making sense of the robots' scheduling decisions, detecting de-synchronizations, and pinpointing the cause by comparing every robot's worldview. To address these challenges, we introduce MOSAIC Viewer, a visual analytics system that helps operators (i) make sense of the robots' schedules and (ii) detect and conduct a root cause analysis of the robots' desynchronized worldviews. Over a year-long partnership with roboticists at the NASA Jet Propulsion Laboratory, we conduct a formative study to identify the necessary system design requirements and a qualitative evaluation with 12 roboticists. We find that MOSAIC Viewer is faster- and easier-to-use than the users' current approaches, and it allows them to stitch low-level details to formulate a high-level understanding of the robots' schedules and detect and pinpoint the cause of the desynchronized worldviews.

Full PDF

AA Visual Analytics Approach to Debugging Cooperative,Autonomous Multi-Robot Systems’ Worldviews

Suyun “Sandra” Bae * Federico Rossi † Joshua Vander Hook ‡ Scott Davidoff § Kwan-Liu Ma ¶ ∗ ¶ University of California, Davis † ‡ §

Jet Propulsion Laboratory, California Institute of Technology A BSTRACT

Autonomous multi-robot systems, where a team of robots sharesinformation to perform tasks that are beyond an individual robotsabilities, hold great promise for a number of applications, such asplanetary exploration missions. Each robot in a multi-robot systemthat uses the shared-world coordination paradigm autonomouslyschedules which robot should perform a given task, and when, usingits worldview –the robots internal representation of its belief aboutboth its own state, and other robots states. A key problem for opera-tors is that robots worldviews can fall out of sync (often due to weakcommunication links), leading to desynchronization of the robotsscheduling decisions and inconsistent emergent behavior (e.g., tasksnot performed, or performed by multiple robots). Operators facethe time-consuming and difﬁcult task of making sense of the robots’scheduling decisions, detecting de-synchronizations, and pinpoint-ing the cause by comparing every robot’s worldview. To addressthese challenges, we introduce MOSAIC Viewer, a visual analyticssystem that helps operators (i) make sense of the robots schedulesand (ii) detect and conduct a root cause analysis of the robots’ desyn-chronized worldviews. Over a year-long partnership with roboticistsat the NASA Jet Propulsion Laboratory, we conduct a formativestudy to identify the necessary system design requirements and aqualitative evaluation with 12 roboticists. We ﬁnd that MOSAICViewer is faster- and easier-to-use than the users current approaches,and it allows them to stitch low-level details to formulate a high-levelunderstanding of the robots schedules and detect and pinpoint thecause of the desynchronized worldviews.

Keywords:

Multi-Robot Systems, Human-Subjects QualitativeStudies, Debugging.

Index Terms:

I.3.8 [Computer Graphics]: Applications—

NTRODUCTION

Autonomous multi-robot systems (MRS) are systems with two ormore autonomous robots (often referred to as agents ), that coordinateand share information so as to perform tasks cooperatively. Thiscooperation, in particular, drives interest in their potential to performhighly complex tasks in diverse contexts from search and rescuein hazardous environments [4, 23, 47] to team sports [34, 51], andeven space missions [56, 62]. The complexity of these systems alsointroduces a problem of usability for operations—or operability.Researchers must monitor the behavior of individual agents as wellas behavior that emerges from their cooperation [32] and see howsmall changes to their systems affect the overall system performance. * e-mail: [email protected] † e-mail: [email protected] ‡ e-mail: [email protected] § e-mail: [email protected] ¶ e-mail: [email protected] One area where this emergent complexity can be particular chal-lenging for operators is distributed scheduling [52], i.e., coopera-tively deciding which agent should perform a given task and when.To track even a single task on a single agent, operators need to under-stand and track inter-task dependencies the precedence constraintsthat the agents’ schedule must satisfy (e.g., scientiﬁc data must becollected before it is analyzed and transmitted to Earth). With dis-tributed scheduling, the effort required to understand the state of thesystem increases quadratically, as tasks assigned to one agent mightbe shared with or even performed by others. In addition, the overallsystem’s behavior depends not only on each agent’s individual statebut also on each agent’s belief about the state of other agents and ofthe environment (i.e., a worldview [30]).Agents’ worldviews introduce a second challenge that furthercomplicates tracking agents’ tasks, as worldviews can fall out ofsync (often due to weak communication links). This leads to desyn-chronization of the agents’ scheduling decisions and inconsistentemergent behavior (e.g., tasks not performed, or performed by mul-tiple agents). To debug these inconsistent behaviors, operators mustpinpoint the source of the desynchronization by comparing everyagent’s worldview. This process is not only critical for debuggingand failure detection purposes, but also enormously time-consumingand difﬁcult: operators need to examine the high-dimensional prod-uct of every attribute in every agent’s worldview.While previous research has explored ways to represent theviews of single agents using text [14, 22, 49] or superimposed overvideos [2, 69], we explore how a visual analytics approach [11] canencode the belief agents have of themselves and about the state ofother agents. To that end, we engaged in a year-long collaborationwith a team of MRS researchers and operators at the NASA JetPropulsion Laboratory (NASA JPL). The collaboration began with a10-week formative investigation to identify the core challenges ofdistributed scheduling, utilizing the MOSAIC distributed schedul-ing framework [75] as a laboratory to explore this objective. Thiswork was followed by six months of iterative co-design with a coreMOSAIC team member to produce MOSAIC Viewer, a visual ana-lytics application that helps operators (i) make sense of the agents’schedules and (ii) detect and conduct a root cause analysis of thedesynchronized worldviews. To compare worldviews, MOSAICViewer draws inspiration from the diff algorithm, which is com-monly used for text comparison [35] to emphasize the differencesof agents’ worldview. Lastly, we demonstrate the effectiveness ofour method and system with two case studies and evaluate the ap-plication through a qualitative study with 12 roboticists at JPL. Thestudy reveals MOSAIC Viewer is easier- and faster-to-use than theusers’ current text-based approaches. The study helps to explain howapplications like MOSAIC Viewer can support worldview desyn-chronization debugging. In particular, from our evaluation, we ﬁndthat our tool supports the following practices:• System speed and interactivity streamline higher-level analy-ses;• Trust for summary displays grew with experience;• Knowing how is not enough—users need to know “why” in a r X i v : . [ c s . H C ] S e p able 1: Attributes in every agent’s worldview. For every attribute, an agent has a value for itself and the presumed values for the other agents. Worldview Attribute ValueAttribute

Self Others

Data Structure

Location The robot’s location Presumed location of other robots 2D CoordinatesScience Zone The robot’s classiﬁcation of whether Presumed classiﬁcation of whether Boolean Arrayit is in a science zone other robots are in science zonesBattery Level The robot’s battery Presumed battery level of other robots Ordinal ArrayCPU Utilization The robot’s CPU level Presumed CPU level of other robots Ordinal ArrayActions The actions the robot is currently performing Actions other robots are believed to be performing Event SequenceCommunication Bandwidth between self and other robots Presumed bandwidths between other pairs of robots Graphorder to back trace the root causes of the problem;• Different sets of assumptions affect data interpretation.This particular design study [63] helps researchers understand the ﬁtbetween the problem of MRS operators debugging desynchronizedworldviews and MOSAIC Viewer. In this paper we contribute: (i)a set of system design requirements based on a year-long forma-tive study with domain experts in multi-robot systems; (ii) a visualanalytics tool that helps operators understand and compare agents’worldviews with a comparison technique inspired from the diff algo-rithm; and (iii) we characterize how the system supports effectivetroubleshooting, with evidence gathered from a study of the system.

ACKGROUND

To motivate and situate our work, we ﬁrst discuss the speciﬁc chal-lenges of supervising autonomous MRS and unpack the details ofthe MOSAIC distributed scheduling framework [75] that we use toexplore desynchronized worldview debugging.

In contrast to multi-agent systems [46] which are enacted entirely assoftware, in this work we focus on multi-robot systems that have tonegotiate with real-world constraints (e.g., limited and time-varyingcommunication bandwidth and dynamic battery levels) that are oftennot considered in the multi-agent systems literature [10, 38, 44].While some MRS researchers investigate MRS that utilize explicit[65] or centralized [78] coordination, we focus on MRS that use shared-world coordination [3, 18], which have proven to be highlypopular in ﬁeld applications [7, 58] due to its simplicity, scalability,and resiliency. In this approach, every robotic agent has a worldview [30]—an internal representation of the world and of the other agentsstates—that is updated through constant communication with otheragents. In our case, Table 1 summarizes the different attributes foundin an agent’s worldview within the MOSAIC framework (describedin Sect. 2.2). Based on its own worldview, each agent independentlycomputes the optimal strategy for all agents, and then executes itsown part of the computed strategy. If all agents have the sameworldview, this results in coordinated behavior.While a shared-world approach introduces many beneﬁts, a keycomplexity it introduces is that, if the agents are unable to com-municate with each other, their worldviews can fall out of sync,resulting in uncoordinated decisions (e.g., a task may be performedby two agents, or it may not be performed at all). This issue isespecially problematic in harsh environments such as undergroundcaves [17, 42], where ensuring constant reliable communication isinfeasible. Therefore, in order to understand the overall behaviorof MRS, it is critical to understand the worldview of each agent.Furthermore, in order to mitigate the effect of worldview desyn-chronization, an operator must be able to identify the cause (e.g.,slow propagation of information on low-bandwidth data links orthe failure of an agent’s radio) to plan for corrective action. Theconcurrent, distributed, and complex components of MRS makesthe debugging process signiﬁcantly difﬁcult, and previous researchhas identiﬁed that these tasks require considerable attention [25] andwould beneﬁt from more appropriate, effective tools [73].

Within the MOSAIC distributed scheduling framework, each agentcan perform a set of navigation tasks and science tasks. Navigationtasks, which model activities such as localization and path planning,are mandatory , and all agents must perform them. Science tasks,which model collection and analysis of scientiﬁc observables, are optional . Though individual science tasks are not required for mis-sion success, the objective of the MOSAIC framework is to performas many science tasks as possible. However, an agent can performscience tasks only if it has the time and energy resources to alsoguarantee the execution of the mandatory navigation tasks.Both navigation tasks and science tasks have precedence con-straints, enforcing that tasks must be accomplished in sequence.With science tasks, performing analysis of scientiﬁc measurementsrequires that the data be collected ﬁrst. With navigation tasks, per-forming localization through visual odometry requires collectingcamera images ﬁrst. Hence, we will refer to navigation tasks andscience tasks as a “chain of tasks” (i.e., several tasks with a chainof inter-task dependencies). A key advantage of MRS is that agentsdo not need to accomplish each task all by themselves—they mayreceive assistance from other agents. An agent in a science zonemay request assistance with its navigation tasks in order to free upcomputational resources for science tasks. Certain computationaltasks, such as performing visual odometry and analyzing data, arerelocatable to other agents. However, not all tasks are relocatable—for instance, tasks requiring the use of an agent’s hardware resources(e.g., capturing images or collecting scientiﬁc measurements) arenot. The MOSAIC scheduler takes all these constraints into ac-count and computes (i) what optional tasks can be performed and (ii)which agents should perform relocatable tasks based on the agents’capabilities and communication links between the agents [75].In this paper, we consider datasets with ten agents generatedby running the MOSAIC scheduler in the loop with a multi-robotsimulator that captures the availability of science zones, robot bat-tery levels, and communication link bandwidths, which reﬂects thestandard practice in robotics research [53, 74, 75]. This level ofsimulation ﬁdelity is well-matched with the level of abstraction atwhich MOSAIC operates; while ﬁeld testing may present differentunderlying causes for worldview desynchronization, the effect onthe data used in this paper (i.e., disagreement between the agents’world views) would be indistinguishable from the output of the sim-ulations. Accordingly, the use of a simulator has a negligible impacton the ﬁdelity of worldview debugging. In these datasets, each agenthas six attributes in its worldview (Table 1) and is endowed with anagent ID. Each agent wishes to perform three mandatory navigationtasks; agents in “science zones” also wish to perform three optionalscience tasks. Each set of tasks has a chain of dependency con-straints. One agent—the base station—is a special agent that doesnot need to perform the navigation or science tasks and is equippedwith a faster processor. Its purpose is to help the other agents withits computing capabilities. Lastly, we remark that the scale of tenagents is representative of proposed extra-planetary (i.e., outsideof Earth) MRS mission concepts under consideration for the nextdecade [8, 39, 40, 56]. able 2: Summary of the participants’ background, current tools and the extent they participated in the year-long formative study.

Participants MOSAICAfﬁliation Years of Full-TimeProfessional Experience Current Tools ParticipationPLT RT CLI DBT DB-LT FS Co-Design FE User Study

P0 Core 1 – 5 years (cid:51) (cid:51) (cid:51) – – (cid:51) (cid:51) (cid:51) –P1 Core 1 – 5 years – (cid:51) – – – (cid:51) – – (cid:51)

P2 Core Less than 1 year – – – – (cid:51) – – – (cid:51)

P3 Non-Core 5 – 10 years (cid:51) – – – – (cid:51) – – (cid:51)

P4 Non-Core Less than 1 year (cid:51) – (cid:51) – – – – – (cid:51) P5 Core 10 - 15 years – – (cid:51) – – (cid:51) – (cid:51) (cid:51) P6 Non-Core 1 – 5 years (cid:51) (cid:51) – – – (cid:51) – – (cid:51)

P7 Non-Core 1 – 5 years – (cid:51) (cid:51) (cid:51) – – – – (cid:51)

P8 Core 1 – 5 years (cid:51) – (cid:51) – – (cid:51) – – (cid:51) P9 Non-Core 1 – 5 years – (cid:51) – – – – – – (cid:51)

P10 Non-Core More than 15 years – (cid:51) (cid:51) – – – – – (cid:51)

P11 Non-Core 1 – 5 years – (cid:51) – – – – – – (cid:51)

P12 Core 10 – 15 years – – (cid:51) – – (cid:51) – (cid:51) (cid:51) MATLAB/Matplotlib (ad-hoc scriping languages and plotting tools); RVIZ and ROS-based plotting tools [55]; Command-line (CLI) logging tools; Command-line debugging and backtracing tools (GDB); Database-backed logging tools; Formative Study; Formative Evaluation

ELATED W ORK

We present MRS supervision tools used in the robotics communityand promising visualization techniques for worldview debugging.

In addition to the various multi-purpose robotics toolkits [55], spe-cialized visual analytics tools have been developed to track robots’task completion [49, 68, 70]. These tools include timeline viewsthat are often organized around two underlying data types. First, weobserve “agent-centric” (AC) timelines [12, 36], which map time-line rows to individual agents, foregrounding the tasks performedby each agent (either for themselves or to assist other agents). Incontrast, the second approach, “task-centric” (TC) [59], organizestimeline rows around tasks and their dependencies and focuses onwhen tasks are completed, rather than who is executing them.We ﬁnd AC timelines to be an incomplete solution for the prob-lem of distributed scheduling, where tasks ofﬂoaded to other agentscan be difﬁcult to trace. We ﬁnd a similar limitation with TC time-lines for distributed scheduling problems, where backtracing taskswith dependencies can be difﬁcult to explain failures. To that end,MOSAIC Viewer uses an AC-TC hybrid timeline. The timelineincludes glyphs that visually encapsulate the completion status ofindividual tasks, and foregrounds task dependencies using interac-tions [21, 37, 77] (see Sect. 5.3 for more details about the timeline).Even with these adaptations, the timeline view is necessary butnot sufﬁcient to support the many tasks robotics researchers andoperators face when debugging unexpected behaviors due to world-view desynchronization. To complete the application, we turn ourattention to views designed to support worldview debugging.

To mitigate the effect of worldview desynchronization, an operatormust be able to identify the root cause of the desynchronizationcondition (Sect. 2.1). Researchers have investigated methods thatdisplay worldview state variables using structured text, throughlogﬁle analysis [22] and watchpoints [14], similar to those foundin software development IDE’s. These initial explorations provideutility, but at the same time require considerable attention and focus[54,72]. One objective of our research is to explore a more expressiveand lower cognitive load approach for users to examine the high-dimensional product of agents’ worldviews. Visual tracking [2, 69],which overlays line graphs [2] and glyphs [69] that describe theagent’s state on top of a video of agents performing tasks, is anotherapproach. While this approach can effectively show the state ofindividual agents, MRS worldview debugging requires operators tounderstand agents’ beliefs about other agents’ states as well.To the best of our knowledge, we know of no research withinthe visualization community that has explicitly looked at the prob-lem domain of distributed MRS worldview debugging. However,researchers have explored various representations and interaction strategies for data with similar underlying representations. For ex-ample, as worldviews have multiple attributes, we build on visualcomparison techniques rooted in multivariate data research [43]. Inparticular, we rely on visual comparison techniques [26] that includeexplicit encoding to represent the system’s consensus of an agent’sview of its own state, and juxtaposition to highlight the differencesamong worldviews. The same strategies to compare parametersof a dataset have been applied across a number of domains, frommalware sampling [27] to time series data [50].Another thread that deﬁnes analytics tools that perform multi-variate comparisons is the actual algorithm they select to highlightparameter differences. DiffMatrix [66], for example, highlights thedifference between two parameter values using the arithmetic sub-traction operators, while OnSet [60] uses the union and intersectionset operators. Our work, like Vdiff [6] and TACO [50], applies thediff algorithm [35] from text processing (see Sect. 5.2 for details).The system we present, overall, extends the work on multivariatecomparison into the domain of distributed MRS worldview debug-ging and contributes a detailed analysis of the ﬁt of our approach tothat domain. We now turn our attention to the formative researchthat informed our understanding of the problem.

ORMATIVE S TUDY

MOSAIC Viewer is the product of a year-long engagement, orga-nized around three distinct phases, with its users. Table 2 summa-rizes the phases of the engagement and describes how each of the 13participants engaged with this project. (Note: P0 is a superuser whoguided the design process, but did not participate in the user study).This section describes the ﬁrst and second phase: a 10-weekformative study with 7 domain experts and a 6-month co-designstudy with 1 domain expert. The third phase (Sect. 6.1) is a formaluser study that evaluates how well MOSAIC Viewer supports visualdebugging for worldview desynchronization.

The objectives of the ﬁrst phase, a formative study, was to gain adeeper understanding of the core challenges of distributed schedulingand the users’ needs. The study was organized around an initial con-textual inquiry [33], which allowed the research team to observe theroboticists at work in their own environments. For four weeks, theprotocol consisted of six sessions of 90 minutes of semi-structuredinterviews and an artifact walk-through where roboticists sharedthe tools and processes that deﬁne their MRS work practice. Theresearch team took notes, and captured images and video recordingsto highlight observations for post-analysis.With a preliminary understanding of the problem domain, theresearch team then elaborated 12 paper prototypes [57] in order totest initial ideas early and quickly. After four rounds of user testingwith all users, these prototypes evolved to higher-ﬁdelity code-basedprototypes with real MRS data. From these prototyping sessions, we igure 1: A 50-inch and a 70-inch display. The 15 terminal windowsallows P1 and 2 other researchers (not pictured) to debug 3 robots. identiﬁed that comparing agents’ worldviews is a core componentwe needed to address in order to help roboticists make sense ofthe agents’ schedules. Then in the second phase, we engaged in 6months of detailed co-design sessions [29] with P0 to unpack thespeciﬁc challenges of worldview comparisons. We found that noexisting visualizations met the needs of the problem, and we collabo-rated with P0 to iteratively prototype different visual representationsthat compare agents’ worldviews. The past designs can be foundin the supplementary material. After identifying a potential design,we conducted a formative evaluation [16] with three roboticists,where we revised the system design based on their feedback, suchas introducing the hierarchical interactions that we list in Sect. 5.3.

The materials captured during phase 1 were annotated with themes,which were then grouped using an emergent theme analysis [1].One key analysis takeaway reveals roboticists invested substantialamounts of effort and time to track the state of each agent’s world-view in order to interpret which tasks are scheduled for which robot.To illustrate, we observed examples where roboticists would needto open 3 or 4 terminal windows of messaging logs per agent andthen would recruit multiple expert colleagues who would each betasked to monitor a single agent on their own screens (Fig. 1). Toelaborate a shared understanding of all the worldviews, includingthe divergences, the roboticists would collectively, as P1 shares,“shout out the state of what theyre seeing” as they visually scan theirrespective displays to ﬁnd relevant information.Roboticists must ﬁrst assess worldview synchronization as thisdictates the ﬂow for the rest of their analysis. If roboticists concludethat the worldviews are synchronized, they would then assess thesystem’s performance (i.e., how many tasks are being performed,and by whom). To perform this objective, we observed researchersﬁrst engaged in a TC perspective to quickly determine the state ofthe agents task: they did not seek the low-level details of how a taskhas been accomplished. Conversely, if an agent failed, researchersswitched to an AC perspective to seek for the low-level details of anagent’s activities that explained the failure. However, if roboticistsconcluded that agents’ worldviews are desynchronized, they wouldtransition into a debugging mode to plan for corrective action (i.e.,determine who is out of sync and reason why the desynchronizationoccurred). Roboticists would read out single states within each oftheir robots’ worldviews again and check for agreement about eachstate before moving onto the next worldview.

Table 3: Researcher’s goals and subgoals.

Goal SubgoalAssess worldviewsynchronization 1. If desynchronized, who is out of sync with whom?2. What is the cause of the desynchronization?Make sense of thescheduler’s output 1. Who is doing what?2. Are agents accomplishing their navigation tasks?3. Are agents maximizing their science tasks?

Phase 1 identiﬁed that worldview debugging requires considerablecognitive effort, supporting our decision to investigate how a visualanalytics tool can support these sets of tasks. In phase 2, over monthsof iterative co-design with P0, we elaborated on the speciﬁc chal-lenges researchers face when debugging MRS systems (summarizedin Table 3). We use these questions to devise a set of requirementson how a visual analytics system can provide the needed support:

R1. Display worldview synchronization state:

Determining ifthe worldviews are in sync is the ﬁrst step to an in-depthanalysis. The system’s visual encoding should indicate whetheragents are in sync and highlight those who are not.

R2. Support system performance assessment:

System perfor-mance is based on the number of accomplished tasks. With n agents, the number of tasks and task transferring can be high.To track who is doing what, the system should (i) differen-tiate the different tasks agents can perform and (ii) offer theﬂexibility to switch analytical perspectives (i.e., AC, TC). R3. Show the differences and similarities of the worldviews:

When worldview desynchronization occurs (R1), researcherswill need to detect who is out of sync in order to plan forcorrective action. Detecting out of sync agents requires under-standing the similarities and differences of all the worldviews.

R4. Help conduct a root cause analysis of the desynchronizedworldviews:

After determining which agents are out of sync(R3), operators will need to conduct a root cause analysis toreason on why this desynchronization occurred and plan forcorrective action. The system should provide the ability forusers to collate and fuse pieces of information in order toproperly diagnose the desynchronization condition.

PPLICATION

To support analyzing and debugging MRS, we design a systemcomposed of two components: Main View (Fig. 2) and DifferentialWorldview Comparison (DWC) (Fig. 4). The two components arelaid side-by-side, and users can ﬁrst use the Main View to assesswhether agents’ worldviews are synchronized. If they concludeworldviews are synchronized, they can proceed to assess the systemperformance with the various views displayed in the Main View.Otherwise, users can use DWC to identify worldview differences.After identifying and selecting misbehaving agents in DWC, userscan engage with the Main View to conduct a root cause analysis.

The Main View (Fig. 2) consists of a summary overview, scatterplot,graph, task abstraction, and a timeline.

Summary Overview.

The summary overview (Fig. 2a) providesa quick assessment of agents’ performance of science tasks andworldview synchronization (R1, R2) . As navigation tasks are re-quired, the operator can infer how well the MOSAIC scheduler isperforming by assessing the number of optional science tasks thatagents can accomplish. Thus, for each task in the chain of sciencetasks, we display how many agents have executed the respective taskas a fraction of all eligible agents. In Fig. 2a, 2 out of 4 agents haveaccomplished the ﬁrst two science tasks. Of the two, only one hasaccomplished the third science task (either on board or by delegating

OSAIC VIEWER

SYSTEM STATUS SUMMARY

Science Objectives

01 TAKE_SAMPLE: 2/4

CNBTSZ

02 ANALYZE_SAMPLE: 2/403 STORE_SAMPLE: 1/4

WorldView Synchronization

No warning.

01 02 03 01

02 03[ AGENTS] SCI

Time (Seconds)

NAV AV G C P U L O A D BA TT E R L Y L EVE L

08 01

A D E C B

111 11111 *2 2 2 22 33 3333

ST 03 02070406 0500 * ** ** 2 ** ** 31 2 3* *2 ** Task: node_94:vo_localization

Start Time: End Time: Duration: Figure 2: The operator is using the Main View to evaluate the system’s performance. (a) Summary overview shows state of the science objectiveand the worldview synchronization (b) Scatterplot abstracts the “behavior” of agents with the x- and y-axis encoding average CPU load and batterylevel, respectively. (c) Graph depicts the agents’ location and communication links (d) Task Abstraction provides a task-centric perspective of eachagent’s task. Squares represent navigation tasks. Triangles represent science tasks (e) Timeline shows agents’ activities. Agent 5’s science chainof task is highlighted in pink. (Note: other agents’ timelines are highlighted in different colors for illustrative purposes for the case study in Sect. 6.3) it). Below,

Worldview Synchronization provides an overview of theagents’ worldview synchronization. To help users determine world-view synchronization, if the system detects a desynchronization, itoutputs a warning ( R1 ). Users can then turn to DWC for furtheranalysis. To show what is desynchronized, the Worldview Synchro-nization displays DWC’s visual representation for three worldviewattributes (‘CN’: Communication Network, ‘BT’: Battery Level, and‘SZ’: Science Zone). We discuss this further in Sect. 5.2. Scatterplot.

The scatterplot (Fig. 2b) is broken into four quad-rants and the x- and y-axis shows average CPU load and batterylevel, respectively. The quadrants are colored with four differentcategorical colors to help operators abstract agents’ behavior at-a-glance with respect to the current time step (R2).

CPU load, inparticular, shows how busy a given agent is. This information inaddition to battery level provides two indications: (1) it identiﬁesover-subscribed agents and “bottlenecks”; (2) it identiﬁes agentsthat are doing the majority of the work for the overall system. Forexample, agents in the blue quadrant are considered “lazy”. With ahigh battery level and low average CPU load, this suggests that theseagents could be given more tasks if the communication topologyallowed for it. In contrast, agents in the yellow quadrant have a lowbattery level and high average CPU load–indicating they are “over-worked”. The middle portion of the scatterplot is colored grey, withthe center colored black, in order to emphasize extreme behavior.

Graph.

In the graph (Fig. 2c), each agent is represented as acircle with their agent number in the center. The base station isdenoted as ‘ST’. The colored ring correlates to which quadrant anagent is positioned in the scatterplot. The edges between agentsrepresent its communication links, and the edge weight encodes theavailable communication bandwidth. The graph has two regions torepresent the agent’s physical environment. Solid dark grey ﬁlled-inregions are the ‘science zones’, while the diagonal hashed regionsrepresent ‘communication cut-off zones’. Communication links thatcross a communication cut-off zone are severed, simulating the effectof line-of-sight of obstructions (e.g., hills). The graph provides basicinteractions to zoom-in, pan, and see details on-demand (i.e., tooltip) when hovering over the nodes or edges. Users can also click on anagent and see its immediate edges in pink, while the opacity of theother edges in the network is lowered.

Task Abstraction and Timeline.

In the Task Abstraction(Fig. 2d), each row represents an agent, and a colored circle next tothe agent name corresponds to the quadrant the agent is respectivelyplaced in the scatterplot. To help distinguish tasks, there are two setsof shapes, where squares represent navigation tasks and trianglesrepresent science tasks (R2) . Each chain of tasks is composed ofthree steps, and for every agent, the respective shape is ﬁlled in ifthe task has been accomplished. This allows operators to quicklyassess if agents have accomplished their navigation tasks, and howmany science tasks have been performed from a TC perspective (R2) . However, this task abstraction does not tell which agent hascompleted the task or when it has been accomplished. To that end,operators can use the timeline, which shows each individual agent’sactivity (Fig. 2e) as horizontal bars in seconds. The length of a barrepresents the duration of a task and the positioning maps whena task has started and ended. The shape at the beginning of eachtask represents the task type (navigation or science), and inside eachshape is either a number that represents the nth-step of its respectivechain of tasks or an asterisk symbol (*) to denote task relocation. Atﬁrst sight, it is not evidently clear if a task in an agent’s timeline mayhave been relocated from another agent for assistance (e.g., Agent 0is performing Agent 4’s second navigation task at t = Differential Worldview Comparison (DWC) enacts the concept ofthe diff algorithm for text comparison [35] to compare the agents’worldviews. For each line in two text ﬁles, the diff algorithm eithergenerates nothing, where the two lines are the same, or shows a side-by-side comparison of the two lines, where they differ. Analogously, iscrete Battery Level j=1i=1 j=2i=2 j=ni=n (cid:31)(cid:30)(cid:29)(cid:28)(cid:27)(cid:26)(cid:25)(cid:24)(cid:23)(cid:30)(cid:22)(cid:25) (cid:31)(cid:30)(cid:29)(cid:28)(cid:27)(cid:26)(cid:25) X (cid:24)(cid:27)(cid:29)(cid:23)(cid:25)(cid:22)(cid:21)(cid:20)(cid:21)(cid:28)(cid:25)(cid:19)(cid:30)(cid:18)(cid:18)(cid:17)(cid:16)(cid:25)(cid:15)(cid:30)(cid:20)(cid:14)(cid:17)(cid:13)(cid:25)(cid:12)(cid:21)(cid:28)(cid:25)(cid:11)(cid:30)(cid:29)(cid:29)(cid:17)(cid:28)(cid:10)(cid:25)(cid:20)(cid:17)(cid:15)(cid:17)(cid:20)(cid:25)(cid:9)(cid:21)(cid:28)(cid:16)(cid:27)(cid:8)(cid:30)(cid:20)(cid:7)(cid:25)(cid:31)(cid:30)(cid:29)(cid:28)(cid:27)(cid:26)(cid:25) X (cid:6)(cid:27)(cid:5)(cid:17)(cid:28)(cid:17)(cid:8)(cid:22)(cid:17)(cid:25)(cid:31)(cid:30)(cid:29)(cid:28)(cid:27)(cid:26)(cid:25) Y (cid:25) (cid:6)(cid:27)(cid:5)(cid:17)(cid:28)(cid:17)(cid:8)(cid:29)(cid:27)(cid:30)(cid:20)(cid:25)(cid:4)(cid:21)(cid:28)(cid:20)(cid:16)(cid:3)(cid:27)(cid:17)(cid:24)(cid:25)(cid:2)(cid:21)(cid:19)(cid:18)(cid:30)(cid:28)(cid:27)(cid:13)(cid:21)(cid:8) (cid:31)(cid:30)(cid:29)(cid:28)(cid:27)(cid:26)(cid:25)(cid:24)(cid:30)(cid:23)(cid:24)(cid:30)(cid:26)(cid:30)(cid:22)(cid:28)(cid:25) (cid:31)(cid:30)(cid:29)(cid:28)(cid:27)(cid:26) (cid:27)(cid:21)(cid:30)(cid:22)(cid:28)(cid:26)(cid:20)(cid:25)(cid:23)(cid:30)(cid:24)(cid:19)(cid:30)(cid:23)(cid:28)(cid:18)(cid:17)(cid:22)(cid:25)(cid:28)(cid:16)(cid:27)(cid:28)(cid:25)(cid:15)(cid:18)(cid:26)(cid:27)(cid:21)(cid:24)(cid:30)(cid:30)(cid:26)(cid:25)(cid:14)(cid:18)(cid:28)(cid:16)(cid:25)(cid:27)(cid:22)(cid:25)(cid:27)(cid:21)(cid:30)(cid:22)(cid:28)(cid:20)(cid:26)(cid:25)(cid:30)(cid:21)(cid:17)(cid:25)(cid:13)(cid:18)(cid:30)(cid:14)(cid:12)(cid:25)(cid:11)(cid:16)(cid:30)(cid:25)(cid:24)(cid:17)(cid:14)(cid:25)(cid:27)(cid:25)(cid:15)(cid:30)(cid:29)(cid:28)(cid:27)(cid:25)(cid:10)(cid:27)(cid:24)(cid:9)(cid:25)(cid:27)(cid:29)(cid:18)(cid:21)(cid:22)(cid:26)(cid:25)(cid:28)(cid:17)(cid:25)(cid:26)(cid:16)(cid:17)(cid:14)(cid:26)(cid:25)(cid:14)(cid:16)(cid:17)(cid:25)(cid:28)(cid:16)(cid:30)(cid:25)(cid:19)(cid:17)(cid:22)(cid:28)(cid:24)(cid:27)(cid:24)(cid:18)(cid:27)(cid:22)(cid:25)(cid:18)(cid:26)(cid:12) i=1i=2i=n j=1 j=2 j=n (cid:1)(cid:29)(cid:30)(cid:29)(cid:17)(cid:25)(cid:127)(cid:1)(cid:29)(cid:30)(cid:29)(cid:17)(cid:25)(cid:129)(cid:1)(cid:29)(cid:30)(cid:29)(cid:17)(cid:25)(cid:141) (cid:21)(cid:23)(cid:20)(cid:23)(cid:19)(cid:28)(cid:18)(cid:24)(cid:24)(cid:26)(cid:17)(cid:19)(cid:25)(cid:24)(cid:19)(cid:28)(cid:16)(cid:22)(cid:15)(cid:14)(cid:26)(cid:23)(cid:20)(cid:23)(cid:30)(cid:22)(cid:25) (cid:1)(cid:29)(cid:30)(cid:29)(cid:17)(cid:25)(cid:141)(cid:1)(cid:29)(cid:30)(cid:29)(cid:17)(cid:25)(cid:129)(cid:143)(cid:25)(cid:25)(cid:25)(cid:25)(cid:25)(cid:25)(cid:25)(cid:25)(cid:25)(cid:25)(cid:9)(cid:144)(cid:157)(cid:21)(cid:25)(cid:3)(cid:27)(cid:17)(cid:24)(cid:7)(cid:1)(cid:29)(cid:30)(cid:29)(cid:17)(cid:25)(cid:127)(cid:143)(cid:25)(cid:1)(cid:30)(cid:19)(cid:17)(cid:25)(cid:3)(cid:30)(cid:20)(cid:14)(cid:17)(cid:13)(cid:25)(cid:30)(cid:13)(cid:25)(cid:1)(cid:29)(cid:30)(cid:29)(cid:17)(cid:25)(cid:141)(cid:143)(cid:25)(cid:6)(cid:27)(cid:5)(cid:17)(cid:28)(cid:17)(cid:8)(cid:29)(cid:25)(cid:3)(cid:30)(cid:20)(cid:14)(cid:17)(cid:13)(cid:25)(cid:30)(cid:13)(cid:25) (4/9) (cid:144)(cid:157)(cid:21)(cid:25)(cid:3)(cid:27)(cid:17)(cid:24) (cid:157)(cid:17)(cid:8)(cid:29)(cid:25) (cid:14)(cid:19)(cid:11)(cid:17)(cid:28) (cid:14)(cid:19)(cid:11)(cid:17)(cid:28)(cid:25)(cid:21)(cid:12)(cid:25)(cid:1)(cid:27)(cid:19)(cid:27)(cid:20)(cid:30)(cid:28)(cid:27)(cid:29)(cid:10)(cid:1)(cid:27)(cid:19)(cid:27)(cid:20)(cid:30)(cid:28)(cid:27)(cid:29)(cid:10)(cid:25)(cid:3)(cid:27)(cid:17)(cid:24)(cid:6)(cid:27)(cid:5)(cid:17)(cid:28)(cid:17)(cid:8)(cid:22)(cid:17)(cid:25)(cid:3)(cid:27)(cid:17)(cid:24)(cid:6)(cid:17)(cid:20)(cid:29)(cid:30)(cid:25)€(cid:27)(cid:8)(cid:17)(cid:6)(cid:17)(cid:20)(cid:29)(cid:30)(cid:13)(cid:1)(cid:14)(cid:19)(cid:19)(cid:30)(cid:28)(cid:10)(cid:25)(cid:3)(cid:27)(cid:17)(cid:24) (cid:6)(cid:17)(cid:20)(cid:29)(cid:30)(cid:13)(cid:6)(cid:17)(cid:29)(cid:30)(cid:27)(cid:20)(cid:25)(cid:3)(cid:27)(cid:17)(cid:24) a cb Figure 3: The process from the raw data to DWC. This ﬁgure focuses on Agent 2’s battery level (Fig. 4). (a) The agents’ view of a single attribute isrepresented as X , a n × n matrix. After applying our variant of the diff function to X , we obtain the Difference Matrix Y (b) where each entry iscategorized as either State 1, 2, or 3. We compute the number of occurrences for each state. (c) The information is transformed into DWC. WORLDVIEW COMPARISON

Communication Bandwidth

Science Zone

Discrete Battery Level

In Science Zone Not In Science Zone DISCRETE BATTERY LEVEL (ORDINAL) COMMUNICATION NETWORK (NETWORK) SCIENCE ZONE (BOOLEAN) (9/9) (4/9) (4/9) (4/9) (4/9) (4/9) (9/9) (4/9) (4/9) ST (9/9) BTCN SZ (cid:31)(cid:30)(cid:29)(cid:28)(cid:27)(cid:26)(cid:25)(cid:24)(cid:30) (cid:23)(cid:22)(cid:21)(cid:21)(cid:27)(cid:20)(cid:19)(cid:18)(cid:17)(cid:25)(cid:30)(cid:16)

Figure 4: Differential Worldview Comparison: (BT) Battery Level Panel, (SZ) Science Zone Panel, and (CN) Communication Network Panel. for each attribute in the agents’ worldviews, we introduce a variantof the diff function that compares each agent’s presumed value foran agent’s attribute (e.g., agent B’s presumed value for agent A’slocation) with the ego value of the said attribute (i.e., the value therespective agent has determined for itself. See Table 1). We compareto an agent’s ego value because we follow the strong assumption thatevery agent knows its state the best. Hence, the agents’ ego value foreach attribute acts as a form of comparison. If an agent’s presumedvalue corresponds to the respective ego value, DWC shows nothing;however, if they differ, DWC shows the presumed value. Fig. 3provides a walk-through of the process from the raw data to DWC.Of the six attributes listed in Table 1, we focus on the threeattributes (battery level, presence in a science zone, and communica-tion network bandwidths) that have a direct impact on the agents’scheduling decisions and are able to explain the majority of thedesynchronization scenarios. For each agent i , each individual world-view attribute (e.g., battery level) can be represented as a 1D-array x i with n entries, where n represents the total number of agents(Fig. 3a). The j -th entry in the array, x i j represents agent i ’s beliefabout the state of agent j ’s attribute. For battery level, the entry isan integer; for science zone, it is a boolean; and for communications,the entry is a list of bandwidths from agent j to all other agents.Once all n arrays are concatenated, this becomes an n × n matrix X . In X , row i represents agent i ’s beliefs about the attribute, and the entry x i j represents agent i ’s belief about the state of agent j ’sattribute value. The value x ii is denoted as as the ego value for theattribute, as this entry represents what agent i thinks about itself. Forexample, the highlighted column in Fig. 3a represents every agent’sbelief about the state of Agent 2’s battery level (encoded by color),and x represents Agent 2’s ego value for its battery level.After compiling X for each attribute, we apply our variant of the diff function to every column j in X , comparing every entry ( x i j ) tothe ego value: diff ( x ii , x i j ) = (cid:40) None , if x ii = x i j x i j , otherwise (1)If x ii = x i j , the function does not return anything; otherwise thevalue is x i j (i.e., agent i ’s presumed value for agent j ). Once wecompute the diff function for every column, the output values areused to create the Difference Matrix Y (Fig. 3b). In Y , each entryis categorized as one of three states: State 1, State 2, or State 3.State 1 is the ego view ( y ii ), (ii) State 2 are entries ( y i j = None)that agree with the ego view, and (iii) State 3 are those ( y i j (cid:54) = None)that disagree with the ego view. Then, for every column j in Y , wecompute the Similarity and Difference Sum to count the number ofentries labeled as State 2 and State 3, respectively:Similarity Sum = n ∑ i = (cid:2) y i j = State 2 (cid:3) , (2)ifference Sum = n ∑ i = (cid:2) y i j = State 3 (cid:3) . (3)We now have all the information we need to create DWC.DWC displays each attribute as a grid-like panel (Fig. 4). Everypanel has n adjacent columns, each representing beliefs about anagent. Each column j is composed of the same four components thatmake up DWC: Ego View, Similarity View, Difference View, andDeltas. Every panel also contains a ‘Delta Line’. Below the DeltaLine, the bottom portion of every panel is the ‘Summary View’ thatsummarizes the system’s synchronization state, and the bottom-mostcomponent is the ‘Ego View’, which visualizes an agent’s ego value.The visual representation of the ego value is speciﬁc to eachattribute. For Communication Network, the ego view of each agent’scommunication network (i.e., the bandwidths from the agent to allother agents) is represented as a 1D-array of length n . The k -th entryin the array represents the communication bandwidth value fromagent j to agent k ; we encode the bandwidth value with a purple-to-red sequential colormap [31]. In contrast, the data types usedin the Battery Level Panel and Science Zone Panel are ordinal andboolean, respectively; for these, we use a square mark to representthe Ego View. Ordinal values are encoded with an orange sequentialcolormap, and boolean values are encoded with two shades of teal.Next, above the Ego View, the ‘Similarity View’ displays thesimilarities of each agent’s worldview through a piling metaphor [5]where the number of horizontal lines represents the Similarity Sum(Equation 2). This information is also captured as a fraction belowthe Ego View. The piling metaphor ﬁts the need of the visualizationas it visually aggregates information–reducing visual noises andallowing more emphasis on the differences. This visual design deci-sion is also based on what we have learned from the formative study,as researchers are more interested in identifying the differences ofagents’ worldviews rather than the similarities.The next component of DWC is the ‘Difference View’, which isadjacent to the Similarity View. The Difference View is representedby diagonal hashed lines, and it complements the Similarity View.From the Difference Sum (Equation 3), the height of the DifferenceView indicates how many agents disagree with an agent’s ego value( R1 ). To see who the contrarians are and their beliefs, we look atthe top portion of DWC: the Detail View ( R1, R3 ). Above the DeltaLine, the Detail View has n rows representing the n agents. For eachcolumn j , rows corresponding to an agent that disagrees with thecolumn’s ego view (i.e., rows corresponding to an entry in State 3in Y ) report the contrarian agent’s belief about agent j ’s worldview.For example, in Fig. 4 (BT), Agent 7 disagrees with Agent 1 and 4’sego view of their location. The Detail View displays a different colorcompared to the respective Ego View’s at the bottom. By default,rows corresponding to agents that agree with the Ego View are blank,in line with the diff metaphor. However, users can also show valuesthat correspond to the Ego View by toggling the Similarity View.Our decision to focus on representing the agents’ view of eachattribute as a single matrix is based on the lessons from past designsfrom the co-design sessions. The past designs layered various visualencoding for each attribute, and users found it difﬁcult to make senseof the layered result and to compare the differences between theagent’s worldviews. In addition, we found that operators would focuson a particular attribute depending on the context of the problem. The prototype features a rich set of user interactions:

Highlighting.

To support switching analytical perspectives (TC,AC) when making sense of an agent’s schedule, users can highlighta single task or highlight an agent’s chain of tasks ( R2 ). In Fig. 2,the user highlighted Agent 5’s science chain of tasks. Users can alsohighlight rows in the Detail View in the DWC as shown in Fig. 4. Interlink Views.

Visualizations for MRS can be broken downinto two aspects: (i) the behavior of a single agent or (ii) the overall behavior of the system [61]. No single tool is capable of providing acomplete picture of the system [49]. Hence, we focus on providingthe operator with the ability to collate and fuse pieces of informationin order to properly diagnose the system by interlinking the views (R4) . For instance, when the rows in the Communication Networkpanel are highlighted, the immediate edges of the highlighted agentsare pink in the graph, while the opacity of the other edges arelowered. Fig. 8 showcase this interaction. As another example,the graph, scatterplot, and task abstraction are interlinked. Whenan agent is selected in any of these views, it is simultaneouslyhighlighted in the other views.

We use a MongoDB database to store each agent’s reported world-view. From the database, we perform three computational tasksbefore visualizing the system: (i) compute summary statistics; (ii)chain tasks according to their precedence constraints and inter-taskdependencies; (iii) compare agents’ worldview through the diff func-tion (Sect. 5.2). Afterward, we visualize the results on a web applica-tion. The front-end is implemented with a combination of HTML5,CSS, JavaScript, and the JavaScript Data-Driven Documents (D3)library [9]. The back-end runs on a Node.js web server.

YSTEM E VALUATION

We are interested in evaluating if MOSAIC Viewer can help opera-tors successfully answer the critical questions listed in Table 3. Weperform two assessments. First, we conduct a qualitative study thatconsists of a training task scenario and three task scenarios in whichdata were collected. For each scenario, participants were asked toexplore the state of a multi-robot system. Then, based on the userstudy, we present two case studies that demonstrate our system’sefﬁcacy and the workﬂow of how users interacted with the system.

Participants.

We recruited 12 participants (3 female, 9 male), agedreported in bins 18 - 44 years, from the population of multi-robot sys-tems researchers and operators at NASA JPL. Of the 12 participants,two were part of the formative evaluation. We elected to includethese participants in the study as the system’s interactions and visualrepresentations changed after formative evaluation, and the studywas designed to assess their ability to debug worldviews based onscenarios they did not encounter in the past. A more comprehensiveoverview of the participants can be found in Table 2.

Conditions and Tasks Design.

To study the usability and out-comes of domain experts using MOSAIC Viewer, we devised threescenarios, each touching upon a common situation operators face.One scenario included all agents in sync. Two scenarios included an“out of sync” condition that emerges from (i) one agent isolated fromothers or (ii) a bipartition in the agents’ communication network,respectively. For each scenario, we asked operators to assess thesystem. If they determined worldviews are out of sync, we askedoperators to perform a root cause analysis. This involved determin-ing which robots were out of sync and analyzing why they were outof sync (Q1) . Participants were also asked to assess whether agentsaccomplished their science tasks (Q2) , navigation tasks (Q3) as wellas whether they understood how tasks were scheduled (Q4) . Eachparticipant was required to complete the three scenarios, and thestudy was counterbalanced to mitigate learning effects. We chosenot to have a baseline interface to compare with MOSAIC Viewerinstead of using existing debugging tools for two reasons: (1) reduceconfounding effects that may emerge from other aspects of the inter-faces and (2) focus the investigation on the qualitative, behavioralaspects participants gained from MOSAIC Viewer.

Experimental Setup.

MOSAIC Viewer ran on a mid-2017 Mac-Book Pro (16 GB, 2.5 GhZ process). The interface was displayed ona 34-inch display (3440 × strongly disagree strongly agree (cid:31)(cid:30)(cid:29)(cid:28)(cid:27)(cid:26)(cid:25)(cid:31)(cid:24)(cid:23)(cid:22)(cid:28)(cid:21)(cid:20)(cid:19)(cid:18)(cid:19)(cid:17)(cid:28)(cid:20)(cid:16)(cid:28)(cid:20)(cid:15)(cid:14)(cid:13)(cid:17)(cid:12)(cid:11)(cid:10)(cid:20)(cid:9)(cid:19)(cid:28)(cid:20)(cid:15)(cid:28)(cid:8)(cid:17)(cid:19)(cid:16)(cid:19)(cid:15)(cid:10)(cid:20)(cid:15)(cid:7)(cid:28)(cid:10)(cid:6)(cid:19)(cid:28)(cid:5)(cid:11)(cid:10)(cid:11)(cid:4)(cid:31)(cid:3)(cid:29)(cid:28)(cid:27)(cid:26)(cid:25)(cid:31)(cid:24)(cid:23)(cid:22)(cid:28)(cid:21)(cid:20)(cid:19)(cid:18)(cid:19)(cid:17)(cid:28)(cid:20)(cid:16)(cid:28)(cid:19)(cid:11)(cid:16)(cid:20)(cid:19)(cid:17)(cid:28)(cid:11)(cid:15)(cid:5)(cid:28)(cid:14)(cid:11)(cid:16)(cid:10)(cid:19)(cid:17)(cid:28)(cid:10)(cid:13)(cid:28)(cid:2)(cid:16)(cid:19)(cid:28)(cid:10)(cid:6)(cid:11)(cid:15)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:12)(cid:1)(cid:28)(cid:127)(cid:2)(cid:17)(cid:17)(cid:19)(cid:15)(cid:10)(cid:28)(cid:11)(cid:8)(cid:8)(cid:17)(cid:13)(cid:11)(cid:127)(cid:6)(cid:4)(cid:31)(cid:129)(cid:29)(cid:28)(cid:27)(cid:23)(cid:28)(cid:11)(cid:12)(cid:28)(cid:141)(cid:20)(cid:143)(cid:19)(cid:141)(cid:1)(cid:28)(cid:10)(cid:13)(cid:28)(cid:2)(cid:16)(cid:19)(cid:28)(cid:11)(cid:28)(cid:10)(cid:13)(cid:13)(cid:141)(cid:28)(cid:141)(cid:20)(cid:143)(cid:19)(cid:28)(cid:26)(cid:25)(cid:31)(cid:24)(cid:23)(cid:22)(cid:28)(cid:21)(cid:20)(cid:19)(cid:18)(cid:19)(cid:17)(cid:28)(cid:18)(cid:6)(cid:19)(cid:15)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:11)(cid:15)(cid:11)(cid:141)(cid:1)(cid:144)(cid:20)(cid:15)(cid:7)(cid:28)(cid:11)(cid:28)(cid:12)(cid:2)(cid:141)(cid:10)(cid:20)(cid:157)(cid:17)(cid:13) (cid:13)(cid:10)(cid:28)(cid:16)(cid:1)(cid:16)(cid:10)(cid:19)(cid:12)(cid:28)(cid:20)(cid:15)(cid:28)(cid:5)(cid:11)(cid:1)(cid:157)(cid:10)(cid:13)(cid:157)(cid:5)(cid:11)(cid:1)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:28)(cid:18)(cid:13)(cid:17)(cid:143) (cid:16)(cid:10)(cid:2)(cid:5)(cid:20)(cid:19)(cid:16)(cid:4) S1S2S3

Figure 5: Participant’s feedback about MOSAIC Viewer on a 5-pointLikert Scale. Median ratings are indicated in gray. browser. Participant input was captured through an external key-board and mouse. For the sake of uniformity, pen and paper wereprovided to participants regardless of the task.

Procedure.

Each participant ﬁrst ﬁlled in their background infor-mation in a survey form. They were then trained on the interfaceuntil they were comfortable using it. Once ready, we provided thethree scenarios one by one. For each scenario, the participants wereasked to answer the four aforementioned questions and write downtheir answers on the provided sheet of paper. At the end of each task,participants responded to a questionnaire about the conﬁdence oftheir answers. After ﬁnishing all three tasks, participants answeredadditional survey questions that prompted their overall thoughtsabout the system and engaged in a semi-structured exit interview.We used a concurrent think-aloud protocol during the study, andparticipants were audio- and screen-recorded for the duration of thetasks. All of the participant’s answers for the tasks were saved.

Our evaluation revealed the following ﬁndings:•

Speed and interactivity streamline higher-level analyses :(1) MOSAIC Viewer helps users locate information faster; (2)The interlinked views enable users to formulate a hypothesisabout the root cause of the desynchronized worldviews.•

Trust for summary displays grew with experience : Usersinitially lacked trust in the visual representations to which theywere not accustomed, but this trust grew quickly once theywere able to verify their understanding.•

The why: The how is not enough on its own : UnderstandingMRS agents requires not only understanding how agents areinteracting with each other and how the tasks are scheduledbut also why. One without the other is not complete.•

Different sets of assumptions affect data interpretation :Users bring in a different set of assumptions that mismatchesfrom the system’s architecture. These mismatches led to incor-rect data interpretation.P6 explains that MOSAIC Viewer meets an unmet need with existingtools, noting “Every time I build a new capability...There are no toolsout of the box that just does it for you. You have to go spend timeand build it, so you can properly visualize what our algorithms do.”

Navigating information.

All 12 users reported that understandingthe Main Views visual encoding required very low to low-level ofeffort. P7 attributes the high learnability due to the fact “someoneactually thought about how to represent these data rather than [users]just plotting the data in a given software”. In particular, participantsreported the graph and shapes to be intuitive and useful.8 out of 12 participants helped explain this ﬁnding, sharing thattheir current workﬂows required them to open multiple terminal win-dows per agent, often spreading across multiple monitors to unpackwhat even a single agent is doing. This matches the ﬁnding from theformative study. Even when working alone, participants describeterminal logs as time-consuming and inefﬁcient, especially for com-parisons. P1 explains this workﬂow requires visually scanning andremembering the contents–making it mentally taxing. P2 contraststhis multi-screen collaboration experience with MOSAIC Viewers A C B Figure 6: Toggle interaction used in the DWC. (a) show the defaultsetting, only showing the deltas in the Detail View. (b) shows whenP4 toggles the Similarity View and the agents that agree with Agent1s ego view of its battery level appears in the Detail View. (c) showswhen P4 toggles the Difference View and makes the deltas disappear. compact form, observing that “[in MOSAIC Viewer] whatever Iwant to see, the information is there.”Users contrasted MOSAIC Viewer with terminal logs, whichrequire considerable work from users to ﬁnd individual pieces ofinformation, let alone combine them into a higher-level analysis. P1comments, “that isnt possible with the way I [currently] do it”. Basedon the feedback users provided on a 5-point Likert scale, participantsdescribe MOSAIC Viewer as both easier- and faster-to-use than theircurrent approaches ( Md = IQR =

0, ) (Fig. 5). Others foundthe ﬂexibility helped drive ease-of-use, with P2 observing “you havemore than one way to see [the same] information”.

Formulating.

Previous research explains that debugging a dis-tributed multi-robot system requires users to combine macro- (i.e.,societal-level) and micro-level (i.e., agent-level) system state infor-mation to form a coherent, uniﬁed picture [49]. In our study, usersexplain that the speed of speciﬁc data access enables them to morequickly achieve these higher levels of understanding. To completeeach task, users would use the DWC to identify if an agent is out ofsync with the other agents. To explain a de-synchronization, userswould have to combine their domain knowledge and the informationprovided from each of the various views to conduct a root causeanalysis ( R4 ). Users would look at each view to determine if theagent’s location, distance, communication bandwidth, or any othervariables could explain the desynchronization they determined inDWC. In Sect. 6.3, we provide a walk-through of how users utilizedour system with two case studies. Independent of which task was presented ﬁrst, users ﬁrst interactedwith the system in a way that indicated they were verifying thelow-level details in which the summary displays were based upon.The DWC Summary View provides a good example of this ob-served user behavior. The default setting of DWC shows only thedifferences in the Detail View, but users can toggle different partsof the Summary View to see different low-level details (Fig. 6). An-other example is the Task Abstraction view. The Task Abstractionview utilizes ﬁlled-in shapes to represent the three step process ofthe science and navigation chain of tasks; however the shapes donot indicate when or who has executed the respective task. To verifythe abstraction, users can individually click on a shape and see theparticular task being highlighted in the timeline visualization orautomatically highlight the entire chain of tasks as shown in Fig. 4.Fig. 7 shows the average number of clicks users used to eitherhighlight and toggle throughout the user study’s three tasks. Inthe ﬁrst task, 8 of 12 users used the toggle interaction to validatetheir understanding of the DWCs visual encoding (Fig. 6). As P4explains: “The reason why I’ve been [toggling] is because I’m notsure right now whether I’m seeing the similarity or differences (inthe Detail View). The way I can check that is by looking at the

Interaction A v g . o f C li cks variable Task1Task2Task3

Figure 7: Summary of participants’ average number of clicks for twointeractions. Participants’ average number of clicks decreases overthe study. The average for the toggle interaction in the third task is 0. color of [Agent 1’s ego view of battery level], and toggling the[diagonal lines] to conﬁrm what I’m seeing.” By the second task,only 3 users continued to use the toggle interaction to validate theirunderstanding, and 0 users used the toggle interaction by the thirdtask. As P6 explains, “At ﬁrst, I [toggled] because I wanted to seethe whole picture. But once I got used to the system, I dont need toverify. I know from here it’s going to show the same thing so onceyou get used to it, it’s not necessary.”A similar pattern of use was observed when users were askedwhether agents have completed their science objectives. During theﬁrst task, 7 out of 12 users used the chain of task highlighting inter-action to conﬁrm whether an agent’s science task abstraction wasaccurately reﬂected in the timeline visualization. By the third task,only 3 used the interaction, while the rest used the Task Abstractionexclusively. P8 commented: “It took me a while to really (pause)trust that the little [shapes] were fully representative (of the timelinevisualization). I just wanted to double-check that’s the right answer.”

While participants shared how MOSAIC Viewer successfully ex-plains what the agents are doing and provide insight to some anoma-lous behaviors, expert operators wanted even more detail than thesystem provided. P6 provides a representative quote:Usually, people using [the system] not only want to seehow its scheduled. I think [answering] how [the tasks]are scheduled, the visualizations are doing that beauti-fully. But theres some underlying optimization going onwhere the objective function is trying to maximize some-thing...As a human operating a system, one of the thingswe always want to do is veriﬁcation. So my computer istelling me this [solution] is the best. Is it really the best?Can I check the sanity of the solution?Similarly, 10 participants indicated they would like to know moreabout the reasoning behind the scheduling optimization with com-ments such as, “I feel like 7 and [base station] should be doing betterthan that and I wonder why they’re not”, “I want to know why [Agent5s science task] was transferred to 4. Why not to 0 or 2?”, “Why arethere blank spaces in the timeline?. During the in-depth interviews,participants commented that they would like to see explanationsbehind the schedulers decisions. For example, when an agent’s taskis delegated to another agent, P6 and P10 explain they would wantto see the low-level details, such as real-time CPU load, of the twoagents in order to understand the scheduling optimization.

Despite an unbounded training period, participants who are notfamiliar with MOSAICs speciﬁc logistics displayed behavior wherethey interpreted data differently. We note two speciﬁc cases.P7 and P9, two non-core MOSAIC participants, incorrectly an-swered “no” for all three tasks when asked if the agents accom-plished their navigation task ( Q3 ). In their think-aloud process, bothmentioned the squares that represent the base station’s navigationchain of task in the Task Abstraction are not ﬁlled in, indicating thatthe base station failed to accomplish its navigation tasks. However, as explained in Sect. 2.2, the base station is a special agent that doesnot need to accomplish the mandatory navigation task as its purposeis to help other agents with its fast computation power.A similar pattern of mental model mismatch occurred with P4.During the ﬁrst task, P4 provided an incorrect answer to the questionrelated to the agents’ science tasks ( Q2 ). In their think-aloud process,P4 observed how Agent 3 accomplished its science chain of taskwith the chain of task highlighting and stated the answer to Q2 is“yes”. However, according to the Task Abstraction in Fig. 2, Agent5 did not accomplish its science task. From the exit interview, P4elaborated the reasoning behind their answer is based on the idea ofhow “the agents are working together in tandem” where if one agentaccomplished its science chain of tasks, other agents did as well. Evaluate System Performance.

Upon launching the system, theoperator sees the Main View and DWC side-by-side. The SummaryView (Fig. 2a) does not display a synchronization warning andDWC also does not show any deltas. The operator concludes theagents worldviews are in sync and moves on to assess the system’sperformance (i.e., how many tasks are being performed, and bywhom) as part of their next goal (Table 3).The operator looks at the Task Abstraction (Fig. 2d) to obtaina high-level overview of the systems performance. The ﬁlled-insquares in the “NAV” column indicate agents have accomplishedtheir mandatory navigation task. The “SCI” column shows Agent 3,with its three ﬁlled-in triangles, is the only agent that fully accom-plished its science chain of tasks. However, the Science Objectives(Fig. 2a) and the graph (Fig. 2c) show there are a total of four eligibleagents (Agents 1, 3, 5, and 6) that can accomplish science tasks.The operator engages with the timeline (Fig. 2e) and scatterplot(Fig. 2b) to seek for the low-level details that explain why the otherthree agents failed to accomplish their science tasks. HighlightingAgent 5s partial science chain of tasks (Fig. 2e colored in pink), theoperator sees Agent 5 relocated the second science task to Agent4. From the graph, the shortest path to send the scientiﬁc data fromAgent 4 to the base station is through Agent 8 or Agent 1. However,Agent 4 has a weak communication link to both agents and is unableto relay the data within the time gap from t =

23 to t =

27. Next,the scatterplot shows that Agent 1’s and Agent 6’s battery level is50%, indicating the computation time needed for their tasks willtake longer compared to other agents (e.g., Agent 1 takes twice aslong as Agent 0 to accomplish its second navigation task). Evidently,Agent 1 and Agent 6 (colored in orange and turquoise, respectively)did not accomplish any science tasks as both needed more time toaccomplish their mandatory navigation tasks.

Plan for Corrective Action.

With the same “out of sync” sce-nario from the user study, this case study reﬂects how operators canplan for corrective action (i.e., determine who is out of sync andreason why the desynchronization occurred) with MOSAIC Viewer.Launching the system, the operator sees the Main View and DWCside-by-side. The Main View signals a desynchronization warning inthe Summary Overview, and DWC (Fig. 4) displays deltas across allthree panels. The deltas speciﬁcally in the Communication NetworkPanel (Fig. 8b) show two distinct sets. Highlighting rows 0, 6, 7,8, ST in the Communication Network panel emphasizes the visual-izations white space, revealing a bipartition in the communicationnetwork as Agents 0, 6, 7, 8, ST are out of sync with Agents 1-5,and conversely Agents 1-5 are out of sync with Agent 7 and 8. Thedeltas in the other two panels have the same visual pattern and thesame set of agents (Agents 0, 6, 7, 8, and ST) are out of sync.The pattern revealed by the DWC is also displayed in the graphin the Main View (Fig. 8a): the same grouping of agents in DWCis reﬂected within the two distinct clusters of the graph. The weakcommunication bandwidth between the two clusters (indicated byblack arrows) further supports the hypothesis of a bipartite graph. COMMUNICATION NETWORK (NETWORK) (9/9) (4/9) (4/9) (4/9) (4/9) (4/9) (9/9) (4/9) (4/9) ST (9/9) a b (cid:31)(cid:30)(cid:29)(cid:28)(cid:27) (cid:31)(cid:30)(cid:29)(cid:28)(cid:26) Figure 8: (a) Graph and (b) Communication Network Panel. In (b), operator highlights row 0, 6, 7, 8, ST and identiﬁes a visual pattern of how Set 1is the complement set of Set 2. They hypothesize there’s a network bipartition. The immediate edges of these selected agents are highlighted in(a) and the same group of agents are separated. The weak communication links (indicated by black arrows) further support this hypothesis.

For instance, the bandwidth value between Agent 4 and 7 is 1, aweak value where agents would not be able to send data across thenetwork and update each others worldviews, causing the bipartition.

ISCUSSION

In this section, we reﬂect upon the ﬁndings of this work, and considerboth its extension, and its limitations.

Limitations in Visual Scalability . While the scale of ten agentsin our work aligns with current and proposed space explorationconcepts [8, 19, 20, 45] within the next 10 years, the current sys-tem design might struggle to achieve the same levels of legibilityand low cognitive load with swarm robotics [48] mission conceptsthat include hundreds of agents. Edge bundling [24] and layoutalgorithms [41] are promising directions to improve dense graphlegibility and transparency [13], and jitter [71] can improve densescatterplot legibility. However, critically, in swarm robotics applica-tions, agents are rarely endowed with a global worldview—rather,each agent typically relies on much simpler scheduling schemes andreacts based on its own state and immediate neighbors’. Accordingly,we expect that further research will be required to capture the qual-itatively different nature of the underlying problem of debuggingautonomous swarms’ behaviors.

Foreground scheduler logistics to achieve generalizability . Inour ﬁndings, we observed differences between how core MOSAICand non-core researchers interpreted the same visual marks on thedisplay. In the example reported in Sect 6.2.4, the Task Abstrac-tion shows unﬁlled squares for the base station. Some non-coreteam members interpreted that not all robots have completed theirnavigation tasks when seeing unﬁlled squares for the base station.However, core team members saw the same marks, and, knowingthat the base station does not need to complete the navigation task,made the correct interpretation that all mandatory navigation taskswere completed. Reﬂecting on the systems’ visual encoding, we re-alize this confusion could have been avoided by adding an additionalglyph that distinguishes the base station from other agents.This split in our test population reveals an interesting artifact ofour design process. Though we spent nearly a year on iterativelydesigning, we never observed this problem in our user testing untilwe included users outside of the core MOSAIC team. We reﬂectthat while both groups contain full-time MRS robotics researchers,the core group is more deeply steeped in the architecture and algo-rithms that drive the MOSAIC scheduler. To design a more generalMRS application that would work across any number of schedulingalgorithms, this would require to dive deep into the mechanics ofscheduling algorithms. A general system would need to visualizenot only the various states of the worldviews, but also many of theunderlying mechanisms through which the scheduling system oper-ates. With this approach, designers could be more likely to create asystem that functions independently of the background of its users.

Move Beyond the Debugging Use Case.

This work focusesmore explicitly on explaining what the autonomy has decided. Giventhis success of this work, participants also expressed some desireto understand, as P7 states, “different perspectives of the same code”, that extends to understanding why the autonomy made thedecisions it did in non-error cases. This points towards supportingthe task of scheduling algorithm design, such as adding a “what-if”mode [76], to help roboticists gain insight into tasks that include howdifferent algorithm hyperparameters affect the search of a very largespace. This creates an opportunity to work in algorithm visualization[28, 64], as well as large decisions trees visualization [67, 70]

Extend to other Multi-Robot Systems.

While this system wasdesigned to support users of the MOSAIC scheduler, it also repre-sents an interesting challenge to extend the application to other MRSsystems that use different underlying algorithms and for differentcontexts of use. In one exit interview, for example, P5 observed thatMOSAIC Viewer might also support agent navigation discrepanciesin support of DARPA’s Subterranean Challenge (SubT) [15]. P5reﬂected on ways in which the contexts share many commonalitiesthat could help the systems achieve more generalizability, noting“What youre visualizing is at the core of these multi-robot problems.They are connected by some network infrastructure, theyre sharinginformation in order to come to a decision about what to do. So wewant to visualize what were the states at a speciﬁc time, what werethey doing, what were they planning on doing, and how they wereconnected with each other.” But they also noted other system dif-ferences that might not make the contexts topologically equivalent,noting “[SubTs] task network is a bit larger [than MOSAICs] andnot as straightforward as 1, 2, 3. You roll back and theres loops. Soits harder to just lay it out sequentially like [MOSAICs]”.

ONCLUSION

We present MOSAIC Viewer, a visual analytics system that helpsusers make sense of robots’ autonomous scheduling decisions andpinpoint the cause of desynchronized worldviews in a multi-robotsystem. To compare worldviews, we draw inspiration from the diff algorithm to visually emphasize the differences, while aggregat-ing the similarities. This approach allows users to quickly detectthe differences and similarities of all the robots’ worldviews. Theinterlinked views help users not only collate and fuse pieces of in-formation from each view in order to conduct a root cause analysisof the desynchronized worldviews, but also understand the behav-ior of the system at a societal and individual level. Our qualitativeuser study with domain experts at the NASA JPL characterizes andelaborates the usefulness and effectiveness of MOSAIC Viewer.

CKNOWLEDGEMENT

The development of MOSAIC Viewer was enabled by theJPL/Caltech/ArtCenter data visualization program. We would liketo thank Santiago Lombeyda, Hillary Mushkin, Maggie Hendrie,Alessandra Fleck, and Sarah Strickler for their feedback and contri-bution on the earlier prototypes of MOSAIC Viewer. Also, specialthanks to the MOSAIC team! This research is sponsored in part bythe U.S. National Science Foundation through grant IIS-1741536.The research was carried out at the Jet Propulsion Laboratory, Cal-ifornia Institute of Technology, under a contract with the NationalAeronautics and Space Administration (80NM0018D0004).

EFERENCES [1] D. Altheide, M. Coyle, K. DeVriese, and C. Schneider. Emergentqualitative document analysis.

Handbook of Emergent Methods , pp.127–151, 2008.[2] B. Annable, D. Budden, and A. Mendes. Nubugger: A visual real-timerobot debugging system. In

Proc. RoboCup , pp. 544–551. Springer,2013.[3] T. Arai, E. Pagello, L. E. Parker, et al. Advances in multi-robot systems.

IEEE Transactions on Robotics and Automation , 18(5):655–661, 2002.[4] B. Argrow, D. Lawrence, and E. Rasmussen. UAV systems for sensordispersal, telemetry, and visualization in hazardous environments. In

Proc. AIAA Aerospace Sciences Meeting and Exhibit , p. 1237, 2005.[5] B. Bach, N. Henry-Riche, T. Dwyer, T. Madhyastha, J.-D. Fekete,and T. Grabowski. Small MultiPiles: Piling time to explore temporalpatterns in dynamic networks.

Computer Graphics Forum , 34(3):31–40,2015.[6] D. J. Barnes, M. T. Russell, and M. C. Wheadon. Developing and adapt-ing UNIX tools for workstations. In

EUUG Conference Proceedings ,pp. 321–333, 1988.[7] J. L. Baxter, E. Burke, J. M. Garibaldi, and M. Norman. Multi-robotsearch and rescue: A potential ﬁeld based approach. In

AutonomousRobots and Agents , pp. 9–16. Springer, 2007.[8] S. S. Board, N. R. Council, et al.

Vision and voyages for planetaryscience in the decade 2013-2022 . National Academies Press, 2012.[9] M. Bostock, V. Ogievetsky, and J. Heer. D data-driven docu-ments. IEEE Transactions on Visualization and Computer Graphics ,17(12):2301–2309, 2011.[10] B. Burmeister, A. Haddadi, and G. Matylis. Application of multi-agent systems in trafﬁc and transportation.

IEE Proceedings - SoftwareEngineering , 144(1):51–60, 1997.[11] K. A. Cook and J. J. Thomas. Illuminating the path: The researchand development agenda for visual analytics. Technical report, PaciﬁcNorthwest National Lab., Richland, WA (United States), 2005.[12] M. Cummings and P. Mitchell. Managing multiple UAVs through atimeline display. In

Proc. AIAA Info Tech , p. 7060. 2005.[13] T. N. Dang, L. Wilkinson, and A. Anand. Stacking graphic elements toavoid over-plotting.

IEEE Transactions on Visualization and ComputerGraphics , 16(6):1044–1052, 2010.[14] M. De Rosa, J. Campbell, P. Pillai, S. Goldstein, P. Lee, and T. Mowry.Distributed watchpoints: Debugging large multi-robot systems. In

Proc. ICRA , pp. 3723–3729. IEEE, 2007.[15] Defense Advanced Research Projects Agency. DARPA Subterranean(SubT) Challenge. URL: .[16] N. Elmqvist and J. S. Yi. Patterns for visualization evaluation.

Infor-mation Visualization , 14(3):250–269, 2015.[17] A. Emslie, R. Lagace, and P. Strong. Theory of the propagation of uhfradio waves in coal mine tunnels.

IEEE Transactions on Antennas andPropagation , 23(2):192–205, 1975.[18] E. E. Entin and D. Serfaty. Adaptive team coordination.

Human factors ,41(2):312–325, 1999.[19] eoPortal. CYGNSS (Cyclone Global Navigation Satellite Sys-tem). URL: https://directory.eoportal.org/web/eoportal/satellite-missions/c-missions/cygnss .[20] eoPortal. OPAL (Orbiting Picosatellite Automatic Launcher).URL: https://directory.eoportal.org/web/eoportal/satellite-missions/o/opal .[21] J. A. Fails, A. Karlson, L. Shahamat, and B. Shneiderman. A visualinterface for multivariate temporal data: Finding patterns of eventsacross multiple histories. In

Proc. VAST , pp. 167–174. IEEE, 2006.[22] J. Figueiredo, N. Lau, and A. Pereira. Multi-agent debugging andmonitoring framework.

IFAC Proceedings Volumes , 39(20):114–120,2006.[23] J. Gancet, E. Motard, A. Naghsh, C. Roast, M. M. Arancon, andL. Marques. User interfaces for human robot interactions with a swarmof robots in support to ﬁreﬁghters. In

Proc. ICRA , pp. 2846–2851.IEEE, 2010.[24] E. R. Gansner, Y. Hu, S. North, and C. Scheidegger. Multilevel agglom-erative edge bundling for visualizing large graphs. In

Proc. PaciﬁcVis ,pp. 187–194. IEEE, 2011. [25] H. Garcia-Molina, F. Germano, and W. H. Kohler. Debugging a dis-tributed computing system.

IEEE Transactions on Software Engineer-ing , (2):210–219, 1984.[26] M. Gleicher, D. Albers, R. Walker, I. Jusuﬁ, C. D. Hansen, and J. C.Roberts. Visual comparison for information visualization.

InformationVisualization , 10(4):289–309, 2011.[27] R. Gove, J. Saxe, S. Gold, A. Long, and G. Bergamo. SEEM: a scalablevisualization for comparing multiple large sets of attributes for malwareanalysis. In

Proc. VizSec , pp. 72–79. IEEE, 2014.[28] S. Grissom, M. F. McNally, and T. Naps. Algorithm visualization in CSeducation: Comparing levels of student engagement. In

Proc. SoftVis ,pp. 87–94. ACM, 2003.[29] J. Halloran, E. Hornecker, G. Fitzpatrick, M. Weal, D. Millard,D. Michaelides, D. Cruickshank, and D. De Roure. Unfolding un-derstandings: Co-designing ubicomp in situ, over time. In

Proc. DIS ,pp. 109–118. ACM, 2006.[30] J. Y. Halpern and Y. Moses. Knowledge and common knowledge in adistributed environment.

Journal of the ACM , 37(3):549–587, 1990.[31] M. Harrower and C. A. Brewer. ColorBrewer.org: An online tool forselecting colour schemes for maps.

Cartographic Journal , 40(1):27–37,2003.[32] J. H. Holland.

Emergence: From chaos to order . OUP Oxford, 2000.[33] K. Holtzblatt and H. Beyer.

Contextual design: Deﬁning customer-centered systems . Elsevier, 1997.[34] C. M. Humphrey, S. M. Gordon, and J. A. Adams. Visualization ofmultiple robots during team activities. In

Proc. HFES , vol. 50, pp.651–655. SAGE Publications, 2006.[35] J. W. Hunt and M. D. MacIlroy.

An algorithm for differential ﬁlecomparison . Bell Laboratories Murray Hill, 1976.[36] J. Jin, R. Sanchez, R. T. Maheswaran, and P. Szekely. Vizscript: On thecreation of efﬁcient visualizations for understanding complex multi-agent systems. In

Proc. IUI , pp. 40–49. ACM, 2008.[37] J. Jo, J. Huh, J. Park, B. Kim, and J. Seo. LiveGantt: Interactivelyvisualizing a large manufacturing schedule.

IEEE Transactions onVisualization and Computer Graphics , 20(12):2329–2338, 2014.[38] G. Kardas, M. Challenger, S. Yildirim, and A. Yamuc. Design and im-plementation of a multiagent stock trading system.

Software: Practiceand Experience , 42(10):1247–1273, 2012.[39] J. T. Karras, C. L. Fuller, K. C. Carpenter, A. Buscicchio, D. McKeeby,C. J. Norman, C. E. Parcheta, I. Davydychev, and R. S. Fearing. Pop-upmars rover with textile-enhanced rigid-ﬂex PCB body. In

Roboticsand Automation (ICRA), 2017 IEEE International Conference on , pp.5459–5466. IEEE, 2017.[40] J. Kasper, J. Lazio, A. Romero-Wolf, J. Lux, and T. Neilsen. The sunradio interferometer space experiment (sunrise) mission concept. In , pp. 1–11. IEEE, 2019.[41] S. Kieffer, T. Dwyer, K. Marriott, and M. Wybrow. Hola: Human-likeorthogonal network layout.

IEEE Transactions on Visualization andComputer Graphics , 22(1):349–358, 2015.[42] H. Kunsei, K. S. Bialkowski, M. S. Alam, and A. M. Abbosh. Improvedcommunications in underground mines using reconﬁgurable antennas.

IEEE Transactions on Antennas and Propagation , 66(12):7505–7510,2018.[43] S. Liu, D. Maljovec, B. Wang, P.-T. Bremer, and V. Pascucci. Visu-alizing high-dimensional data: Advances in the past decade.

IEEETransactions on Visualization and Computer Graphics , 23(3):1249–1268, 2016.[44] Y. Luo, K. Liu, and D. N. Davis. A multi-agent decision support systemfor stock trading.

IEEE Network , 16(1):20–27, 2002.[45] A. J. Mannucci, J. Dickson, C. Duncan, and K. Hurst. GNSS geospaceconstellation (GGC): A CubeSat space weather mission concept. Tech-nical report, Jet Propulsion Laboratory, California Institute of Technol-ogy, 2010.[46] F. Michel, J. Ferber, and A. Drogoul. Multi-agent systems and simula-tion: A survey from the agent commu-nitys perspective. In

Multi-AgentSystems , pp. 17–66. CRC Press, 2018.[47] K. Nagatani, Y. Okada, N. Tokunaga, S. Kiribayashi, K. Yoshida,K. Ohno, E. Takeuchi, S. Tadokoro, H. Akiyama, I. Noda, et al.Multirobot exploration for search and rescue missions: A report onmap building in RoboCupRescue 2009.

Journal of Field Robotics ,8(3):373–387, 2011.[48] I. Navarro and F. Mat´ıa. An introduction to swarm robotics.

Isrnrobotics , 2013.[49] D. T. Ndumu, H. S. Nwana, L. C. Lee, and J. C. Collis. Visualisingand debugging distributed multi-agent systems. In

Proc. AAMAS , pp.326–333, 1999.[50] C. Niederer, H. Stitz, R. Hourieh, F. Grassinger, W. Aigner, andM. Streit. TACO: visualizing changes in tables over time.

IEEETransactions on Visualization and Computer Graphics , 24(1):677–686,2017.[51] E. Osawa, H. Kitano, M. Asada, Y. Kuniyoshi, and I. Noda. RoboCup:the robot world cup initiative. In

Proc. ICMAS , pp. 9–13, 1996.[52] L. E. Parker. Distributed intelligence: Overview of the ﬁeld and itsapplication in multi-robot systems. In

Proc. AAAI Fall Symposium:Regarding the Intelligence in Distributed Intelligent Systems , pp. 1–6,2007.[53] C. Pinciroli, V. Trianni, R. OGrady, G. Pini, A. Brutschy, M. Brambilla,N. Mathews, E. Ferrante, G. Di Caro, F. Ducatelle, et al. Argos:a modular, parallel, multi-engine simulator for multi-robot systems.

Swarm intelligence , 6(4):271–295, 2012.[54] J. Preece, Y. Rogers, H. Sharp, D. Benyon, S. Holland, and T. Carey.

Human-Computer Interaction . Addison-Wesley Longman Ltd., 1994.[55] M. Quigley, K. Conley, B. Gerkey, J. Faust, T. Foote, J. Leibs,R. Wheeler, and A. Y. Ng. ROS: An open-source robot operatingsystem. In

Proc. ICRA Workshop on Open Source Software , vol. 3, p. 5.Kobe, Japan, 2009.[56] A. Rahmani, S. Bandyopadhyay, F. Rossi, J.-P. de la Croix, J. V. Hook,and M. T. Wolf. Space vehicle swarm exploration missions: A studyof key enabling technologies and gaps. In

Proc. IAC , 2019.[57] M. Rettig. Prototyping for tiny ﬁngers.

Communications of the ACM ,37(4):21–27, 1994.[58] F. Rossi, R. Zhang, Y. Hindy, and M. Pavone. Routing autonomousvehicles in congested transportation networks: Structural propertiesand coordination algorithms.

Autonomous Robots , 42(7):1427–1442,2018.[59] H. A. Ruff and G. L. Calhoun. Human supervision of multiple au-tonomous vehicles. Technical report, Air Force Research Lab Wright-Patterson AFB OH Human Effectiveness Directorate, 2013.[60] R. Sadana, T. Major, A. Dove, and J. Stasko. Onset: A visualiza-tion technique for large-scale binary set data.

IEEE transactions onvisualization and computer graphics , 20(12):1993–2002, 2014.[61] M. Schroeder and P. Noy. Multi-agent visualisation based on multivari-ate data. In

Proc. AAMAS , pp. 85–91, 2001.[62] C. Seah, M. Sierhuis, and W. J. Clancey. Multi-agent modeling andsimulation approach for design and analysis of mer mission operations.2005.[63] M. Sedlmair, M. Meyer, and T. Munzner. Design study methodology:Reﬂections from the trenches and the stacks.

IEEE Transactions on Visualization and Computer Graphics , 18(12):2431–2440, 2012.[64] C. A. Shaffer, M. L. Cooper, A. J. D. Alon, M. Akbar, M. Stewart,S. Ponce, and S. H. Edwards. Algorithm visualization: The state of theﬁeld.

ACM Transactions on Computing Education , 10(3):1–22, 2010.[65] R. Simmons, D. Apfelbaum, W. Burgard, D. Fox, M. Moors, S. Thrun,and H. Younes. Coordination for multi-robot exploration and mapping.In

Proc. AAAI/IAAI , pp. 852–858, 2000.[66] H. Song, B. Lee, B. H. Kim, and J. Seo. DiffMatrix: Matrix-based in-teractive visualization for comparing temporal trends. In

Proc. EuroVis(Short Papers) , 2012.[67] P. Szekely, R. Maheswaran, C. M. Rogers, and R. Sanchez. Schedulingthe activities of distributed teams. 2008.[68] P. Szekely, C. M. Rogers, and M. Frank. Interfaces for understandingmulti-agent behavior. In

Proc. IUI , pp. 161–166. ACM, 2001.[69] A. Tanoto, J. L. Du, T. Kaulmann, and U. Witkowski. MPEG-4-basedinteractive visualization as an analysis tool for experiments in robotics.In

Proc. MSV , pp. 186–192, 2006.[70] G. Taylor, R. M. Jones, M. Goldstein, R. Frederiksen, and R. E. Wray.VISTA: A generic toolkit for visualizing agent behavior. In

Proc. CGF ,pp. 29–40, 2002.[71] M. Trutschl, G. Grinstein, and U. Cvek. Intelligently resolving pointocclusion. In

Proc. InfoVis , pp. 131–136. IEEE, 2003.[72] T. S. Tullis.

Predicting the usability of alphanumeric displays . PhDthesis, 1984.[73] M. H. Van Liedekerke and N. M. Avouris. Debugging multi-agentsystems.

Information and Software Technology , 37(2):103–112, 1995.[74] J. Vander Hook, W. Seto, V. Nguyen, Z. Hasnain, L. Gallagher,T. Halpin-Chan, V. Varahamurthy, and M. Angulo. Autonomousswarms of high speed maneuvering surface vessels for the centraltest evaluation improvement program. In

Unmanned Systems Technol-ogy XXI , vol. 11021, p. 110210M. International Society for Optics andPhotonics, 2019.[75] J. Vander Hook, T. Vaquero, F. Rossi, M. Troesch, M. S. Net, J. School-craft, J.-P. de la Croix, and S. Chien. Mars on-site shared analyticsinformation and computing. In

Proc. ICAPS , vol. 29, pp. 707–715,2019.[76] J. Wexler, M. Pushkarna, T. Bolukbasi, M. Wattenberg, F. Vi´egas, andJ. Wilson. The what-if tool: Interactive probing of machine learningmodels.

IEEE transactions on visualization and computer graphics ,26(1):56–65, 2019.[77] K. Wongsuphasawat, J. A. Guerra G´omez, C. Plaisant, T. D. Wang,M. Taieb-Maimon, and B. Shneiderman. LifeFlow: visualizing anoverview of event sequences. In

Proc. CHI , pp. 1747–1756. ACM,2011.[78] A. Yamashita, T. Arai, J. Ota, and H. Asama. Motion planning ofmultiple mobile robots for cooperative manipulation and transportation.