[PDF] Context-Responsive Labeling in Augmented Reality

Abstract

Route planning and navigation are common tasks that often require additional information on points of interest. Augmented Reality (AR) enables mobile users to utilize text labels, in order to provide a composite view associated with additional information in a real-world environment. Nonetheless, displaying all labels for points of interest on a mobile device will lead to unwanted overlaps between information, and thus a context-responsive strategy to properly arrange labels is expected. The technique should remove overlaps, show the right level-of-detail, and maintain label coherence. This is necessary as the viewing angle in an AR system may change rapidly due to users' behaviors. Coherence plays an essential role in retaining user experience and knowledge, as well as avoiding motion sickness. In this paper, we develop an approach that systematically manages label visibility and levels-of-detail, as well as eliminates unexpected incoherent movement. We introduce three label management strategies, including (1) occlusion management, (2) level-of-detail management, and (3) coherence management by balancing the usage of the mobile phone screen. A greedy approach is developed for fast occlusion handling in AR. A level-of-detail scheme is adopted to arrange various types of labels. A 3D scene manipulation is then built to simultaneously suppress the incoherent behaviors induced by viewing angle changes. Finally, we present the feasibility and applicability of our approach through one synthetic and two real-world scenarios, followed by a qualitative user study.

Full PDF

CContext-Responsive Labeling in Augmented Reality

Thomas K ¨oppel * TU Wien, Austria

M. Eduard Gr ¨oller † TU Wien, AustriaVRVis Research Center, Austria

Hsiang-Yun Wu ‡ TU Wien, Austria (a) − ◦ (b) − ◦ (c) ◦ (d) ◦ (e) ◦ Figure 1: Responsive labeling of the

Tokyo Disneyland Dataset in AR. The images present results of different viewing angles toshow consistent label positions when the AR device is rotated toward different directions by the user. A BSTRACT

Route planning and navigation are common tasks that often requireadditional information on points of interest. Augmented Reality(AR) enables mobile users to utilize text labels, in order to providea composite view associated with additional information in a real-world environment. Nonetheless, displaying all labels for points ofinterest on a mobile device will lead to unwanted overlaps betweeninformation, and thus a context-responsive strategy to properly ar-range labels is expected. The technique should remove overlaps,show the right level-of-detail, and maintain label coherence. Thisis necessary as the viewing angle in an AR system may changerapidly due to users’ behaviors. Coherence plays an essential rolein retaining user experience and knowledge, as well as avoidingmotion sickness. In this paper, we develop an approach that sys-tematically manages label visibility and levels-of-detail, as well aseliminates unexpected incoherent movement. We introduce threelabel management strategies, including (1) occlusion management ,(2) level-of-detail management , and (3) coherence management bybalancing the usage of the mobile phone screen. A greedy approachis developed for fast occlusion handling in AR. A level-of-detailscheme is adopted to arrange various types of labels. A 3D scenemanipulation is then built to simultaneously suppress the incoherentbehaviors induced by viewing angle changes. Finally, we present thefeasibility and applicability of our approach through one syntheticand two real-world scenarios, followed by a qualitative user study.

Index Terms:

Human-centered computing—Visualization—— * e-mail: [email protected] † e-mail: [email protected] ‡ e-mail: [email protected] NTRODUCTION

We schedule and plan routes irregularly in our everyday life. Forexample, we visit ofﬁces, go to restaurants, or see doctors, in orderto accomplish necessary tasks. In some cases, such as visitingmedical doctors or popular restaurants, one has to wait in a queueuntil being able to proceed. This is time-inefﬁcient and most peopletry to avoid it. Normally, if a person needs to decide the nextplace to visit, he or she can extract knowledge about the targetsof interest. Then a decision is made based on the correspondingexperience or referring locations using a map. 2D maps are one ofthe most popular methods that describe the geospatial informationof objects, to give an overview of the object positions in a certainarea. With a 2D map for navigation, users need to remap or translatethe objects on the map to the real environment, to understand therelationships and distances to these objects [14]. This inevitablystrains our cognition. It is also the reason why some people cannotquickly locate themselves on a 2D map or ﬁnd the correct directionor orientation immediately.

Augmented Reality (AR) and

MixedReality (MR) have been proposed to overlay information directly onthe real-world environment with a lower complexity by instructingusers in an effective way [10, 28]. In this paper, we use AR asour technique of choice for the explanation. Displaying texts orimages in AR or MR allows us to acquire information encoded withgeotagged data and stored in GISs. It is also known that using ARfor guiding users in exploring the real environment can be moreeffective in comparison to a 2D representation [8].In mixed environments, points of interest (POIs) are often as-sociated with text labels [16, 20, 35] in order to present additionalinformation (e.g., name, category, etc.). For example, an AugmentedReality Browser (ARB) facilitates us to embed and show relevantdata in a real-world environment. Technically, POIs are registered atcertain geographical positions via GPS coordinates. Based on thecurrent position and the viewing angle of the device, the POIs are an-notated and the corresponding labels are then projected to the screenof the user’s device. Naive labeling strategies can lead to occlusionproblems between objects, especially in an environment with a dense a r X i v : . [ c s . H C ] F e b rrangement of POIs. Additionally, properly selecting the right levelof a label to present information can help to avoid overcrowdedsituations. Moreover, retaining the consistency between successiveframes also enables us to maintain a good user experience and toavoid motion sickness. Based on the aforementioned ﬁndings, wesummarize that a good AR labeling framework should address: (P1) The occlusion of densely placed labels in AR space. Occlu-sion removal has been considered as a primary design criterionin visualization approaches. It reﬂects user preferences andalso allows the system to present information explicitly [45]. (P2) Limited Information provided by plain text.

As summa-rized by Langlotz et al. [20], labels in AR often contain plaintext rather than other richer content, such as ﬁgures or hybridsof texts and ﬁgures. (P3) Label incoherence due to the movement of mobile devices.

During the interaction with an AR system, the user may fre-quently change positions or viewing angles. This leads tounwanted ﬂickering that impacts information consistency [16].In this paper, we develop a context-responsive framework to op-timize label placement in AR. By context-responsive , we refer totaking contextual attributes, such as GPS positions, mobile orien-tations, etc., into account. The system responds to the user withan appropriate positioning of labels. The approach contains threemajor components: (1) occlusion management , (2) level-of-detailmanagement , and (3) coherence management , which are essential forthe approach to be context-responsive. The occlusion management eliminates overlapping labels by adjusting the positions of occludedlabels with a greedy approach to achieve a fast performance. Then,a levels-of-detail scheme is introduced to select the appropriate levelin a hierarchy and present it based on how densely packed the labelsare in the view volume of the user. We construct a 3D scene tomanipulate and control the movement of labels enhancing the userexperience.We introduce a novel approach to manage label placement tailoredto AR. It enables an interactive environment with continuous changesof device positions and orientations. A survey by Preim et al. [30]concluded that existing labeling techniques often resolve overlap-ping labels once the camera stops moving or the camera positionis assumed to be ﬁxed to begin with. Approaches often projectlabels to a 2D plane to determine the occlusions and then performocclusion removal. However, object movement in 3D is not obvi-ous in the 2D projections of a 3D scene, which leads to temporalinconsistencies that are harmful to label readability [35]. ˇCmol´ık etal. [6] summarized the difﬁculty of retaining label coherence dueto many discontinuities of objects projected into 2D images. As inthe sequence of snapshots in Figure 1, we treat labels as objects ina 3D scene and apply our management strategies for better qualitycontrol. In summary, the main technical contributions are:• A fast label occlusion removal technique for mobile devices.• A clutter-aware level-of-detail management.• A 3D object arrangement that retains label coherence.• A prototype to demonstrate the applicability of our ap-proach [17].The remainder of the paper is structured as follows: Section 2presents previous work and relates our approach to existing research.An overview of our design principles and system is described inSection 3. In Section 4, we detail the methodology and technicalaspects. The implementation is explained and use cases are demon-strated in Section 5, followed by an evaluation in Section 6. Thelimitations are explained in Section 7, and we conclude this workand provide future research directions in Section 8.

ELATED W ORK

We present a novel responsive approach considering label occlusion,visual clutter, and coherence simultaneously. We discuss relatedwork to identify our contributions by ﬁrst covering general nav-igation techniques, and then speciﬁc labeling topics in differentapplications and spaces.

Spatial cognition studies show how people acquire experience andknowledge to identify where they are, how to continue the journey,and visit places effectively [40]. Maps are classical tools used todetect positions and extract spatial information throughout humanhistory [44], while modern maps often use markers to identify andhighlight the locations of POIs. 2D maps may not be always op-timal since the 2D information needs to be translated to the realenvironment [14].An alternative, or maybe a more intuitive way, is to map the infor-mation directly to the physical environment. McMahon et al. [28]compared paper maps and Google Maps to AR or more speciﬁcallyhand-held AR [33], which better supports people in terms of activat-ing their navigation skills. Willett et al. [43] introduced embeddeddata representations, a taxonomy describing the challenges of show-ing data in physical space, and mentioned that occlusion problemshave not yet been fully resolved. Bell et al. [3] proposed a pioneer-ing view-management approach to project objects onto the screenwhile resolving occlusions or to arrange similar objects close to eachother. Guarese and Maciel [14] investigated MR, to assist navigationtasks by overlaying the real environment with virtual holograms.Schneider et al. [32] investigated an AR navigation concept, wherethe system projects the content onto a vehicle’s windshield to assistdriving behaviors.

Labeling is an automatic approach to position text or image labelsin order to efﬁciently communicate additional information aboutPOIs. It improves clarity and understandability of the underlyinginformation [2]. Internal labels are overlaid onto their referenceobjects. External labels are placed outside the objects and are con-nected to them by leader lines. Recently, ˇCmol´ık et al. [6] haveintroduced

Mixed Labeling that facilitates the integration of internaland external labeling in 2D. Labeling techniques have been exten-sively investigated in geovisualization, where resolving occlusionsand leader crossings [21] are primary aesthetic criteria to ensuregood readability. Besides 2D labeling, in digital map services, suchas Google Maps and other GISs, scales have been considered toimprove user interaction. Active range optimization, for example,uses rectangular pyramids to eliminate label-placement conﬂictsacross different zoom levels [1, 46]. Labeling of 3D scenes has beenmainly investigated in medical applications [29], usually focusingon complex mesh and volume scenes, as well as intuitiveness fornavigation. Maass and D¨ollner [23] developed a labeling techniqueto dynamically attach labels to the hulls of objects in a 3D scene.Later they extended this billboard concept by taking occlusion withlabels and scene elements into account [24]. The approach byKouˇril et al. [18] annotates a complex 3D scene, involving multi in-stances across multiple scales in a dense 3D biological environment.Occlusion in these approaches is detected after projecting objectsinto 2D. It is hard to maintain coherence.Handheld Augmented Reality has become useful as the comput-ing power of mobile devices has increased. One advantage of usingAR is to overlay information directly on the real world that theuser is familiar with. For example, White and Feiner [41] proposed

SiteLens , a situated visualization that embeds relevant data of thePOIs in AR. Veas et al. [38] investigated outdoor AR applications,where they focused on multiple-view coordination and occlusionwith objects in the background. Labels are not fully researchedere. As referred to in most of the following papers, occlusionsbetween labels have been considered as a primary issue in AR appli-cations [13, 16, 20]. Grasset et al. [13] proposed a view managementtechnique to annotate landmarks in an image. Edge detection andimage saliency are integrated to identify unimportant regions fortext label placement. Jia et al. [16] investigated a similar strategy,with incorporating human placement preferences as a set of con-straints to improve the work by Grasset et al. [13]. Two prototypesare implemented for desktop computers due to the poor temporalperformance on mobile devices. Tatzgern et al. [35] developed a pio-neering approach that considers labels as 3D objects in the scene toavoid unstable labels due to view changes. The approach estimatesthe center position of an object and moves labels along a 3D pole,which attaches to the object. Another proposed scenario constrainslabel movement to a predeﬁned 2D view plane. This technique islimited to annotating objects in front of the camera.Existing work tends to directly solve label occlusions in 2D orto project labels from 3D to 2D and apply 2D solutions. Thesetechniques cannot avoid label inconsistencies [6]. In contrast toexisting approaches, we handle labels as objects in the 3D scene.This allows us to compensate for incoherent label movement causedby viewing angle changes of the device. We integrate the labelingtechnique into 3D to retain stability and introduce additional visualvariables, including text, images, icons, and colors, to enrich thecorresponding visual representation. Our label encoding also variesin order to balance information provided by POIs. More designchoices will be explained in Section 3.

ONTEXT -R ESPONSIVE F RAMEWORK

Based on the taxonomy by Wiener et al. [42], our approach supportsaided and unaided wayﬁnding tasks. We can directly highlightthe destination label and assist users to combine decision-makingprocesses, memory processes, learning processes, and planningprocesses for ﬁnding the overall best destinations. The effort toidentify objects in AR is low [14] because real-world objects canbe directly annotated [3, 13] and AR navigation is less user-focusdemanding compared to other map techniques [28]. The responsiveframework is inspired by Hoffswell et al. [15], who proposed ataxonomy for responsive visualization design, which is essentialto present information based on the device context. In principle,our design has three major components, including (1) occlusionmanagement, (2) level-of-detail management, and (3) coherencemanagement, each of which aims to solve the problems (P1-P3) ,respectively. We ﬁrst introduce the encoding of labels beyond plaintext, followed by an overview of the presented approach.

The label encoding reduces the limitations in existing work andsolves (P2) . We introduce additional types of labels than merelytext labels as concluded by Langlotz et al. [20]. We use color toencode scalar variables of each POI [25, 27]. In general, the userscan choose a color scheme and a scale according to their preferences.A label consists of several of the following components:• a text tag containing the name of the POI,• an iconic image (photo) of the POI,• an icon encoding the type of the POI, and• a color-coded rectangle representing a scalar value of the POI.In Figure 2, labels concerning the

Tokyo Disneyland Dataset are shown. POIs are attractions in this case. Attractions can becategorized into three types, i.e., thrilling , adventure , and children ,each of which is depicted through a type icon. Figure 2(a) providesan explanatory label annotating an attraction of the dataset. Thetext tag depicts the name of the attraction and the waiting time (e.g., (a) Label encoding, three LODs (b) Super label Figure 2: An example label encoding (

Tokyo Disneyland Dataset ). Big Thunder Mountain min ). The iconic image shows a photoof the train of the attraction and the type icon indicates that it is athrilling attraction. The colored (rectangular) backgrounds of thelabels encode the corresponding waiting times.

Figure 3 gives an overview of our approach. We ﬁrst position la-bels of POIs in AR (Figure 3(a) as a top view and (b) as a frontview) and perform the proposed three management strategies. Weprocess the objects in the 3D scene using a Cartesian world coor-dinate system, where the xz -plane is parallel to the ground plane.Figure 3(a) depicts a top view of our coordinate system, the x -axisand z -axis deﬁne the ground plane and the y -axis is vertically up-wards from the ground plane. The input to our system is a setof POIs P = { p , p ,..., p n } and a set of labels L = { l , l ,..., l n } ,for example, manually selected by the users or downloaded froman online database. In the positioning labels in AR preprocessing(Section 4.1), for each POI p i , the corresponding label l i is initiallyplaced perpendicularly to the ground plane in the world coordinatesystem (Figure 3(b)). Currently, each POI p i has one associatedlabel l i describing the attributes of the POI. We also assume that the ( x , z ) -coordinates of each annotated POI are more important thanthe y -coordinate, since the ( x , z ) -coordinates are essential to indicatethe relative positions of the POIs as suggested by prior work [3, 14].The occlusion management strategy (Section 4.2) addresses (P1) and resolves occlusions of labels considering the current conﬁgura-tion of the device. The labels are ﬁrst sorted by distance from thedevice into a list S , from the nearest to the farthest positions. Withthis information, we resolve occlusions starting with the closest labeland using a greedy approach (see Figure 3(c)). The greedy approacharranges the lowest y -positions of the labels to be visible iteratively.This allows effective execution of the occlusion-handling on mobiledevices, where the computation powers are limited compared todesktop computers. The occlusion strategy provides a solution tootherwise inconsistently moving labels when the viewing angle ofthe AR device changes [35].In the level-of-detail management (Section 4.3), we introduce fourdistinct types of label encodings for (P2) to represent three levels-of-detail (LODs, Figure 2(a)) of an individual label and one super label to indicate an aggregated group of labels for visual clutter reduction(Figure 2(b)). The level-of-detail management depicts a differentamount of information for each label (see Figure 3(d)). The LODof a label l i is selected according to the distance of the annotatedPOI to the device and the label density in the view volume. Forconvenience, we assume that close labels get at least as much screenspace as distant labels, since it is natural to show objects larger whenthey are close by. However, different conﬁgurations can be alsoincorporated by adding rays in the occlusion detection. Super labels(Figure 2(b)) are representative labels that depict a set of aggregatedlabels in order to reduce visual clutter. Figure 2(b) gives an exampleof a super label for the Tokyo Disneyland Dataset . The themed area

Adventureland is aggregated and the blue background color of thea) Input (b) Positioning labels in AR (c) Occlusion management (d) Level-of-detail management (e) Coherence management

Figure 3: The input scenario (a), positioning of labels in AR (b), and the three management strategies of our approach (c)-(e). super label encodes the average waiting time. A color legend atthe bottom of the label presents the individual waiting times of theaggregated attractions in this themed area.

Positioning labels in AR , occlusion management , and level-of-detail management are smoothly updated in the coherence man-agement module (Figure 3(e)). To avoid ﬂickering that inevitablyreduces coherency [16], the labels are not moved or changed im-mediately, but follow a common animation policy, by strategicallyupdating changes over time (Section 4.4) to solve problem (P3) . ONTEXT -R ESPONSIVE L ABELING M ANAGEMENT

Our approach positions labels in AR space, followed by a context-responsive computation. Here we introduce occlusion removal, per-form level-of-detail strategies, and enforce coherent label placement.In this section, we will detail the proposed technique.

In a preprocessing step, we map the geographical locations from thereal world to our Cartesian AR world space. This considers the GPSposition of the user’s device, the GPS location of the POIs, and thecompass orientation of the device [39].The labels are oriented towards the user’s position by aligningthe normal vectors of the labels with the AR device in the AR worldspace. Once this initial label positioning is done, a perspectiveprojection from the AR world space into the screen space of thedevice is performed. In doing so, we can position the labels in ARspatially relative to the position of the user to support explorationand navigation as shown by Guarese et al. [14]. In principle, existingframeworks, like the

AR + GPS Location SDK package [11] orthe

Wikitude AR SDK package [12] can be used to map real-worldobjects to the AR world space. Unfortunately in our experiment,the techniques are not stable due to the inaccurate GPS sensor [37]or compass [19] data of mobile devices. To test and assess thequality of the coherence strategies for the occlusion management and level-of-detail management , we predeﬁne the positions of labelsat the ( x , z ) -coordinates in the AR world space. The existing librariesdo not provide stable label positions, which would lead to a lesscoherent behavior that is not relying on the proposed coherencemanagement . Once the labels are placed, we order the labels basedon their distance to the user for future computations. Showing many labels simultaneously on a mobile device will, unfor-tunately, lead to occlusions of labels, especially if the annotated POIsare close to each other or even hidden by other labels (Figure 4(a)).Point-feature labeling has been extensively investigated due to itsNP-hardness, even when looking for an optimal solution just in2D [5]. In our setting, occlusions change over time, since the usersmove. Fast responsive management strategies are required to updatethe scene regularly. Viewing angle and position changes of the userneed to be accounted for to guarantee smooth state transitions andto eliminate unwanted ﬂickering. We perform the entire occlusionhandling in the 3D scene, overcoming the label positioning inconsis- tencies caused by viewing angle changes. The occlusion handlingconsists of two steps, occlusion detection and shift computation.

We employ ray tracing to detect occlusions, which is different fromexisting approaches [2]. As the labels have been sorted by the dis-tance to the user, the occlusions are detected and solved iterativelyfrom label l to label l n of the sorted list S . For each label l i , theorigins of four rays are set to the location of the user’s device in AR.The rays run through the corner points of label l i as shown in Fig-ure 4(b). If another label is hit during the ray traversals, an occlusionoccurs. To ensure that all possible occlusions will be detected, weassume that labels closer to the viewer are either larger or as large aslabels farther away. This allows us to use just four rays to detect 3Docclusions effectively. The approach works for rectangular shapesor rectangular bounding boxes of polygonal shapes and could beextended to polygons or 3D objects (e.g., buildings in MR). Otherconﬁgurations can be accommodated by increasing the number ofrays. Figure 4(b) gives an example, where label l (orange) is infront of label l (red). In this case, the corner ray 1 of label l col-lides with label l , indicating that label l occludes label l . Sincewe assume that closer labels are always larger or as large as fartheraway labels, no occluding labels will be missed during the occlusiondetection. Once the occlusions are detected, we can iteratively shift the labelsgreedily in the order of increasing distance. Since the labels areshifted from the closest to the farthest one, the label l i will be locatedeither at its initial ( x , z ) -coordinates or above the previous label l i − along the y-axis. Figure 4(c) illustrates the basic shift of label l .The blue lines represent the corner rays for occlusion detection andthe gray line shows the traversed ray for calculating the occlusionfree position of label l . Figure 4(d) depicts an occlusion-free resultafter shifting label l , where the shift distance d is | y (cid:48) − y | .Szirmay-Kalos et al. [34] proved that the ray-tracing approach atleast requires a logarithmic computation time in the worst case basedon the number of scene objects. On the other hand, modern plat-forms already provide real-time ray-tracing [36]. In our approach,the occlusion management takes O ( n ) if labels are aligned in asequence along the current viewing direction. The current label l i possibly needs to be shifted above each label in front of it. Weshow a comparison with different label alignments in Section 5. Thegreedy label placement terminates as soon as no other label in frontoccludes label l i . Labels occupy space that is a scarce resource on a mobile device,especially if many labels should be shown simultaneously. To reduceunwanted visual clutter, we introduce an LOD concept for labels [26]and incorporate a level-of-detail management in the pipeline (Fig-ure 3(d)). The LOD is also computed based on the sorted distancesof labels and the label density.The LOD selection consists of twosteps: LOD calculation and super label aggregation.a) World coordinates (top view) (b) Occluded label l (c) Shift of l (d) Occlusion-free result Figure 4: Illustration of the occlusion management . (a) The labeled 3D scenario in top view. Label l and label l are in the current view volume.(b) Occlusion by l (transparent and orange), which is in front of l (red). The ray at corner of l intersects the occluding, label l . (c) Label l isshifted above the gray ray by the distance d to resolve the occlusion. (d) The blue corner rays do not collide with a label in front anymore. In our implementation, the lowest LOD occupies the least spaceand includes a colored rectangle and an icon (Figure 2(a)). Themiddle LOD presents a colored rectangle, the icon, and an iconicimage (photo) of the POI (Figure 2(a)). The highest LOD containsa text tag and occupies the most space (Figure 2(a)). The level-of-detail for each label changes when the user navigates through thescene.

The LOD for each label depends on the distance to the user and thelabel density. For each label, a virtual view volume aligned to the ( x , z ) ground plane is constructed to mimic that the user would lookinto the direction of each label. The horizontal distance along the ( x , z ) ground plane and the vector from the user to the position ofeach label are used. If the angle between these two vectors is abovea threshold t (45 ° by default in our system), the label is locatedoutside the aligned view volume. We split the view volume and eachlabel below the threshold m (20 ° by default) receives the highestLOD until one label exceeds the angle m . The remaining labelsare displayed in the middle LOD until reaching the threshold m (30 ° by default). If a label exceeds m , it will be displayed in thelowest LOD. The threshold angles can be changed according touser preferences. The LODs of all labels are consistent when theviewing angle of the device changes for the current user position.The level-of-detail management provides coherent movement whenrotating the AR device. The LODs for the labels are updated if theuser moves. To further reduce visual clutter, we introduce super labels that groupindividual labels (see Section 3.1). The position of a super label iscalculated as the average ( x , z ) -positions of the individual labels thatare part of the aggregation in the 3D scene. A predeﬁned grouping(i.e., themed areas of amusement parks) of labels is necessary tocompute the super labels, while unsupervised clustering algorithmscan also be directly applied. We do not aggregate labels of the closestpredeﬁned group considering the position of the user. Individuallabels in the close surroundings of the user are always displayedand not aggregated supporting the exploration process. We onlyaggregate individual labels to super labels if the user is locatedoutside of the respective label group. To avoid unwanted ﬂickering, we incorporate smooth transitions foreach movement and change. Smooth transitions are implemented ifpositions of labels change to be occlusion-free during the interactionwith the system, if LODs of labels change, or if labels are aggre-gated to super labels. We investigated ten different easing functions,including linear, and various quadratic and cubic equations, for the transitions to further increase the coherency. For comparison, werefer readers to the supplementary videos. We believe that the ease-in ease-out sine function (Eq. 1) represents the best easing functionas it provides harmonic transitions. The easing function can bechanged based on user preferences. Let t transition be the duration fora transition to be completed. The variables t start and t current indicatethe start time and the current time during the transition. The function e ( t current ) represents the easing function fora smooth transition: e ( t current ) = − . ∗ ( cos ( π ∗ t current − t start t transition ) − ) . (1) Due to the interaction of the user, occlusion-free label positionsmay vary from one frame to the next. If the labels would simply bedisplayed at the newly calculated positions, the labels might abruptlychange their positions, which destroys the users’ experience sincethe labels do not move in a coherent way. To allow the user to betterkeep track of the labels, we implemented smooth transitions fromthe previous locations of the labels to the newly calculated ones. Weinterpolate original positions and the newly calculated positions ofthe labels. The position for label l i is updated every frame until itreaches its destination. Let p goal ( l i ) be the new occlusion-free labelposition and p start ( l i ) the label position at the start of the transition.We calculate the current label position for label l i : (cid:126) p ( l i ) = (cid:126) p start ( l i ) + ( (cid:126) p goal ( l i ) − (cid:126) p start ( l i )) ∗ e ( t current ) . (2) If the LOD for a label changes, the transition needs to be smoothedto avoid ﬂickering and allow a coherent user experience. The LODsof labels change over time, and we adapt the alpha channel to achievea smooth transition. In this way, the iconic images, the icons, andthe text tags fade in or out using α ( l i ) = (cid:40) e ( t current ) , b = − e ( t current ) , b = , (3)where α ( l i ) is the alpha value of the iconic image, the icon, or thetext tag of label l i . Since our easing function e (in Eq.(1)) returns avalue between 0 and 1, the result can be used to set the alpha channelin Eq.(3). The variable b indicates, if the object should becomeinvisible ( b =

0) or if the object should become visible ( b = If labels are aggregated to super labels, individual labels will bemoved to the respective super label positions in the scene. Simul-taneously, we fade in the super labels and fade out the labels bynterpolating the alpha channels. If individual labels are aggregated,the labels move towards their super label and disappear. If an ag-gregation is split up again, coherency is achieved analogously. Ifthe alpha channel of a super label is decreased, the individual la-bels reappear over time and move back to their respective positions(Eqs.(4), (5), and (6)). Let l i be a label that will be aggregated into asuper label l s . The alpha values of l i and l s and the position of l i arecomputed as follows: α ( l i ) = − e ( t current ) (4) α ( l s ) = e ( t current ) (5) (cid:126) p ( l i ) = (cid:126) p start ( l i ) + ( (cid:126) p ( l s ) − (cid:126) p start ( l i )) ∗ e ( t current ) (6) XPERIMENTAL R ESULTS

To assess the applicability of our technique, we investigate threedifferent use cases, including a (1)

Synthetic Dataset , a (2)

LocalShops Dataset , and the (3)

Tokyo Disneyland Dataset . The

SyntheticDataset shows different variations of label layouts. The

Local ShopsDataset provides a real-world example, where the labels are closeand next to each other. The

Tokyo Disneyland Dataset presentsanother real-world scenario, where the labels are spread out in the3D scene. We use Unity as the visualization platform [36] andincorporate the Vuforia engine [31] to arrange objects in AR. Theimages shown in this section were taken using a Xiaomi Mi A2device (Qualcomm Snapdragon 660 processor and 4 GB RAM) withAndroid 10 in portrait mode.

We study three different label layouts of the

Synthetic Dataset (Fig-ure 5) and compute the execution time measured on the mobiledevice Xiaomi Mi A2. The three layouts are a circle layout, a gridlayout, and a line layout, which are computationally increasinglyexpensive. This assumption is based on the fact that if more labelsare hidden in the current viewing direction, more occlusion removalsteps are necessary. Figure 6 gives the execution times of all layoutsin milliseconds based on a variation of label numbers. The labelsin this dataset have a height and width of 120 world space units bydefault in Unity.The circle layout (Figure 5(a)) requires the least computationtimes to resolve occlusions since many labels are initially arrangedwithout occlusion issues. The radius of the circle layout is setto 1 ,

000 world space units in this experiment. The grid layout(Figure 5(b)) distributes the labels equally leading to densely placedlabels in the scene. In our setting, the number of labels per row isequal to √ n , where n is the total number of labels in Figure 6. If √ n is not an integer, the layout contains one partial label row inthe grid. The size of the grid is 4 , × ,

000 world space unitsand includes both near and far labels in the world space. The linelayout (Figure 5(c)) represents the worst case example. The labelsare located one after another, which leads to the maximum numberof i − l i . The labels are placed 90 world spaceunits behind each other. As shown in Figure 6, resolving occlusionsfor the grid layout leads to higher computation times than the circlelayout, but lower computation times compared to the line layout. The

Local Shops Dataset contains shop locations, types of shops, andnumber of people inside a shop (per m ) of a strip mall (Figure 7).The icons indicate the respective shop types (e.g., clothing, shoes,and groceries). Considering the current COVID-19 regulations, weencode the number of people per m , to identify the customer density (a) (b) (c) Figure 5: An example of the

Synthetic Dataset in top view with thedisplayed results beneath. Labels are arranged on a (a) circle, (b)grid, and (c)line.Figure 6: Computation times for removing occlusions. or COVID-19 safety measure in the shop in real-time. In Figure 7,we use a color scale from white to red. The text displays the nameand measure accordingly. Figure 7 gives an explanatory result, inwhich the device is tilted. As shown here, the placement of the labelsis thereby not inﬂuenced. The rectangular labels remain parallel tothe ground.

The

Tokyo Disneyland is one of the most popular amusement parksin the world. Many visitors often need to line up for hours to enjoya speciﬁc attraction, and many magazines and blogs guide visitorsto optimize their one-day visit [4]. The amusement park consistsof 35 big attractions, which we mark all as POIs in our system togive an overview of the park. In the park, themed areas, such asthe

Westernland , are subregions grouping several attractions forconvenience. We use the themed areas of the amusement park toaggregate labels and present the area using the corresponding superlabel.Once the positioning labels in AR has been preprocessed, labelsmight initially be occluded. Figure 8 compares the results of thesame position and viewing angle. Initially, the labels are occluded asshown in Figure 8(a) and the respective occlusion-free result is givenin Figure 8(b). Since the occlusion-free results are independent of theviewing angle of the device, no incoherent label movement occurswhen the user rotates the device. The occlusions are resolved for allthe labels around the users as explained in Section 3 and Section 4.2.Labels closer to the user are more likely to stay close to their initialpositions than labels that are farther away. The two closest labelsin Figure 8 are

Big Thunder Mountain and

Mark Twain’s Riverboat showing an iconic image of a train and a boat. The positions of thesetwo labels are not changed. Labels that are occluded by these twolabels will be shifted upwards during the occlusion management .Figure 9 depicts the transition of a super label to its individual labels.The super label represents the

Westernland themed area of the

Toko igure 7: An example witha ◦ tilted mobile device. (a) (b) Figure 8: Occlusions that occur in (a)are resolved in (b).

Disneyland .Figure 10 presents different LODs of the themed area

Western-land . Figure 10(a) shows all labels in the lowest LOD consisting ofa colored rectangle encoding the waiting time and an icon indicat-ing the attraction type. This LOD provides the simplest overviewof the attractions, and it presents the least amount of informationas only the attraction types and the color encodings are included.Figure 10(b) illustrates the middle LOD adding an iconic image tothe encoding. In this case, the type icon is less dominant than inthe lowest LOD. Figure 10(c) depicts the highest LOD by addinga text tag stating the name and the exact waiting time of the attrac-tion in minutes. This LOD provides the most detailed information.However, higher vertical stacking of labels is necessary to resolveocclusions compared to the lowest and middle LOD during theocclusion handling. Figure 10(d) presents the label placement ofthe themed area

Westernland once the dynamic LOD selection isenabled. This solution constitutes a compromise concerning thepresented amount of information and label displacement. It includesdetailed information about close attractions and keeps the verticalstacking of labels low compared to the highest LOD. The preferredLOD might vary depending on the use case and the user’s preference(see Section 6). Each LOD has its beneﬁts and drawbacks with thedynamic LODs being the most versatile one as they present detailedinformation about close labels and avoid excessive vertical stacking(see Section 6). Figure 11 exempliﬁes lateral translations of the userand the resulting label arrangements. Figure 11(a) and Figure 11(c)correspond to the initial positions. In Figure 11(b) and Figure 11(d),the user moved laterally to the right. The label positions are updatedsmoothly depending on the movement of the user.

UALITATIVE E VALUATION

We conducted an online survey to evaluate the effectiveness andthe applicability of our approach. Primarily, we aim to conﬁrmthe appropriateness of the selected design principles. It is basedon users’ preferences by examining task performance in terms ofrequired time and result accuracy. Our hypotheses of the study aresummarized as follows: (H1)

The design principle, removing label occlusions, has higherpriority in comparison to showing precise positions of labels. (H2)

Rich label design in AR leads to a better POI exploration anddecision-making experience in contrast to plain text labels. (H3)

Users can perform faster route planning tasks using our systemcompared to conventional maps.We further decompose our hypotheses into four main tasks assummarized in Table 1 for an online questionnaire. In the future, we plan to do an in-person user study as one of our primary attempts.For each measurable task, time and accuracy were collected. Aftereach task, we also asked participants to provide reasons regardingtheir experience when performing the task. At the end of the entirequestionnaire, we requested general feedback and collected somepersonal information for further analysis (e.g., age, educational back-ground, experience with AR devices, and so forth). Privacy agree-ments have been received prior to the user study and the collecteddata is carefully stored without identiﬁcations of the participants.In total, we recruited 30 participants who are experienced with vi-sualization techniques and graduate students of visual computingparticipated in the survey. The age of the participants ranges from24 to 64 years with the majority of participants being in the latetwenties or the early thirties. One limitation of the user study comesfrom the limited access to the general audience, while experience invisual computing will help the participants to answer the questionssmoothly. We performed a within-subjects study design, where wetested all variable conditions for a participant in order to analyzeindividual behaviors in more depth. Questions in each task are alsorandomized to avoid a learning effect. For more details, we refer tothe accompanying supplementary materials.

Tasks Goal of the investigation and question samplesTask 1 Impact of occlusion on attribute tasks and comparative tasksQ1: What is the waiting time of an attraction?Q2: Which attraction has the minimal waiting time?Task 2 Effectiveness of levels-of-detailQ3: Which themed area has the minimal waitingtime? (with LOD variations)Q4: Which LOD do you prefer?Task 3 Effectiveness of 2D maps and our AR encodingQ5: Choose the attraction with the minimalwaiting time in the speciﬁed themed areaTask 4 Combinatorial features in our systemQ6: Provide your feedback to different conﬁguration settings

Table 1: Overview of the tasks in the user study. (H1) demonstrates the importance of resolving label occlusionsin AR. As described in Section 2, existing work concludes theimportance of resolving occlusions in AR to support the decisionmaking process by the users [13]. In Task 1, we show participantsa few snapshots (see supplementary materials) of our system, andask the participants to determine the waiting time of the speciﬁedattraction (Q1) and select the attractions with minimal waiting times(Q2). Three participants managed to select the correct waitingtimes if occlusions occurred, and the participants stated that thewaiting times were not recognizable in such a situation. Figure 12summarizes task completion time and accuracy. The time needed toanswer the questions could be decreased (Q1 from 33 . s to 12 . s ,Q2 from 21 . s to 13 . s ) and the number of correct answerscould be increased (Q1 from 10 % to 86 .

67 %, Q2 from 3 .

33 %to 100 %) when showing results with our occlusion management (Figure 12). 24 participants explicitly stated that it was difﬁcult orimpossible to select the correct answers if information is occluded,and 24 participants agree that the occlusion-free positioning easesdecision-making processes when investigating the labels.For hypothesis (H2) , we design questions in Task 2, where par-ticipants need to take several attributes into account to answer thequestions. In Q3, the participants were asked to select a themed areaof the amusement park with the lowest average waiting time. Weshowed participants images with labels of different LOD settings,including text labels, labels with the lowest LOD, and super labels.The time needed to answer questions for this task is summarizedin Figure 12(a). The participants, in general, spent more time ifonly text labels are present (52 . s on average) since they prob-ably like to calculate the correct number to answer the questions igure 9: Transition from a super label to the individual labels for each POI over time. (a) Lowest LOD (b) Middle LOD (c) Highest LOD (d) Dynamic LODs Figure 10: A comparison of different LODs and dynamic LODs (applying the level-of-detail management ) (a) (b)(c) (d) Figure 11: Lateral transitions properly. If we present information using the lowest LOD, a shortertime (23 . s on average) was required in comparison to pure textlabels. Using super labels achieved a similar performance, partici-pants spent 21 . s to answer the questions. If the waiting time isdepicted using text labels or labels in the lowest LOD, the themedarea with the minimal average waiting times was correctly selectedby 73 .

33 % of the participants. 90 % of the participants selected thecorrect answers if the super labels were shown (Figure 12(b)).In Q4, we ask participants which LOD they prefer. We pre-sented text labels, labels in one of the three LODs, and labels indynamic LODs as computed by our level-of-detail management . Thedynamic LODs were chosen as the favorite approach by 40 % ofthe participants. 53 .

33 % of the participants preferred the highestLOD. Participants, who selected the dynamic LODs as their favoritedesign, emphasized that the vertical stacking of labels is reducedwhile detailed information about close attractions is preserved. Theparticipants who chose the highest LOD as their favorite designappreciated the detailed information that can be used in decisionmaking. It is surprising that they were not disturbed or annoyed bythe excessive vertical stacking of the labels. The dynamic LODsavoid this excessive vertical stacking while presenting more infor-mation about close labels and less information about far labels. Tocheck vertical stacking, Figure 13(a) compares the highest LODand dynamic LODs. The more information is included for a label,the higher is the chance the label needs to be shifted upwards andstacked. We recorded the y -coordinate from the highest label of thetwo methods as a representative value for each themed area. Theheight of the stacked labels can be effectively reduced when usingthe dynamic LODs. For hypothesis (H3) , we aim to compare the decision makingeffectiveness when using 2D paper maps or our AR encoding inTask 3. We again measure the task completion time and accuracybetween using a Tokyo Disneyland map and our visualization. Asa preprocessing, we ﬁrst removed other POIs (e.g., shops or restau-rants) and left the 35 big attractions from the ofﬁcial 2D map of theamusement park, to increase the fairness of the comparison. Moredetails about the task are included in the supplementary material.60 % of the participants selected attractions with minimal waitingtimes of a themed area when using the 2D map and 83 .

33 % whenthe AR encoding was employed (Figure 13(b)). The average timethat the participants needed to select an attraction using the 2D mapwas 58 . s while they spent 32 . s on average when using ourapproach, which clearly shows a reduced effort for POI selection(Figure 13(c)).In the feedback session, participants are allowed to freely com-ment on the presented approach. Videos are shown highlighting thedynamic behavior of our tool when the user interacts with the system.Two participants mentioned that they prefer 2D maps compared toAR since 2D maps give a global top view. However, they performedthe tasks in the user study better with the AR setting. We believe thatboth 2D maps and AR systems have strengths and weaknesses de-pending on the tasks and use cases. In our study, we have proven thatfor navigation purposes, AR systems could be more practical. Twoparticipants also suggested to combine 2D maps together with ARsystems as done by Veas et al. [38]. This could allow us to exploitthe advantages of both approaches and achieve a similar result as in Google Maps and

Google Street View . Other participants would pre-fer super labels combined with the highest LOD. This could reducea) Time (b) Accuracy

Figure 12: (a) Task completion times (in seconds) and (b) accuracy ofQ1 to Q3. The error bars represent the standard errors. (a) Height of labels (b) Accuracy (c) Time

Figure 13: (a) Combined height of the stacked labels. (b) Accuracyand (c) task completion times (in seconds) of Q5. The error bars showthe standard errors. visual clutter, but might lead to a higher vertical stacking of labelscompared to dynamic LODs. We, therefore, allow users to adjustthe thresholds for switching LODs, to accommodate this preference.The occlusion handling and the smooth transitions were positivelymentioned by participants in the general feedback. Examples in-clude: ”I really like the occlusion management, to my eyes, it’salmost seamless.” or, ”Active occlusion handling is much superiorto no occlusion handling.” . Super label aggregation has been anotherpopular and speciﬁcally mentioned feature. Participants appreciatethe overview on the themed areas by giving feedback such as, ”Ilike the super label transitions if there are many attractions becauseit gives a good overview of an area.” , and ”I like the super labeltransitions the most.” . Overall, all participants expressed interest touse our system for navigation purposes.

IMITATIONS

The limitations of our system are inherited from the hardware, espe-cially the accuracy of mobile GPS. The position and particularly therotation data from the available Xiaomi Mi A2 smartphone and theGoogle Nexus C tablet are not consistent based on our experience.A less coherent behavior of our system follows as the sensor datafrom each of the two devices is not stable. This, unfortunately, limitsthe capability to fully utilize the application, while we also envisionthat this will sooner or later be solved by newer technologies. Toremove the errors, we thus present the results using predeﬁned labelpositions in AR 3D world space. This allows us to avoid thoseerrors induced by the hardware (e.g., changes in the device positionand viewing angle) that could inﬂuence the coherence of labels.It will be interesting to collaborate with researchers focusing onhigh-precision GPS positioning systems.Another limitation is that the data could contain many POIs withlong text descriptions. If each label should be large enough toshow the text, not much background information could be depictedeventually. The current aggregation of labels to super labels isstraightforward and can be easily extended based on the use cases.One important decision criterion for the occlusion management andthe level-of-detail management is the position of the user. Theordering of the labels based on the position of the user inﬂuences the resulting labeling. Furthermore, one limitation is the loss ofthe global overview using AR compared to 2D maps as mentionedby related work [3, 13, 14] and two user study participants. Usersneed to interact with the system and look into different directionsto see all the labels. The AR view only depicts the labels that arecurrently in front of the user in the respective view volume. Wecould in the future introduce additional labels on the sides of thescreen to provide hints to invisible objects.

ONCLUSION AND F UTURE W ORK

We present a context-responsive labeling framework in AugmentedReality, which allows us to introduce rich-content labels associ-ated with POIs. The label management strategy suppresses labelocclusions and incoherent label movements caused by transitionsand rotations of the device during user interaction. The frameworkpresents an alternative approach for spatial data navigation. The level-of-detail management takes the position of the user and labeldensity in the view volume into account. The computed levels-of-detail for each label avoid excessive vertical stacking of labels, whilestill retaining basic information, which depends on the object dis-tance. To further reduce visual clutter, we introduce the concept ofsuper labels, which group a set of labels. Smooth transitions havebeen implemented in our coherence management to avoid ﬂicker-ing and enable stable label movement. The evaluation shows theapplicability of the proposed approach.As future direction, techniques will be investigated to overcomethe drawbacks of seeing only the labels that are in the current viewvolume. The user should still anticipate POIs outside the view vol-ume and retain a global overview of the annotated scene as with 2Dmaps. One possibility would be including the technique presentedby Lin et al. [22] to depict labels that are currently outside the viewvolume and place hints at the display border of the device. Consid-ering the positioning accuracy, it would be interesting to includeso-called

Dual-Frequency GPS [9] or

Continuous Operating Refer-ence Stations (CORS) [7] as investigated by related work to improvethe sensor accuracy of mobile devices [19]. A selection scheme withthe integration of service providers (e.g., OpenStreetMap or GoogleMaps with large POI data) could improve the system usability. A CKNOWLEDGMENTS

Part of the research was enabled by VRVis funded in COMET(879730) a program managed by FFG. R EFERENCES [1] K. Been, M. N¨ollenburg, S.-H. Poon, and A. Wolff. Optimizing activeranges for consistent dynamic map labeling.

Computational Geometry ,43(3):312 – 328, 2010.[2] M. A. Bekos, B. Niedermann, and M. N¨ollenburg. External labelingtechniques: A taxonomy and survey.

Computer Graphics Forum ,38(3):833–860, 2019.[3] B. Bell, S. Feiner, and T. H¨ollerer. View management for virtual andaugmented reality. In

Proceedings of the 14th annual ACM symposiumon user interface software and technology , pp. 101–110, 2001.[4] T. Bricker. Tokyo Disneyland planning guide. Disney Tourist Blog,2020.[5] J. Christensen, J. Marks, and S. Shieber. Placing text labels on mapsand diagrams.

Graphic Gems , 4:497–504, 1994.[6] L. Cmolik, V. Pavlovec, H.-Y. Wu, and M. N¨ollenburg. Mixed label-ing: Integrating internal and external labels.

IEEE Transactions onVisualization and Computer Graphics , pp. 1–14, 2020.[7] P. Dabove and V. Di Pietra. Towards high accuracy gnss real-timepositioning with smartphones.

Advances in Space Research , 63(1):94–102, 2019.[8] A. Devaux, C. Hoarau, M. Br´edif, and S. Christophe. 3D urban geo-visualization: in situ augmented and mixed reality experiments. In

ISPRS Technical Commission IV Symposium , vol. IV-4, pp. 41 – 48,2018.9] A. Elmezayen and A. El-Rabbany. Precise point positioning us-ing world’s ﬁrst dual-frequency gps/galileo smartphone.

Sensors ,19(11):2593, 2019.[10] B. Ens, J. Lanir, A. Tang, S. Bateman, G. Lee, T. Piumsomboon, andM. Billinghurst. Revisiting collaboration through mixed reality: Theevolution of groupware.

International Journal of Human-ComputerStudies , 131:81 – 98, 2019.[11] D. Fortes. AR + GPS Location. https://assetstore.unity.com/packages/tools/integration/ar-gps-location-134882 ,2020. [Online; accessed 09-June-2020].[12] W. GmbH. Wikitude Cross Plattform Augmented Reality SDK. , 2020. [Online;accessed 25-October-2020].[13] R. Grasset, T. Langlotz, D. Kalkofen, M. Tatzgern, and D. Schmalstieg.Image-driven view management for augmented reality browsers. In

Proceedings of the International Symposium on Mixed and AugmentedReality (ISMAR) , pp. 177–186, 2012.[14] R. L. M. Guarese and A. Maciel. Development and usability analysis ofa mixed reality gps navigation application for the Microsoft Hololens.In

Proceedings of the Computer Graphics International Conference ,pp. 431–437, 2019.[15] J. Hoffswell, W. Li, and Z. Liu. Techniques for ﬂexible responsivevisualization design. pp. 1–13, 04 2020.[16] J. Jia, Y. Zhang, X. Wu, and W. Guo. Image-based label placement foraugmented reality browsers. In

Proceedings of the 4th InternationalConference on Computer and Communications (ICCC) , pp. 1654–1659,2018.[17] T. K¨oppel. git repository. https://github.com/1327052/ARContextLabeling.git . [Online; accessed 10-February-2021].[18] D. Kouˇril, L. ˇCmol´ık, B. Kozlikova, H.-Y. Wu, G. Johnson, D. Goodsell,A. Olson, M. E. Gr¨oller, and I. Viola. Labels on levels: Labeling ofmulti-scale multi-instance and crowded 3D biological environments.

IEEE Transactions on Visualization and Computer Graphics , 25:977–986, 2019.[19] T. Kuhlmann, P. Garaizar, and U.-D. Reips. Smartphone sensor ac-curacy varies from device to device in mobile research: The case ofspatial orientation.

Behavior Research Methods , 2020.[20] T. Langlotz, T. Nguyen, D. Schmalstieg, and R. Grasset. Next-generation augmented reality browsers: rich, seamless, and adaptive.

Proceedings of the IEEE , 102(2):155–169, 2014.[21] C. Lin. Crossing-free many-to-one boundary labeling with hyperlead-ers. In , pp.185–192, 2010.[22] Y.-T. Lin, Y.-C. Liao, S.-Y. Teng, Y.-J. Chung, L. Chan, and B.-Y. Chen.Outside-in: Visualizing out-of-sight regions-of-interest in a 360 videousing spatial picture-in-picture previews. In

Proceedings of the 30thAnnual ACM Symposium on User Interface Software and Technology ,pp. 255–265, 2017.[23] S. Maass and J. D¨ollner. Dynamic annotation of interactive environ-ments using object-integrated billboards. In

Proceedings of the 14thInternational Conference in Central Europe on Computer Graphics,Visualization and Computer Vision (WSCG) , pp. 327–334, 2006.[24] S. Maass and J. D¨ollner. Seamless integration of labels into interactivevirtual 3D environments using parameterized hulls. In

ComputationalAesthetics in Graphics, Visualization, and Imaging , 2008.[25] J. Mackinlay. Automating the design of graphical presentations ofrelational information.

ACM Trans. Graph. , 5:110–141, 04 1986.[26] K. Matkovic, H. Hauser, R. Sainitzer, and M. E. Gr¨oller. Process visu-alization with levels of detail. In

Proceedings of the IEEE Symposiumon Information Visualization , pp. 67–70, 2002.[27] R. Mazza.

Introduction to information visualization . Springer Science& Business Media, 2009.[28] D. D. McMahon, C. C. Smith, D. F. Cihak, R. Wright, and M. M.Gibbons. Effects of digital navigation aids on adults with intellectualdisabilities: Comparison of paper map, google maps, and augmentedreality.

Journal of Special Education Technology , 30(3):157–165, 2015.[29] S. Oeltze-Jafra and B. Preim. Survey of labeling techniques in medicalvisualizations. In

Proceedings of the 4th Eurographics Workshop onVisual Computing for Biology and Medicine , VCBM ’14, pp. 199—-208, 2014. [30] B. Preim and P. Saalfeld. A survey of virtual human anatomy educationsystems.

Computers & Graphics , 71:132–153, 2018.[31] PTC. Vuforia: Market-Leading Enterprise AR. , 2020. [Online;accessed 02-June-2020].[32] M. Schneider, A. Bruder, M. Necker, T. Schluesener, N. Henze, andC. Wolff. A ﬁeld study to collect expert knowledge for the developmentof AR HUD navigation concepts. In

Proceedings of the 11th Inter-national Conference on Automotive User Interfaces and InteractiveVehicular Applications: Adjunct Proceedings , pp. 358–362, 2019.[33] M. Sereno, X. Wang, L. Besanc¸on, M. J. Mcgufﬁn, and T. Isenberg.Collaborative work in augmented reality: A survey.

IEEE Transactionson Visualization and Computer Graphics , 2021.[34] L. Szirmay-Kalos and G. M´arton. Worst-case versus average casecomplexity of ray-shooting.

Computing , 61(2):103–131, 1998.[35] M. Tatzgern, D. Kalkofen, R. Grasset, and D. Schmalstieg. Hedgehoglabeling: View management techniques for external labels in 3D space.In

Proceedings of the 21st IEEE Virtual Reality (VR) , pp. 27–32, 2014.[36] U. Technologies. Unity Website. https://unity.com/de , 2020.[Online; accessed 02-June-2020].[37] M. Uradzi´nski and M. Bakuła. Assessment of static positioning accu-racy using low-cost smartphone gps devices for geodetic survey points’determination and monitoring.

Applied Sciences , 10(15):5308, 2020.[38] E. Veas, R. Grasset, E. Kruijff, and D. Schmalstieg. Extended overviewtechniques for outdoor augmented reality.

IEEE Transactions on Visu-alization and Computer Graphics , 18(4):565–572, 2012.[39] W. Narzt, G. Pomberger, A. Ferscha, D. Kolb, R. M¨uller, J. Wieghardt,Hortner, and C. Lindinger. Pervasive information acquisition for mo-bile ar-navigation systems. In

Proceedings 5th IEEE Workshop onMobile Computing Systems and Applications , pp. 13–20, 2003.[40] D. Waller and L. Nadel, eds.

Handbook of Spatial Cognition Hardcover .Amer Psychological Assn, 2012.[41] S. White and S. Feiner. Sitelens: situated visualization techniques forurban site visits. pp. 1117–1120, 04 2009.[42] J. M. Wiener, S. J. B¨uchner, and C. H¨olscher. Taxonomy of humanwayﬁnding tasks: A knowledge-based approach.

Spatial Cognition &Computation , 9(2):152–165, 2009.[43] W. Willett, Y. Jansen, and P. Dragicevic. Embedded data representa-tions.

IEEE Transactions on Visualization and Computer Graphics ,23(1):461–470, 2017.[44] H.-Y. Wu, B. Niedermann, S. Takahashi, M. J. Roberts, andM. N¨ollenburg. A survey on transit map layout – from design, machine,and human perspectives.

Computer Graphics Forum , 39(3):619–646,May 2020.[45] H.-Y. Wu, S. Takahashi, D. Hirono, M. Arikawa, C.-C. Lin, and H.-C.Yen. Spatially efﬁcient design of annotated metro maps.

ComputerGraphics Forum (EuroVis 2013) , 32(3):261–270, June 2013.[46] H.-Y. Wu, S. Takahashi, S.-H. Poon, and M. Arikawa. Scale-adaptiveplacement of hierarchical map labels. In