[PDF] A privacy-preserving approach to streaming eye-tracking data

Abstract

Eye-tracking technology is being increasingly integrated into mixed reality devices. Although critical applications are being enabled, there are significant possibilities for violating user privacy expectations. We show that there is an appreciable risk of unique user identification even under natural viewing conditions in virtual reality. This identification would allow an app to connect a user's personal ID with their work ID without needing their consent, for example. To mitigate such risks we propose a framework that incorporates gatekeeping via the design of the application programming interface and via software-implemented privacy mechanisms. Our results indicate that these mechanisms can reduce the rate of identification from as much as 85% to as low as 30%. The impact of introducing these mechanisms is less than 1.5 ∘ error in gaze position for gaze prediction. Gaze data streams can thus be made private while still allowing for gaze prediction, for example, during foveated rendering. Our approach is the first to support privacy-by-design in the flow of eye-tracking data within mixed reality use cases.

Full PDF

TTo appear in IEEE Transactions on Visualization and Computer Graphics

A privacy-preserving approach to streaming eye-tracking data

Brendan David-John,

Student Member, IEEE , Diane Hosfelt,Kevin Butler

Senior Member, IEEE and Eakta Jain,

Member, IEEE

Eye

Image Eye Position, e.g., (x, y, t)

Camera 360◦ Video Safety TrainingEye Tracking PlatformEmbedded Gaze estimation Redirected WalkingEvent Detection

Eye Position w/ label, e.g., (x, y, t, F/S/SP)

Gatekeeper (Browser, API)Event Sample Aggregator

AOI Metrics, e.g.,

Dwell TimeEvent Data, e.g., (Fix.

AOI Metric Computation Gaze-based Interface

Gatekeeper Model

Collaborative TrainingVisualization, AOI AnalysisUser IdentificationFoveated RenderingGaze PredictionUser Identification (x’, y’, t)Eye Image Eye Position, e.g., (x, y, t)

Camera Eye Tracking Platform

Embedded

Gaze estimation

Privacy

Mechanism Event DetectionClassificationUser Identification

Saliency Map Generation

Training Saliency ModelUser Identification Real-time Event

Detection

Privacy

Mechanism

Eye Position w/ label, e.g., (x, y, t, F/S/SP) (x’, y’, t, F/S/SP)

Standalone Privacy Mechanism

Fig. 1: Top: The

Gatekeeper model protects identity by delivering relevant data at different levels directly through the API, whilewithholding raw gaze samples that contain biometric features. This approach cannot be used directly with applications that requireraw gaze samples. Bottom: In scenarios where a

Gatekeeper

API cannot be implemented, we instead apply a privacy mechanism toraw gaze samples to serve applications that use gaze samples or event data directly.

Abstract — Eye-tracking technology is being increasingly integrated into mixed reality devices. Although critical applications arebeing enabled, there are signiﬁcant possibilities for violating user privacy expectations. We show that there is an appreciable risk ofunique user identiﬁcation even under natural viewing conditions in virtual reality. This identiﬁcation would allow an app to connect auser’s personal ID with their work ID without needing their consent, for example. To mitigate such risks we propose a framework thatincorporates gatekeeping via the design of the application programming interface and via software-implemented privacy mechanisms.Our results indicate that these mechanisms can reduce the rate of identiﬁcation from as much as 85% to as low as 30%. The impact ofintroducing these mechanisms is less than 1.5 ◦ error in gaze position for gaze prediction. Gaze data streams can thus be made privatewhile still allowing for gaze prediction, for example, during foveated rendering. Our approach is the ﬁrst to support privacy-by-design inthe ﬂow of eye-tracking data within mixed reality use cases. Index Terms —Privacy, Eye Tracking, Eye Movements, Biometrics ©2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, includingreprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuseof any copyrighted component of this work in other works. DOI: DOI NO. TO APPEAR UPON PUBLICATION • Brendan David-John is a PhD student at the University of Florida.E-mail: brendanjohn@uﬂ.edu.• Diane Hosfelt was a privacy and security researcher at Mozilla at the timeof writing E-mail: [email protected]• Dr. Kevin Butler is an Associate Professor at the University of Florida.E-mail: butler@uﬂ.edu• Dr. Eakta Jain is an Assistant Professor at the University of Florida.E-mail: [email protected]ﬂ.edu

NTRODUCTION

As eye trackers are integrated into mixed reality hardware, data gath-ered from a user’s eyes ﬂows from the mixed reality platform to theapplications (apps) that use this data. This data is a critical enablerfor a number of mixed reality use cases: streaming optimization [63],foveated rendering [10, 66, 67, 79], redirected walking [49, 50, 54, 99],gaze-based interfaces [34, 84, 107], education [81], and social interac-tion [26, 61, 64, 70, 74]. The eye-tracking data also contains a variety ofinformation about the user which are not necessarily needed by eachapplication. For example, eye movements identify attributes such asgender, bio-markers for various health conditions, and identity. As a1 a r X i v : . [ c s . H C ] F e b esult, how this data is handled, and to whom, has privacy and securityimplications.The problem of applications receiving data and passing it along tocolluding apps or parent companies erodes public trust in technology,and cannot be “regulated away”. It has received public attention in thecontext of similar personal devices, such as smartphones. Recently,The Weather Channel took location data it mined from users’ foottrafﬁc at different businesses, and sold it to hedge funds to inform theirinvestments before quarterly income statements were released. . Evenwith regulation, imagine that the weather app collecting location datacolludes with an advertising application that belongs to the same parentcompany. The user will then be served personalized ads based on herlocation: such as car ads appearing after a visit to the car dealershipfor an oil change. Now imagine that the parent company also knowswhich cars she glanced at while waiting, or that she actually spent mostof the time looking at the motorcycle parked out front relative to theother vehicles.This problem becomes even more severe when we recognize thatmixed reality headsets are going to have as much enterprise use aspersonal use. A user might log in at work to do their job-relatedtraining with their known real-world identity, but attend labor unionmeetings as User X to avoid negative repercussions. , The agent thatconnects these two identities has the power to “out” the user to herwork organization.In this paper, we have investigated the threat of biometric identiﬁ-cation of a user from their eye movements when they are being eyetracked within immersive virtual reality environments. For severalmixed reality use cases, raw eye-tracking data does not need to bepassed along to the application. As shown in Figure 1, a

Gatekeeper that resides between the eye tracking platform and applications canalleviate this threat by encapsulating raw data within an applicationprogramming interface (API). We have proposed a design for such anAPI in Section 4.This philosophy of serving data on a “need-to-know basis” is effec-tive in preventing data from being used for deviant purposes insteadof their originally intended purpose. However, there remain certainapplications that rely on access to raw gaze data. In this case, we haveproposed privacy mechanisms to erase identifying signatures from theraw gaze data before it is passed on to the application. We have evalu-ated how the proposed privacy mechanisms impact utility, i.e., what theapplication needs gaze data to do. Finally, we have investigated howthe proposed privacy mechanisms impact applications that need accessto eye events, i.e., eye-tracking data labeled as ﬁxations, saccades, orsmooth pursuits.Our work is part of a broader thrust in the eye tracking and virtualreality communities on characterizing risks related to unregulated mas-sive scale user eye tracking, and developing technological mitigationsfor these risks. For risks associated with an adversary gaining accessto the eye image itself, we direct readers to the privacy mechanismspresented in [22, 44]. For a differential privacy perspective, we directreaders to [45, 59, 97]. For a differential privacy perspective on theidentiﬁcation of users by colluding apps, we direct readers to the de-tailed analysis in [15,97], with the caveat that the utility task consideredin this body of work is gaze-based document type classiﬁcation. Incontrast, we focus on utility tasks that are speciﬁc to mixed reality. Ourgoal is to provide a foundation for future researchers and developers toorganize their thinking around the risks created by the ﬂow of behav-ioral data in mixed reality, and the proactive rather than reactive designof mitigation strategies. YE -T RACKING A PPLICATIONS IN M IXED R EALITY

We can expect eye tracking to run as a service within a mixed realitydevice, analogous to the way that location services run on phones today. https://tcf.org/content/report/virtual-labor-organizing/ Eye tracking is a speciﬁc case of more general behavioral trackingservices in mixed reality, including head, hand, and body tracking.Mixed reality platforms such as Microsoft and Facebook will collectraw data from the native sensors, process it to perform noise removaland event detection, and pass the processed data up the software stack.Because a rich, self-sustaining mixed reality ecosystem will rely onindependent content developers, a mixed reality web browser, akinto a conventional web browser, will provide the software interfaceto access a wide array of content for consumers. In this section, wehighlight critical eye-tracking applications for mixed reality that useaggregate-level, individual-level, and sample-level gaze data.

Aggregate gaze data is collected from many viewers to drive applica-tions such as highlighting salient regions using heatmaps [28, 82, 95],and learning perceptual-based streaming optimizations for 360 ◦ con-tent [63, 101]. These applications typically rely on a data collectionprocess conducted in research lab environments for a sample of viewers.Viewer data is then used to train machine-learning models or evaluatethe most effective streaming methodology within the dataset. Resultsfrom the dataset are then released in aggregate form to inform the de-ployment of such methods on consumer devices. This provides utilityto the consumer without creating privacy risks, however training datafor machine-learning models may pose a risk to privacy [25], as wellas publicly-released datasets that include the raw gaze data used togenerate aggregate representations [1, 40, 41, 57, 102]. Eye movement behavior captured by eye-tracking events, such as ﬁx-ations, saccades, and smooth pursuit, contribute to gaze-based inter-faces [34, 77], evaluating training scenarios [19, 30, 43], and identifyingneurodegenerative diseases [75] and ASD [18]. Detecting eye-trackingevents enables improved techniques for redirected walking [49, 50, 54],a critical application for VR that expands the usable space of virtualenvironment within a conﬁned physical environment. The most com-mon method to quantify an individual’s gaze behavior is to mark Areasof Interest (AOIs) within content and measure how gaze interacts withthis region. Typical metrics for these regions depend on ﬁxation andsaccade events only, recording dwell times, the number of ﬁxations orglances, and ﬁxation order [55, 76]. Event data also poses a privacyrisk, as it reveal the viewer’s intent and preferences based on how gazeinteractions with different stimuli content.

Multiple key mixed reality applications depend on individual gazesamples from an eye-tracker of a sampling rate of at least 60Hz. Thisincludes foveated rendering [10, 66, 67, 79], which is expected to havethe biggest impact on deploying immersive VR experiences on low-power and mobile devices. This application relies on gaze samplesto determine where the foveal region of the user currently is, and topredict where it will land during an eye movement to ensure that theuser does not perceive rendering artifacts [3]. Similarly, gaze predictionmodels are trained that predict future gaze points while viewing 360 ◦ imagery and 3D rendered content [40, 41].Another key set of applications that require sample-level data aregaze guidance techniques [88, 89]. Gaze guidance takes advantageof sensitivity to motion in the periphery to present a ﬂicker in lumi-nance that will attract the user’s eyes, using eye tracking to removethe ﬂicker before the user can ﬁxate upon the region and perceive thecue [8, 38]. This technique enables manipulation of visual attention,and ultimately user behavior. For example, gaze guidance in 2D en-vironments has been shown to improve spatial information recall [7],improve training of novices to identify abnormalities in mammogramimages [96], and improve retrieval task performance in real-world envi-ronments [12]. Gaze guidance has also been used to enhance redirectedwalking techniques in VR by evoking involuntary eye movements, andtaking advantage of saccadic suppression [99]. Guiding gaze throughsaccades and manipulating the user allows for use of a 6.4m × × o appear in IEEE Transactions on Visualization and Computer Graphics Table 1: State-of-the-art gaze-based biometric methods. Key: RBF = Radial Basis Function Network, RDF = Random Decision Forests, STAT =Statistical test, SVM = Support Vector Machine.

Method Features Classiﬁer Dataset Results

Schroder et al. [93] Fixation, Saccade RBF BioEye 2015, MIT data set IR: 94.1%, 86.76%Schroder et al. [93] Fixation, Saccade RDF BioEye 2015, MIT data set IR: 90.9%, 94.67%George&Routray [35] Fixation, Saccade RBF BioEye 2015 IR:93.5%Lohr et al. [60] Fixation, Saccade STAT VREM-R1, SBA-ST EER: 9.98%, 2.04%Lohr et al. [60] Fixation, Saccade RBF VREM-R1, SBA-ST EER: 14.37%, 5.12%Eberz et al. [31] Fixations, Binocular Pupil SVM [31] EER: 1.88%Rigas et al. [86] Fixations, Saccades, Density maps Multi-score fusion [86] EER: 5.8%, IR: 88.6%Monaco [68] Gaze Velocity/Acceleration STAT EMVIC 2014 IR: 39.6%upon the usable area within VR experiences. This application requiresan eye tracker sampling rate of 250Hz or more, and requires sample-level data to know precisely when gaze moves towards the peripherycue. Providing sample-level data with high accuracy at this frequencyposes a serious risk to user privacy in the form of gaze-based biometricfeatures that can then be extracted from these gaze positions.

ELATED W ORK

Human eyes reﬂect their physical attributes. For example, algorithmscan estimate the ages of users by monitoring the change in the gazepatterns as they age [73, 106], their gender based on the temporaldifferences in gaze patterns while viewing faces [92], and their racefrom the racial classiﬁcation of faces they tend to look at [9].Beyond physical attributes, gaze allows rich insights into psycho-logical attributes, such as neurological [56] and behavioral disor-ders [27, 72, 80]. The eyes can also reveal whether an individual suffersfrom an affective disorder—anxious individuals’ gaze is characterizedby vigilance for threat during free viewing, while depressed individuals’gaze is characterized by reduced maintenance of gaze on positive stim-uli [5]. Eye tracking has also been used to investigate gaze behavior inindividuals on the autism spectrum, ﬁnding that they generally tend toﬁxate less on faces and facial features [13, 23].Pupillometry, when combined with scene metadata could allow al-gorithms to infer user sexual orientation, as shown in clinical studiesmeasuring genital responses, offering a less invasive way to infer in-dividual’s preferences [85]. In addition to allowing sexual orientationinferences, pupillometry can reveal insight into women’s hormonal cy-cles using similar methodology [52]. Pupil size also reveals the user’scognitive load [29] as well as emotional arousal, as shown in studieswith images [17, 53] and videos [83]. Interestingly, pupil responseseems to be modulated by subconscious processing, changing when themind wanders [100].Body mass index (BMI) status appears to inﬂuence gaze parame-ters that are not under conscious control, allowing BMI estimationwhen presenting individuals with images of foods of differing caloriccontent [37]. These risks involve knowledge of both eye position andstimuli, whereas user identiﬁcation can be applied to raw eye move-ments without knowledge of what the stimuli was.

Gaze patterns can be used to identify individuals as they contain uniquesignatures that are not under a user’s voluntary control [47, 48]. TheEye Movement Veriﬁcation and Identiﬁcation Competitions in 2012and 2014 challenged researchers to develop algorithms that identiﬁedusers based on their eye movements when they followed a jumpingdot (2012) and when they looked at images of human faces (2014). Thebest models’ accuracy ranged from 58% to 98% for the jumping dotstimuli, and nearly 40% accuracy compared to a 3% random guessprobability for viewing faces.Based on recent surveys on eye movements biometrics [33, 87] aswell as our own literature search, we identiﬁed algorithms that havebeen shown to successfully identify individual users from their eyemovements in Table 1. These algorithms have been applied to existinggaze-biometric challenge datasets, as well as the natural viewing of image stimuli in 2D (MIT data set). The method with the best biometricperformance produces an Equal Error Rate of 1.88% using pupil-basedfeatures [31], however the majority of consumer applications in mixed-reality do not require pupil diameter. Thus, we selected to implementthe RBF approach proposed by George and Routray [35], as it reliesonly on ﬁxation and saccade events. This method also produces impres-sive results with VR eye-tracking data [60] and natural viewing of 2Dimages [93].

In recent years privacy concerns related to eye-tracking applicationshas grown signiﬁcantly [16, 42, 44, 51, 58, 98]. In response, researchershave developed methods to enhance privacy of aggregate features, likesaliency heatmaps [59] and event statistics [15, 32, 97]. These methodshave been shown to reduce performance in classiﬁcation of genderand identity, however the methods operate only on aggregate gaze dataafter it has been collected and processed. Recent work by Li et al. hasapplied formal privacy guarantees to raw streams of gaze designed toobfuscate viewer’s gaze relative to AOIs within stimuli over time [58].The ability to protect biometric identity was was evaluated empiricallyon the 360 em dataset [1], reducing identiﬁcation to chance rate. Ourwork develops a threat model based on the streaming of gaze samplesand the privacy risk related to biometric identiﬁcation within an XRecosystem.

ESIGNING AN

API

FOR G AZE P RIVACY

The typical architecture and data ﬂow in an eye-tracking platform isshown in Figure 1. Existing eye trackers process user data in threestages: eye image capture, which images the user’s eye, eye positionestimation, which infers the point of regard from the eye image, andevent detection, which classiﬁes each point of regard as belongingto a ﬁxation, saccade, blink, etc. When eye trackers were specialtyequipment, all this data was made available to the application. Theseapplications were typically research data gathering software. Themajor difference now is that the applications will have a proﬁt-basedbusiness model. This model will naturally create incentives to shareuser gaze data and make inferences by combining data across devicesfor advertising revenue, for example. We have identiﬁed privacy riskscreated by this ecosystem in Section 3. In this section, we deﬁne ourthreat model and propose the design of an application programminginterface (API) which adopts a privacy-preserving approach to passinggaze data to downstream applications.

Threat Model

We assume that the components comprising the eye-tracking platform and API are trusted, i.e., the integrity of the hardwareand software could be attested through mechanisms such as secureboot [4] and integrity measurement [90], and we assume that the op-erating system is protected, e.g., through SELinux mandatory accesscontrols [69]. The adversary is capable of examining all data transmit-ted to the eye-tracking applications, and seeks to use this informationto re-identify the user. An adversarial application has the capability tocollude with other applications by sharing information through eitherovert or covert channels [65] in order to re-identify users.Our privacy-preserving solution is focused on preventing biometricidentiﬁcation of users from their gaze data. First, the eye is imaged by acamera, producing an eye image that is provided to the platform, which3rocesses the image into position coordinates. The platform providesthis eye position to trusted applications like the browser, which thenpass the eye position on to browser apps that perform tasks such as AOIanalysis for performance in training scenarios, saccade detection forredirected walking, and smooth pursuits for gaze-based interaction.

Nai¨ve API Design

The simplest way to provide a gaze API would beto pass along the raw gaze data to applications. At any point in time,the application would be able to request getGazePosition() . Fromthis, the application would be able to compute ﬁxations, saccades, anddwell time; in particular, an AOI application would be able to computeﬁxations in an AOI, time to ﬁrst saccade into the AOI, and dwell timein the AOI.Providing raw gaze data also allows for computation of the velocityof eye movements, and other features that are commonly used foridentity classiﬁcation tasks [33, 35, 93]. Allowing for raw gaze accessin an untrusted context, such as the web, allows arbitrary apps theability to re-identify users.

However, we can modify the gaze API to be privacy-preserving byacting as a

Gatekeeper . Privacy vulnerabilities are caused by the designassumption that the application is benign, and the data is used only forthe purpose for which it is collected. As discussed previously, appli-cations need not be benign, and connecting user data across deviceswill allow for richer inferences to be made about that user. This threatmotivates our proposed

Gatekeeper design. An added beneﬁt of ourproposed design is that the

Gatekeeper model provides desired metricsdirectly to applications, instead of requiring applications to processstreamed user gaze data and calculate the metrics themselves.Advertisers and other AOI applications are interested in the numberof ﬁxations and the dwell time of a ﬁxation in a predetermined AOI.Under the

Gatekeeper framework, instead of passing along raw gazepositions, an API allows requests for this information. For example,a getFixations method takes a rectangular area and returns a list ofﬁxations that had occurred in that area, and a getDwellTime methodtakes as input a ﬁxation and returns in milliseconds the dwell time of theﬁxation. Additionally, we provide a getSaccades method that wouldreturn a list of saccades into the AOI. Saccades are a strong classiﬁerfeature for identity, when raw gaze points are included, however wemitigate this risk by providing only lower dimensional summary data.It is important to note that this API is designed speciﬁcally to pro-vide AOI metrics and summary data of eye movement events. TheAPI does not scale to address applications such as platform foveatedrendering, which requires raw gaze samples for utility. The

Gatekeeper model does support streaming optimizations based on the current gazeposition within a discrete set of tiles [20, 78], by providing only infor-mation about which tile they are currently attending too. This typeof optimization is critical for low-power devices to ensure high visualquality while preserving precious network resources.

In some situations, such as gaze-based interfaces and redirected walk-ing, applications will need to be notiﬁed when a new ﬁxation or saccadeoccurs, instead of querying for all ﬁxations or saccades.In this scenario, we can use an

EventListener model instead of aquery-based model. When a new event occurs, the

EventListener will be notiﬁed and given the event data, (x, y, t) and a boolean indi-cating if it is a ﬁxation, saccade, or smooth pursuit. More complex eyemovements are difﬁcult to detect in real-time with the sampling rate ofmixed reality eye-tracking devices, and typically are not implementedin real-time applications.Our typical model for streaming event data is to send an event whenthe eye movement has concluded. For example, in a gaze-based inter-face the application needs to be notiﬁed that a smooth pursuit occurred,and where it landed. In applications such as redirected walking it iscritical to know when a saccade begins, to take advantage of saccadicblindness [49,50,54,99]. In this case, one mode of the

EventListener will be to indicate when a saccade event has started and ﬁnished, asopposed to only when the saccade has ﬁnished. Table 2: Privacy mechanism variable deﬁnitions.Variable Description x Horizontal gaze position y Vertical gaze position t Timestamp e Event label: Fix. (F), Sacc. (S), Smooth Pursuit (SP) X Input time series of gaze samples G Number of gaze positions in time series X (cid:48) Output privacy-enhanced time series K Temporal downsample factor relative to sampling rate L Spatial downsample factor relative to 3840 × M Number of rows in equirectangular projection N Number of columns in equirectangular projection δ x Horizontal step size: N δ y Vertical step size: M Most applications will be able to function with the aforementioned APIdesigns; however, two key mixed reality applications that will requiresample-level data are foveated rendering and subtle gaze guidance.Foveated rendering is critical for performance on next generationwearable VR headsets. In an ideal situation, platforms will use GPU-based foveated rendering—where gaze information is sent to the graph-ics driver, informing it to do fewer calculations for the parts of thescreen that are away from the center of view. This requires cooperationwith the graphics hardware driver for optimal performance. Experi-ments on native platforms show up to a 2.71 times speed up in framesper second [66]. This will not be possible in all cases, so platformsand browsers will also need to leverage software-based foveated ren-dering and streaming optimization [71]. In this scenario, gaze samplesare transmitted directly to the content or webpage, which then knowswhere it should render objects in more detail. However, this exposesthe raw gaze data to the application and allows the content to performfurther processing on the raw gaze information, whether that is useridentiﬁcation or inferring sensitive characteristics.In these scenarios the eye-tracking platform must stream sample-level data, and it is impossible to simply abstract data using a privacy-preserving API. Therefore, we propose the use of a privacy mechanismto manipulate gaze samples as they are streamed to increase privacy.

ETHODOLOGY

In this section, we propose, implement, and evaluate three privacymechanisms with the goal of mitigating the threats identiﬁed in Sec-tion 4. Our goal is to reduce the accuracy of user identiﬁcation basedon features derived from common eye events, such as ﬁxations andsaccades. We consider the following privacy mechanisms: addition ofGaussian noise to raw gaze data, temporal downsampling, and spatialdownsampling.We implement these mechanisms and evaluate them against the base-line identiﬁcation rate when raw gaze data is passed to the applicationas is. For each of the privacy mechanisms, we also evaluate the utilityof the data that is passed downstream.

We deﬁne the data received by the privacy mechanism to be a timeseries where each tuple is comprised of horizontal and vertical gazepositions ( x , y ), a time stamp t , and the event label assigned to the sam-ple e : X = { ( x , y , t , e ) , ( x , y , t , e ) ,..., ( x G , y G , t G , e G ) } , a set of G gaze positions. This data is processed via a privacy mechanism andthe processed output as a time series X (cid:48) , with additional variables de-ﬁned in Table 2. The following three privacy mechanisms are exploredin this paper. Additive Gaussian Noise

Noise is sampled from a Gaus-sian distribution of zero mean and standard deviation σ de-ﬁned in visual degree is added to the gaze positions. Noise4 o appear in IEEE Transactions on Visualization and Computer Graphics Table 3: Dataset characteristics.Dataset Participants ◦ Images Free ViewingVR-Saliency [95] 130 23 8 30s 360 ◦ Images Free ViewingVR-EyeTracking [102] 43 208 148 20s-70s 360 ◦ Videos Free Viewing360 em [1] 13 14 14 38s-85s 360 ◦ Videos Free ViewingDGaze [40] 43 5 2 180s-350s 3D Rendered Scene Free Viewingis independently sampled for horizontal and vertical gaze posi-tions as X (cid:48) = { ( x + N ( , σ ) , y + N ( , σ ) , t , e ) , ( x + N ( , σ ) , y + N ( , σ ) , t , e ) ,..., ( x G + N ( , σ ) , y G + N ( , σ ) , t G , e G ) } . Temporal Downsampling

Temporal downsampling reduces thetemporal resolution of the eye-tracking data stream. Downsamplingis implemented by streaming the data at a frequency of the originalsampling rate divided by a scaling parameter K . The output timeseries is deﬁned as X (cid:48) = { ( x ( K · p )+ , y ( K · p )+ , t ( K · p )+ , e ( K · p )+ ) ,... } for all integers p ∈ [ , GK ] . For example, with a scaling pa-rameter of two, the private gaze positions are deﬁned as X (cid:48) = { ( x , y , t , e ) , ( x , y , t , e ) , ( x , y , t , e ) ,... } , retaining only everyother gaze sample. For a scaling parameter of three, X (cid:48) = { ( x , y , t , e ) , ( x , y , t , e ) , ( x , y , t , e ) ,... } . Spatial Downsampling

Spatial downsampling reduces the resolu-tion of eye-tracking data down to a discrete set of horizontal and verticalgaze positions. Intuitively, the scene is divided into a grid and eachgaze sample is approximated by the grid cell that it lies within. Spa-tial downsampling is performed by deﬁning a target equirectangulardomain spanning 180 ◦ vertically and 360 ◦ horizontally with M rowsand N columns. For smaller values of M and N there are less possiblepositions, and thus reduced spatial resolution. Raw gaze positions ( x ∈ [ , ◦ ) , y ∈ [ , ◦ ) , t ) are transformed by ﬁrst computing thehorizontal step size δ y = M and vertical step size δ x = N . Downsam-pled gaze positions are then computed as ( (cid:98) x δ x (cid:99) · δ x , (cid:98) y δ y (cid:99) · δ y , t ) , where (cid:98)·(cid:99) represents the ﬂoor function that rounds down to the nearest integer.For the results presented in this paper, we parameterize spatialdownsampling as a factor L relative to an equirectangular domainof M = N = M = L and N = L . For example, an input downsampling factor of L equals twowill result in M = N = L equals three willresult in a resolution of M =

720 and N = In order to evaluate the privacy mechanisms on how effectively theyprevented an adversary from re-identifying the user, we selected ﬁveexisting datasets of VR eye-tracking data. Table 3 presents character-istics of each dataset included in analysis. Datasets were selected tohave diversity in the number of participants, the number of stimuli pre-sented, and the task being performed. Four of the datasets are publiclyavailable, while ET-DK2 consists of data previously collected by theauthors. The ET-DK2 dataset consists of twenty participants viewing ﬁfty 360 ◦ images using an Oculus-DK2 HMD with integrated SMI 60Hz binocu-lar eye tracker. Data was collected under an IRB approved protocol inDecember 2017 for the purpose of generating saliency maps from gazedata. Two participants were not included in analysis, as one participantgot motion sickness, and the data collection software did not log datafrom all 50 images for one participant. The remaining 18 individualswere made up of ﬁve females and thirteen males with an average age of32, and an age range of 23 to 52 years. Each participant viewed imagesfrom the Salient360! [82] dataset in random order. Participants wereseated in a swivel chair so they could rotate and explore each 360 ◦ scene while eye and head movements were recorded. The dataset will be released publicly when the manuscript is published

All participants performed a 9-point calibration at the beginning ofthe experiment, and eye-tracking accuracy was validated to less than2 ◦ visual angle before image viewing. Each 360 ◦ image was shownfor 25 seconds, following the Salient360! [82] protocol. In contrastto their protocol, we varied the starting orientation of the participantwithin the 360 ◦ image across eight orientations instead of being heldconstant. Halfway through the experiment participants were given aﬁve minute break, after which the eye tracker was re-calibrated beforeviewing the rest of the images. The entire data collection process tookapproximately 40 minutes, including informed consent and a post-studydemographics survey. The VR-Saliency [95] dataset includes gaze data collected from par-ticipants viewing 360 ◦ images on a 2D display, in VR while seated ina swivel chair, and in VR while standing. We analyze only the seatedVR condition, as it is the only VR condition with raw data availableat 120Hz for all stimuli. Free-viewing data was collected in a similarmanner to ET-DK2 for the purpose of saliency map generation, howeveronly eight 360 ◦ images were viewed by each participant. The VR-EyeTracking [102] dataset includes gaze data collected at100Hz from participants viewing 360 ◦ videos. The dataset applicationis to train a deep network model for predicting gaze within dynamicVR environments. The video stimuli did not have a ﬁxed duration, as inET-DK2 and VR-Saliency, however participants viewed many videosand took many breaks to avoid motion sickness. The 360 em [1] dataset includes gaze data collected at 120Hz fromparticipants viewing 360 ◦ videos. Fourteen of the stimuli consisted oftypical 360 ◦ videos from YouTube, while one stimuli was created bythe authors to elicit speciﬁc eye and head movements. The dataset ap-plication is to train and evaluate event detection algorithms, classifyingﬁxation, saccade, smooth pursuit, and OKN events in VR viewing data.For our analysis we only consider the fourteen stimuli downloadedfrom YouTube. The DGaze [40] dataset includes gaze data collected at 100Hz fromparticipants that explore and navigate various 3D rendered scenes.Within each environment multiple animals dynamically move around,attracting visual attention of the participant. Gaze data is used to trainand evaluate the DGaze model for gaze prediction. DGaze can predictgaze position given head orientation, or predict the next gaze positiongiven the current gaze position. Gaze prediction by DGaze has beendemonstrated in the context of foveated rendering, and can help accountfor latency in the eye-tracking and rendering pipeline [3, 40, 79].

For each dataset metrics are computed to identify privacy risks, andevaluate the impact of privacy mechanisms on application utility. Utilitymeasures depend on the application of eye-tracking within the datasets,ranging from AOI analysis to gaze prediction. We deﬁne a utility metricfor each dataset depending on the type of stimuli and application.5 .3.1 Privacy

In our context, privacy refers to how effectively the mechanism preventsan adversary from identifying an individual. Identiﬁcation is deﬁnedas a classiﬁcation task: an algorithm matches the input to the databaseand return the closest match. If the algorithm matches the input tothe ground truth identity, then the comparison is counted as a TruePositive, otherwise it is considered a False Negative. The IdentiﬁcationRate (IR), is the total number of True Positive classiﬁcations dividedby the total number of comparisons [47, 48, 93]. A high IR indicatesaccurate classiﬁcation of identity, and therefore, low privacy.

Predicting future gaze position from eye-tracking data is a critical areaof research that has yet to be solved [40, 41]. Using the DGaze datasetwe evaluate the ability to predict ground truth gaze position 100 ms intothe future when gaze data output from a privacy mechanism is usedas the testing data, and as both the training and testing data. Utility ismeasured as angular gaze prediction error for each input gaze sample,with lower values indicating higher accuracy.The most common form of eye-tracking analysis is performed us-ing static AOIs deﬁned within image content [55, 76]. AOI analysisis used to study gaze behavior during social interaction [11], whileviewing websites [103], and to evaluate content placement in 3D envi-ronments [2], among many other applications. A key AOI metric thatis robust to ﬁxation detection parameters is dwell time [76]. Dwelltime measures how long a viewer’s gaze fell within an AOI, and allowsfor comparison between which AOIs attracted the most attention. Weevaluate the loss in utility between ground truth and gaze data out-put by a privacy mechanism by computing the Root Mean SquaredError (RMSE) between AOI dwell times. AOI utility is measured forthe ET-DK2 dataset, as two rectangular AOIs are marked within eachimage that correspond with a salient object, such as people or naturallandmarks, to measure individual viewing behavior within the scene.Eye-tracking data is also used to generate saliency maps, whichrepresent a probability distribution over visual content that highlightsregions most likely to be looked at by a viewer [55]. Saliency mapsgenerated from aggregate eye-tracking data from many viewers andare used to train and evaluate deep learning models for saliency andscanpath prediction [6, 24]. Saliency metrics are computed for both360 ◦ images (VR-Saliency), and 360 ◦ video (VR-EyeTracking and360 em). We compute KL-Divergence [55] to measure the impact onaggregate-level gaze measures and saliency modeling. We deﬁne two classiﬁers for biometric identiﬁcation using a RadialBasis Function (RBF) network [35, 60], with one network to classifyﬁxation events and one to classify saccade events. This method is anal-ogous to a traditional neural network with an input layer representinga feature vector (cid:126) x ∈ R p containing p ﬁxation or saccade features froma single event, one hidden layer consisting of m nodes, and an outputlayer containing c class scores, one for each unique individual in thedataset. The output class scores are used to measure which individualthe input feature vector is most similar to. Thus, larger scores indicatea higher probability of the ﬁxation or saccade event being from thatclass, or individual. Each node in the hidden layer is deﬁned by anactivation function φ i ( (cid:126) x ) and a set of real-valued activation weights w i , c ,where i ∈ [ , ,..., m ] and j ∈ [ , ,..., C ] . The similarity score for agiven class c in the output layer is computed as a weighted sum of allactivation functions in the hidden layer, Score c ( (cid:126) x ) = m ∑ i = w i , c · φ i ( (cid:126) x ) . (1)The activation function of each hidden node takes the form of aGaussian distribution centered around a prototype vector (cid:126) µ i with spreadcoefﬁcient β i . The function is deﬁned as φ i ( (cid:126) x ) = e − β i || (cid:126) x − (cid:126) µ i || , (2) Privacy:Identification RateTraining

Dataset : Fix./Sacc. Features

M Stimuli

Dataset C Subjects

M + N =

Stimuli

Train Fix. &

Sacc. RBFN : 32 hidden nodes per subject

Fixations/Saccades from gaze data in training set S M+N S I I N … I I M+N … … … N Test Stimuli per Subject ID S S M+N … S S M+N

Testing Dataset:

Fix./Sacc.

Features

N Stimuli

Incorrect MatchCorrect MatchRBFN Model:

Classify

Fix./Sacc. Event stream as Subject ID

Fig. 2: Evaluation procedure for the gaze-based biometric classiﬁer.with shape coefﬁcient β i and prototype feature vector (cid:126) µ i deﬁned priorto training the network. Thus, an RBF network must be constructedin two stages by ﬁrst deﬁning the prototypes and then optimizing theactivation weights.First, k-means clustering is applied to a training set of n featurevectors to determine k representative feature vectors per individual [35,60]. Through this process β i and (cid:126) µ i are deﬁned for each of the m = k · c hidden nodes. The activation function φ i ( (cid:126) x ) is then deﬁned using thecluster centroid as (cid:126) µ i , and β i as σ , where σ is the average distancebetween all points in the cluster and the centroid (cid:126) µ i .Second, the activation weights w i , c are learned from the same setof training data used to deﬁne the activation functions. Weights aretrained using only ﬁxation or saccade features from the training set.Training can be implemented using gradient descent [94], or by theMoore–Penrose inverse when setting up the network as a linear sys-tem [35]. The latter method is implemented in this work by deﬁningthe RBF network using an activation output matrix A n × m , where rowsconsist of the n training feature vectors input to the m previously de-ﬁned activation functions, weight matrix W m × c comprised of activationweights w i , c , and an output matrix Y n × c generated as a one-hot encod-ing of the ground truth identity labels. Using matrix multiplication thefollowing system deﬁnes the RBF Network A · W = Y .The weight matrix W is then learned by computing W = A ∗ · Y , where A ∗ is the Moore-Penrose inverse of A computed using MATLAB’s pinv implementation. Class score predictions ˆ Y are then generatedfor the testing data ˆ A by computing ˆ A · W = ˆ Y . Every sample in thetesting set is then classiﬁed as the class label with the maximum score.To classify a stream of events the class scores from all events areﬁrst summed together, and then the class with the maximum valuereturned. Scores from the ﬁxation RBF and saccade RBF are combinedby averaging the score between the two and summing them togetherfor equal contribution to the ﬁnal classiﬁcation. The evaluation protocol for the RBF-based biometric, illustrated inFigure 2, is derived from [93], where a stream of gaze data collectedfrom multiple participants viewing numerous static images is used fortraining and testing the identity classiﬁcation. The size of the trainingand testing sets are deﬁned by the number of stimuli from which gazedata is used. For example, with a training/testing split of 50%/50% gazedata from a half of the dataset selected at random is used for trainingand the other half for testing. Fixation and saccade events data fromall C participants are aggregated from the training stimuli and are thenused to train the ﬁxation and saccade RBF networks for classifyingidentity, as described in Section 5.4. Fixation and saccade events fromthe testing set are input to the trained RBF networks to classify theidentity of each participant. Each participant is present in both thetraining set and the testing set. Identiﬁcation rate is then computed asthe number of correct matches divided by the number of comparisons. ESULTS

In this section we will compute privacy and utility metrics to evaluatethe proposed privacy mechanisms from Section 5.1 for each dataset6 o appear in IEEE Transactions on Visualization and Computer Graphics

Training/Testing Split I den t i f i c a t i on R a t e ( % ) ET-DK2VR-SaliencyVR-EyeTracking360_emDGaze

Fig. 3: Mean and standard deviations of identiﬁcation rates acrossdatasets of 360 ◦ images (ET-DK2, VR-Saliency), 360 ◦ videos (VR-EyeTracking, 360 em), and 3D rendered scenes (DGaze). Lines foreach dataset indicate a baseline of random guessing for the givennumber of subjects.listed in Table 3. In Section 6.1, we ﬁrst compute identiﬁcation rateusing the RBF biometric for each dataset without modiﬁcation, toestablish a baseline privacy risk. Then, we compute identiﬁcation ratefor the privacy mechanisms for different parameter values and discussobserved effects. Last, in Section 6.2 we explore the privacy achievedby each mechanism, and the measured impact on eye-tracking utility. We evaluate the RBF biometric by splitting gaze data from stimuliviewed by each participant into training and testing sets as describedin Section 5.5. For each dataset we evaluate a 75%/25%, 50%/50%,and 25%/75% training/test split, except for DGaze as each participantonly saw two stimuli. Identiﬁcation rate is computed over ten runs withrandom stimuli selected as part of the training and test set, to accountfor variance in stimuli content.Figure 3 presents the mean and standard deviation of identiﬁcationrates for each dataset, along with a baseline rates corresponding torandom guessing. For all datasets, identiﬁcation rate were highest whenthere was more training data than testing data, i.e., a 75%/25% split.ET-DK2 produced the highest identiﬁcation rate with 85% on average,where participants viewed 50 static 360 ◦ images. VR-Saliency used asimilar protocol with 130 participants, however only eight images wereshown to each individual on average. A lower identiﬁcation rate of 9%was observed in this dataset, compared to a baseline guess rate of 0.77%.Further analysis comparing identiﬁcation rates for ET-DK2 using onlyeight stimuli, and VR-Saliency with eighteen random subjects closedthe gap, producing identiﬁcation rates of 47% and 22% respectively.Identiﬁcation rates for the VR-EyeTracking and 360 em datasets arelower on average than the ET-DK2 dataset, reporting rates of 33% and47%. We observed that DGaze produced an identiﬁcation rate of 2.7%,showing only slight improvement over a baseline rate of 2.3%. Thisdataset differs in that participants moved through two 3D renderedvirtual scenes using a controller for teleportation for several minutes ata time, instead of viewing many 360 ◦ scenes from a ﬁxed viewpoint.In summary, we observe that using more data for training and view-ing many different stimuli produces higher identiﬁcation rates. Thus,it will become easier and easier to re-identify an individual as a largevolume of gaze data is collected in a variety of contexts. Identiﬁcationrates are as high as 85% depending on the circumstances, highlightingthe need to enforce privacy in future mixed reality applications.Figure 4 presents the mean and standard deviations achieved whenprivacy mechanisms are applied to each dataset. A training/testing splitof 75%/25% is used to generate these results. We observe that Gaussiannoise achieves the most privacy, reducing the identiﬁcation rate of ET-DK2 from 85% to 30% on average. Temporal downsampling is not recommended, as it had the least observed impact on identiﬁcation rateand event detection is degraded at sampling rates less than 120Hz [104]. The utility of eye-tracking data depends on the context of the appli-cation, thus we evaluate the impact of our privacy mechanisms atthree different scales: sample-level gaze points, individual-level gazebehavior, and aggregate-level gaze behavior over many individuals.First, we evaluate sample-level utility by computing gaze predictionerror using the DGaze neural network architecture, then, individual-level utility by computing dwell time for AOIs deﬁned in the ET-DK2dataset, and ﬁnally, we compute aggregate-level utility measures forgenerating saliency heatmaps of 360 ◦ images and video by computingKL-Divergence for the VR-Saliency, VR-EyeTracking, and 360 emdatasets. Tables 4, 5, and 6 present the impact of privacy mechanismson utility based on the parameter that provided the largest decrease inidentiﬁcation rate. Gaze Prediction

Evaluating gaze prediction accuracy involved conﬁg-uring the DGaze neural network to predict gaze position 100ms intothe future, which as a baseline produces an average gaze predictionerror of 4.30 ◦ . Gaze prediction error was as high as 9.50 ◦ for the Gaus-sian mechanism, more than double the baseline gaze prediction errorreported in [40]. Next, we evaluated performance by re-training theDGaze model from scratch and applying privacy mechanisms to bothtraining and testing data dataset. This resulted in much lower predictionerrors, with results as low as 5 . ◦ (Table 4), which are comparable tothe 4 . ◦ reported in [40].Introducing the privacy mechanism to both training and testingdata implies that raw gaze data is not shared with any party duringmodel training and deployment. Our experiments indicate that it is stillpossible to learn a reasonable gaze prediction model without access tothe raw gaze data. Withholding raw gaze data from the training datasetis desirable, as it removes the need to safeguard additional data andalleviates the risk of membership inference attacks [25]. We expectfuture gaze prediction models will improve in performance, and in turndecrease the absolute gaze prediction error when using gaze data outputfrom the privacy mechanisms. AOI Analysis

The impact of privacy mechanisms on area of inter-est (AOI) analysis is measured as the Root Mean Squared Error (RMSE)between AOI metrics. There are several popular AOI metrics, suitablefor different analyses, such as number of visits to an AOI [103], timeto ﬁrst ﬁxation, and number of visits to an AOI [43]. For an overviewof AOI analysis, see the discussion by Le Meur and Baccino [55]. Foran investigation into privacy mechanisms, we select Dwell Time asa representative AOI metric. Dwell time is the amount of time spentby a user on an AOI, computed as the sum of the durations of all theﬁxations inside that AOI. The key logical operation is checking whethera ﬁxation location falls within the bounding box that demarcates theAOI, which is the typical ﬁrst step in all AOI metrics.If the ﬁxation location is perturbed, such as with the privacy mecha-nisms proposed above, then we can anticipate an error being introducedin the dwell time computation. We report the RMSE computed be-tween AOI Dwell Time for each individual on the original dataset andafter privacy mechanisms are applied, averaged across all stimuli in thedataset.RMSE in dwell time computation for additive Gaussian noise andtemporal downsampling is below 40ms (Tables 4 and 5), which is in-signiﬁcant for the practical application of AOI metrics, as a ﬁxationitself typically lasts 200ms [91, 105]. However, for spatial downsam-pling, an RMSE of 247ms is introduced, which is greater than thelength of one visual ﬁxation. While being a few ﬁxations off on averagemay not have a large effect on AOI applications such as evidence-baseduser experience design, it may be noticeable in scenarios with multiplesmall AOIs close together, such as ﬁguring out which car the user spentlongest looking at on a virtual visit to a car dealership.

Saliency Map Generation

Saliency maps represent a spatial probabil-ity distribution of attention over an image or video. Maps are generatedby aggregating ﬁxations from eye-tracking data of multiple observersto highlight regions that attract the most attention in the stimulus [46].7 aussian ( ° ) I d e n ti f i ca ti on R a t e ( % ) ET-DK2VR-SaliencyVR-EyeTracking360_emDGaze

Temporal Downsample

Scale Factor I d e n ti f i ca ti on R a t e ( % ) ET-DK2VR-SaliencyVR-EyeTracking360_emDGaze

Spatial Downsampling

Scale Factor I d e n ti f i ca ti on R a t e ( % ) ET-DK2VR-SaliencyVR-EyeTracking360_emDGaze

Fig. 4: Mean and standard deviation of identiﬁcation rate for each privacy mechanism with different internal parameters. Gaussian noise generatesthe lowest observed identiﬁcation rates across all datasets, while temporal downsampling has the least impact.Table 4: This table illustrates the impact of introducing the Gaussian Noise privacy mechanism on the identiﬁcation rate as well as on three usecases. The reported numbers are for σ = ◦ . The second column shows how the identiﬁcation rate falls after the privacy mechanism is applied.The fourth column reports an error metric that is relevant to that use case.Mechanism Identif. Rate Utility Impact on Utility DatasetGaussianNoise 3% →

2% Gaze Prediction Avg. Prediction Error Difference = 1 . ◦ DGaze (Re-trained)GaussianNoise 85% →

30% AOI Analysis Dwell Time RMSE = 0.0359s ET-DK2 (360 ◦ images)GaussianNoise 33% →

9% Generate Saliency Map KL-Divergence = 0.0367 VR-EyeTracking (360 ◦ videos)Table 5: This table illustrates the impact of introducing the Temporal Downsample privacy mechanism on the identiﬁcation rate as well as onthree use cases. The reported numbers are for K =

3. The second column shows how the identiﬁcation rate falls after the privacy mechanism isapplied. The fourth column reports an error metric that is relevant to that use case.Mechanism Identif. Rate Utility Impact on Utility DatasetTemporalDownsample 3% →

3% Gaze Prediction Avg. Prediction Error Difference = 0 . ◦ DGaze (Not Re-trained)TemporalDownsample 85% →

79% AOI Analysis Dwell Time RMSE = 0.006s ET-DK2 (360 ◦ images)TemporalDownsample 9% →

7% Generate Saliency Map KL-Divergence = 0.0019 VR-Saliency (360 ◦ images)Table 6: The lowest achievable identiﬁcation rate (IR) for the Spatial Downsample was at L =

64, and the corresponding impact on utility arereported below. The arrow indicates the IR before and after the privacy mechanism is applied.Mechanism Identif. Rate Utility Impact on Utility DatasetSpatialDownsample 3% →

2% Gaze Prediction Avg. Prediction Error Difference = 0 . ◦ DGaze (Re-trained)SpatialDownsample 85% →

48% AOI Analysis Dwell Time RMSE = 0.2473s ET-DK2 (360 ◦ images)SpatialDownsample 47% →

29% Generate Saliency Map KL-Divergence = 0.1293 360 em (360 ◦ videos)Saliency maps are used directly for gaze prediction [24] and to opti-mize streaming [63, 101] or rendering [62]. We compute error as theKL-Divergence between a saliency map generated from the originalgaze data and the saliency map generated by gaze data after the privacymechanisms have been applied. KL-Divergence measures the rela-tive entropy between the two saliency maps and is commonly used inloss functions to train deep saliency prediction models and to evaluatelearned models [21, 24, 39, 55].The spatial errors introduced by the privacy mechanism may causeregions highlighted by the saliency map to shift or spread out, lead-ing to larger KL-Divergence values. A recent survey revealed thebest performing model in predicting human ﬁxations produced a KL-Divergence of 0.48 for the MIT300 dataset, with baseline models pro-ducing values of 1.24 or higher [14]. We observed that spatial downsam-pling produces the largest KL-Divergence on average of 0.1293, while Gaussian and temporal downsampling mechanisms produces muchsmaller values of 0.0367 and 0.0019 respectively. Spatial downsam-pling introduced errors that are approximately a fourth of the existinggap in ﬁxation prediction. Errors of this magnitude will cause saliencymaps generated from spatially downsampled gaze data to deviate fromground truth, and negatively impact performance of models that use themaps for training. ONCLUSIONS AND F UTURE W ORK

As eye-tracking technology is built into mixed reality devices, theyopen up possibilities for violating user privacy. In this paper, we haveexamined a speciﬁc threat to user privacy: unique user identiﬁcationbased on their eye movement data. This identiﬁcation would enablecolluding applications to connect a user logged in “anonymously” withtheir work ID, for example.8 o appear in IEEE Transactions on Visualization and Computer Graphics

We ﬁrst determine biometric identiﬁcation rates across ﬁve datasetsof eye movements in immersive environments. We show that identiﬁca-tion rates can reach as high as 85% depending on the type of stimulusused to elicit the eye movements, and the amount of eye movement datacollected in total. Our highest identiﬁcation rates were achieved whenviewing many 360 ◦ images with short duration (ET-DK2), with alldatasets having an identiﬁcation rate higher than chance except DGaze.We hypothesize this is the result of the DGaze dataset providing view-ers only two scenes to explore, containing sparse environments withanimals that they can follow around by using teleporting to navigate. Inthe context of saliency Borji [14] describes the role that stimuli playsin eye movements elicited by viewers, suggesting that datasets frommore diverse stimuli is needed to improve generalized performance ofsaliency prediction models. In the context of privacy, this suggests thatthe presence of biometric features within gaze data collected in envi-ronments differs for photorealistic, static, and dynamic stimuli. Givenenough eye movement data collected from the right stimuli, there is anappreciable risk for identiﬁcation.We propose a Gatekeeper model to alleviate biometric authenticationby apps that need AOI metrics or event speciﬁc data for their utility.This model provides API calls that return desired metrics and summaryinformation of ﬁxation and saccades to applications without providingstreams of raw gaze data, which sufﬁces for certain classes of mixedreality use cases. However, in the case of use cases such as foveatedrendering, streaming gaze data is required. We propose that in thiscase, privacy mechanisms be applied to the raw data stream to reduceidentiﬁcation rate, while maintaining the utility needed for the givenapplication. We evaluated three privacy mechanisms: additive Gaussiannoise, temporal downsampling, and spatial downsampling. Our bestresults used additive Gaussian noise to reduce an identiﬁcation rateof 85% to 30% while supporting AOI analysis, gaze prediction, andsaliency map generation.

Implications

Imagine the scenario described earlier of a worker thatanonymously attends labor union meetings as User X. The eye-trackingdata collected during a VR union meeting attended by User X is exposedthrough a database breach or collusion with the employer, who thendiscovers a match between User X and their real identity at a rate greaterthan chance. Even though they were not the only worker to attend thismeeting, biometric data suggested they were the most likely employeeto have attended, turning User X into a scapegoat for the entire group.The individual may then have their reputation tarnished in retaliationby their employer. Our investigations are a ﬁrst step towards protectingsuch a user. Though the proposed mechanisms lower identiﬁcationrates, they do not eliminate the possibility of weak identiﬁcation. Morework is needed to create and evaluate mechanisms that allow users,organizations, and platforms to trust eye tracking, and more broadly,behavioral tracking, within mixed reality use cases.

Limitations

Our threat model assumes a trusted platform. In caseswhere the platform itself cannot be trusted, there is a need for user-implementable solutions, similar in spirit to the user-implementableoptical defocus in [42]. Our characterization of the proposed privacymechanisms is based on one biometric authentication approach (RBFN).As newer methods are developed, we will likely need new privacymechanisms that can applied as a software patch for the mixed realityheadset. This work also considers each privacy mechanism individually.We expect there will be greater gains in terms of privacy when applyinga combination of different privacy mechanisms.

Future Work

In addition to exploring combinations of privacy mech-anisms, future work might draw inspiration from research in locationprivacy, and investigate adapting location k-anonymity schemes forgaze [36]. It would also be interesting to characterize stimuli as be-ing dangerous from the perspective of biometric signatures, akin to“click-bait”. More broadly, while our work considers the user privacy,future work might also consider security from a platform’s perspective.Consider the case of an attacker injecting gaze positions to fool an AOImetric into thinking that an AOI has been glanced at (for monetizationof advertisements). One potential solution to this problem is directanonymous attestation in a trusted platform module (TPM) to assuregaze consumers that there have been no injections. A CKNOWLEDGMENTS

Authors acknowledge funding from the National Science Founda-tion (Awards FWHTF-2026540, CNS-1815883, and CNS-1562485),the National Science Foundation GRFP (Awards DGE-1315138 andDGE-1842473), and the Air Force Ofﬁce of Scientiﬁc Research (AwardFA9550-19-1-0169). R EFERENCES [1] I. Agtzidis, M. Startsev, and M. Dorr. A ground-truth data set and aclassiﬁcation algorithm for eye movements in 360-degree videos. arXivpreprint arXiv:1903.06474 , 2019.[2] R. Alghofaili, M. S. Solah, and H. Huang. Optimizing visual elementplacement via visual attention analysis. In , pp. 464–473. IEEE, 2019.[3] E. Arabadzhiyska, O. T. Tursun, K. Myszkowski, H.-P. Seidel, andP. Didyk. Saccade landing position prediction for gaze-contingent ren-dering.

ACM Transactions on Graphics (TOG) , 36(4):1–12, 2017.[4] W. A. Arbaugh, D. J. Farber, and J. M. Smith. A secure and reliablebootstrap architecture. In

Proceedings of the 1997 IEEE Symposium onSecurity and Privacy , pp. 65–71, 1997.[5] T. Armstrong and B. O. Olatunji. Eye tracking of attention in the affectivedisorders: A meta-analytic review and synthesis.

Clinical psychologyreview , 32(8):704–723, 2012.[6] M. Assens, X. Giro-i Nieto, K. McGuinness, and N. E. O’Connor. Path-gan: visual scanpath prediction with generative adversarial networks. In

Proceedings of the European Conference on Computer Vision (ECCV) ,pp. 0–0, 2018.[7] R. Bailey, A. McNamara, A. Costello, S. Sridharan, and C. Grimm. Im-pact of subtle gaze direction on short-term spatial information recall. In

Proceedings of the Symposium on Eye Tracking Research and Applica-tions , pp. 67–74, 2012.[8] R. Bailey, A. McNamara, N. Sudarsanam, and C. Grimm. Subtle gazedirection.

ACM Transactions on Graphics (TOG) , 28(4):1–14, 2009.[9] Y. Bar-Haim, T. Ziv, D. Lamy, and R. M. Hodes. Nature and nurture inown-race face processing.

Psychological science , 17(2):159–163, 2006.[10] B. Bastani, E. Turner, C. Vieri, H. Jiang, B. Funt, and N. Balram.Foveated pipeline for AR/VR head-mounted displays.

Information Dis-play , 33(6):14–35, 2017.[11] J. K. Bennett, S. Sridharan, B. John, and R. Bailey. Looking at faces:autonomous perspective invariant facial gaze analysis. In

Proceedings ofthe ACM Symposium on Applied Perception , pp. 105–112. ACM, 2016.[12] T. Booth, S. Sridharan, A. McNamara, C. Grimm, and R. Bailey. Guidingattention in controlled real-world environments. In

Proceedings of theACM Symposium on Applied Perception , pp. 75–82, 2013.[13] Z. Boraston and S.-J. Blakemore. The application of eye-tracking technol-ogy in the study of autism.

The Journal of physiology , 581(3):893–898,2007.[14] A. Borji. Saliency prediction in the deep learning era: Successes and lim-itations.

IEEE transactions on pattern analysis and machine intelligence ,2019.[15] E. Bozkir, O. G¨unl¨u, W. Fuhl, R. F. Schaefer, and E. Kasneci. Differentialprivacy for eye tracking with temporal correlations. arXiv preprintarXiv:2002.08972 , 2020.[16] E. Bozkir, A. B. ¨Unal, M. Akg¨un, E. Kasneci, and N. Pfeifer. Privacypreserving gaze estimation using synthetic images via a randomizedencoding based framework. In

Proceedings of the Symposium on EyeTracking Research and Applications , pp. 1–5, 2020.[17] M. Bradley. Natural selective attention: orienting and emotion.

Psy-chophysiology , 46(1):1–11, 2009.[18] J. Bradshaw, F. Shic, A. N. Holden, E. J. Horowitz, A. C. Barrett, T. C.German, and T. W. Vernon. The use of eye tracking as a biomarker oftreatment outcome in a pilot randomized clinical trial for young childrenwith autism.

Autism Research , 12(5):779–793, 2019.[19] A. Burova, J. M¨akel¨a, J. Hakulinen, T. Keskinen, H. Heinonen, S. Sil-tanen, and M. Turunen. Utilizing vr and gaze tracking to develop arsolutions for industrial maintenance. In

Proceedings of the 2020 CHIConference on Human Factors in Computing Systems , pp. 1–13, 2020.[20] J. Chakareski, R. Aksu, X. Corbillon, G. Simon, and V. Swaminathan.Viewport-driven rate-distortion optimized 360º video streaming. In , pp. 1–7. IEEE,2018.

21] F.-Y. Chao, L. Zhang, W. Hamidouche, and O. Deforges. Salgan360:Visual saliency prediction on 360 degree images with generative adver-sarial networks. In , pp. 01–04. IEEE, 2018.[22] A. K. Chaudhary and J. B. Pelz. Privacy-preserving eye videos usingrubber sheet model. In

ACM Symposium on Eye Tracking Research &Applications , pp. 1–5, 2020.[23] K. Chawarska and F. Shic. Looking but not seeing: Atypical visualscanning and recognition of faces in 2 and 4-year-old children withautism spectrum disorder.

Journal of autism and developmental disorders ,39(12):1663, 2009.[24] D. Chen, C. Qing, X. Xu, and H. Zhu. Salbinet360: Saliency predictionon 360° images with local-global bifurcated deep network. In , pp. 92–100.IEEE, 2020.[25] M. Chen, Z. Zhang, T. Wang, M. Backes, M. Humbert, and Y. Zhang.When machine unlearning jeopardizes privacy. arXiv , pp. arXiv–2005,2020.[26] S. Cho, S.-w. Kim, J. Lee, J. Ahn, and J. Han. Effects of volumetriccapture avatars on social presence in immersive virtual environments. In ,pp. 26–34. IEEE, 2020.[27] K. M. Dalton, B. M. Nacewicz, T. Johnstone, H. S. Schaefer, M. A.Gernsbacher, H. H. Goldsmith, A. L. Alexander, and R. J. Davidson.Gaze ﬁxation and the neural circuitry of face processing in autism.

Natureneuroscience , 8(4):519–526, 2005.[28] E. J. David, J. Guti´errez, A. Coutrot, M. P. Da Silva, and P. L. Callet. Adataset of head and eye movements for 360 videos. In

Proceedings ofthe 9th ACM Multimedia Systems Conference , pp. 432–437. ACM, 2018.[29] A. T. Duchowski, K. Krejtz, I. Krejtz, C. Biele, A. Niedzielska, P. Kiefer,M. Raubal, and I. Giannopoulos. The index of pupillary activity: Mea-suring cognitive load vis-`a-vis task difﬁculty with pupil oscillation. In

Proceedings of the 2018 CHI Conference on Human Factors in Comput-ing Systems , pp. 1–13, 2018.[30] A. T. Duchowski, V. Shivashankaraiah, T. Rawls, A. K. Gramopadhye,B. J. Melloy, and B. Kanki. Binocular eye tracking in virtual reality forinspection training. In

Proceedings of the Symposium on Eye TrackingResearch & Applications , pp. 89–96, 2000.[31] S. Eberz, G. Lovisotto, K. B. Rasmussen, V. Lenders, and I. Martinovic.28 blinks later: Tackling practical challenges of eye movement biometrics.In

Proceedings of the 2019 ACM SIGSAC Conference on Computer andCommunications Security , pp. 1187–1199, 2019.[32] W. Fuhl. Reinforcement learning for the manipulation of eye trackingdata. arXiv preprint arXiv:2002.06806 , 2020.[33] C. Galdi, M. Nappi, D. Riccio, and H. Wechsler. Eye movement analysisfor human authentication: a critical survey.

Pattern Recognition Letters ,84:272–283, 2016.[34] C. Gebhardt, B. Hecox, B. van Opheusden, D. Wigdor, J. Hillis,O. Hilliges, and H. Benko. Learning cooperative personalized poli-cies from gaze data. In

Proceedings of the 32nd Annual ACM Symposiumon User Interface Software and Technology , pp. 197–208, 2019.[35] A. George and A. Routray. A score level fusion method for eye movementbiometrics.

Pattern Recognition Letters , 82:207–215, 2016.[36] A. Gkoulalas-Divanis, P. Kalnis, and V. S. Verykios. Providing k-anonymity in location based services.

ACM SIGKDD explorationsnewsletter , 12(1):3–10, 2010.[37] R. Graham, A. Hoover, N. A. Ceballos, and O. Komogortsev. Body massindex moderates gaze orienting biases and pupil diameter to high andlow calorie food images.

Appetite , 56(3):577–586, 2011.[38] S. Grogorick, M. Stengel, E. Eisemann, and M. Magnor. Subtle gazeguidance for immersive environments. In

Proceedings of the ACM Sym-posium on Applied Perception , pp. 1–7, 2017.[39] J. Guti´errez, E. J. David, A. Coutrot, M. P. Da Silva, and P. Le Callet.Introducing un salient360! benchmark: A platform for evaluating visualattention models for 360 contents. In , pp. 1–3. IEEE,2018.[40] Z. Hu, S. Li, C. Zhang, K. Yi, G. Wang, and D. Manocha. Dgaze:Cnn-based gaze prediction in dynamic scenes.

IEEE transactions onvisualization and computer graphics , 26(5):1902–1911, 2020.[41] Z. Hu, C. Zhang, S. Li, G. Wang, and D. Manocha. SGaze: A data-driven eye-head coordination model for realtime gaze prediction.

IEEETransactions on Visualization and Computer Graphics , 25(5):2002–2010, 2019.[42] B. John, S. Jorg, S. Koppal, and E. Jain. The security-utility trade-offfor iris authentication and eye animation for social virtual avatars.

IEEEtransactions on visualization and computer graphics , 2020.[43] B. John, S. Kalyanaraman, and E. Jain. Look out! a design frameworkfor safety training systems a case study on omnidirectional cinemagraphs.In , pp. 147–153. IEEE, 2020.[44] B. John, S. Koppal, and E. Jain. EyeVEIL: degrading iris authenticationin eye tracking headsets. In

ACM Symposium on Eye Tracking Research& Applications , p. 37. ACM, 2019.[45] B. John, A. Liu, L. Xia, S. Koppal, and E. Jain. Let it snow: Adding pixelnoise to protect the user’s identity. In

Proceedings of the Symposium onEye Tracking Research and Applications , pp. 1–3, 2020.[46] B. John, P. Raiturkar, O. Le Meur, and E. Jain. A benchmark of fourmethods for generating 360 ◦ saliency maps from eye tracking data. In-ternational Journal of Semantic Computing , 13(03):329–341, 2019.[47] P. Kasprowski and K. Harezlak. The second eye movements veriﬁcationand identiﬁcation competition. In

IEEE International Joint Conferenceon Biometrics , pp. 1–6. IEEE.[48] P. Kasprowski, O. V. Komogortsev, and A. Karpov. First eye movementveriﬁcation and identiﬁcation competition at btas 2012. In , pp. 195–202. IEEE, 2012.[49] M. Keyvanara and R. Allison. Transsaccadic awareness of scene trans-formations in a 3d virtual environment. In

ACM Symposium on AppliedPerception 2019 , pp. 1–9, 2019.[50] M. Keyvanara and R. Allison. Effect of a constant camera rotationon the visibility of transsaccadic camera shifts. In

Proceedings of theSymposium on Eye Tracking Research and Applications , pp. 1–8, 2020.[51] J. L. Kr¨oger, O. H.-M. Lutz, and F. M¨uller. What does your gaze re-veal about you? on the privacy implications of eye tracking. In

IFIPInternational Summer School on Privacy and Identity Management , pp.226–241. Springer, 2019.[52] B. Laeng and L. Falkenberg. Women’s pupillary responses to sexuallysigniﬁcant others during the hormonal cycle.

Hormones and behavior ,52(4):520–530, 2007.[53] P. Lang, M. Greenwald, M. M. Bradley, and A. O. Hamm. Looking atpictures: affective, facial, visceral, and behavioral reactions.

Psychophys-iology , 30(3):261–73, 1993.[54] E. Langbehn, F. Steinicke, M. Lappe, G. F. Welch, and G. Bruder. In theblink of an eye: Leveraging blink-induced suppression for imperceptibleposition and orientation redirection in virtual reality.

ACM Transactionson Graphics (TOG) , 37(4):66, 2018.[55] O. Le Meur and T. Baccino. Methods for comparing scanpaths andsaliency maps: strengths and weaknesses.

Behavior research methods ,45(1):251–266, 2013.[56] R. J. Leigh and D. S. Zee.

The neurology of eye movements . OxfordUniversity Press, USA, 2015.[57] C. Li, M. Xu, X. Du, and Z. Wang. Bridge the gap between vqa andhuman behavior on omnidirectional video: A large-scale dataset anda deep learning model. In

Proceedings of the 26th ACM internationalconference on Multimedia , pp. 932–940, 2018.[58] J. Li, A. R. Chowdhury, K. Fawaz, and Y. Kim. Kal ε ido: Real-timeprivacy control for eye-tracking systems. In { USENIX } SecuritySymposium ( { USENIX } Security 20) , 2020.[59] A. Liu, L. Xia, A. Duchowski, R. Bailey, K. Holmqvist, and E. Jain.Differential privacy for eye-tracking data. In

ACM Symposium on EyeTracking Research & Applications , p. 28. ACM, 2019.[60] D. J. Lohr, S. Aziz, and O. Komogortsev. Eye movement biometricsusing a new dataset collected in virtual reality. In

Proceedings of theSymposium on Eye Tracking Research and Applications , pp. 1–3, 2020.[61] S. Lombardi, J. Saragih, T. Simon, and Y. Sheikh. Deep appearancemodels for face rendering.

ACM Transactions on Graphics (TOG) ,37(4):68, 2018.[62] P. Longhurst, K. Debattista, and A. Chalmers. A gpu based saliency mapfor high-ﬁdelity selective rendering. In

Proceedings of the 4th interna-tional conference on Computer graphics, virtual reality, visualisationand interaction in Africa , pp. 21–29, 2006.[63] P. Lungaro, R. Sj¨oberg, A. J. F. Valero, A. Mittal, and K. Tollmar. Gaze-aware streaming solutions for the next generation of mobile VR expe-riences.

IEEE Transactions on Visualization and Computer Graphics ,24(4):1535–1544, 2018. o appear in IEEE Transactions on Visualization and Computer Graphics [64] A. MacQuarrie and A. Steed. Perception of volumetric characters’ eye-gaze direction in head-mounted displays. In Proceedings of 2019 IEEEVirtual Reality (VR) , vol. 2019. IEEE, 2019.[65] C. Marforio, H. Ritzdorf, A. Francillon, and S. Capkun. Analysis of thecommunication between colluding applications on modern smartphones.In

Proceedings of the 28th Annual Computer Security Applications Con-ference , pp. 51–60, 2012.[66] X. Meng, R. Du, and A. Varshney. Eye-dominance-guided foveatedrendering.

IEEE Transactions on Visualization and Computer Graphics ,26(5):1972–1980, 2020.[67] X. Meng, R. Du, M. Zwicker, and A. Varshney. Kernel foveated render-ing.

Proceedings of the ACM on Computer Graphics and InteractiveTechniques , 1(1):1–20, 2018.[68] J. V. Monaco. Classiﬁcation and authentication of one-dimensional behav-ioral biometrics. In

IEEE International Joint Conference on Biometrics ,pp. 1–8. IEEE, 2014.[69] J. Morris, S. Smalley, and G. Kroah-Hartman. Linux security modules:General security support for the linux kernel. In

Proceedinsg of the 2002USENIX Security Symposium , 2002.[70] C. Mousas, A. Koilias, D. Anastasiou, B. Hekabdar, and C.-N. Anag-nostopoulos. Effects of self-avatar and gaze on avoidance movementbehavior. In , pp. 726–734. IEEE, 2019.[71] J. H. Mueller, P. Voglreiter, M. Dokter, T. Neff, M. Makar, M. Steinberger,and D. Schmalstieg. Shading atlas streaming. In

SIGGRAPH Asia 2018Technical Papers , p. 199. ACM, 2018.[72] P. Mundy. A review of joint attention and social-cognitive brain systemsin typical development and autism spectrum disorder.

European Journalof Neuroscience , 47(6):497–514, 2018.[73] D. Munoz, J. Broughton, J. Goldring, and I. Armstrong. Age-relatedperformance of human subjects on saccadic eye movement tasks.

Experi-mental brain research , 121(4):391–400, 1998.[74] M. Murcia-L´opez, T. Collingwoode-Williams, W. Steptoe, R. Schwartz,T. J. Loving, and M. Slater. Evaluating virtual reality experiences throughparticipant choices. In , pp. 747–755. IEEE, 2020.[75] J. Orlosky, Y. Itoh, M. Ranchet, K. Kiyokawa, J. Morgan, and H. Devos.Emulation of physician tasks in eye-tracked virtual reality for remotediagnosis of neurodegenerative disease.

IEEE Transactions on Visualiza-tion and Computer Graphics , 23(4):1302–1311, 2017.[76] J. L. Orquin, N. J. Ashby, and A. D. Clarke. Areas of interest as asignal detection problem in behavioral eye-tracking research.

Journal ofBehavioral Decision Making , 29(2-3):103–115, 2016.[77] Y. S. Pai, B. I. Outram, B. Tag, M. Isogai, D. Ochi, and K. Kunze.Gazesphere: Navigating 360-degree-video environments in VR usinghead rotation and eye gaze. In

ACM SIGGRAPH 2017 Posters , p. 23.ACM, 2017.[78] G. Papaioannou and I. Koutsopoulos. Tile-based caching optimizationfor 360 videos. In

Proceedings of the Twentieth ACM InternationalSymposium on Mobile Ad Hoc Networking and Computing , pp. 171–180,2019.[79] A. Patney, M. Salvi, J. Kim, A. Kaplanyan, C. Wyman, N. Benty, D. Lue-bke, and A. Lefohn. Towards foveated rendering for gaze-tracked virtualreality.

ACM Transactions on Graphics (TOG) , 35(6):179, 2016.[80] K. A. Pelphrey, J. P. Morris, and G. McCarthy. Neural basis of eye gazeprocessing deﬁcits in autism.

Brain , 128(5):1038–1048, 2005.[81] Y. Rahman, S. M. Asish, N. P. Fisher, E. C. Bruce, A. K. Kulshreshth, andC. W. Borst. Exploring eye gaze visualization techniques for identifyingdistracted students in educational vr. In , pp. 868–877. IEEE, 2020.[82] Y. Rai, J. Guti´errez, and P. Le Callet. A dataset of head and eye move-ments for 360 degree images. In

Proceedings of the 8th ACM on Multi-media Systems Conference , pp. 205–210. ACM, 2017.[83] P. Raiturkar, A. Kleinsmith, A. Keil, A. Banerjee, and E. Jain. Decouplinglight reﬂex from pupillary dilation to measure emotional arousal in videos.In

Proceedings of the ACM Symposium on Applied Perception , pp. 89–96,2016.[84] V. Rajanna and J. P. Hansen. Gaze typing in virtual reality: impact ofkeyboard design, selection method, and motion. In

Proceedings of theSymposium on Eye Tracking Research and Applications , p. 15. ACM,2018.[85] G. Rieger, B. M. Cash, S. M. Merrill, J. Jones-Rounds, S. M. Dhar-mavaram, and R. C. Savin-Williams. Sexual arousal: The correspondence of eyes and genitals.

Biological Psychology , 104:56–64, 2015.[86] I. Rigas, E. Abdulin, and O. Komogortsev. Towards a multi-source fusionapproach for eye movement-driven recognition.

Information Fusion ,32:13–25, 2016.[87] I. Rigas and O. V. Komogortsev. Current research in eye movementbiometrics: An analysis based on bioeye 2015 competition.

Image andVision Computing , 58:129–141, 2017.[88] S. Rothe, F. Althammer, and M. Khamis. Gazerecall: Using gaze di-rection to increase recall of details in cinematic virtual reality. In

Pro-ceedings of the 17th International Conference on Mobile and UbiquitousMultimedia , pp. 115–119, 2018.[89] S. Rothe, D. Buschek, and H. Hußmann. Guidance in cinematic virtualreality-taxonomy, research status and challenges.

Multimodal Technolo-gies and Interaction , 3(1):19, 2019.[90] R. Sailer, X. Zhang, T. Jaeger, and L. Van Doorn. Design and Implementa-tion of a TCG-based Integrity Measurement Architecture. In

Proceedingsof the 2004 USENIX Security Symposium , 2004.[91] D. D. Salvucci and J. H. Goldberg. Identifying ﬁxations and saccadesin eye-tracking protocols. In

Proceedings of the Symposium on EyeTracking Research & Applications , pp. 71–78, 2000.[92] N. Sammaknejad, H. Pouretemad, C. Eslahchi, A. Salahirad, andA. Alinejad. Gender classiﬁcation based on eye movements: A process-ing effect during passive face viewing.

Advances in cognitive psychology ,13(3):232, 2017.[93] C. Schr¨oder, S. M. K. Al Zaidawi, M. H. Prinzler, S. Maneth, andG. Zachmann. Robustness of eye movement biometrics against varyingstimuli and varying trajectory length. In

Proceedings of the 2020 CHIConference on Human Factors in Computing Systems , pp. 1–7, 2020.[94] F. Schwenker, H. A. Kestler, and G. Palm. Three learning phases forradial-basis-function networks.

Neural networks , 14(4-5):439–458, 2001.[95] V. Sitzmann, A. Serrano, A. Pavel, M. Agrawala, D. Gutierrez, B. Masia,and G. Wetzstein. Saliency in VR: How do people explore virtual envi-ronments?

IEEE Transactions on Visualization and Computer Graphics ,24(4):1633–1642, 2018.[96] S. Sridharan, R. Bailey, A. McNamara, and C. Grimm. Subtle gazemanipulation for improved mammography training. In

Proceedings ofthe Symposium on Eye Tracking Research and Applications , pp. 75–82,2012.[97] J. Steil, I. Hagestedt, M. X. Huang, and A. Bulling. Privacy-aware eyetracking using differential privacy. In

ACM Symposium on Eye TrackingResearch & Applications . ACM, 2019.[98] J. Steil, M. Koelle, W. Heuten, S. Boll, and A. Bulling. Privaceye:privacy-preserving head-mounted eye tracking using egocentric sceneimage and eye movement features. In

ACM Symposium on Eye TrackingResearch & Applications , p. 26. ACM, 2019.[99] Q. Sun, A. Patney, L.-Y. Wei, O. Shapira, J. Lu, P. Asente, S. Zhu,M. Mcguire, D. Luebke, and A. Kaufman. Towards virtual reality inﬁnitewalking: dynamic saccadic redirection.

ACM Transactions on Graphics(TOG) , 37(4):67, 2018.[100] S. Uzzaman and S. Joordens. The eyes know what you are thinking: eyemovements as an objective measure of mind wandering.

Consciousnessand cognition , 20(4):1882–1886, 2011.[101] M. Xu, C. Li, S. Zhang, and P. Le Callet. State-of-the-art in 360video/image processing: Perception, assessment and compression.

IEEEJournal of Selected Topics in Signal Processing , 14(1):5–26, 2020.[102] Y. Xu, Y. Dong, J. Wu, Z. Sun, Z. Shi, J. Yu, and S. Gao. Gaze predictionin dynamic 360 ◦ immersive videos. In Proceedings of IEEE CVPR 2018 ,pp. 5333–5342, 2018.[103] C. Yangandul, S. Paryani, M. Le, and E. Jain. How many words is apicture worth? attention allocation on thumbnails versus title text regions.In

ACM Symposium on Eye Tracking Research & Applications , pp. 1–5,2018.[104] R. Zemblys and O. Komogortsev. Developing photo-sensor oculography(PS-OG) system for virtual reality headsets. In

ACM Symposium on EyeTracking Research & Applications , p. 83. ACM, 2018.[105] R. Zemblys, D. C. Niehorster, O. Komogortsev, and K. Holmqvist. Usingmachine learning to detect events in eye-tracking data.

Behavior researchmethods , 50(1):160–181, 2018.[106] A. T. Zhang and B. O. Le Meur. How old do you look? inferring yourage from your gaze. In , pp. 2660–2664. IEEE, 2018.[107] G. Zhang and J. P. Hansen. Accessible control of telepresence robotsbased on eye tracking. In

ACM Symposium on Eye Tracking Research & pplications , p. 50. ACM, 2019., p. 50. ACM, 2019.