[PDF] Covert Embodied Choice: Decision-Making and the Limits of Privacy Under Biometric Surveillance

Abstract

Algorithms engineered to leverage rich behavioral and biometric data to predict individual attributes and actions continue to permeate public and private life. A fundamental risk may emerge from misconceptions about the sensitivity of such data, as well as the agency of individuals to protect their privacy when fine-grained (and possibly involuntary) behavior is tracked. In this work, we examine how individuals adjust their behavior when incentivized to avoid the algorithmic prediction of their intent. We present results from a virtual reality task in which gaze, movement, and other physiological signals are tracked. Participants are asked to decide which card to select without an algorithmic adversary anticipating their choice. We find that while participants use a variety of strategies, data collected remains highly predictive of choice (80% accuracy). Additionally, a significant portion of participants became more predictable despite efforts to obfuscate, possibly indicating mistaken priors about the dynamics of algorithmic prediction.

Full PDF

CCovert Embodied Choice: Decision-Making and the Limits ofPrivacy Under Biometric Surveillance

Jeremy Gordon

Max Curran

John Chuang

Coye Cheshire

Figure 1: Left: participant in VR experiment. Right: participant field of view during card selection.

ABSTRACT

Algorithms engineered to leverage rich behavioral and biometricdata to predict individual attributes and actions continue to perme-ate public and private life. A fundamental risk may emerge frommisconceptions about the sensitivity of such data, as well as theagency of individuals to protect their privacy when fine-grained(and possibly involuntary) behavior is tracked. In this work, weexamine how individuals adjust their behavior when incentivizedto avoid the algorithmic prediction of their intent. We present re-sults from a virtual reality task in which gaze, movement, and otherphysiological signals are tracked. Participants are asked to decidewhich card to select without an algorithmic adversary anticipat-ing their choice. We find that while participants use a variety ofstrategies, data collected remains highly predictive of choice (80%accuracy). Additionally, a significant portion of participants became

Permission to make digital or hard copies of part or all of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for third-party components of this work must be honored.For all other uses, contact the owner/author(s).

CHI ’21, May 8–13, 2021, Yokohama, Japan © 2021 Copyright held by the owner/author(s).ACM ISBN 978-1-4503-8096-6/21/05.https://doi.org/10.1145/3411764.3445309 more predictable despite efforts to obfuscate, possibly indicatingmistaken priors about the dynamics of algorithmic prediction.

CCS CONCEPTS • Human-centered computing → Virtual reality ; •

Social andprofessional topics → Surveillance ; KEYWORDS biometrics, prediction, privacy, virtual reality, surveillance

ACM Reference Format:

Jeremy Gordon, Max Curran, John Chuang, and Coye Cheshire. 2021. CovertEmbodied Choice: Decision-Making and the Limits of Privacy Under Bio-metric Surveillance. In

CHI Conference on Human Factors in ComputingSystems (CHI ’21), May 8–13, 2021, Yokohama, Japan.

ACM, New York, NY,USA, 12 pages. https://doi.org/10.1145/3411764.3445309

In a world in which sensing devices permeate both public and pri-vate domains, consumers [17], renters [13], voters [9], and decision-makers of all kinds are now subject to unprecedented levels ofsurveillance during day-to-day life: from the tracking of searchqueries to face and ‘emotion recognition’ [8], to geospatial locationand physiology [4]. The data produced by sensors and software a r X i v : . [ c s . H C ] J a n HI ’21, May 8–13, 2021, Yokohama, Japan Gordon et al. capturing these signals are now and will continue to be analyzed byincreasingly advanced statistical techniques and algorithms witha specific interest in inferring internal states that might predictthe future actions of individuals. Constituting what Zuboff termsthe ‘Big Other’, this constellation of tracking infrastructure usesopaque mechanisms to detect and control behavior, and ultimatelyposes a threat to both individual privacy and democratic norms atlarge [38].According to Nissembaum’s contextual integrity heuristic, pri-vacy is best understood as a set of appropriate information flowssubject to contextual norms dependent on parameters such as thedata subject, parties, information type, and principles of transmis-sion [26]. As proposed by Sedenberg et al. 2017, the privacy threatsposed by remote biometric sensing can be seen as shifts or breachesof these contextual norms: “...when biosensed data like emotionsor internal physiological states are systematically recorded and an-alyzed, all signals become magnified beyond their original naturalpublic scope” [34]. Indeed, remote sensing systems of this kind havealready generated real-world contexts within which actors mayhope to predict internal beliefs or preferences from biometric data,and for which the obfuscation of these beliefs may be beneficialto the individual. One such example is individually targeted pricediscrimination in a physical retail setting where video feeds may af-ford gaze tracking (see commercial use of "Smart-shelf" technologyreported in 2013[7]).Thanks to journalistic and public service efforts, an awarenessof these threats is growing, but individuals still hold significantmisconceptions about the sensitivity of information that can beinferred from these types of data [23], and what agency they havein protecting their privacy.In this work, we designed a Virtual Reality (VR)-based behav-ioral experiment in which participants completed an iterative binarydecision task. In later trials, participants were informed of an adver-sary that is tracking their behavior and attempting to predict eachchoice. Questionnaire responses, post-task interviews and quanti-tative analyses of biometric data indicate that participants used avariety of behavioral strategies and felt confident in their agency inavoiding prediction. While some participants adjusted their behav-ior in ways that reduced prediction accuracy, a strong majority oftrials could still be successfully classified based on biometric datadespite these efforts.

A valuable perspectiveon the privacy of decisions in a tracked environment comes from amultidisciplinary literature which integrates embodied, enactive,extended, and embedded accounts of cognition (referred to as ‘4ECognition’). In a 2015 paper, Lepora and Pezzulo introduced theEmbodied Choice framework (EC) which recasts decision-makingas a dynamic and inherently active process rather than a sequentialperception-decision-execution cycle [18]. In a visual decision task,the authors tracked the correlates of active consideration via move-ments of a mouse towards a target. These ideas are inspired by andconsistent with Active Inference, a formal framework modelingsituated action in which agents minimize their prediction errorby entraining available sensors in such a way as to disambiguate competing perceptual hypotheses [35]. In this framework, perceptscan be seen as hypotheses (e.g. the dog is hungry , or the customer isplanning to buy some flowers ), and the ocular motor outputs thatproduce saccades (eye movements) are seen as experiments aimingto confirm or deny prior beliefs [12].In another behavioral experiment looking at embodied decision-making, Beilock and Holt found that the visual perception of stimulican recruit the motor system into action simulation (the productionof potentially detectable micro-motor outputs) which can influenceaffective judgments [6]. Together, research within this school ofthought encourages us to reconsider traditional notions of passiveperception followed by active decisions, and instead see decision-making as a continuous and dynamic process. In this light, sensedbiometric data may serve as a correlate to deliberative processes.The privacy implications stemming from this reasoning have notyet been sufficiently investigated.

Eye-tracking tech-nology is far from new, but the increasing prevalence of videocapture systems and improved algorithms for accurately inferringgaze direction [19] make it especially relevant to conversationsaround modern algorithmic surveillance. Further, the breadth ofapplication areas being explored for eye-tracking—as a novel UIselection affordance [30], a method to augment user experiencevia attentional awareness [37], and an opportunity to gain com-putational efficiencies via foveated rendering [27], among manyothers—reinforces the belief of some researchers that eye trackingtechnology will soon be ubiquitous [20].Beyond eye-tracking, a critical body of literature has explored thedeleterious effects of various regimes of biometric surveillance [22,31], including works specifically highlighting the dangers whenthese technologies are targeted at marginalized groups [21], andthe unique effects of systems purporting to monitor internal statessuch as affect [8]. The documentation of surveillance harms offeredby these scholars, among others, provided a key motivation for thepresent work.Scholars in HCI have offered a variety of perspectives on privacyconcerns relating to the use of consumer remote sensing technolo-gies such as Internet of things (IoT) products like smart speakers,and augmented reality (AR) devices. Denning et al. explored theperceptions and privacy concerns of bystanders to the use of glasses-style AR devices which may record video of people and surround-ings [10]. In a 2014 review, Roesner et al. identified several keychallenges and risks of AR systems, including adversarial attacksto both input and output channels, and theft or misuse of sensordata [33]. Recent work by Ahmad et al. raises concerns over privacyambiguity imposed by ‘always-on’ IoT devices, and introduces theconcept of tangible privacy—design features that allow bystandersto clearly assess the state of a device’s sensors [1].We chose to implement our study using a VR-based laboratoryprecisely because of the ease of collecting not only gaze targets, buta rich stream of sensor data allowing us to measure our participants’in-task behavior at a fine level of granularity. Indeed, the potentialto develop machine learning-based predictive models trained on VRbiometric data, some of it imperceptible to human observers, has notgone unnoticed by businesses and entrepreneurs [24]. The ethicalquestions raised by this new domain, which is quickly developing overt Embodied Choice CHI ’21, May 8–13, 2021, Yokohama, Japan on interrelated but discrete paths in the private sector and academia,are diverse and poignant. A first of its kind conference was heldin 2018 proposing to construct a “VR bill of rights” [15], and anopinion article by Jeremy Bailenson in the same year highlightedthe capabilities and dangers of biometric data collected in VR [3].While immersive VR clearly comes with a host of potential pri-vacy concerns, our primary interest centers around VR as a researchtool. The VR-based setup allows us to probe the sensitivity of col-lected data, as well as the psychology influencing the perceptionof surveillance, in both in-home and public settings. Ultimately,we see the virtual laboratory as offering researchers a controllable,replicable setting enabling the study of human behavior in a realworld increasingly subject to surveillance.

Though biometric surveillance has attracted extensive scholarship,few quantitative behavioral studies have shed light on the expecta-tions, assumptions, and range of behavioral responses employedby individuals hoping to evade prediction by an adversarial system.In this work, we aim to explore the following questions:(1) RQ1: Under what conditions might biometric signals suchas motor outputs, eye movements, and electrodermal ac-tivity expose sensitive information relating to beliefs andimmediate choice intentions?(2) RQ2: How effectively can commonly used machine learningmodels predict choice intention given behavioral data duringdecision-making?(3) RQ3: What strategies are used by participants when in-structed to make unpredictable decisions in a simple trackedsetting?(4) RQ4: How effective are the employed strategies at maintain-ing an individual’s privacy of intent?We note that as per RQ2, we seek to understand the approximateperformance of minimally tuned off-the-shelf tools, rather than toestimate an upper bound on prediction accuracy. The developmentof novel algorithms that might improve performance on this predic-tion task were not among the objectives of this work (see Section4.4 for discussion).

In order to study human behaviors related to the protection of pri-vate intent, we developed a game-based experiment in which par-ticipants were sufficiently motivated to avoid predictability whilebehaving in a physical environment. Specifically, we defined thefollowing desiderata to constrain experimental design:(1)

Covert imperative . Participants should be sufficiently in-centivized to obscure their intent.(2)

Non-trivial choice-salience . Decisions made in the taskmust influence a relevant outcome (e.g. compensation); par-ticipants must consider an extrinsic value of their choiceseparate from masking intent.(3)

Embodied . Choices should be enacted by gross motor out-puts as opposed to verbal reports to allow for action simula-tion or other micro-motor outputs to be detectable.

Based on these criteria, we designed a virtual reality task (inspiredby the popular game

Set ) as follows.In each trial, the participant is presented with two cards in aprivate ‘hand’ facing them, and two cards flat on a table (see Figure2). During the 10-second decision phase, they must choose oneof the two single cards on the table. Then, during the 3-secondselection phase, they use the controller to bring the chosen card intotheir hand to complete a trial. A trial is successful if the resultingthree cards in the hand form a complete ‘match’. A successful matchis defined as three cards for which both card attributes: count andshape are either all the same or all different. A trial is completedeither upon the participant’s releasing their chosen card into theirhand, or when the selection timer elapses. A new trial is startedafter completion of the last, with a new deal of two private and twotable cards.See Section 2.4 for a more detailed description of the task andexperimental protocol.

Study participants were recruited via e-mail through an on-campusexperimental lab at a public university in the United States. Partici-pants were screened to minimize the risk of adverse effects fromthe use of VR: participants indicating dizziness or nausea duringprior uses of VR, epilepsy, a history of seizures, or that they areprone to motion sickness, were not considered for the full study.To support the use of parametric tests with our mixed between-subjects and within-subjects design, we selected a target samplesize of 35 subjects per group (70 total). Since machine-learning-based classification accuracy was a primary dependent variable, itwas critical to ensure both that (1) the training set contained suffi-cient data with respect to the dimensionality of the feature-spaceand complexity of the model to effectively generalize, and (2) thatthe test set was large enough to accurately estimate model perfor-mance. Despite stopping data collection early due to the COVID-19pandemic (with 57 completions), our resultant dataset included1200 non-practice subject-trials which we deemed sufficient for ourplanned analysis.Two participants were dropped after they indicated in the post-experiment interview that they misunderstood the instructions andbelieved the task required them to enable the adversary to predicttheir intent (PP 004), or provide training labels to the adversary (PP041).Participants were randomly placed into one of the two studyconditions: early adversary (N=27), or delayed adversary (N=28).The two study conditions operated identically except for the trialnumber at which the adversary info screen appeared (see 2.4), al-lowing us to control for potential learning effects. Participants hada mean age of 20.4 ± HI ’21, May 8–13, 2021, Yokohama, Japan Gordon et al.

After providing consent, each participant went through a self-pacedslide-based tutorial using the experiment computer. The tutorialexplained the pattern matching task, and provided quizzes to ensurethe participant had understood how to identify a match consistentwith the rules of the game. The tutorial also detailed how a finalsuccess score would be calculated, and the monetary bonus thatwould be provided depending on this score. Bonuses were calcu-lated based on the percent of successful rounds (a correct matchfor pre-adversary rounds, and both a correct match and the avoid-ance of prediction in the adversarial rounds), however informationabout the adversary was not presented until later. All participantsreceived the base compensation ($20) for their participation, withperformance-based rewards contributing an additional 0-25% ofthe base level. At the end of the tutorial the experimenter gave theparticipant an opportunity to ask any clarifying questions.Next, the experimenter fitted the participant with an EmpaticaE4 skin conductance and heart rate monitor, and a Vive Pro Eye VRheadset. The participant completes a brief exercise in which theyfixate on targets that appear on the screen in sequence to calibratethe in-unit eye tracking system. Once complete, the participantwas asked to use the trigger on the controller to begin the first offour (unscored) practice rounds in the experiment. The practicerounds simulate a normal trial, and allow the participant to becomecomfortable with the mechanics of the interaction in VR. Practicerounds, like regular rounds, had a 10-second decision timer, and3-second selection timer, so that participants learned how to maketheir selection in the allotted time.

Figure 2: Left: schematic of subject during a trial. Top-right:participant in VR. Bottom-right: participant’s field of viewduring card selection.

After completing all four practice rounds, the participants wereasked to confirm they understood the task and the interaction, andif so, they used the controller trigger to begin the main experiment.After three trials (early condition) or 22 trials (delayed condition,midway point) the participant was presented with a screen inform-ing them that the task would be changing and that in all remainingtrials, an adversary would be “tracking their behavior during thedecision phase” and attempting to predict which card they wouldchoose. They were also informed that success on each followingtrial would be determined by both selecting a correct match and avoiding their choice being predicted by the adversary . After theparticipant reviewed this information, the experimenter verballyasked whether the information was clear and prompted for anyfinal questions before instructing the participant to continue withthe experiment when ready.All participants completed all 44 non-practice trials. Upon com-pletion of the last trial, a screen appeared summarizing the par-ticipant’s performance across all trials (counts of correct matches,and successful avoidance of prediction), and displaying their finalsuccess score and bonus compensation earned.After removing the VR headset, the participant was asked torespond to a short questionnaire and structured interview coveringtheir beliefs about the biometric data that was used, strategiesthey employed, and confidence in their efficacy . Once complete, adebrief form and media release form were signed, and finally theparticipant was provided their incentive payment and told theymay exit the lab.The recruitment process and study protocol was approved bythe local ethics review board. The VR experiment was implemented in C . Thisdata type was converted into a series of fixation records trackingstart and stop timestamps, as well as the scene element fixated.Secondly, raw gaze data was recorded in the update loop providinga gaze origin and direction vector in world coordinates, as well asan approximate vergence distance allowing tracking of the implied ( 𝑥, 𝑦, 𝑧 ) coordinates of raw gaze target.Physiology data including heart rate, inter-beat interval (IBI),and electrodermal activity, was extracted from data files producedby the E4. We extracted EDA spike timestamps using peakutils [25],and computed heart rate variability (HRV) based on IBI recordsusing the HRV

Python library [5]. All data was aligned and regis-tered by comparing synchronized device timestamps and trial-by-trial metadata recorded by the Unity experiment. All experimentcode is available in an MIT licensed repository: https://github.com/onejgordon/cec_vr. During experimentation, the adversary was programmed to make predictions basedon a simple gaze-based heuristic trivial to compute in real-time. Because only anaggregate success score was provided to participants after all trials were complete, thespecific mechanism of prediction could not influence participant behavior. The questionnaire and interview were intended as exploratory methods to captureunexpected responses and hear from participants in their own words. While theexperimental study and quantitative results are the primary contributions of this work,the qualitative data allowed for more nuanced interpretation and set up possible focusareas for future, more in-depth investigation. For details on Tobii’s G2OM algorithm, see https://vr.tobii.com/sdk/technology/tobii-g2om/ overt Embodied Choice CHI ’21, May 8–13, 2021, Yokohama, Japan

To assess participants’ effectiveness at protecting privacy of intent,we trained a discriminative machine learning model to predict a par-ticipant’s ultimate choice of card. Trials where the participant failedto choose any card before time elapsed were excluded, so the modelis required only to correctly predict a binary label correspondingto choose left or choose right . Data available to the predictive modelincluded all physiological and behavioral data collected during the10-second decision phase, which was clearly demarcated to partici-pants by a text alert and changing background color (see section2.6.2 for details). Data captured during the selection phase, such asparticipants’ arm motion to select a card, was not available to thealgorithm.We conducted experiments within two related prediction paradigmsboth consistent with plausible real-world settings but presentingunique challenges.In the first, we trained a participant agnostic (PA) model topredict participant choice without any identifying informationabout the participant or their behavior in other trials. The pre-diction problem is posed in the classical machine-learning format,to learn a predictive model: 𝑀 𝑃𝐴 = 𝑃 ( 𝑌 𝑡 | 𝑋 𝑡 ) , where 𝑌 𝑡 is the choicelabel for trial 𝑡 , and 𝑋 𝑡 is the corresponding feature vector.In the second paradigm, we trained an ensemble of behavioral-typology (BT) models that first group participants based on sev-eral rounds of trial data (including choice labels). The model isthen required to predict participant choice for all remaining trials.This paradigm was motivated by the qualitative observation thatparticipants’ choices often became predictable to the experimenteronce a strategic pattern was detected.As such, the behavioral-typology model is given trial data and la-bels from the first 3 adversarial trials completed by each participant.The prediction problem for each trial 𝑡 becomes: 𝑀 𝐵𝑇 = 𝑃 ( 𝑌 𝑡 | 𝑋 𝑡 , 𝑋 , , , 𝑌 , , ) Here, 𝑌 𝑡 and 𝑋 𝑡 are defined as before, and 𝑋 , , and 𝑌 , , arethe concatenated data and labels respectively from the first threetrials.To test this model, we chose a simple heuristic based on the mostchoice-informative feature according to exploratory analysis of thetraining set: the proportion of eye fixations on the chosen card.Participants were assigned a behavioral-typology based on thismetric ( < < − > Both prediction problems used the samedataset which was produced by randomly assigning 50% of partici-pants into each set ( 𝑁 𝑡𝑟𝑎𝑖𝑛 =

28 and 𝑁 𝑡𝑒𝑠𝑡 = set (trainvs. test) determined by the participant’s assignment. As such, allmodels were tested using trials from novel participants, with noopportunity for train-test leakage. The resultant matrix contained 1200 trials of which 52% were used for training. To compare pre-dictive performance and the success of obfuscation, independentmodels were trained on pre- and post-adversarial trials separately.We tried a number of off-the-shelf classification algorithms forthis task, and report results for two that performed best overall:Scikit-learn’s implementation of the Random Forest Decision Tree(RFDT) and Gradient Boosted Decision Tree (GBDT) classifiers [28].Due to the limited sample size and lack of a separate evaluation set,only minimal hyper-parameter tuning was performed in order toavoid overfitting the test sample.To assess each model’s performance we report accuracy scores(percent of trials correctly predicted) across all test participants’trials. Fixations and Gaze.

To conform with the training paradigmwhich called for the generation of a single binary prediction pertrial, we employed a number of common techniques to extractlower-dimensional feature vectors from the high-dimensional rawgaze and fixation data collected during each trial’s decision phase.Fixations, which were tallied with a start and stop timestamp andtarget object, were used to produce features indicating the fixationcount ( 𝐹 𝑖 ) and fixation fraction ( 𝐹 𝑖 (cid:205) 𝑗 𝐹 𝑗 ) for each key object in thescene (e.g. left card, right card, holder, table). Since each table cardwas expected to be a particularly informative region of interest,we also computed the minimum, maximum, and mean duration offixations on each. Additional fixation-based features included: lastand second to last fixation object, and percent of trial for whicheyes were closed.In contrast to discrete fixations, features based on raw gazepoints have the potential to capture behavior in which participantslook near but avoid direct fixation upon an object of interest. Wecomputed simple descriptive features from the mean, mode, andstandard deviation of each of the three coordinate dimensions. Head and Controller Motion.

To capture dynamics of the con-troller and head mounted display (HMD), which measures headposition and rotation, we included descriptive statistics from theposition and rotation of each (e.g. HMD Y-position standard devia-tion, controller yaw mean, final HMD roll). Position and rotationfeatures included an absolute version, as well as one relative to trialstart. We estimated mean and maximum velocity based on positiondeltas and timestamps.Finally, to capture more granular information that might indicatethe trajectory of head and arm motion, we broke the decision phaseinto 10 equal duration segments, and included features for 3-axisposition snapshots at each time point.

Our analysis indicates significant changes when comparing be-havior between adversarial and non-adversarial trials, as well assubstantial individual differences during performance of the task.In Section 3.2 we review results, including the post-task question-naires and interviews, related to apparent behavioral strategies andavoidance of prediction.

HI ’21, May 8–13, 2021, Yokohama, Japan Gordon et al.

In the following sections we report results of analyses comparingdata before and after each participant received the adversarial in-struction. Unless stated, analyses include participants from bothearly adversary and delayed adversary conditions. The physiologi-cal and behavioral variables analyzed exist on a vague continuumfrom signals typically considered involuntary (e.g. heart rate vari-ability and electrodermal activity), to those usually interpreted asthe result of intentional, conscious action (e.g. gross arm move-ments). The very premise of this study rests on the ambiguousmiddle ground, where participants may attempt to control or alterbehaviors they know to be under observation in order to obfuscatetheir true intent. Which hand motions, head tilts, or saccades are theproduct of voluntary control versus involuntary physiological ormotor outputs cannot be established by our analyses, or indeed anyalgorithm [16]. Though post-task interviews do shed light on thestrategies consciously selected by participants, self-reports provideonly a subset of all effects on behavior exhibited. For the purposesof this study, however, we need not discriminate voluntary versusinvoluntary action, and rather ask only how measurable behav-iors change, and to what extent these changes affect the abilityof an algorithmic agent to successfully predict a participant’s ulti-mate choice. For an analysis of behavior changes consistent withself-reported strategies, see Section 3.2.

Though a slight peak is observed after adver-sarial trials start, we found no statistically significant difference inheart rate between non-adversarial and adversarial trials. Resultssuggest that the number of EDA spikes reduced post adversary( 𝑡 = . 𝑝 = . In adversarial trials, participants tended to movetheir arm and head less. Specifically, we assessed the range of mo-tion along each rotational dimension (pitch, roll, yaw), as 𝜃 𝑟𝑎𝑛𝑔𝑒 = 𝑚𝑎𝑥 ( 𝜃 ) − 𝑚𝑖𝑛 ( 𝜃 ) . Range of all three dimensions for both controller(arm) and HMD (head) was significantly lower in post-adversarytrials (statistical test results shown in Figure 4). The distribution of raw gaze data also significantlyshifted during adversarial rounds. Both the mean and standarddeviation of gaze target height ( 𝑦 𝑚𝑒𝑎𝑛 , 𝑡 : − . , 𝑝 < . 𝑦 𝑆𝐷 , 𝑡 = − . , 𝑝 < . Post-task questionnaires and inter-views probed participants on how they interpreted the requirementsof the post-adversary task, how they modified their behavior (if atall) to avoid prediction, as well as the experience of being askedto make a decision under adversarial surveillance. While the pri-mary motivation for these interviews was to verify participants’comprehension of the task, they contain some useful insights intoconscious strategies used, as well as the challenges posed by thetask overall.Questionnaire responses indicate that participants primarily ad-justed behavior related to gaze and arm motion (see Figure 5). Ininterviews, participants identify a range of meta-strategies whichwe categorize as signal dampening, signal randomizing and overt (a) Heart rate time-series for all delayed condition participants,across full session duration. Red dashed line indicates adversarialtransition.(b) Count of EDA spikes, pre- and post-adversary. Reduction inspikes was not statistically significant according to independent 𝑡 -test. Figure 3: Analysis of physiology data before and after adver-sary. misdirection. Signal dampening strategies included the suppres-sion of behaviors that participants believed may give away theirintent, such as holding their head still, reducing motion in theirarm, pointing their hand in a neutral direction, entraining gaze ona neutral portion of the table or on the timer, and looking near butnot directly at the cards on the table. Signal randomizing techniquesincluded both adding excess behaviors during each trial, such asshifting gaze widely and continuously, moving their head back andforth at random, as well as randomizing overt asymmetrical behav-iors by pointing the controller sometimes at the chosen card, andsometimes away.Overt misdirection included intentional actions that ‘telegraph’intent opposite to true choice, such as fixating gaze or head positionfor long durations on the opposite card, moving the controllertowards the opposite card during the decision phase, etc. overt Embodied Choice CHI ’21, May 8–13, 2021, Yokohama, Japan

Figure 4: Head and controller range pre- (blue) and post-adversary (red). Range for both reduced significantly post-adversary suggesting support for signal dampening strategy. 𝑝 -value for independent sample 𝑡 -test, and Cohen’s 𝑑 shownabove each plot.Figure 5: Results from post-task survey. Left: Participantsreported modifying multiple behaviors with gaze and armmotion most often selected. Right: 61% of participants re-sponded ’Agree’ or ’Strongly Agree’ when asked if they be-lieved they were able to successfully influence the adver-sary’s ability to predict their choice. Several participants reported initially looking at or near the tablecards just long enough to make their decision, then switching toa misdirection or obfuscation strategy for the remainder of thedecision phase: “I would look at both and figure out what the rightanswer was, and then I would stare at the wrong one, or go offinto my own thoughts” (PP 012). Other participants talked about anevolving thought process on obfuscation, and the realization thatconsistent misdirection may also be predictable: “At first I thought,oh it makes sense to try not to do what you would normally do, likeif you’re going to pick this, then look at the other one for longer.After a while I was like, oh that’s also predictable, so maybe switchit up.” (PP 015).Over 60% of participants believed they were able to influencethe adversary’s ability to predict their intent using the strategiesthey employed.

Figure 6: The increasing overlap of the gaze x-coordinatedistribution between left vs. right trials post-adversary illus-trates behavior change to deter prediction based on gaze. Onaverage, however, final gaze point was highly predictive ofchoice both pre- and post-adversary.

While the majority of partici-pants reported that they believed they were able to influence theadversary’s prediction, some weren’t sure, and others disagreed.In post-task interviews, some of these participants noted that theydidn’t understand how the adversary was making its predictions,and so felt it wasn’t clear how they could effectively respond (PP046, PP 016, PP 057): “I guess I just didn’t really know what it wasgoing to be looking at, so I didn’t know what to change” (PP 017).Others found the matching aspect of the task challenging, and feltthey needed to focus on selecting the right card rather than avoid-ing prediction: “I was more preoccupied with getting the right card,so kind of forgot about [the adversary] sometimes” (PP 003). Foranother, the decision and the mechanics of making their selectiontook priority: “I was still trying to pick the right card, like grab itin the right amount of time” (PP 016).These assessments highlight that for some participants uncer-tainty about the dynamics of algorithmic prediction underminedtheir feeling of agency. Additionally, the cognitive load imposed bythe combination of the card matching task and time limit were chal-lenging for some participants to balance with the need to obfuscate.In the following section we review results comparing biometric datawhen left is chosen versus when right is chosen, thus identifyingchoice-correlated asymmetries likely exploited by the predictivemodel.

To detect behavioral patterns that mightbe leveraged by a predictive model, we performed a variety ofexploratory analyses comparing raw data and features collectedduring the decision phase, segmented by participant choice. Anyasymmetries seen either across the full training set, or within asubstantial sub-group, might indicate strategic behaviors or invol-untary predictors correlated with intent. We report several suchasymmetries below.

Gaze.

Even when averaging across the full training population,the x-coordinate of the final raw gaze point was highly predictiveof choice ( 𝑑 = − . , 𝑡 = − . , 𝑝 < . HI ’21, May 8–13, 2021, Yokohama, Japan Gordon et al.

Longitudinal Observations.

Though comparing summary statis-tics of gaze and motion data highlights systematic differences inbehavior, this approach ignores the richness of temporal visualattention inherent to this decision-making paradigm. By analyzingindividual participants movement and gaze longitudinally (bothacross trials, and within each trial), additional behavioral regulari-ties appear.

Figure 7: Total fixation duration (top) and fixation time-series (bottom) by trial for 4 selected participants (PPs).Green: fixations on chosen card; Blue: fixations on othercard; Gray: fixations on non-card objects (shown in sequencecharts only). Trials progress vertically from top to bottom,and the dotted red line indicates the beginning of adversarytrials. Note that trials for which no card was selected areomitted.

Figure 7 illustrates trial-wise fixation time-series and total du-ration by card (chosen versus not chosen) for selected individualparticipants. These plots demonstrate dynamics of fixations, andbehavioral change post-adversary, found to be representative ofcommon behavioral patterns among the training population.Participants PP 010 and PP 005 both favored fixations on the cardthey eventually chose prior to adversary, but show very differentbehavior post-adversary. Post-adversary, PP 005 reduces fixationduration on their chosen card, but both fixation duration, and finalfixation (note the regularity in final fixation) still indicates choiceon nearly every post-adversary trial. PP 010, on the other hand,is likely using a misdirection strategy and intentionally switchesvisual attention to the opposite card.PP 014’s fixation behavior illustrate a key finding—in some cases,conscious obfuscation may unintentionally unmask intent. Whilefixations are well-balanced pre-adversary, the participant’s choicebecomes significantly more predictable post-adversary under anapparent strategy of fixating on the opposite card.Overall, these analyses highlight the value of providing temporalgaze sequence features (in addition to duration and fixation count)to the predictive models.

Arm Movement.

An analysis of arm movement (measured bycontroller trajectories), reveals a substantial behavior change pre-

Figure 8: Top-down view of controller trajectories (left), fi-nal position (middle) and final velocity (right), at termina-tion of decision phase. Turquoise: right selected; Magenta:left selected. Clear choice correlation can be seen in allthree metrics pre-adversary. Post-adversary data exhibitsless stereotyped segmentation, but clear differentiation isstill visible in both final position and final velocity. and post-adversary. Trajectories are consistently stereotyped bychoice in early non-adversarial trials, and show many participantspreemptively (prior to selection phase) moving their controller overthe chosen card in preparation for selection. In post-adversary trials,we see support for self-reported strategies involving constrainedcontroller motion as well as misdirection (e.g. the density of finalcontroller positions over the opposite card in the bottom centerplot in Figure 8).Despite this visible obfuscation behavior, choice-correlation re-mains in final controller position, and particularly in final velocity.We discuss these findings further in Section 4.

Participant-Agnostic Predictor.

Even in the more challenging participant-agnostic paradigm where the predictive model must identify fea-tures successfully correlating with choice without the context ofprior participant behavior, prediction rates were high. The best-performing model was the Gradient Boosted Decision Tree (GBDT),with a prediction accuracy of 93% on pre-adversary trials, and 73%on post-adversary trials (see Figure 9).The most predictive features included last gaze fixation target,mean gaze X-coordinate, and final HMD roll offset.

Behavioral-Typology Predictor.

As anticipated, the behavioral-typology (BT) predictor improved overall prediction accuracy, achiev-ing 72.4% for the most effective (least predictable) strategy typolo-gies, and 82.7% for the least effective behavioral typologies (seeFigure 11 for BT model performance details).The majority of participants are reliably predicted in between75 to 100% of trials using this technique (see accuracy distributionpost-adversary in Figure 10).Table 1 provides a summary of model prediction accuracy.

Strategic Efficacy.

We can quantify the efficacy of a participant’schange in behavior by comparing our trained model’s prediction overt Embodied Choice CHI ’21, May 8–13, 2021, Yokohama, Japan

Figure 9: Model accuracy comparison for participant-agnostic predictor, with forward addition of feature-sets.Feature sets were added as follows: 1) Gaze features, 2) Fixa-tion features, 3) HMD features, 4) Controller features.Figure 10: BT model predictability pre-adversary (x-axis)compared to post-adversary (y-axis). Some test participants(red) became more predictable post-adversary. The distribu-tion of predictability (right) illustrates that the majority ofparticipants’ choices could be successfully predicted in be-tween 75% to 100% of their trials. Note: the chart at left in-cludes only test participants in the delayed adversary condi-tion, since BT model performance cannot be assessed fromthe three pre-adversary trials. The chart at right includes alltest participants across both conditions. performance between pre- and post-adversary trials. We define ametric for strategic efficacy ( Ψ ) as follows: Ψ = 𝑁 𝑐𝑜𝑟𝑟𝑒𝑐𝑡,𝑝𝑟𝑒 𝑁 𝑡𝑜𝑡𝑎𝑙,𝑝𝑟𝑒 − 𝑁 𝑐𝑜𝑟𝑟𝑒𝑐𝑡,𝑝𝑜𝑠𝑡 𝑁 𝑡𝑜𝑡𝑎𝑙,𝑝𝑜𝑠𝑡 If efforts to conceal intent are effective, prediction accuracy post-adversary will be lower (positive Ψ ). Figure 10 plots BT modelaccuracy pre- and post-adversary thus illustrating strategic efficacy(distance from diagonal). Participants for whom prediction accu-racy decreased are plotted in green, while increases in prediction Table 1: Summary of predictive model performance. Best-performing models for each prediction paradigm are listedfor both pre-adversary (pre) and post-adversary (post) trials,Gradient Boosted Decision Tree (GBDT) for the Participant-Agnostic (PA) predictor, and Random Forest Decision Tree(RFDT) for the Behavioral-Typology (BT) predictor.

Model Train Accuracy Test AccuracyGBDT (PA, Pre) 97.1% 92.0%GBDT (PA, Post) 72.0% 73.4%RFDT (BT, Pre) – 91.0%RFDT (BT, Post) – 80.4%accuracy are plotted in red. The clustering of test participants be-low the diagonal illustrates that, through strategic adjustment tobehavior, most participants were able to successfully reduce choiceprediction, but not by much. Indeed, only one participant (shownat lower right) evaded prediction in more than half of trials, andon average across participants, 80.4% of post-adversary trials weresuccessfully predicted.Also visible in Figure 10, is the surprising finding that for anumber of participants (red dots), efforts to become less predictablehad the opposite effect producing an increased prediction accuracypost-adversary.To further investigate behavioral and physiological differencesthat might explain variance in strategic efficacy, we compared twogroups of test participants: those less than 70% predictable post-adversary, which we term ‘effective’, and those at least 70% pre-dictable post-adversary, which we term ‘ineffective’. The lattergroup can be seen as the rightmost disjoint cluster in the histogramin Figure 10. We find that the effective group exhibited statisticallysignificant reduction in controller movement (yaw std. dev.: 𝑑 = . , 𝑡 = . , 𝑝 = . 𝑑 = . , 𝑡 = . , 𝑝 < . 𝑑 = . , 𝑡 = . , 𝑝 < . 𝑑 = . , 𝑡 = . , 𝑝 < . 𝑑 = . , 𝑡 = . , 𝑝 < . 𝑑 = . , 𝑡 = . , 𝑝 < . 𝑑 = . , 𝑡 = . , 𝑝 = . 𝑑 = . , 𝑡 = . , 𝑝 = . The experimental paradigm presented here was designed to provideinsights into both the sensitivity of behavioral and biometric datawhen leveraged towards the prediction of future decisions (RQ1/2),the response of individuals confronted with a specific kind of ad-versarial surveillance (RQ3), as well as the interaction between thetwo in the effect of awareness of tracking on an individual’s abilityto maintain private intentions (RQ4).Classification accuracy results indicate that collected data wereextremely predictive of choice (over 90% accuracy) for participantsduring pre-adversary trials. Though pre-adversary prediction was

HI ’21, May 8–13, 2021, Yokohama, Japan Gordon et al.

Figure 11: Behavioral-Typology model performance foreach inferred typology based on proportion of fixations: fix-ations prefer chosen card, fixations prefer other card, andmixed. performed primarily to produce a baseline against which to compareperformance during adversarial trials, this finding alone deservessome consideration. In the absence of any reason to mask intent, be-havioral data collected during deliberation was sufficient to predicteventual choice in the vast majority of trials.In adversarial trials, after being informed of the imperative to de-cide which card to select without revealing intent, most participantsadopted strategies based on logical assumptions about the dynam-ics of prediction. Participant interviews revealed a thought processinvolving the various data they believed might be used by the ad-versary. The gaze calibration exercise at the beginning of the VRsession suggested to many participants that their eye movementsmay be among the data used by the adversary to anticipate their de-cision. Seeing their controller in VR was another strong signal thatarm motion might be a target of tracking, as well as an opportunityfor masking or obfuscation. Some participants observed that duringpre-adversary trials their hand would begin to move just beforethe end of the decision phase, and reported attempting to adjust,invert, or randomize this behavior post-adversary. Despite correctlyidentifying these signals, however, components of gaze, head, andarm motion all remained correlated with choice when averagedacross the test sample. These correlations, in aggregate, allowed theclassifiers within both prediction paradigms to successfully predictcard choice in spite of the strategies employed.To understand why, it is instructive to revisit one of the mostinformative data types captured: gaze fixations. In Figure 7, PP010 and PP 005 both demonstrate significant adjustments to gazedynamics upon the beginning of adversarial trials. Both strategies,once identified, remain predictive of choice—PP 005 reduces overallfixations on either card, and appears to randomize fixations whiledeciding, but fixates their chosen card at the end of each decisionphase (green lines at end of post-adversary trials). PP 010 alsoreduces overall card fixations, and inverts their gaze behavior in away that remains highly correlated with choice.Optimal behavior, as any child fond of rock-paper-scissors (orgame theory) intuitively knows, depends upon the randomizationof behavior in order to minimize correlations between observable data and intended action. Randomization, even of a single seriesof discrete choices, however, is known to be impossible for hu-mans to achieve [32]. In contrast to univariate randomization, ourtask is especially challenging due to its multi-dimensionality—itrequires successful randomization of all detectable behaviors simul-taneously.Finally, several of the strategies detected appear consistent withthe types of intuitive behavior that might be effective at maskingintent from a human observer. This is one interpretation of resultsshowing lowered head pitch (Figure 4), and lowered gaze, whichfail to obfuscate the target of gaze from eye-tracking hardware likethat used in this study.

The prediction accuracy results achieved in our analysis are notmeant to establish a ceiling. Rather than developing an optimalbehavior prediction tool, the aim of this work was to assess thefeasibility of intent prediction, as well as the range of individuals’responses to an explicitly surveilled task paradigm. With a largertraining set, it is likely that even the standard statistical techniquesused in this analysis would learn an improved behavioral modelcapable of exploiting the regularities of a wider variety of strategiesand therefore achieve prediction accuracy improvements.An additional limitation stems from the simplistic behavioral-typology inference model which we based on a single metric (cardfixation ratio) indicative of a single behavioral strategy. With aslightly larger participant pool, an unsupervised learning approachidentifying natural subject clusters or axes of behavioral variationwould likely have produced improved prediction accuracy for theBT model.While our results suggest dynamics that may extend into real-world contexts in which individuals interact with surveillance sys-tems, the typical considerations of the generalizability of in-labfindings apply here. While a significant performance-based incen-tive was used to motivate task success, behavior in a setting withmore significant potential consequences—as is increasingly relevantto real-world surveillance—might well display dynamics differentfrom those observed in this study. Relatedly, though the game-based setting was conducive to our analysis, studying participantsas they make more naturalistic decisions (e.g. who to vote for, orhow much to tip) may offer insights that more easily extend toeveryday behavior.We hope future work will build on this preliminary study todevelop a more holistic understanding of individuals’ reasoningabout and response to biometric surveillance.

The kinds of privacy risk supported by our findings are likely bestaddressed through a combination of public awareness, regulatoryaction, and the concerted efforts of designers [2, 29, 36].Among other proposed solutions, Bailenson suggests users maytake it upon themselves to use hardware filters capable of addingnoise and reducing the fidelity of collected data [3]. Our findings,however, suggest that users may overestimate the efficacy of moreintuitive obfuscation strategies, and underestimate the sensitivity ofthe data collected from them. As such, hoping to encourage costly overt Embodied Choice CHI ’21, May 8–13, 2021, Yokohama, Japan user action such as obtaining and using obfuscation devices maynot be a reliable solution.Public education, however, must certainly play a role. Alphabet’s“Digital Transparency in the Public Realm” initiative is one examplethat hopes to encourage cities and other actors collecting data inpublic spaces to use a set of colored privacy icons to indicate thetypes and sensitivity of data being collected [14]. The behaviors andmisconceptions illuminated by our results, however, indicate somecritical limitations of this effort as well. What will an individualwho notices an eye-tracking symbol on a light-post do with thisinformation? Will Facebook require users of its recently announcedAR glasses to wear warning labels to inform passersby of ongoingtracking [11]? What use is informing members of the public aboutsurveillance, without instructions for opting-out? And if, as was thecase for some of our participants, strategies to protect one’s privacyinstead act to exposes sensitive information, might messaging likethis cause additional harms?Even more fundamentally, however, any response to trackingtechnologies must contend with the fact that what is being collectedis a moving target: the signals hidden within raw data evolve withnew algorithms, new users, and integrations with additional datasources. The sensitivity emerges not from the data itself, but whatcan be done with it. This study highlights the counterintuitivenature of “what can be done,” and as such the challenges privacyadvocates must overcome when communicating these ideas to thepublic.

It is crucial, when conducting work either employing or studyingtechnologies that have been used to harm individuals and commu-nities, to critically exam the potential impact on these populationsand society as a whole. We reason that this research is unlikely toreinforce harms, and holds the potential to bring valuable data toacademic and public dialogue regarding the capabilities, risks, andvulnerabilities that may be exploited by algorithmic surveillance.First, this work does not develop any novel algorithm or com-putational model that might improve predictive performance, butrather explores the efficacy of off-the-shelf machine learning tools.Second, that the findings of this work likely parallel research con-ducted privately within organizations seeking to leverage algorith-mic prediction in the interest of financial or political gain. If ourspeculation related to potential prediction accuracy is correct, wemust acknowledge the multiple orders of magnitude that separatethe sample size and scope of data used in this study with that avail-able to companies and other organizations already in the businessof collecting biometric data from individuals.

In this work, we examined the expectations individuals hold aboutbiometric surveillance, and how these beliefs influence behavioralresponse to a tracked setting. Our results suggest that participantshold a range of priors about the nature of biosignals that mightbe leveraged for prediction, and use a wide variety of strategiesto attempt to make a future choice less predictable. While someparticipants questioned their agency in evading the adversary, mostmodified their behavior and successfully reduced their prediction accuracy. However, data collected remained highly predictive ofchoice (over 80% mean accuracy), and the majority of participantswere correctly predicted by the behavioral-typology model in 75-100% of trials. Importantly, a meaningful subset of participantsadopted a strategy that on average increased the model’s ability tosuccessfully predict their choice, suggesting the counterintuitivenature of the dynamics of algorithmic prediction.

This material is based upon work supported by the National ScienceFoundation Graduate Research Fellowship Program under GrantNo. 2019236659. Any opinions, findings, and conclusions or recom-mendations expressed in this material are those of the author(s)and do not necessarily reflect the views of the National ScienceFoundation.

REFERENCES [1] Imtiaz Ahmad, Rosta Farzan, Apu Kapadia, and Adam J Lee. 2020. Tangibleprivacy: Towards user-centric sensor designs for bystander privacy.

Proceedingsof the ACM on Human-Computer Interaction

4, CSCW2 (2020), 1–28.[2] Claudio A Ardagna, Marco Cremonini, Sabrina De Capitani di Vimercati, andPierangela Samarati. 2009. An obfuscation-based approach for protecting locationprivacy.

IEEE Transactions on Dependable and Secure Computing

8, 1 (2009), 13–27.[3] Jeremy Bailenson. 2018. Protecting Nonverbal Data Tracked in Virtual Reality.

JAMA Pediatrics

Labor History

51, 1(2010), 87–106.[5] Rhenan Bartels and Tiago Peçanha. 2020. HRV: a Pythonic package for HeartRate Variability Analysis.

Journal of Open Source Software

5, 51 (2020), 1867.[6] Sian L. Beilock and Lauren E. Holt. 2007. Embodied Preference Judgments: CanLikeability Be Driven by the Motor System?

Psychological Science

18, 1 (Jan. 2007),51–57. https://doi.org/10.1111/j.1467-9280.2007.01848.x[7] Clint Boulton. 2013. Snackmaker modernizes the impulse buy with sensors,analytics.

CIO Journal

11 (2013).[8] Joseph Bullington. 2005. ’Affective’computing and emotion recognition systems:the future of biometric surveillance?. In

Proceedings of the 2nd annual conferenceon Information security curriculum development . 95–99.[9] David Chaum. 2004. Secret-ballot receipts: True voter-verifiable elections.

IEEEsecurity & privacy

2, 1 (2004), 38–47.[10] Tamara Denning, Zakariya Dehlawi, and Tadayoshi Kohno. 2014. In situ withbystanders of augmented reality glasses: Perspectives on recording and privacy-mediating technologies. In

Proceedings of the SIGCHI Conference on Human Factorsin Computing Systems . 2377–2386.[11] Facebook. 2020. Announcing Project Aria: A Research Project on the Futureof Wearable AR. https://about.fb.com/news/2020/09/announcing-project-aria-a-research-project-on-the-future-of-wearable-ar/.[12] Karl Friston, Rick A. Adams, Laurent Perrinet, and Michael Breakspear. 2012.Perceptions as Hypotheses: Saccades as Experiments.

Frontiers in Psychology

Wired ([n. d.]).[15] Ian Hamilton. 2018. Privacy Summit At Stanford Will Draft VR ’Bill Of Rights’.[16] Alicia Juarrero. 2000. Dynamics in action: Intentional behavior as a complexsystem.

Emergence

2, 2 (2000), 24–57.[17] Malcolm Kirkup and Marylyn Carrigan. 2000. Video surveillance research inretailing: ethical issues.

International Journal of Retail & Distribution Management (2000).[18] Nathan F. Lepora and Giovanni Pezzulo. 2015. Embodied Choice: How ActionInfluences Perceptual Decision Making.

PLOS Computational Biology

11, 4 (April2015), e1004110. https://doi.org/10.1371/journal.pcbi.1004110[19] Dongheng Li, David Winfield, and Derrick J Parkhurst. 2005. Starburst: A hybridalgorithm for video-based eye tracking combining feature-based and model-basedapproaches. In . IEEE, IEEE, San Diego, California, USA,79–79.[20] Daniel J. Liebling and Sören Preibusch. 2014. Privacy Considerations for aPervasive Eye Tracking World.

Proceedings of the 2014 ACM International Joint

HI ’21, May 8–13, 2021, Yokohama, Japan Gordon et al.

Conference on Pervasive and Ubiquitous Computing Adjunct Publication - UbiComp’14 Adjunct (2014), 1169–1177. https://doi.org/10.1145/2638728.2641688[21] Mirca Madianou. 2019. The biometric assemblage: Surveillance, experimentation,profit, and the measuring of refugee bodies.

Television & New Media

20, 6 (2019),581–599.[22] Avi Marciano. 2019. Reframing biometric surveillance: from a means of inspectionto a form of control.

Ethics and Information Technology

21, 2 (2019), 127–136.[23] Nick Merrill, John Chuang, and Coye Cheshire. 2019. Sensing is Believing: WhatPeople Think Biosensors Can Reveal About Thoughts and Feelings. In

Proceedingsof the 2019 on Designing Interactive Systems Conference . 413–420.[24] Betsy Morris. 2016. Virtual-Reality Startup Strivr Raises $5 Million in InitialFunding Round.

Wall Street Journal (Dec. 2016).[25] Lucas Hermann Negri and Christophe Vestri. 2017. lucashn/peakutils: v1.1.0 .https://doi.org/10.5281/zenodo.887917[26] Helen Nissenbaum. 2009.

Privacy in context: Technology, policy, and the integrityof social life . Stanford University Press.[27] Anjul Patney, Marco Salvi, Joohwan Kim, Anton Kaplanyan, Chris Wyman, NirBenty, David Luebke, and Aaron Lefohn. 2016. Towards foveated rendering forgaze-tracked virtual reality.

ACM Transactions on Graphics (TOG)

35, 6 (2016),179.[28] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M.Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cour-napeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: MachineLearning in Python.

Journal of Machine Learning Research

12 (2011), 2825–2830.[29] James Pierce. 2019. Smart home security cameras and shifting lines of creepiness:A design-led inquiry. In

Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems . 1–14.[30] Thammathip Piumsomboon, Gun Lee, Robert W Lindeman, and Mark Billinghurst.2017. Exploring natural eye-gaze-based interaction for immersive virtual reality.In . IEEE, 36–39.[31] Inioluwa Deborah Raji, Timnit Gebru, Margaret Mitchell, Joy Buolamwini, Joon-seok Lee, and Emily Denton. 2020. Saving face: Investigating the ethical concernsof facial recognition auditing. In

Proceedings of the AAAI/ACM Conference on AI,Ethics, and Society . 145–151.[32] Amnon Rapoport and David V Budescu. 1992. Generation of random seriesin two-person strictly competitive games.

Journal of Experimental Psychology:General

Commun. ACM

57, 4 (2014), 88–96.[34] Elaine Sedenberg, Richmond Wong, and John Chuang. 2017. A window into thesoul: Biosensing in public. arXiv preprint arXiv:1702.04235 (2017).[35] Anil K Seth. 2014.

The cybernetic Bayesian brain . Open MIND. Frankfurt amMain: MIND Group.[36] Vincent Toubiana, Lakshminarayanan Subramanian, and Helen Nissenbaum.2011. TrackMeNot: Enhancing the privacy of Web Search. arXiv e-prints (Sept. 2011), arXiv:1109.4677. https://doi.org/10.1016/S0370-2693(96)01648-6arXiv:arXiv:1109.4677 [cs.CR][37] Roel Vertegaal and Jeffrey S Shell. 2008. Attentive user interfaces: the surveillanceand sousveillance of gaze-aware objects.

Social Science Information

47, 3 (2008),275–298.[38] Shoshana Zuboff. 2015. Big other: surveillance capitalism and the prospects of aninformation civilization.