GazeBase: A Large-Scale, Multi-Stimulus, Longitudinal Eye Movement Dataset
Henry Griffith, Dillon Lohr, Evgeny Abdulin, Oleg Komogortsev
GGazeBase: A Large-Scale, Multi-Stimulus,Longitudinal Eye Movement Dataset
Henry Griffith , Dillon Lohr , Evgeny Abdulin , and Oleg Komogortsev Texas State University, Department of Computer Science, San Marcos, TX, 78666, USA * corresponding author: Henry Griffith (h [email protected]) ABSTRACT
This manuscript presents GazeBase, a large-scale longitudinal dataset containing 12,334 monocular eye-movement recordingscaptured from 322 college-aged subjects. Subjects completed a battery of seven tasks in two contiguous sessions duringeach round of recording, including a - 1) fixation task, 2) horizontal saccade task, 3) random oblique saccade task, 4) readingtask, 5/6) free viewing of cinematic video task, and 7) gaze-driven gaming task. A total of nine rounds of recording wereconducted over a 37 month period, with subjects in each subsequent round recruited exclusively from the prior round. Alldata was collected using an EyeLink 1000 eye tracker at a 1,000 Hz sampling rate, with a calibration and validation protocolperformed before each task to ensure data quality. Due to its large number of subjects and longitudinal nature, GazeBase iswell suited for exploring research hypotheses in eye movement biometrics, along with other emerging applications applyingmachine learning techniques to eye movement signal analysis.
Background & Summary
Due to their demonstrated uniqueness and persistence , human eye movements are a desirable modality for biometricapplications . Since their original consideration in the early 2000s , eye movement biometrics have received substantialattention within the security literature . Recent interest in this domain is accelerating, due to the proliferation of gaze trackingsensors throughout modern consumer products, including automotive interfaces, traditional computing platforms, and head-mounted devices for virtual and augmented reality applications. Beyond this increase in sensor ubiquity, eye movements are anadvantageous modality for emerging biometric systems due to their ability to support continuous authentication and livelinessdetection , along with their ease of fusion with other appearance-based traits in both the eye and periocular region .Despite considerable research progress in eye movement biometrics over the past two decades, several open research areasremain. Namely, ensuring performance robustness with respect to data quality, and further investigating both task dependencyand requisite recording duration is necessary to transition this technology to widespread commercial adoption. Moreover, theexploration of emerging deep learning techniques, which have proven successful for more traditional biometric modalities , hasbeen limited for eye movement biometrics. This investigation is impeded by the challenges associated with the large-scalecollection of eye movement data, along with the lack of task-diverse, publicly-available, large-scale data repositories.To promote further development in eye movement biometrics research, this manuscript describes a newly-released datasetconsisting of 12,334 monocular eye movement recordings captured from 322 individuals while performing seven discrete tasks.The considered task battery includes guided stimuli intended to induce specific eye movements of interest, along with multipleobjective-oriented and free-viewing tasks, such as reading, movie viewing, and game playing. Hereby denoted as GazeBase,this data were captured over a 37 month period during nine rounds of recording, with two contiguous sessions completed duringeach recording period. The data collection workflow is summarized in Fig. 1.Although subsets of this data have been utilized in prior work , this recent dissemination is the first release of theentire set of gaze recordings and corresponding target locations for applicable stimuli. As the experimental parameters of thedata collection were chosen to maximize the utility of the resulting data for biometric applications, GazeBase is well suitedfor supporting further investigation of emerging machine learning biometric techniques to the eye movement domain, suchas metric learning . Beyond this target application, the resulting dataset is also useful for exploring numerous additionalresearch hypotheses in various areas of interest, including eye movement classification and prediction. Applications employingmachine learning techniques will benefit both from the scale of available data, along with the diversity in tasks considered andsubjects recorded. Moreover, this dissemination will help improve quality in subsequent research by providing a diverse set ofrecordings for benchmarking across the community . a r X i v : . [ c s . H C ] S e p ig. 1: Summary of the GazeBase dataset collection.
Top:
Experimental set-up.
Middle:
Screenshots of the stimuli fromfour of the seven tasks. a) is a screenshot of the gaze-driven gaming task, b) is a screenshot of the reading task, c-d) is a singlescreenshot from one of the two video viewing tasks. e-g) shows the stationary bull’s-eye target utilized in the calibration andvalidation process, along with the fixation and two saccade tasks. The screenshot is obtained during the random saccade task.
Bottom:
A timeline of the multiple recording rounds (round identifiers are labeled on top of each rectangle). ethods
Subjects
Subjects were initially recruited from the undergraduate student population at Texas State University through email and targetedin-class announcements. A total of 322 subjects (151 self-identifying as female, 171 self-identifying as male) were enrolledin the study and completed the Round 1 collection in its entirety. Subjects for Rounds 2 - 9 were recruited exclusively fromthe prior round’s subject pool. All subjects had normal or corrected to normal visual acuity. Aggregate subject demographicinformation is presented in Table 1. The distribution of subjects’ age at the time of the Round 1 collection is shown in Fig. 2.The number of subjects completing the entire task battery in each round is summarized in Table 2, along with the recordingdates for each round of collection.Ethnicity: Asian Black Caucasian Hispanic MixedNum. of Subjects: 10 32 178 76 27
Table 1:
Self-reported ethnicity of subjects
Fig. 2:
Distribution of subjects’ ages at the time of enrollmentAll subjects provided informed consent under a protocol approved by the Institutional Research Board at Texas StateUniversity prior to each round of recording. As part of the consent process, subjects acknowledged that the resulting data maybe disseminated in a de-identified form.
Data Acquisition Overview
Data was captured under the supervision of a trained experimental proctor. Before initiating the recording process, the proctorprovided a general overview of the experiment to the subject, along with a summary of best practices for maximizing the qualityof the captured data. Namely, subjects were instructed to maintain a stable head and body position, and to attempt to avoidexcessive blinking. Based upon initial recording experiences, it was ultimately suggested that subjects avoid wearing mascarato the recording session as part of the appointment confirmation email. This suggestion was initiated during the first round ofrecording and maintained throughout the remainder of the collection.Subjects wearing eyeglasses were asked to attempt the experiment with glasses removed. This protocol was chosen due tothe known challenges associated with recording individuals wearing eyeglasses using the target capture modality. If subjectswere unable to complete the experimental protocol with glasses removed, an attempt was made to complete the experiment ound ID Num. of Subjects Date Range1 322 09/13 - 02/142 136 02/14 - 03/143 105 03/14 - 04/144 101 04/14 - 04/145 78 09/14 - 11/146 59 03/15 - 05/157 35 10/15 - 11/158 31 03/16 - 05/169 14 10/16 - 11/16
Table 2:
Total number of subjects and recording date range of each roundwhile wearing eyeglasses. Subjects that could not be successfully calibrated or recorded after multiple attempts were withdrawnfrom the study. Subjects could also self-withdrawal at any point during the recording process. A total of 13 subjects werewithdrawn from the initial round of the study. GazeBase contains only data from subjects completing the entire recordingprotocol for a given round.Monocular (left) eye movements were captured at a 1,000 Hz sampling rate using an EyeLink 1000 eye tracker (SRResearch, Ottawa, Ontario, Canada) in a desktop mount configuration. The EyeLink 1000 is a video oculography device whichoperates using the pupil-corneal reflection principle, where gaze locations are estimated from pupil-corneal reflection vectorsusing a polynomial mapping developed during calibration . Stimuli were presented to the subject on a 1680 x 1050 pixel(474 x 297 mm) ViewSonic (ViewSonic Corporation, Brea, California, USA) monitor. Instrumentation control and recordingmonitoring were performed by the proctor using a dedicated host computer as shown in Fig. 1. Recordings were performed in aquiet laboratory environment without windows. Consistent ambient lighting was provided by ceiling-mounted fluorescent lightfixtures.Subjects were seated 550 mm in front of the display monitor. Subjects’ heads were stabilized using a chin and forehead rest.Once the participant was initially seated, the chin rest was adjusted to level the subjects’ left eye at the primary gaze position,located 36 mm above the center of the monitor. This vertical offset from the monitor center was chosen to ensure the comfort oftaller participants given restrictions on adjusting the chair and monitor height. Chair height was initially adjusted as necessaryto ensure the comfort of the subject, followed by additional fine tuning of the chin rest as required to align the left eye with theprimary gaze position. The lens focus was manually tuned as necessary in order to ensure the sharpness of the eye image asviewed by the proctor on the host display.Subjects completed two sessions of recording for each round of collection. While the proctor suggested that subjects take afive-minute break between sessions if needed, subjects were free to decline the break if desired. Subjects could also requestbreaks at any time during the recording process as noted during the consent process. During each recording, the gaze locationwas monitored by the proctor to ensure compliance with the individual task protocols.The gaze position and corresponding stimuli were innately expressed in terms of pixel display coordinates. These valueswere converted to degrees of the visual angle (dva) according to the geometry of the recording setup. Although iris imageswere also captured as part of this collection before the initiation of eye movement recordings, they are not distributed as partof GazeBase. Prior collections including both gaze traces and matching iris images may be found at the following link -http://userweb.cs.txstate.edu/~ok11/etpad_v2.html Calibration and Validation
A calibration and validation procedure were performed before the recording of each task to ensure data quality. To initiatethe calibration process, pupil and corneal reflection thresholds were established. While manual tuning was exclusively usedfor some initial recordings, the automatic thresholding function of the instrumentation software was ultimately employed todevelop initial estimates, with manual fine-tuning performed as required.Once threshold parameter values were tuned to ensure successful tracking of the pupil and corneal reflection, a nine-pointrectangular grid calibration was performed. During this process, subjects were instructed to fixate at the center of bulls-eyecalibration target positioned on a black background. The bulls-eye target consisted of a larger white circle with an approximatediameter of one dva enclosing a small black dot as shown in Fig. 1. The stability of target fixations was monitored by the proctoron the host monitor using a vendor-provided software interface. If necessary, the proctor provided additional instructions toimprove image capture quality (i.e.: increase eye opening, etc.). Once the software determined that the subject had successfullyfixated on a target, the calibration process advanced to the next target. Calibration was terminated when a stable fixation had een captured for all nine points in the grid, thereby producing the aforementioned mapping for estimating gaze location.A nine-point validation process was subsequently performed to ensure calibration accuracy. Validation points outside of theprimary position were disjoint from those utilized in the calibration grid. Validation for each target was manually terminated bythe proctor (contrasting from the calibration procedure, which used automatic termination) upon the determination of a stabletarget fixation. The spatial accuracy of each fixation on the corresponding validation target, hereby referred to as the validationerror, was computed after completion of the validation process. Validation error was determined by computing the Euclideandistance between each target and the estimated gaze location. A maximum and average validation error of less than 1.5 and 1.0dva., respectively, was established as a guideline accuracy criteria for accepting the calibration. However, acceptance of thecalibration was ultimately determined by the proctor based upon visual inspection of the discrepancy between the estimated andtrue target location for each validation point, with additional discretion applied to calibrations failing to meet this quantitativeaccuracy goal. The calibration protocol was repeated before the recording of each task.
Task Battery Overview
A battery of seven tasks were performed during each session of the recording. Tasks were performed in the numbered orderdescribed in the following subsections. Acronyms utilized to describe each task within the distributed dataset are defined withineach subsection title.
Task 1: Horizontal Saccade Task (HSS)
The HSS task was designed to elicit visually-guided horizontal saccades of constant amplitude through the periodic displacementof a peripheral target. Subjects were instructed to fixate on the center of the bull’s-eye target utilized during calibration. Thetarget was displayed on a black background and was initially placed at the primary gaze position. The target was regularlydisplaced between two positions located ±
15 dva horizontally from the center of the screen, thereby ideally eliciting a 30 degreehorizontal saccade upon each jump displacement. The target’s position was maintained for one second between displacements,with 100 transitions occurring during each recording. The proctor notified the subjects of the approximate time remainingwithin the 100-second recording session at 20 second intervals. An identical stimulus was used for the HSS task across subjects,sessions, and rounds.
Task 2: Video Viewing Task 1 (VD1)
The VD1 task was designed to elicit natural eye movements occurring during the free-viewing of a cinematic video. Subjectswere instructed to watch the first 60 seconds of a trailer for the movie “The Hobbit: The Desolation of Smaug”. No audio wasplayed during the video clip. The same video segment was used for the VD1 task across subjects and rounds. Due to variabilityin instrumentation settings, the video stimulus was only displayed for the initial 57 seconds during the second session of eachrecording.
Task 3: Fixation Task (FXS)
The FXS task was designed to elicit fixational eye movements through the static presentation of a central fixation targetlocated at the primary gaze position. Subjects were instructed to fixate on the previously described bull’s-eye target which wasmaintained at the center of the display for 15 seconds. The proctor asked that subjects avoid blinking if possible during theduration of the task before initiating the recording. An identical stimulus was used for the FXS task across subjects, sessions,and rounds.
Task 4: Random Saccade Task (RAN)
The RAN task was designed to elicit visually-guided oblique saccades of variable amplitude through the periodic displacementof a peripheral target. Similar to the HSS task, subjects were instructed to follow the bull’s-eye target by fixating at its center.The target was displaced at random locations across the display monitor, ranging from ±
15 and ± Task 5: Reading Task (TEX)
The TEX task was designed to capture subjects’ eye movements during reading. Subjects were instructed to silently read apassage from the poem “The Hunting of the Snark” by Lewis Carroll. The task was automatically terminated after 60 secondsirrespective of the subjects’ reading progress. Subjects did not receive explicit instructions of what to do if they finished readingbefore the end of the 60 second period. Instead, several possible actions were suggested, including rereading the passage.Because of this ambiguity in instructions, the gaze position towards the end of the recording may vary from the expectedper-line pattern typically encountered during reading. Another passage of the poem was displayed as the stimulus for the secondsession within a given round, with the same pair of sections utilized for all subjects and rounds. ask 6: Balura Game (BLG)
The BLG task was designed to capture subjects’ eye movements while interacting with a gaze-driven gaming environment. Dur-ing the game, blue and red balls moving at a slow fixed speed were presented on a black background. Subjects were instructed toattempt to remove all red balls from the display area as quickly as possible. Red balls were eliminated when the subject fixated onthem, while blue balls could not be eliminated. Visual feedback was provided to subjects by placing a highlighted border aroundeach ball upon the detected onset of a fixation on the ball. The game was terminated when no additional red balls were remaining.Further details regarding the game may be found at the following link - https://digital.library.txstate.edu/handle/10877/4158.In some instances, steady fixations on red balls did not produce the desired elimination behavior. Based upon this limitation,the proctor instructed subjects to not maintain elongated fixations if the red balls were not eliminating as intended. Instead,subjects were instructed to move their gaze away from the ball, and subsequently re-fixate on the non-eliminating ball. As theinitial position and trajectory of each ball was set randomly for each recording, the stimulus varied across subjects, sessions,and rounds.
Task 7: Video Viewing Task 2 (VD2)
Subjects were instructed to watch the subsequent 60 seconds of the trailer used in the VD1 task. Similar to the VD1 task, noaudio was presented to the subjects. The same video was used for the VD2 task across subjects and rounds. Similar to the VD1task, the duration of the VD2 stimulus in Session 2 was truncated to 57 seconds due to variability in instrumentation settings.
Data Records
GazeBase is available for download on figshare . GazeBase is distributed under a Creative Commons Attribution 4.0International (CC BY 4.0) license. Gazebase may be used without restriction for non-commercial applications, with all resultingpublications providing citation to this manuscript. All data have been de-identified in accordance with the informed consentprovided by subjects.GazeBase is organized in a hierarchical directory structure by round, subject, session, and task, respectively. Data recordsare compressed at the subject folder level. Each task folder contains a single csv file with the following naming convention -‘S_rxxx_Sy_tsk’, with the relevant parameters for each field summarized in Table 3. The first line of each file contains thevariable identifiers for each column, which are summarized in Table 4.Naming Parameter Definition Valid Valuesr Round Number 1 - 9xxx Subject ID 1 - 335 (excluding incomplete subjects)y Session Number 1 - 2tsk Task Description ‘HSS’, ‘VD1’, ‘FXS’, ‘RAN’, ‘TEX’, ‘BLG’, ‘VD2’ Table 3:
Description of file naming conventionVariable Identifier Definitionn Timestamp (ms)x Horizontal Gaze Position (dva)y Vertical Gaze Position (dva)val Sample Validity (0 implies valid sample)xT Horizontal Target Position (dva) (where applicable)yT Vertical Target Position (dva) (where applicable)
Table 4:
Description of file variablesMissing samples within each file are denoted by a non-zero value in the val field, with the corresponding gaze positionspecified as NaNs. Missing samples result from the failure to extract either the pupil or corneal reflection from the capturedimage, which may occur under scenarios of blinks or partial occlusions of the eye. For tasks not employing a target (i.e.: VD1,TEX, BLG, VD2), target entries (i.e.: columns xT and yT) are populated with NaNs at each sample. echnical Validation
Substantial efforts were undertaken to maintain data quality throughout the experimental design and collection process, initiatingwith the selection of the recording instrument. The EyeLink 1000 was selected for data collection due to its high spatial accuracyand precision characteristics, which have resulted in its widespread adoption across the research community . The EyeLink1000 is routinely employed as a quality benchmark in the research literature when evaluating emerging camera-based eyetracking sensors (i.e.: ). To ensure adherence to best practices throughout the data collection, all experimental proctorswere trained by personnel with considerable prior experience using the device. This expertise was developed during the lab’sprior data collections using the EyeLink 1000 (e.g. https://userweb.cs.txstate.edu/~ok11/software.html).The experimental protocol was also designed to maximize the quality of the captured data. Namely, subjects were instructedto maintain a stable head position and sufficient opening of the eyelids due to the known relationship between these factors andassociated raw eye positional data quality . A dedicated calibration and validation process was also employed for each task toavoid calibration decay across the collection. Box plots of the distributions of mean and maximum validation errors for eachround of recording are presented in Figs. 3 and 4, respectively. As shown, the median values of the mean validation errorsin each round are less than the upper bound of the specified typical spatial accuracy (0.5 dva) of the instrument. Moreover,the significant dispersion of the two metrics, indicating considerable variability in calibration quality across individuals, isconsistent with prior observations in the literature . Fig. 3:
Distribution of mean validation error across recordings versus round. The central mark in each box corresponds to themedian value, with the lower and upper edges of the box corresponding to the 25th and 75th percentiles of the distribution,respectively. The whiskers extend to the outlier boundaries for each round, which are set at 1.5 times the interquartile range ofthe distribution above and below the box boundaries. Outliers are marked using the + symbol.
Code availability
The distributed csv files were generated by first converting the edf output files produced by the Eyelink 1000 to a text-based ascfile format. These files were subsequently converted to csv files of the specified format using a customized MATLAB script.Data may be extracted from the repository into the target computing environment using traditional csv import functions.
References Bargary, G. et al.
Individual differences in human eye movements: An oculomotor signature?
Vis. Res. , 157–169(2017). ig. 4:
Distribution of the maximum validation error across recordings versus round. See Fig. 3 for an explanation of box plotparameters. Jain, A., Klare, B. & Ross, A. Guidelines for best practices in biometrics research. In , 541–545 (IEEE, 2015). Kasprowski, P. & Ober, J. Eye movements in biometrics. In
International Workshop on Biometric Authentication , 248–258(Springer, 2004). Katsini, C., Abdrabou, Y., Raptis, G. E., Khamis, M. & Alt, F. The role of eye gaze in security and privacy applications:Survey and future hci research directions. In
Proceedings of the 2020 CHI Conference on Human Factors in ComputingSystems , 1–21 (2020). Eberz, S., Rasmussen, K. B., Lenders, V. & Martinovic, I. Looks like eve: Exposing insider threats using eye movementbiometrics.
ACM Transactions on Priv. Secur. (TOPS) , 1–31 (2016). Komogortsev, O. V., Karpov, A. & Holland, C. D. Attack of mechanical replicas: Liveness detection with eye movements.
IEEE Transactions on Inf. Forensics Secur. , 716–725 (2015). Winston, J. J. & Hemanth, D. J. A comprehensive review on iris image-based biometric system.
Soft Comput. ,9361–9384 (2019). Woodard, D. L., Pundlik, S. J., Lyle, J. R. & Miller, P. E. Periocular region appearance cues for biometric identification. In , 162–169 (IEEE, 2010). Sundararajan, K. & Woodard, D. L. Deep learning for biometrics: A survey.
ACM Comput. Surv. (CSUR) , 1–34 (2018). Abdulin, E., Friedman, L. & Komogortsev, O. V. Method to detect eye position noise from video-oculography whendetection of pupil or corneal reflection position fails. arXiv preprint arXiv:1709.02700 (2017).
Friedman, L., Rigas, I., Abdulin, E. & Komogortsev, O. V. A novel evaluation of two related and two independentalgorithms for eye movement classification during reading.
Behav. research methods , 1374–1397 (2018). Rigas, I., Friedman, L. & Komogortsev, O. Study of an extensive set of eye movement features: Extraction methods andstatistical analysis.
J. Eye Mov. Res. , 3 (2018). Friedman, L. & Komogortsev, O. V. Assessment of the effectiveness of seven biometric feature normalization techniques.
IEEE Transactions on Inf. Forensics Secur. , 2528–2536 (2019). Lohr, D. J., Friedman, L. & Komogortsev, O. V. Evaluating the data quality of eye tracking signals from a virtual realitysystem: Case study using smi’s eye-tracking htc vive. arXiv preprint arXiv:1912.02083 (2019). Friedman, L., Stern, H. S., Price, L. R. & Komogortsev, O. V. Why temporal persistence of biometric features, as assessedby the intraclass correlation coefficient, is so valuable for classification performance.
Sensors , 4555 (2020). Friedman, L., Nixon, M. S. & Komogortsev, O. V. Method to assess the temporal persistence of potential biometric features:Application to oculomotor, gait, face and brain structure databases.
PloS one , e0178501 (2017). Griffith, H., Biswas, S. & Komogortsev, O. Towards reduced latency in saccade landing position prediction using velocityprofile methods. In
Proceedings of the Future Technologies Conference , 79–91 (Springer, 2018).
Griffith, H., Aziz, S. & Komogortsev, O. Prediction of oblique saccade trajectories using learned velocity profile parametermappings. In , 0018–0024 (IEEE,2020).
Griffith, H., Biswas, S. & Komogortsev, O. Towards improved saccade landing position estimation using velocity profilemethods. In
SoutheastCon 2018 , 1–2 (IEEE, 2018).
Griffith, H. & Komogortsev, O. A shift-based data augmentation strategy for improving saccade landing point prediction.In
ACM Symposium on Eye Tracking Research and Applications , 1–6 (2020).
Griffith, H. K. & Komogortsev, O. V. Texture feature extraction from free-viewing scan paths using gabor filters withdownsampling. In
ACM Symposium on Eye Tracking Research and Applications , 1–3 (2020).
Abdelwahab, A. & Landwehr, N. Deep distributional sequence embeddings based on a wasserstein loss. arXiv preprintarXiv:1912.01933 (2019).
Research, S. Eyelink 1000 user’s manual, version 1.5. 2 (2010).
Griffith, H., Lohr, D. & Komogortsev, O. V. GazeBase data repository, 10.6084/m9.figshare.12912257.v2.
Nyström, M., Niehorster, D. C., Andersson, R. & Hooge, I. The tobii pro spectrum: A useful tool for studyingmicrosaccades?
Behav. Res. Methods
Ehinger, B. V., Groß, K., Ibs, I. & König, P. A new comprehensive eye-tracking test battery concurrently evaluating thepupil labs glasses and the eyelink 1000.
PeerJ , e7086 (2019). Raynowska, J. et al.
Validity of low-resolution eye-tracking to assess eye movements during a rapid number naming task:performance of the eyetribe eye tracker.
Brain injury , 200–208 (2018). Nyström, M., Andersson, R., Holmqvist, K. & Van De Weijer, J. The influence of calibration method and eye physiologyon eyetracking data quality.
Behav. research methods , 272–288 (2013). Hornof, A. J. & Halverson, T. Cleaning up systematic error in eye-tracking data by using required fixation locations.
Behav.Res. Methods, Instruments, & Comput. , 592–604 (2002). Acknowledgements
This work was supported by the National Science Foundation (Award CNS-1250718). Any opinions, findings, conclusions orrecommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NationalScience Foundation. We would also like to thank Alex Karpov and Ionannis Rigas for help in designing the experiments, alongwith the numerous experimental proctors that contributed throughout the data collection.
Author contributions statement