App-based saccade latency and error determination across the adult age spectrum
Hsin-Yu Lai, Gladynel Saavedra-Pena, Charles G. Sodini, Thomas Heldt, Vivienne Sze
11 App-based saccade latency and errordetermination across the adult age spectrum
Hsin-Yu Lai,
Student Member, IEEE , Gladynel Saavedra-Pe ˜na, Charles G. Sodini,
Fellow, IEEE ,Thomas Heldt,
Senior Member, IEEE , Vivienne Sze,
Senior Member, IEEE
Abstract — Objective : We aid in neurocognitive moni-toring outside the hospital environment by enabling app-based measurements of visual reaction time (saccade la-tency) and error rate in a cohort of subjects spanningthe adult age spectrum.
Methods : We developed an iOSapp to record subjects with the frontal camera during pro-and anti-saccade tasks. We further developed automatedalgorithms for measuring saccade latency and error ratethat take into account the possibility that it might not al-ways be possible to determine the eye movement from app-based recordings.
Results : To measure saccade latency ona tablet, we ensured that the absolute timing error betweenon-screen task presentation and the camera recording iswithin 5 ms. We collected over 235,000 eye movementsin 80 subjects ranging in age from 20 to 92 years, with96% of recorded eye movements either declared good ordirectional errors. Our error detection code achieved asensitivity of 0.97 and a specificity of 0.97. Confirmingprior reports, we observed a positive correlation betweensaccade latency and age while the relationship betweenerror rate and age was not significant. Finally, we observedsignificant intra- and inter-subject variations in saccadelatency and error rate distributions, which highlights theimportance of individualized tracking of these visual digitalbiomarkers.
Conclusion and Significance : Our system andalgorithms allow ubiquitous tracking of saccade latencyand error rate, which opens up the possibility of quantifyingpatient state on a finer timescale in a broader populationthan previously possible.
Index Terms — Eye tracking, mobile health monitoring,saccade latency, saccade error rate, neurodegenerative dis-eases
This work was supported in part by Sensetime through a grant to theMIT Quest for Intelligence and by the MIT-IBM Watson AI Lab.H.-Y. Lai is with the Department of Electrical Engineering & ComputerScience, Massachusetts Institute of Technology, Cambridge, MA 02139,USA.G. Saavedra-Pe˜na was with the Department of Electrical Engineering& Computer Science, Massachusetts Institute of Technology, Cam-bridge, MA 02139, USA.C.G. Sodini is with the Department of Electrical Engineering &Computer Science, Institute for Medical Engineering & Science, andthe Microsystems Technology Laboratory, Massachusetts Institute ofTechnology, Cambridge, MA 02139, USA.T. Heldt is with the Department of Electrical Engineering & ComputerScience, Institute for Medical Engineering & Science, and the ResearchLaboratory of Electronics, Massachusetts Institute of Technology, Cam-bridge, MA 02139, USA.V. Sze is with the Department of Electrical Engineering & ComputerScience and the Research Laboratory of Electronics, Massachusetts In-stitute of Technology, Cambridge, MA 02139, USA. e-mail: [email protected]
I. I
NTRODUCTION
It remains challenging to track neurodegenerative diseaseprogression objectively, accurately, and frequently. Currentassessments of neurodegenerative diseases are subjective andsparse, and standard neurocognitive and neuropsychologicaltest batteries require a trained specialist to administer andscore [1], [2]. Additionally, these tests demand significantpatient time and cooperation, and can therefore be influencedby a patient’s level of attention and comfort with the clinicalsetting [3]. Quantitative, objective, and frequent assessmentsmay mitigate the effects of individual physician’s clinical acu-men and patient fatigue when determining neurodegenerativedisease progression.Assessment of eye movement is a promising candidate forsuch a quantitative and objective test. First, eye movementsare readily observable. Second, their neural pathways involveseveral brain regions, and they might hence be affected bydegenerative processes affecting various brain centers [4]. Forexample, Huntington’s disease and progressive supranuclearpalsy directly affect oculormotor pathways. As a result, clinicaleye movement assessments are key to diagnosing and trackingthese diseases.In the context of neurodegenerative disease assessment andprogression monitoring, pro- and anti-saccade visual reactiontasks are often used challenge tests [5], [6]. In the pro-/anti-saccade tests, a subject is asked to look towards/awayfrom a visual stimulus. An anti-saccade task, in particular,requires a person to inhibit a natural reflexive eye movementtowards the stimulus and initiate an eye movement in theopposite direction of the stimulus. Thus, it requires morecognitive processing than a pro-saccade task [7], [8]. Becausethese tasks demand cognitive abilities which can be affectedby neurodegenerative diseases, two saccadic eye movementfeatures were observed to be significantly different betweenhealthy subjects and patients: saccade latency (visual reactiontime) and error rate (the proportion of eye movements towardsthe wrong direction) [9]–[12]. However, these features arecommonly measured with dedicated infrared cameras andchinrests, which limits the measurements to the doctor’s officeor the neurophysiological laboratory. In our previous work[13]–[15], we showed that we can accurately and robustlydetermine saccade latency from recordings obtained with asmartphone camera.In this work, we developed an app to display the pro- a r X i v : . [ q - b i o . N C ] D ec /anti-saccade tasks on a tablet computer while recordinga subject’s eye movements with the built-in camera. Wepresent an automated processing pipeline to determine pro-/anti-saccade latency and error rate, thus enabling ubiquitousrecording of these neurological digital biomarkers. With thisnovel recording platform and pipeline we collected over 6,800videos and over 235,000 individual eye movements from 80subjects across the adult age spectrum. II. M
ATERIALS
A. Recruitment
To study the responses of subjects of different ages to pro-and anti-saccade tests, we recruited 80 self-reported healthyadult subjects, ranging in age from 20 to 92 years.Video recording of volunteers was approved by MIT’sCommittee on the Use of Humans as Experimental Subjects(protocol
B. App design
In our previous work [15], we displayed the visual reactiontask on a laptop and recorded the subjects with an iPhone. Syn-chronization of the recording and task display was achievedthrough a second screen that mirrored the laptop screen andwas recorded alongside the subject’s response [15]. Given theelaborate set-up, the recording was limited to our laboratorysetting. Our goal here was to allow for ubiquitous recordingand hence for subjects to record themselves in the comfort oftheir homes or offices. We therefore developed an iOS app sosubjects could record themselves with the frontal (i.e. selfie) ≥ Fig. 1. Age distribution of subjects with single or multiple recordingsessions.
Enter the subject ID
Show the number of pro-and anti-saccade tasks the subject has recorded today
Choose between a pro- and anti- saccade taskChoose between a 20- and 40- stimuli taskShow the face reflection with a bounding box and check the lighting condition.
Display the task on the screen and record the subject with the frontal camera End the app
Press “Start”
Fig. 2. The flow of the app. Blue arrows request the input from thesubject. Orange arrows denotes the response of the app. (a) (b)iPad Stand~40 cm iPad (60 fps)
Fig. 3. (a) Recording setup; (b) before showing the task on thescreen, the app displays the face of the subject with a bounding box.If the distance measurement from the camera to the subject’s face isaccessible (i.e. between 30 and 50 cm), the box will turn green. If theautomatically detected ISO is greater than 1000, a warning will be shownto guide the subject to move to a better-illuminated place. camera as the tasks were displayed on the screen. While theapp can run on iPhones, our platform of choice was the iPad(Generation 2 and 3) for their larger dimensions and hencelarger angular gaze amplitudes ( ∼ available), and the recorded ISO value at the beginning ofthe recording.To acquire accurate saccade latency measurements, it iscrucial to synchronize the task display on the iPad screen andthe recording from the iPad camera. We detailed and evaluatedthe synchronization in Appendix. By requiring the ISO to beless than 1000, we showed that we can bound the absolutesynchronization error to be within 5 ms. C. Task design
In this work, we implemented two commonly studied tasksin the literature, namely a gap-pro-saccade and a gap-anti-saccade task [10], [12], [16]. Both tasks start with a fixationperiod. During the fixation period (1 s), a fixation point (greensquare) is shown at the top center of the screen (as shown inFig. 4). We chose the fixation point to be presented at the topto prevent occlusion from the eyelids. Subjects were instructedto look at the fixation point during this period. The fixationis followed by a 200-ms gap period, where the fixation pointdisappears and the screen stays black. After the gap period, astimulus (white square) is presented on either left or right sideof the screen. If a subject is performing a pro-/anti-saccadetask, the subject is instructed to move their eyes towards/awayfrom the stimulus as quickly and accurately as possible. Thisstimulus period will last for 1.2 s and be followed by another200-ms gap period. This sequence of “fixation-gap-stimulus-gap” will repeat for 20 or 40 times, with half of the stimulipresented to the right of the fixation point and half to the leftin randomized order.
III. M
ETHODS
A. Measurement Pipeline Requirements
In this section, we discuss our measurement pipeline asshown in Fig. 5. Building on our prior work [13]–[15], we usediTracker-face to estimate the gaze of the subject (Fig. 6). Theinputs to iTracker-face include the cropped face, as determinedby the Viola-Jones algorithm [17], and a face grid indicatingthe face position. The outputs of iTracker-face are the (x, y)-coordinates of the estimated gaze position on the screen inthe unit of centimeters. To attain our horizontal eye movementtrace, we retain the x-coordinate of the gaze position acrossframes. As discussed in Section II, we have synchronized thecamera recording with the task display. We use the timestampsof the screen frames to acquire the time when each stimulus
Fixation FixationGap = 200 ms
Gap = 200 msStimulusFixation FixationGap = 200 ms
Gap = 200 msStimulus (a) (b)
Fig. 4. (a) Pro-saccade task: Look toward the stimulus. (b) Anti-saccadetask: Look away from the stimulus.
Camera Recording
TimeTime
Feature Extraction Saccade LatencyError
Eye Movement Trace
Eye Movement Trace P o s i t i o n P o s i t i o n EyeTrackingAlgorithm
Fig. 5. The measurement pipeline includes the tablet-based videorecording, an eye tracking algorithm, a saccade-latency measurementalgorithm, and an error detection algorithm.
Eye Gaze (x,y) C O N V C O N V C O N V C O N V F C F C F C F C F C F C FaceFace Grid
Fig. 6. Convolutional neural network architecture used by iTracker-face. “CONV” stands for convolutional layers and “FC” stands for fullyconnected layers. The details of the architecture can be found in [13]. appears. With the stimulus presentation time and the eyemovement trace, we can determine saccade latency and detecteye-movement errors.In [15], we measured saccade latency by fitting a hyperbolictangent ( tanh ) to a fixed window of the eye movement trace,from 100 ms before to 500 ms after the stimulus presentation,and determined the saccade onset as the time when the bestmodel fit exceeded 3% of the maximal saccade amplitude.Saccade latency was then computed as the time differencebetween the saccade onset and the time when the stimuluspresented. A major benefit of using this model-based approachis that it provides an automated signal-quality quantificationby means of the normalized root-mean-square error (NRMSE)between the model fit and the eye-position trace. We markeda trace as unusable if its NRMSE was greater than 0.1 [15].In addition, since the output of iTracker-face is in the unit ofcentimeters, we normalized the trace to the expected saccadeamplitude using the best fit model. Since our saccade onsetdetermination is scale invariant, the measured saccade latencyis invariant of this normalization.In the current study, we expanded upon our initial studycohort in [15] by specifically including self-reported healthysubjects across the adult age spectrum. Consequently, weobserved a larger heterogeneity in saccadic eye-movement pat-terns that necessitated revisions to our previously establishedprocessing pipeline.To allow for latency measurements from subjects withslower response times, we needed to increase the window offit for the tanh model from 200 ms before to 800 ms after thestimulus presentation. However, we noticed that by expandingthe window, it is more likely to capture a subject’s eye move-ments back toward the center position (Fig. 7a). Additionally, subjects may perform a series of hypometric saccades in whichthe initial saccadic movement does not reach the final positionand a second saccade is made to correct for this undershoot(Fig. 7b). Correct identification of hypometric saccades is ofrelevance since an increased incidence of hypometric saccadesis associated with certain neurodegenerative pathologies [6],[11]. The single tanh model cannot fit well to these tracesif we use a fixed window to determine latency values. Todetermine saccade latency, we need to allow for an adaptivewindow of fit for the tanh model to identify the initial saccadicmovement to be fitted. We also note that we cannot convertthe unit of the eye-movement trace from centimeter to degreeusing the best fit tanh model. As a result, we needed to reviseour method to normalize the trace.
B. Saccade Normalization
We make three assumptions to convert the unit of the eye-movement traces from the iTracker-face generated centimeterto degree. First, we assume that subjects were looking at thefixation point during the fixation period. Second, we assumethat subjects did not overshoot their gaze. Finally, we assumethat during the stimulus period, subjects either (a) did not movetheir eyes at all, (b) gazed at the stimulus, or (c) gazed at theopposite position of the stimulus.With these assumptions, we normalize the trace as follows.First, to simplify the algorithms, we flip the trace if neededso that positive excursions correspond to eye movements inthe correct direction. We then smooth the eye-movement tracewith a Savitzky-Golay filter [18], [19] (of order 3 and framelength 5) to make the final normalization more robust to noise.Subsequently, we determine two reference points to scale andshift the eye-movement trace. Our first reference point is setas the starting gaze position of a trace, that is 200 ms beforethe stimulus presentation. With the second assumption, oursecond reference point is either the maximum or the minimumvalue of the smoothed trace, depending on whether the subjectmakes a correct saccade, a corrected error, or an uncorrectederror. Scaling and shifting coefficients can be found by shiftingthe first reference point to zero degree and scaling the secondto either the final expected amplitude (12.7 degrees) or thenegative amplitude ( − . cm, we assume that the subjects have made a correct (b) A B C DE F (a)
A B C DE F
Fig. 7. Examples where tanh cannot be fitted to the entire trace: (a)gaze returning (b) hypometric saccade [4], [6]. As one can see, to findthe saccade latency, the window where we fit a tanh model should befrom A to D. saccade or a corrected error, and we scale the second referencepoint to the positive expected amplitude value. If the differencebetween the maximum value and the starting gaze position issmaller than . cm but the absolute difference between theminimum value and the starting gaze position is greater than . cm, we assume that the subjects have made an uncorrectederror and we scale the second reference point to the negativeexpected amplitude value. If neither of these criteria is met, weassume that the subjects have made only subtle eye movementsor that the eyes were occluded.In the first two scenarios, we find the scaling and shiftingcoefficients from the smoothed trace and normalize the originaltrace using these coefficients. One key observation of thisnormalization is that after normalization, traces with the sameshape will become identical. This characteristic ensures thatif the saccade-latency measurement algorithm and the error-detection algorithm are designed using this normalized trace,the algorithms will be scale-and-shift-invariant. That is, eyemovement features are measured only based on the shape ofa trace. In the third scenario, we noticed on visual inspectionof the video recordings that the sizes of the eye movementwere often comparable with noise and subtle head movement.To account for such observations, we label such traces as“LS” (Low Signal) to acknowledge the fact that we areuncertain whether there is an actual eye movement even byvisualizing the original videos. Traces labeled LS will beexcluded from the saccade-latency measurement and the error-detection algorithm. C. Adaptive Windowing and Saccade LatencyMeasurement
With the normalization, we next describe how we updatedthe window of fit for saccade latency measurement. Returningto the examples in Fig. 7, during the period from A to Band C to D, the subject’s eyes are fixated. During the periodfrom B to C, the subject performed a correct saccade, in thesense that the eyes moved in the correct direction. As a result,the proper window of fit is the first sequence of fixation,directionally correct eye movement, and fixation. This periodcan be identified using the velocity of the gaze. We estimatethe velocity of the gaze by computing the first-order derivativeof the Savitzky-Golay filtered trace to avoid amplifying high-frequency noise.We then classify a sequence of time instances as a correctsaccade period if the velocity values cross 30 degrees/s, asan incorrect saccade period if the velocity values cross − tanh modelto traces with multiple transitions and measure their saccadelatencies. We compared the previously described fixed-windowapproach with the adaptive-window approach and observedthat the proportion of saccades with a NRMSE > (b) (F) (C) (F) (C) (F) (a) (F) (C) (F) (E) (F) Fig. 8. Tanh fitting example: (a) gaze returning (b) hypometric saccade.The top panels show the eye movement traces obtained from iTracker-face after normalization. The dark lines show the fitted hyperbolic tan-gent models. The bottom panels show the velocity of the eye movementsand the velocity threshold (the dash lines). With such a threshold,we label different parts of the trace as fixation (F), correct saccade(C), or error saccade (E). The window of fit is chosen as the first“fixation(F)-correct saccade(C)-fixation(F)” period that crosses a third ofthe amplitude. by moving to the adaptive-window approach, we were ableto compute significantly more latencies with this improvedsaccade-latency measurement algorithm.
D. Error Detection
In the clinical literature, a directional error is defined asan initial eye movement in the wrong direction [16]. Manualannotation is often involved in the determination of theseerrors [10], [11]. Because such clinical studies have tradi-tionally relied on specialized environments and eye-trackingequipment, including use of chinrests, infrared illumination,and research-grade cameras, there were usually comparativelyfew traces collected per subject and the traces tended to beclean. As a result, manual annotation of traces is possiblein these cases. In contrast, to enable collection of largeamounts of data, we use consumer-grade cameras and do notuse a chinrest. As a result, we obtained significantly many x ( ° ) g p ( ° ) g n ( ° ) Time (ms) 𝑡 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑡 𝑒𝑟𝑟𝑜𝑟 Fig. 9. Error detection example. The top panel shows the x coordinatesof the iTracker-face output over time ( x t ). The middle and the bottompanel show gp t and gn t . The dashed line indicates the threshold T .When gp t and gn t cross the threshold T , t correct and t error aredetected, respectively. In this case, since < t error < t correct , anerror is detected. more traces, though some were affected by glares or headmovements. Our goal is thus to reject poor recordings anddevelop an accurate and robust error detection algorithm.As mentioned in Section III-C, we exclude the traceslabeled LS, since we cannot distinguish between saccadic eyemovements and noise/head movements. Out of the remainingtraces, we noticed that a typical error trace shows a period offixation followed by a directionally incorrect eye movement(as shown in the top panel of Fig. 9). Since our goal is todetect such a change, we developed our algorithm based on thechange detection literature [20]. In particular, we extended thecumulative sum (CUSUM) algorithm [21] for our purposes.We first assume that our measured eye movement trace x t at time t is composed of an eye movement θ t and an additivemeasurement noise (cid:15) t . We then use a recursive least squarefilter to estimate the eye movement ˆ θ t according to ˆ θ t = λ ˆ θ t − + (1 − λ ) x t . (1)The residual error then becomes ˆ (cid:15) t = x t − ˆ θ t . If there is neithera positive trend nor a negative trend in x t , ˆ (cid:15) will be centeredaround zero. As a result, when we consider the cumulativesum of the residual error s t = s t − + ˆ (cid:15) t , s t will be centeredaround zero as well. However, if there is a negative trendin x t as shown in Fig. 9, s t will become progressively morenegative. We can then use a threshold to determine whether s t is sufficiently negative such that (cid:15) t is unlikely to just representadditive measurement noise.To distinguish between correct saccades and incorrectsaccades, we define two separate variables for s t : gn t =max { gn t − − ˆ (cid:15) t , } and gp t = max { gp t − + ˆ (cid:15) t , } . Thatis, gn t accumulates negative trends and gp t accumulatespositive trends. As a result, when gn t and gp t cross the pre-determined threshold, we detect an incorrect and a correctsaccade, respectively. To apply the definition of a directionalerror as an initial eye movement towards the wrong direction, we detect an error if gn t crosses the pre-determined thresholdafter ms and before gp t crosses the pre-determined threshold.Here, we chose to scale the threshold with respect to theestimated (corrected) saccade amplitude. We notice that ifthere is no error in a trace, gn t will be around zero while gp t will approximate the amplitude of the saccade (Fig. 9).When there is an error, gp t will approximate the amplitudeof the corrected saccade. On the other hand, when there isan uncorrected directional error, gn t will approximate theamplitude of the saccade. As a result, we approximate the(corrected) saccade amplitude by max t { gp t , gn t } . We furtherobserve that if the saccade amplitude before the normalizationis sufficiently large, the saccade will be less affected by headmovement and noise. Thus, we can consider lowering thethreshold to detect smaller errors. On the other hand, if theoriginal saccade amplitude is closer to the size of the headmovement and noise, the threshold needs to be sufficientlylarge to avoid artifacts from being detected. Recall that inSection III-C, we scale the trace and shift it to normalize itfrom centimeters to degrees. We can use the scaling coefficient(denoted as B in Algorithm ) as a metric to evaluate the sizeof the original saccade amplitude. If B is small ( < ), it meansthat the original amplitude is large and the threshold could besmaller. If B is large ( ≥ ), we will use a fixed threshold. Herethe value can be considered as a hyperparameter that we cantune. The final threshold is max t { gp t , gn t } · min { B, } · T .The complete algorithm is shown in Algorithm .To determine the threshold T , we asked four subjects toperform six anti-saccade tasks of 40 stimuli each. Two expertannotators reviewed the videos and annotated the directionalerrors. Out of the · ·
40 = 960 saccadic eye movements, therewere only two disagreements between the annotators whichwere resolved after these two disagreements were reviewedtogether. With the annotated data set at hand, we swept thethreshold T and determined the true positive and false positiverates for detecting a directional error (Fig. 10). When thethreshold is lower than the noise level, gp t and gn t maycross the threshold due to noise rather than a saccadic eyemovement. That is, gp t may be equally likely to cross thethreshold as gn t . Recall that we only detect a trace as an errorif gn t crosses the threshold before gp t . As T goes to zero, thetrue positive rate and the false positive rate go to 0.5. On theother hand, if the threshold is too large, the amplitude of anincorrect saccade may be smaller than the threshold and theerror may not be detected. When T is larger than the noiselevel but smaller than the amplitude of an error, we can gethigh sensitivity and specificity. By choosing T = 0 . , wecan achieve a sensitivity of 0.97 and a specificity of 0.97 fordetecting a directional error. E. Error Rate Definition
In the clinical literature, error rate is often defined as theproportion of errors, though it is not usually discussed whethernoisy traces are excluded from such calculation. Given the useof special-purpose equipment and optimized environmentalconditions in clinical research studies, such recordings mayhave very few noisy traces. Without a chinrest and a controlled
Algorithm:
Error Detection input : x = [ x , . . . , x N ] , B , x is chosen to bethe first instance after the stimuluspresentation, B is the scaling coefficientin the saccade normalization output : t error , t correct (An error is only detectedif the first element in t error is smallerthan the first element in t correct .) parameter: λ, T for round=0:1 do ˆ θ = x , t error = [] , gn = [0] , gp = [0] ; for t=2:N do ˆ θ = λ ˆ θ + (1 − λ ) x [ t ] ; ˆ (cid:15) = x [ t ] − ˆ θ ; gn.append (max { gn [ t − − ˆ (cid:15), } ) ; gp.append (max { gn [ t −
1] + ˆ (cid:15), } ) ; if round==1 thenif gn [ t ] > A · T then t error .append ( t ) ; gn [ t ] = 0 ; ˆ θ = x [ t ] ; endif gp [ t ] > A · T then t correct .append ( t ) ; gp [ t ] = 0 ; ˆ θ = x [ t ] ; endendend A = min { , B } · max { gp, gn } ; end False Positive Rate T r u e P o s i t i v e R a t e T=0.03 T=0
Fig. 10. The true positive rate and the false positive rate as weincreased the error detection threshold T from 0 to 0.1. We chose T = 0 . as our final threshold to achieve a sensitivity of 0.97 anda specificity of 0.97. laboratory setup, we obtained more noisy traces. We care-fully identified the causes of these noisy traces: glares, headmovements, eyelids drooping. Many of these causes could bereduced with more careful instruction. However, even withcareful instruction, it is hard to eliminate all these causes, dueto the nature of the much more relaxed and varying recordingenvironment and the large number of recordings. As a result,it is important to define an error rate that takes these noisytraces into consideration.An eye movement was either declared a correct saccade(dC), declared an error (dE), or labeled low signal (LS). Ifwe define the error rate as the proportion of errors out of allthe traces, we might significantly underestimate the error ratein records with a lot of eye movements in the LS category. Abetter approach might be to define the error rate in a recordingas • P ( dE | C ) ≈ , P ( dC | E ) ≈ , • P ( LS | E ) ≈ P ( LS | C ) ,where E denotes errors and C denotes correct saccades, wecan express the error rate as P ( dE )1 − P ( LS )= P ( dE | E ) P ( E ) + P ( dE | C ) P ( C )1 − P ( LS | E ) P ( E ) − P ( LS | C ) P ( C ) ≈ P ( dE | E ) P ( E ) P ( E )[1 − P ( LS | E )] + P ( C )[1 − P ( LS | C )] ≈ [1 − P ( LS | E )] P ( E )[1 − P ( LS | E )] P ( E ) + [1 − P ( LS | C )] P ( C ) ≈ P ( E ) P ( E ) + P ( C )= P ( E ) (2)where we made use of the fact that a trace is either anerror or a correct saccade, i.e. P ( E ) + P ( C ) = 1 . Thefirst assumption states that the false positive and the falsenegative are essentially zero. As discussed in Section III-D,our error detection algorithm achieved a sensitivity of 0.97and a specificity of 0.97. Therefore, the first two assumptionsare indeed met. The second assumption states that a correctsaccade is equally likely to be declared LS as an error saccade.Since our determination of LS is simply based on the sizeof the trace, this condition is met as well. Therefore, it isreasonable to define the error rate as IV. D
ATA A NALYSIS
With our system, we have collected 6823 videos and 236900eye movements from 80 subjects across the adult age spec-trum. With the saccade latency and error determinations, welabeled the traces as in Fig. 11. We observe that in videoswith a substantial number of LSs, subjects’ eyes were oftenpartially occluded due to eyelid droop. Videos with a largenumber of bad saccades tend to contain more head movements.
Declared an Error (dE): 16%Declared a Correct Saccade (dC)Low Signal (LS): 1% NRMSE < NRMSE ≥ ( ± ± ± ± (Good Saccade) (Bad Saccade) A Video of Eye Movements
Fig. 11. Breakdown of saccades into error saccades, good saccades,bad saccades, and “LS” (low signal).
As a result, the number of LSs and bad saccades indicateswhether a subject recorded themselves properly. We thereforediscard a video if more than half of the saccades are LSs orbad saccades. After discarding the videos with too many LSsand bad saccades, we retained 6787 videos and 235520 eyemovements from 80 subjects. Out of the remaining videos, wecalculated the mean (standard deviation) of the proportions ofeach label in a video. There are 1% (4%) of LSs and 3% (5%)of bad saccades. That is, on average, 96% of the saccades aregood saccades or declared errors.With these data, we can analyze the responses of eyemovement features in different age groups (Fig. 1). This isimportant because it gives us a baseline when we compare theresults with data from patients. We calculate the mean saccadelatency and error rate for each individual and then computethe mean and standard error of the individual mean saccadelatencies and error rates per age group. As a result, the mean ofan age group is not biased towards those subjects who providedmore recordings. To evaluate the correlation between age andeye movement features, we compared our result with [22],[23], where data were collected from specialized equipment(DC electrooculography with a head rest) in a controlledenvironment. We notice that [22] defined an anticipatorysaccade as any saccade (including errors) with latency < (c) (d)(a) (b) L a t e n c y ≥ m s L a t e n c y ≥ m s Saccade Latency Error Rate ≥ ≥ ≥ ≥ Fig. 12. Eye movement features as a function of age with saccades > ms: (a) mean saccade latency (b) mean error rate, and with saccades > ms: (c) mean saccade latency (d) mean error rate. The bars showedone standard error. while there is no consistent definition of anticipatory saccadesin the literature, our observation highlights that they should becarefully defined.In addition, with the accessibility to sizable data, we canstudy individual distributions, instead of only reporting thepopulation mean as in most clinical literature. We analyzedthe mean pro-saccade latency of each subject in seven agegroups and chose from each age group the subject with themedian mean pro-saccade latency as the representative subject.In Fig. 13, we showed example saccade latency distributionsof these representative subjects. We observe that there aresignificant intra- and inter-subject variations in saccade latencyacross our study cohort, which suggests that aggregated resultsmay lose the information encoded in individual distributions. V. D
ISCUSSION
The neural circuits involved in generating eye movementscan be affected by neurodegenerative diseases. In particular,pro-/anti-saccade latency and error rates have been shownin the clinical literature to be significantly different betweenhealthy subjects and patients with certain neurodegenera-tive conditions such as Alzheimer’s disease and Parkinson’sdisease [9]–[11]. Thus, such eye movement features maybe promising candidates for tracking disease progression.However, these features are commonly measured in special,somewhat artificial environments and with special-purposeinfrared-illuminated cameras, which limits broad accessibilityand repeat measurements to track neurodegenerative diseaseprogression longitudinally. In this work, we present, validate,and use an iOS application to enable such data collection.Additionally, we present algorithms for measuring saccadelatency, determining directional errors, and calculating errorrate that takes into account the possibility that it might notalways be possible to determine the eye movement from home-based recordings.
Recording setup
In our previous work, we showed that instead of a special-purpose camera, we can measure saccade latency using asmartphone camera. The recording setup, nevertheless, re-quired a laptop to display the task, a screen synchronized withthe laptop to be placed behind the subject, and a researcher torecord both the subject’s eye movement and the synchronizedscreen using the back camera of an iPhone. Due to theserequirements, the recording setup was not sufficiently flexiblefor a subject to take recordings on their own in their homesor offices, which limits the possibility of using such a systemto flexibly and ubiquitously monitor neurocognitive decline ordisease progression. In this work, we designed an iOS app torecord a subject with the frontal camera of an iPad while thesubject is following a task shown on the screen. There are twochallenges to achieve this goal.First, unlike in the clinical setup and in our previous workwhere an expert researcher takes recordings of a subject, ourapp needs to guide the subject to record themselves at a properdistance to the camera and in a well-lit environment. To resolvethis first challenge, before recording a subject, the app displaysthe subject on the screen and guides the subject to aligntheir face with a bounding box shown on the screen. Withsuch guidance, most subjects were recorded at an appropriatedistance. To ensure the environment is well-lit, the app alsoasks the subject to move to a better-illuminated environmentif the measured ISO is greater than 1000.Second, the camera recording and the task displayed onthe screen need to be well-synchronized to obtain accuratesaccade latency. This can be challenging as most applications(e.g., video chatting) only require the synchronization error tobe unnoticeable by a human (i.e., <
80 ms). With careful appdesign and evaluation of the synchronization error, we showthat we can restrict the absolute timing error to be within 5ms, which is well within the standard deviation of a subject’ssaccade latency distribution.
Algorithm design
In our previous work, we measured pro-saccade latency bycombining a deep convolutional neural network for gazeestimation with a model-based approach for saccade on-set determination that also provides automated signal-qualityquantification and artifact rejection. Here, we also includean anti-saccade task and extend our measurement pipeline tomeasure the associated saccade latencies and error rates.Since eye movements are now recorded outside of a clinicalenvironment, our first observation is that in cases where eyemovements are too small in amplitude or when the eyesare occluded, the eye movement signals can be smaller thannoise. In these cases, we cannot tell the direction of the eyemovement either from the trace or from the original video.As a result, we cannot classify these traces into a corrector an erroneous eye movement and cannot determine thesaccade onset. We show that we can identify these tracesusing the raw output of iTracker-face, label these traces asthe ”LS”s (low signal), and exclude them from the saccadelatency measurement and error detection.Our second observation is that, since we now implementboth pro- and anti-saccade tasks and that anti-saccade latencies
20s 30s 40s 50s 60s 70s ≥
100 200 300 400 500
Saccade latency (ms) P D F AVG = 117 msSD = 27 msN = 859 AVG = 128 msSD = 39 msN = 541
Saccade latency (ms) P D F AVG = 122 msSD = 48 msN = 183 AVG = 220 msSD = 57 msN = 143
Saccade latency (ms) P D F AVG = 140 msSD = 53 msN = 330 AVG = 243 msSD = 76 msN = 261
Saccade latency (ms) P D F AVG = 142 msSD = 58 msN = 337 AVG = 223 msSD = 63 msN = 241
Saccade latency (ms) P D F AVG = 164 msSD = 87 msN = 1617 AVG = 253 msSD = 99 msN = 1434
Pro-saccadeAnti-saccade
Saccade latency (ms) P D F AVG = 181 msSD = 75 msN = 293 AVG = 255 msSD = 73 msN = 160
Saccade latency (ms) P D F AVG = 126 msSD = 49 msN = 493 AVG = 219 msSD = 101 msN = 397
Fig. 13. Representative normalized distributions, shown as probability density functions (PDFs), of pro-saccade (blue) and anti-saccade (red)latencies for each decade in age of the study population. Subjects whose mean pro-saccade latency is the median of the corresponding age groupwere chosen to represent each group. No censoring was applied to eliminate anticipatory saccades. AVG: average latency; SD: standard deviation;N: number of eye movements. are usually larger than pro-saccade latencies, we need toincrease the size of the window where we fit our tanh model.However, by doing so, we also increase the potential ofincluding more than one saccade movement in the window.For example, subjects may make a hypometric saccade orreturn their gaze towards the center of the screen. Being able tomeasure saccade latency from these traces is crucial, especiallywhen these eye movements indicate a certain phenotype.For instance, patients with Parkinson’s disease may makemore hypometric saccades [6], [11] than patients age-matchedcontrols. Our previous saccade latency measurement algorithmcannot measure latencies from these traces since a tanh modelwith a fixed window cannot fit well on these traces. In thiswork, we show how we can find the appropriate windows of fitfor these traces and thus enable saccade latency measurement.By doing so, we keep 96% of the traces to be either a goodsaccade (the saccade with NRMSE ≤ > Age and eye movement features
With the improvement in our measurement pipeline, we took6823 recordings from 80 subjects ranging in age from 20years to 92 years, a significantly larger number comparedto our previous work – around 500 recordings from 29subjects mostly in their 20’s and 30’s, and most other workcollected just a few recordings from each subject [11], [16],[24]. Moreover, we have 43 subjects with multiple recordingsessions compared to 11 subjects in our previous work. Evenafter discarding undesirable recordings, we retained 6787recordings and 235520 eye movements from 80 subjects.As in the literature, we observe that anti-saccade latencyand error rate tend to be larger than pro-saccade latencyand error rate, respectively. Across the age range, we alsoobserve that saccade latency is positively correlated with agewhile a strong relationship between error rate and age is notapparent. This observation also matches the observation inprior work [22], [23]. Although our saccade latency valuesare smaller than values reported in [22], [23], our values arewithin the range of latency values reported in the clinicalliterature [10], [16], [25], [26]. Several hypotheses can bemade to explain why our values may be smaller. First, our recording setup is less constrained. As mentioned in [3],recording subjects in dedicated environments may affect asubject’s cognitive awareness. Second, our subjects are mostlygraduate students or professors. It is likely that education levelmay affect reaction time. We also have fewer subjects in the70’s and 80’s than in other age brackets. While one of thethree subjects in the 70’s has latency values much closer tothe values reported in the literature, two other subjects havesmaller latency values.We also observe that the definition of an anticipatorysaccade may significantly affect the measured pro-saccadelatency and error rate. While the definition is not consistentacross the clinical literature, our observation suggests that amore careful investigation into the effect of picking a latencythreshold for anticipatory saccages on mean saccade latencyis warranted. Some investigations designed tasks to avoidanticipatory saccades [27], [28], for example, by randomizingthe length of the fixation period or by including more positionswhere a stimulus can be presented. However, we suspect thatthese modifications may result in an increased error rate. Sincewe aim to design and validate our error detection algorithm inthis work, we did not implement either of these modifications.Nevertheless, it is worth analyzing how these modificationsmay affect saccade latency and error rate.Last but not least, we show that with multiple recordingsfrom each subject, we can study individual saccade latencydistributions, while most literature only reported populationmeans. We observe that there is significant intra- and inter-subject variability in these distributions. This observationsuggests that such distinctive differences within and acrosssubjects is lost if we were only to report a single summarystatistic (mean or median) for each subject or across each agegroup. Our pipeline fundamentally enables the collection andthe analysis of a large number of measurements to characterizethe distributional characteristics for each subject. VI. C
ONCLUSION
In this work, we developed, validated, and deployed anapp to allow for robust determination of pro- and anti-saccade latencies in a visual reaction task. Additionally, weextended our previously reported signal processing pipelineto automatically detect low-signal recordings that should notbe further analyzed and also identified directionally erroneouseye movements. With this platform in place, we collectedover 235,000 eye movements from 80 self-reported healthyvolunteers ranging in age from 20 to 92 years, an order ofmagnitude more measurements than presented in our previouswork. We observed that pro- and anti-saccade latency is pos-itively correlated with age whereas the relationship betweenerror rate and age is not significant. Moreover, we observednotable intra- and inter-subject variability across participants,which highlights the need to track eye-movement features in apersonalized manner. By enabling app-based saccade latencymeasurements and error rate determination, our work paves theway to use these digital biomarkers to aid in the quantificationof neurocognitive decline and possibly from the comfort of thepatient’s home. A CKNOWLEDGMENTS
The authors would like to thank Mr. Peter Kamm for helpwith the development of the app. R EFERENCES [1] S. Hoops et al. , “Validity of the MoCA and MMSE in the detection ofMCI and dementia in Parkinson disease,”
Neurology , vol. 73, no. 21,pp. 1738–1745, 2009.[2] A. Mitchell, “A meta-analysis of the accuracy of the mini-mentalstate examination in the detection of dementia and mild cognitiveimpairment,”
Journal of Psychiatric Research , vol. 43, no. 4, pp. 411–431, 2009.[3] National Academies of Sciences, Engineering, and Medicine,
Harness-ing mobile devices for nervous system disorders: Proceedings of aWorkshop . Washington, DC: The National Academies Press, 2018.[4] R. Leigh and D. Zee, “The saccadic system,” in
The Neurology of EyeMovements . Oxford: Oxford University Press, 2015, ch. 4, pp. 169–288.[5] S. Tabrizi et al. , “Biological and clinical manifestations of Huntington’sdisease in the longitudinal track-hd study: Cross-sectional analysis ofbaseline data,”
The Lancet Neurology , vol. 8, no. 9, pp. 791–801, 2009.[6] T. Anderson and M. MacAskill, “Eye movements in patients withneurodegenerative disorders,”
Nature Reviews Neurology , vol. 9, no. 2,pp. 74–85, 2013.[7] D. Munoz and S. Everling, “Look away: the anti-saccade task andthe voluntary control of eye movement,”
Nature Review Neuroscience ,vol. 5, no. 3, pp. 218–228, 2004.[8] J. M. JE et al. , “Neurophysiology and neuroanatomy of reflexiveand volitional saccades: evidence from studies of humans,”
Brain andCognition , vol. 68, no. 3, pp. 255–270, 2008.[9] R. Shafiq-Antonacci et al. , “Spectrum of saccade system function inAlzheimer’s disease,”
Archives of Neurology , vol. 60, no. 9, pp. 1275–1278, 2003.[10] T. Crawford et al. , “Inhibitory control of saccadic eye movements andcognitive impairment in Alzheimer’s disease,”
Biological Psychiatry ,vol. 57, no. 9, pp. 1052–1060, 2005.[11] U. Mosimann et al. , “Saccadic eye movement changes in Parkinson’sdisease dementia and dementia with Lewy bodies,”
Brain , vol. 128,no. 6, pp. 1267–1276, 2005.[12] S. Garbutt et al. , “Oculomotor function in frontotemporal lobar degener-ation, related disorders and Alzheimer’s disease,”
Brain , vol. 131, no. 5,pp. 1268–1281, 2008.[13] H.-Y. Lai et al. , “Enabling saccade latency measurements withconsumer-grade cameras,” in
Proceedings of the IEEE InternationalConference on Image Processing (ICIP) , 2018, pp. 3169–3173.[14] G. Saavedra-Pe˜na et al. , “Determination of saccade latency distributionsusing video recordings from consumer-grade devices,” in
Proceedingsof the IEEE Engineering in Medicine and Biology Conference (EMBC) ,2018, pp. 953–956.[15] H.-Y. Lai et al. , “Measuring saccade latency using smartphone cameras,”
IEEE Journal of Biomedical and Health Informatics , vol. 24, no. 3, pp.885–897, 2020.[16] S. Rivaud-P´echoux et al. , “Mixing pro- and antisaccades in patients withparkinsonian syndromes,”
Brain , vol. 130, no. 1, pp. 256–264, 2006.[17] P. Viola and M. Jones, “Rapid object detection using a boosted cascadeof simple features,” in
Proceedings of the IEEE Computer SocietyConference on Computer Vision and Pattern Recognition (CVPR) , 2001,pp. 511–518.[18] M. Nystr¨om and K. Holmqvist, “An adaptive algorithm for fixation,saccade, and glissade detection in eyetracking data,”
Behavior ResearchMethods , vol. 42, no. 1, pp. 188–204, 2010.[19] A. Savitzky and M. Golay, “Smoothing and differentiation of data bysimplified least squares procedures,”
Analytical Chemistry , vol. 36, no. 8,pp. 1627–1639, 1964.[20] F. Gustafsson, “On-line approaches,” in
Adaptive Filtering and ChangeDetection . John Wiley & Sons, Ltd, 2001, ch. 3, pp. 55–87.[21] E. Page, “Continuous inspection schemes,”
Biometrika , vol. 41, no. 1/2,pp. 100–115, 1954.[22] D. Munoz et al. , “Age-related performance of human subjects onsaccadic eye movement tasks,”
Experimental Brain Research , vol. 121,no. 4, pp. 391–400, 1998.[23] B. Fischer, M. Biscaldi, and S. Gezeck, “On the development ofvoluntary and reflexive components in human saccade generation,”
BrainRes. , vol. 754, no. 1-2, pp. 285–597, 1997. [24] Q. Yang et al. , “Specific saccade deficits in patients with Alzheimer’sdisease at mild to moderate stage and in patients with amnestic mildcognitive impairment,” Age , vol. 35, no. 4, pp. 1287–1298, 2013.[25] J. Holden et al. , “Prodromal alzheimer’s disease demonstrates in-creased errors at a simple and automated anti-saccade task,”
Journalof Alzheimer’s Disease , vol. 65, no. 4, pp. 1209–1223, 2018.[26] C. Bonnet et al. , “Eye movements in ephedrone-induced parkinsonism,”
PLoS one , vol. 9, no. 8, pp. 1–8, 2014.[27] A. Boxer et al. , “Saccade abnormalities in autopsy-confirmed fron-totemporal lobar degeneration and Alzheimer’s disease,”
Archives ofNeurology , vol. 69, no. 4, pp. 509–517, 2012.[28] S. Hopf et al. , “Age dependent normative data of vertical and horizontalreflexive saccades,”
PLoS One , vol. 13, no. 9, p. e0204008, 2018. A PPENDIX
In this appendix, we detail how we bound the error asso-ciated with saccade latency determination using the app. Theaccuracy of the saccade latency determination depends on theaccuracy with which the timing of two events can be deter-mined with the app, namely 1) the times of first presentation ofeach stimulus (the “stimulus timestamps” s i ), and 2) the timesassociated with the frame-by-frame recording from the camera(the “recording timestamps” t j ). Our typical saccade taskconsisting of 40 individual pro-/anti-saccade stimuli. Hence,we obtain 40 stimulus timestamps (i.e. i ∈ [1 , ..., ), whereaswe record around Z ≈ frames from the camera for eachrecording (i.e. j ∈ [1 , ..., Z ] ). Both series of timestamps areobtained through function calls to the operating system.If the timestamps s i and t j could be obtained to very highaccuracy, the resulting error in the saccade onset determina-tion would solely be due to the saccade onset determinationalgorithm. However, given that operating systems generallyprioritize a host of housekeeping tasks, timing informationobtained from the operating system tend to be affected byqueued access to the processor clock.To evaluate the synchronization error between the screentimestamps and the recording timestamps, we placed thedevice in front of a mirror and ran a 40-saccade task. Withthe mirror, we can identify the recording frame in which eachof the 40 stimuli appears first. In Fig. 14, for example, thefirst stimulus was presented in Frame 85. With the 40 frameindices and the associated recording timestamps t j , we cantranslate these indices into time instants r i (ms), i = 1 , . . . , .In Fig. 14, r ≈ . s. Similarly, from the screentimestamps, we can obtain the time s i when the i -th stimulus isshown on the screen. Figure 15, shows s to be approximately4398.8324 s. Frame 84 Frame 85 Frame 86
Frame
Index
Recording
Time
Fig. 14. Example for determining r i , the time when the i -th stimulusappears. In this example, the first stimulus appears in recording frame85 at r = 4398 . s. If the timestamps were all accurate, the stimulus appearingon the screen would be captured by the next camera frame.In this case, r i − < s i ≤ r i , since the time differencebetween two frames is ms in a 60-fps recording. If theerrors in the recording timestamps and the screen timestampsare D r and D s , respectively, the relationship becomes r i + D r − < s i + D s ≤ r i + D r . That is, r i − < s i − D ≤ r i where D .. = D r − D s .From the recording timestamps, we can only find one timeinstant ˜ r i ( D ) as a function of D that satisfies ˜ r i ( D ) −
Display Picture 11 Display Picture 13
Fig. 15. Example for acquiring s i , the time when the i-th stimuluspresents on the screen. Picture 11 is a black image, and Picture 13 isthe image with a left stimulus. The first stimulus shows up when Picture13 is displayed. As a result, in this example, s = 4398 . s. (𝑎) 𝐷 ( m s ) (𝑏) Shutter Duration (ms) ISO 𝐷 ( m s ))