[PDF] A Visual Analytics Approach to Facilitate the Proctoring of Online Exams

Abstract

Online exams have become widely used to evaluate students' performance in mastering knowledge in recent years, especially during the pandemic of COVID-19. However, it is challenging to conduct proctoring for online exams due to the lack of face-to-face interaction. Also, prior research has shown that online exams are more vulnerable to various cheating behaviors, which can damage their credibility. This paper presents a novel visual analytics approach to facilitate the proctoring of online exams by analyzing the exam video records and mouse movement data of each student. Specifically, we detect and visualize suspected head and mouse movements of students in three levels of detail, which provides course instructors and teachers with convenient, efficient and reliable proctoring for online exams. Our extensive evaluations, including usage scenarios, a carefully-designed user study and expert interviews, demonstrate the effectiveness and usability of our approach.

Full PDF

AA Visual Analytics Approach to Facilitate the Proctoring ofOnline Exams

Haotian Li

Department of Computer Science andEngineering, HKUST, Hong KongSAR, [email protected]

Min Xu

Department of Computer Science andEngineering, HKUST, Hong KongSAR, [email protected]

Yong Wang

School of Information Systems,Singapore Management University,[email protected]

Huan Wei

Department of Computer Science andEngineering, HKUST, Hong KongSAR, [email protected]

Huamin Qu

Department of Computer Science andEngineering, HKUST, Hong KongSAR, [email protected]

ABSTRACT

Online exams have become widely used to evaluate students’ per-formance in mastering knowledge in recent years, especially duringthe pandemic of COVID-19. However, it is challenging to conductproctoring for online exams due to the lack of face-to-face interac-tion. Also, prior research has shown that online exams are morevulnerable to various cheating behaviors, which can damage theircredibility. This paper presents a novel visual analytics approachto facilitate the proctoring of online exams by analyzing the examvideo records and mouse movement data of each student. Specif-ically, we detect and visualize suspected head and mouse move-ments of students in three levels of detail, which provides courseinstructors and teachers with convenient, efficient and reliable proc-toring for online exams. Our extensive evaluations, including usagescenarios, a carefully-designed user study and expert interviews,demonstrate the effectiveness and usability of our approach.

CCS CONCEPTS • Human-centered computing → Human computer interac-tion (HCI) ; Visual analytics ; •

Applied computing → E-learning ; Learning management systems . KEYWORDS

Online proctoring, visual analytics, mouse movement, head poseestimation

ACM Reference Format:

Haotian Li, Min Xu, Yong Wang, Huan Wei, and Huamin Qu. 2021. A VisualAnalytics Approach to Facilitate the Proctoring of Online Exams. In

CHIConference on Human Factors in Computing Systems (CHI ’21), May 8–13,2021, Yokohama, Japan.

ACM, New York, NY, USA, 17 pages. https://doi.org/10.1145/3411764.3445294

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected].

With the rapid development of online learning in the past decade,online exams and tests are becoming increasingly popular for courseinstructors to assess the knowledge of students [11]. For example,Massive Open Online Courses (MOOCs) such as Coursera and EdXoften require students to pass a series of online exams before theycan gain a final course certificate. Meanwhile, conventional univer-sities also continue to expand their online course programs and holdonline exams for students [22]. Such a trend is further significantlyaccelerated from 2019 due to the COVID-19 lockdown, and mostschools and universities have switched to embrace online teachingand online exams. However, one major challenge for online examsis how to proctor online exams in a convenient, efficient and reliablemanner . Prior research [16, 30–32] has shown that online exams arevulnerable to cheating behaviors. According to the survey by Kingand Case [16], about 74% of students in 2013 reported that it is easyto cheat in online exams and nearly 29% of the students indicatedthat they cheated in online exams. These cheating behaviors candamage the credibility of online exams, which makes online examproctoring crucial for MOOCs platforms and universities to furtherexpand the application and usage of online exams.Different from traditional exams with onsite proctoring, onlineexams lack face-to-face interactions. It brings trouble to the proc-toring of online exams and various types of cheating behaviors mayoccur in online exams [38]. For example, students may type thequestions into the browser and search for possible solutions fromthe Internet. They may also send messages to a third party (e.g.,friends) to ask for help by using their mobile phones or chat appson the computer. Without face-to-face interactions in online exams,it is not an easy task to identify such cheating behaviors. To enableeffective proctoring, existing online exams usually ask the studentsto use webcams to monitor and record their activities during theexams [2, 13, 20, 25, 29]. Accordingly, a set of preliminary studieson the proctoring of online exams have been conducted based onsuch kinds of settings.According to our survey, the existing approaches for the proc-toring of online exams can be generally categorized into threegroups: manual proctoring, fully automated proctoring, and semi-automated proctoring. Manual proctoring is commonly applied a r X i v : . [ c s . H C ] J a n HI ’21, May 8–13, 2021, Yokohama, Japan Li et al. in the proctoring of online exams and many online testing solu-tions, such as Kryterion and Loyalist Exam Services , employ suchproctoring. Specifically, it requires a few proctors watching thevideos of all the students during the whole online exam, whichis often labour-intensive and time-consuming. Instead, fully auto-mated proctoring aims to reduce the manual efforts of proctorsby utilizing machine learning techniques to analyze the recordedvideo and audio data of students during the online exam [9, 33, 34].It automatically detects suspected behaviors and classifies theminto cheating or non-cheating. However, it is often difficult for theexisting fully-automated proctoring methods to achieve a very highaccuracy and the validation of the results becomes an issue. Tomitigate this issue, some recent online exam proctoring approachescombine the detection by machine learning approaches and furthermanual confirmation by proctors [20, 23]. But their manual confir-mation relies on manually checking the original videos backwardsand forwards, which is still inconvenient and time-consuming forproctors.In this paper, we propose a novel visual analytics approach tofacilitate the proctoring of online exams. Inspired by prior stud-ies, our approach combines human efforts with machine learningtechniques to achieve convenient, efficient and reliable proctoring foronline exams. Specifically, our approach analyzes both the examvideos of each individual student recorded by a webcam and themouse movement data. To collect the mouse movement data, wedesign and implement a lightweight JavaScript plugin that can beeasily embedded into different web pages including common web-based learning management systems (e.g., Canvas ) and does notrequire students to add extra settings. With the collected videosand mouse movement data, key features indicating suspected examcheating behaviors, including both abnormal head movements (e.g.,abnormal head rotation, face disappearance from the screen) andmouse movements (e.g., copy and paste, moving the mouse outof the exam web page), are extracted. Furthermore, we designedeffective visualizations to enable the interactive exploration of stu-dent cheating behaviors in three levels of detail: Student List View provides an overview of the cheating behaviors of all the studentsthrough a list of radar-chart-based glyphs;

Question List View visu-alizes the cheating risk distribution of all the questions finished byeach student and

Behavior View , along with

Playback View , enablesthe detailed inspection of a student’s suspected cheating behaviordistribution of working on a specific question and its comparisonwith other students and questions. Compared with prior proctor-ing approaches that need multiple extra devices or sensors [2, 20],students are only required to have one webcam on their computer,which is often available for most laptops and makes our approach convenient to deploy in real online exams. Also, our straightfor-ward and effective visualizations help users efficiently investigatethe student cheating behaviors of different levels, and the detailedcomparisons across different students and questions enable a more reliable cheating behavior judgement. We extensively evaluated theeffectiveness and usability of our approach through three usage sce-narios, a user study and expert interviews. The major contributionsof this paper can be summarized as follows: http://loyalistexamservices.com/ • We formulate the design requirements for the proctoring ofonline exams by working together with domain experts (i.e.,university faculty and teaching staff) and surveying priorstudies. • We propose a novel visual analytics approach for the proc-toring of online exams by visualizing the head and mousemovements of students during online exams in three levels ofdetail, enabling convenient, efficient and reliable proctoring. • We conduct extensive evaluations, including three usagescenarios, a carefully-designed user study and expert inter-views, to demonstrate the effectiveness and usability of theproposed approach.

The related work of our paper can be categorized into three parts:online proctoring methods, mouse movement visualization andhead pose analysis.

Online exams are emerging nowadays with the population of onlinelearning. The methods of online proctoring can be categorized intothree types: online human proctoring, semi-automated proctoringand fully automated proctoring. Online human proctoring meansthat there will be remote proctors watching students during thewhole online exam. It is a very common method used by manyonline testing solution providers (e.g., Kryterion, Loyalist ExamServices) and some universities (e.g., University of Amsterdam) [23].However, it is very labor-intensive and the cost will be high whena large number of students attend an online exam.To eliminate the usage of manpower, some fully automated proc-toring approaches are proposed [2, 9], which often use machinelearning techniques to identify cheating behaviors. Currently, thereare some other online proctoring platforms, including ProctorU and Proctorio , using automated proctoring based on machinelearning. However, all the existing fully automated proctoring ap-proaches suffer from similar concerns as other machine learningmethods in education. These concerns include the “black box” na-ture of the machine learning algorithms and unreliable decisionmaking led by biased training datasets [37]. Due to these concerns,it is almost impossible to totally rely on automated methods todetermine whether a student cheats in an online exam or not.To address the problem resulting from the fully automated proc-toring methods, semi-automated proctoring has been introducedto involve humans in the final decision making [11, 20, 23]. Onerepresentative prior work is Massive Open Online Proctor proposedby Li et al. [20]. Specifically, their approach first detects suspectedstudent cheating behaviors with machine learning techniques andthe detection results will be further checked by teachers. How-ever, it does not provide teachers or instructors with a convenientway to explore and analyze suspected student cheating behaviors.Also, it requires that each student in the online exam uses multipledevices (e.g., two webcams, a gaze tracker and an electroencephalo-gram (EEG) sensor) to record their exam process, which is not https://proctorio.com/ Visual Analytics Approach to Facilitate the Proctoring of Online Exams CHI ’21, May 8–13, 2021, Yokohama, Japan affordable for most educational institutions. Migut et al. [23] pro-posed a method to calculate the similarity between two successiveframes in videos which record screens and extract video clips withdissimilar frames for manual checking. Their method also suffersthe problem that there is no convenient method to explore the stu-dents’ behaviors in the extracted video clips. Futhermore, detectingcheating behaviors using local materials (e.g., paper materials, mo-bile phones) is not supported in their method. Costagliola et al. [11]proposed a visual analytics system to assist teachers in invigilatingan exam. However, it is limited to detecting the cheating case that astudent is looking at another student’s screen, which is not commonin online exams.Inspired by the prior research above, we aim to propose a visualanalytics system for the efficient detection and analysis of variouscommon suspected cheating behaviors in online exams. It willutilize easily-collected data and combine the domain knowledge ofusers with machine computation power.

Mouse movements are commonly used to analyze user behaviorsand cope with various tasks including user modeling [21, 26], cog-nitive load evaluation [12, 15] and student performance predic-tion [19, 39]. Raw mouse movement data is spatial-temporal dataand hard for humans to interpret.A few visualization approaches have been proposed to visualizethe spatial and temporal information of mouse movement data,including 2D and 3D visualization. 2D visualizations often plot thespatial information on a vertical axis and a horizontal axis andencode temporal information in a weaker visual channel (e.g., col-ors). Arroyo et al. [1] plotted raw mouse trajectories on web pagesand used a heatmap-like design to show the time delay on eachelement in the website. The occlusion of mouse trajectories is se-vere in their method when the trajectory is complex. Burigat etal. [6] implemented a 2D visualization to draw all mouse move-ment trajectories on web pages and use colors of lines to encodethe sequential information of movements. This method also suf-fers from the problem of occlusion and it is difficult to track thesequential order of movements. Heatmap is a frequently used tech-nique in 2D visualization of mouse movement data. The frequencyof mouse movement data in an area is represented by the colorsin the heatmaps. Currently, several web analytics tools includingHotjar and Mouseflow apply heatmaps to present the mousemovement data. The drawback of this method is that they cannotshow any detailed movement and the temporal information is alsolost. Region-of-Interest (ROI)-based visualization has also been ex-plored by prior studies [5, 42, 43]. They visualized the transitionsbetween ROIs to conduct visual analysis of user mouse movementbehaviors. However, these methods depend highly on the appro-priate definitions and choices of ROIs. 3D visualizations have alsobeen proposed to present mouse movement data. Zgonnikov etal. [47] proposed a landscape-like design to visualize the positionsand speeds of mouse movements. Leiva and Vivó [18] plotted linecharts in three dimensions to represent the mouse movement posi-tions and temporal information, respectively. However, they share https://mouseflow.com/ common limitations with other 3D visualization methods, includingocclusion and inaccurate depth perception.In this paper, we propose a novel visual design for showingmouse movements to support the visual analytics of students’ sus-pected cheating behaviors during online exams. Head pose estimation is an important topic in the research oncomputer vision. Head poses show how the head rotates in threedimensions (i.e., yaw, pitch, roll) which are illustrated in Figure 2.The representative methods on head pose estimation include FSA-Net [44], PADACO [17] and Hopenet [35].Head pose estimation has also been applied in the proctoring ofonline exams. For example, Prathish et al. [29] applied a head poseestimation method proposed by Narayanan et al. [28] to detect mis-conduct. Chuang et al. [9] extracted students’ head poses with themethod proposed by Baltrusaitis et al. [3]. They calculated severalstatistical metrics of head poses (e.g., average yaw angle, maximumpitch angle), which were further input into their regression modelsto predict whether a student had cheated or not in the exam.In this paper, we also consider analyzing and further visualizinghead pose data, which help proctors conduct a convenient andreliable exploration of cheating behaviors in online exams.

To better understand the major challenges and design requirementsin conducting proctoring for online exams, we have worked closelywith five teachers (university professors or teaching staff) (P1-P5)at our university in the past six months. P1 is a professor who hastaught several online courses on human computer interactions anddata analytics in past years and has also been working on researchprojects on E-learning for more than five years. P2 is an assistantprofessor who has rich teaching experience and has also taughtmultiple online courses on programming languages. P3 and P4 arefull-time teaching associates who are mainly responsible for assist-ing professors on proctoring exams and marking papers. P5 is alecturer who has instructed multiple online courses on design andinnovation. All experts have experience in organizing and proctor-ing online exams. P1 is also a co-author of this paper. We conducteda series of interviews and discussions with them through onlinevideo meetings and email communications. We collected their feed-back and summarized the major design requirements for proctoringonline exams. We denote the five major design requirements as R1 - R5 for easy reference in the subsequent sections. R1. Identify students at high risk of cheating.

According tothe feedback of our experts, all of them agreed that it is almostimpossible to manually review all the videos, since there are oftenmultiple or even several hundred students taking the same onlineexam for a course. Therefore, an effective approach to facilitate theproctoring of online exams should help teachers or other proctorseasily and quickly identify the students who have possibly cheatedin the online exam, especially those students of high risk. This isalso the fundamental goal of any approach for enabling the efficientproctoring of online exams.

HI ’21, May 8–13, 2021, Yokohama, Japan Li et al.

R2. Locate the questions where high-risk cheating behav-iors occurred.

When a student is identified as at high risk of cheat-ing during an online exam, it is often necessary for teachers andother proctors to further explore where and when the student hascheated and check how he/she cheated. However, the current wayto achieve this is to manually go through the original videos, whichis often time-consuming. For example, P3 commented that “A typ-ical online exam lasts for 2-3 hours and may involve hundreds ofstudents. Reviewing individual people is time-consuming” . P2 also de-scribed this method as “a dull process but needs 120% concentration” ,which increases the burden on teachers. To handle this problem,it is important to locate the questions a student may cheat on andenable a fast review and check of cheating behaviors.

R3. Inspect students’ detailed cheating behaviors.

All theexperts agreed that they also need to inspect detailed cheating be-haviors to better understand the detected suspected cases. Accord-ing to the feedback of experts, there are various cheating methodsincluding using unauthorized paper materials, seeking help throughsocial media and searching for answers on the Internet. Amongthem, the majority of commonly-seen cheating methods are relatedto head and mouse movements. For example, a sign of cheating onother web pages is that the student’s mouse arrives at the edge ofthe exam web page and stays for a while, as P3 suggested. Also,P4 pointed out that turning the head to somewhere else also canindicate some cheating behaviors such as using cheat sheets. Thus,teachers need to inspect the detailed mouse and head movementsduring the online exam. For such kind of inspections, a convenientand intuitive way to explore those behaviors, which can indicatecheating, is highly appreciated.

R4. Confirm suspected cheating cases through compari-son across students and questions.

P1 suggested that cheatingcases are always hard to confirm, since some normal behaviorsalso look like cheating such as rotating the head to read the ques-tions. Thus, our system needs to provide a convenient and effectiveapproach to confirm that a suspected case is not led by normalbehaviors. P4 agreed that comparison with peers is an importantway to avoid the systematic errors led by question design and per-sonal habits when reviewing suspected cases. In practice, differentquestion designs may lead to different problem-solving behaviorsof students during the online exam. For example, long questionson the screen may require students to rotate their heads to readthem. Also, students’ habits may affect their behaviors during theonline exam. Thus, a comparison with peers’ behaviors on the samequestion and a student’s own behaviors on other questions can helpteachers and proctors reduce the possibility of making mistakes injudging a suspected cheating case.

R5. Explore the original video and mouse movement datain a convenient manner.

As P2 and P3 suggested, a function-ality of playing back video recordings and mouse movements isessential. It can help proctors to further confirm suspected cheatingcases, which may be wrongly labeled by some automated detectionmethods. For example, drinking water could be easily recognizedas abnormal behavior, since students move their heads severely todrink water. Also, due to the unstable network or errors of webcams,one or two frames of the video may be of low quality, which canresult in face detection failure. Besides, proctors may be interested in finding if there are any other suspected cases in the video thatmay not be detected by the current approaches.

To enable the convenient proctoring of online exams, we proposeusing the video taken by the front webcam and mouse movementdata during the online exams to detect cheating behaviors. Sincemost laptops have a webcam at the front, it is convenient to useit to record a student’s behavior, such as head movements, dur-ing an exam. Mouse movement data is also collected, as mousemovements reflect where a student is focusing and are easier tobe collected without extra devices than other means such as eyetracking [7, 10, 11]. Prior studies have also explored other featuresfor cheating detection, such as audio [2], eye movements and elec-troencephalogram (EEG) [20]. However, our approach does notinclude those, as they are not always available in real online exams.For example, as our experts P1 and P3 suggested, audio is oftenunavailable in online exams held through online meeting software,since students are usually muted to avoid noise. Furthermore, thecollection of eye movement and EEG data requires extra devices,which bring more costs and limit their usage in practice. In thissection, we introduce the details of our data collection.

There is no available public data for the proctoring of online exams.Therefore, we decided to hold a mock online exam to collect dataafter a careful discussion with P1. The major reason for collectingdata from a mock exam is that we can ask participants to indicatewhere and how they cheated. Such information can be used asthe ground truth for evaluating the effectiveness of the proposedapproach. However, it is difficult to ask students to indicate whereand how they have cheated in real online exams. To ensure that themock online exam has a similar setting as that of real online exams,we worked closely with P1 and designed an online exam consistingof two question sets. In the mock online exam, participants’ detailedexam records are collected, including their exam videos, mousemovement data, duration of the online exam, grades and the exactlabeling on their cheating behaviors.

Exam.

We designed a mock online exam consisting of two ques-tion sets with both sets focusing on evaluating students’ knowledgeof JavaScript. Each question set consists of 10 multiple choice ques-tions and 4 short answer questions. For example, a multiple choicequestion can look like this: “Which value will not be returned by the‘typeof’ in JavaScript? A. number; B. object; C. function; D. null” anda short answer question will ask students to list at least 3 ways toempty an Array in JavaScript. The time limit for one question setwas 25 minutes and participants were allowed to submit early afterhe/she finished all the questions.

Participants and Apparatus.

In our mock online exam, we re-cruited 24 participants (7 female, 𝑎𝑔𝑒 𝑚𝑒𝑎𝑛 = . 𝑎𝑔𝑒 𝑠𝑑 = .

94) bysocial media. They are all postgraduate students or fresh graduateswho have experience in using JavaScript. They received US $2.5if they finished the whole online exam. Also, to encourage partic-ipants to take the online exam seriously and act as taking a realexam, US $0.375 was paid for each correctly answered question.

Visual Analytics Approach to Facilitate the Proctoring of Online Exams CHI ’21, May 8–13, 2021, Yokohama, Japan

To mimic the environment of real online exams, as suggested byP1, the mock online exam was conducted online. We implemented aweb-based online exam system to collect webcam video recordingsand mouse movement data. Participants were asked to show theirentire faces in the video recordings. Before the online exam, wesent each participant an exam guideline and a cheatsheet on keyJavaScript knowledge points that are useful for both question sets.

Procedure.

Before the mock online exam, we first introducedthe exam guideline and clarified what data would be collected inthe exam. Then we asked for their permission to allow us to usecollected data for research purposes. In the mock online exam, eachparticipant needed to finish both question sets and was asked tocheat on one of them. To eliminate the influence of the differencebetween two question sets, we arranged the participants and the or-der of the question sets required for cheating in a counterbalancedmanner. In the question set on which they were asked to cheat,participants needed to use at least 3 methods to cheat and on eachquestion, they could only apply one cheating method. The ques-tions to cheat on and the detailed cheating methods were decidedby participants. Additionally, in our instruction, we emphasizedthat they needed to try their best to pretend they were in a realclosed-book online exam. When participants finished the questionset on which they were asked to cheat, a questionnaire will appearand ask them to indicate where and in which way they cheated.Participants were also allowed to have a break for 5 minutes be-tween two question sets. Since early submissions are permitted,their exact time used to answer each question set is also recordedas the duration between their entrance to the exam web page andtheir final submission. In the following week, we graded all theexam scores of each participant.

Table 1 shows the basic statistics of our collected data.

Table 1: Statistics of collected data.

Number of participants 24Number of videos 48Number of mouse movement records 286,940Length of videos 9h 21m 32sNumber of cheating cases In the local environment 50On the computer 189

Video Data.

In our mock online exam, we collected 48 videosin total, which are 30 frames per second (FPS) with a resolution of640 × Mouse Movement Data.

In our online exam, students usedtheir mouses and keyboards to interact on the exam web page toanswer questions. Since most of the interactions are conducted bymouse movements, we use the term mouse movement data to denoteall the collected interaction data generated by mouse or keyboardinteractions. We developed a JavaScript plugin to collect mousemovement data, which is implemented based on the DOM structure The plugin is available at https://github.com/HKUST-VISLab/Mousetrack. of the HTML file and is generalizable to collect mouse movementson any other web pages. Furthermore, the plugin is automaticallyloaded with the web page and does not require proctors or studentsto conduct any extra settings. Since the plugin only works on oneweb page and cannot collect any data after the student finishes theonline exam, it will not lead to the privacy concerns of collectingdata in the background.In each record of the mouse movement data, the mouse positionand the DOM event type are recorded. We collected six types ofDOM events: • Blur : the web page loses focus, which is triggered when theparticipant leaves the current web page. • Focus : the web page is the current focus, which is triggeredwhen the participant enters the web page. • Copy : content is copied from the current web page by usingeither mouse or keyboard. • Paste : content is pasted to the current web page by usingeither mouse or keyboard. • Mousemove : mouse moves on the web page. • Mousewheel : mouse wheel rolls to scroll on the web page.Among all the six types, “blur” , “focus” , “copy” and “paste” are con-sidered as indicators of cheating, since their occurrence alwaysreflect the usage of some external materials on the computer, asindicated by P3. However, our expert P4 also pointed out that it ispossible that a student may copy and paste some materials merelyon the exam web page. Thus, individual “copy” or “paste” is insuffi-cient to judge if some cheating behaviors are occurring. To addressthis issue, we can check if “blur” and “focus” exist in the contextof “copy” and “paste” mouse events to verify suspected cheatingbehaviors. If “blur” and “focus” exist around “copy” and “paste”, thestudent may copy some questions, use unauthorized materials andpaste something to the exam web page. For example, in Scenario 3of Section 6.1, the student copied something from the exam pageand left the exam page for a while to cheat by running the copiedcode. If there is no “blur” or “focus” around “copy” and “paste”, thestudent is likely just copying and pasting on the exam web pageand the possibility of cheating is low. Cheating Types.

By analyzing our collected data, we classifystudents’ cheating methods into two types: • Cheating in the local environment : students use unauthorizedmaterials locally to cheat, for example, paper materials andmobile phones, while they stay on the exam web page. Thecommon feature of these cheating methods is that studentsneed to turn their head away from the current screen. Thus,we propose using face disappearance and abnormal head pose as indicators of cheating in the local environment. • Cheating on the computer : students leave the exam web pageand use the computer to access unauthorized materials, forexample, searching on the Internet, asking friends throughsocial media and using electronic notes. Also, students al-ways copy the questions to search or paste answers to thewebsite to save time. Thus, leaving the website and “copyand paste” are our main indicators for finding suspectedcheating behaviors on computers. HI ’21, May 8–13, 2021, Yokohama, Japan Li et al.

In this section, we introduce the technical details of our visualanalytics approach for the proctoring of online exams. As shown in Figure 1, the proposed approach consists of three ma-jor modules: data collection, suspected case detection engine andvisualization, where the latter two modules are the core parts ofour approach. For the data collection, we mainly collected examvideos of individual students, mouse movement data and other re-lated information like exam score and exam duration, which hasbeen introduced in Section 4. Then, in the suspected case detectionengine, we conduct face detection and head pose estimation onvideos to detect abnormal head movements. We also define twotypes of abnormal mouse movements and further identify themfrom the mouse movement data. Finally, we visualize the abnor-mal head and mouse movements and other related informationwith different levels of details. There are four main views in ourvisualization module:

Student List View provides proctors with aquick overview of students at a high risk of cheating during thewhole online exam (Figure 3(a));

Question List View facilitates theselection of high-risk periods of each student (Figure 3(b));

BehaviorView presents the students’ detailed behaviors of head and mousemovements (Figure 3(c));

Playback View makes proctors able toconduct a final confirmation on students’ videos and mouse move-ments (Figure 3(d)).

Figure 1: Our method consists of three modules: data col-lection, suspected case detection engine and visualization.In the data collection module, each student’s video, mousemovement data, exam score and duration are collected.These data are fed into the suspected case detection engineto extract abnormal head and mouse movements. The visu-alization module enables a convenient and efficient analysisof students’ online exam behaviors.

We design a rule-based suspected case detection engine to identifysuspected cases from both video and mouse movement data, andfurther estimate the risk of cheating. Specifically, we propose tworules: head poses which vary greatly from others and face disap-pearances in the video are abnormal head movements; copy, paste,blur and focus are abnormal mouse movements, since they canindicate the existence of cheating, as mentioned in Section 4.2.

Abnormal Head Movement Detection.

We characterize headmovements from two perspectives: head poses and head positions. Our system is available at https://github.com/HKUST-VISLab/Visual-analytics-approach-online-proctoring.

Head poses indicate where a student is looking at during the on-line exam. Meanwhile, head positions represent how a studentmoves his/her head to use different devices or materials. We usethe position of a student’s face in the collected video to delineatethe corresponding head position, which is labeled as a rectangularbounding box that appropriately encloses the student’s face, asindicated by the green box in Figure 2. The size of a bounding boxcan help proctors check any change in distance between a student’sface and the screen, which has been considered to be a useful in-dicator of cheating in prior research [9]. Also, the head positioninformation can benefit the detection of cases when a student isnot looking at the exam web page.In our approach, we focus on detecting two types of abnor-mal head movements: face disappearance and abnormal head pose ,which are extracted by considering head positions and head poses,respectively.

Face disappearance represents that the head position isunavailable during a period of time, since no face is detected. Thiscan happen under two circumstances where faces are not capturedby webcams or are captured by webcams but not detected by thealgorithm. The situation where faces are not captured by webcamsoften result from the student leaving the room or covering thecamera. While an existing face being undetected by the algorithmusually happens when the student’s face is partially covered bysome objects, for example, a cup, or the student bows his/her headdeeply. According to our experts, P1 and P4, both cases often in-dicate a high probability of cheating. To verify if a case of facedisappearance indicates cheating, proctors can check other infor-mation such as raw videos and mouse movement data. For example,in Scenario 1 of Section 6.1, we illustrate that drinking water ismistakenly judged as a face disappearance case and the raw videoin our Playback View can help teachers verify it. An abnormal headpose indicates that the head pose of the student varies greatly froma normal one. For example, a student raises or bows his/her head,or turns his/her head away from the screen.

Figure 2: This figure illustrates the angles of the head poseand the head position. Pitch, yaw and roll describe the headrotation about X-axis, Y-axis, Z-axis, respectively. X-axispoints to the right of the student. Y-axis points to the floor.Z-axis points to the computer screen. The green solid box isthe bounding box representing the head position.

For abnormal head movement detection, we first extract headpositions by conducting face detection in the video. Since there willbe a large number of frames in each video, we surveyed prior stud-ies [40, 45, 46] and follow Zeng et al. [45] by sampling video framesto accelerate video processing. Specifically, we process one videoframe for every five frames. Then, we use the pre-trained Faster

Visual Analytics Approach to Facilitate the Proctoring of Online Exams CHI ’21, May 8–13, 2021, Yokohama, Japan

R-CNN model [36] to detect faces in the video. The extracted headposition in Frame 𝑖 is labeled as a vector [ 𝑥 𝑚𝑖𝑛𝑖 , 𝑦 𝑚𝑖𝑛𝑖 , 𝑥 𝑚𝑎𝑥𝑖 , 𝑦 𝑚𝑎𝑥𝑖 ] ,where ( 𝑥 𝑚𝑖𝑛𝑖 , 𝑦 𝑚𝑖𝑛𝑖 ) is the coordinate of the upper-left corner and ( 𝑥 𝑚𝑎𝑥𝑖 , 𝑦 𝑚𝑎𝑥𝑖 ) is the coordinate of the lower-right corner. Also, themodel outputs a probability score of correct detection for each de-tected head position to show if the detection is reliable. To keep ourresults reliable, only the head positions with a probability largerthan 0 .

95 are kept. If Student 𝑠 ’s face is not detected in Frame 𝑖 inhis/her video after sampling, then that frame is labeled as “facedisappearance”. Student 𝑠 ’s total number of face disappearances isrecorded as 𝑛 𝑠𝑓 .Furthermore, we apply the state-of-the-art head pose estimationmodel [35] to extract the head pose in each frame of the exam videoof every student. The extracted head pose at Frame 𝑖 of Student 𝑠 ’svideo is a three dimension vector [ 𝑝𝑖𝑡𝑐ℎ 𝑠𝑖 , 𝑦𝑎𝑤 𝑠𝑖 , 𝑟𝑜𝑙𝑙 𝑠𝑖 ] , which repre-sents the head rotation about 3 axes (Figure 2). Among all 3 angles,pitch and yaw angles are crucial to suspected case detection, sincethe pitch angle indicates where the student looks vertically and theyaw angle indicates where the student looks horizontally. However,the roll angle is not meaningful, since the head rotation along theZ-axis cannot affect much where the student is looking. Thus, it isnot considered in our approach. Since different students may havedifferent exam settings, which lead to different ranges of head posi-tions and head poses, we normalize each student’s head positionsand poses at each frame to (− , ) with min-max normalization. Inthe rest of this paper, all the head poses [ 𝑝𝑖𝑡𝑐ℎ 𝑠𝑖 , 𝑦𝑎𝑤 𝑠𝑖 , 𝑟𝑜𝑙𝑙 𝑠𝑖 ] andhead positions [ 𝑥 𝑚𝑖𝑛𝑖 , 𝑦 𝑚𝑖𝑛𝑖 , 𝑥 𝑚𝑎𝑥𝑖 , 𝑦 𝑚𝑎𝑥𝑖 ] are the normalized ones.Since students may place their webcam at various places or an-gles, a standard head pose is not available. Thus, we propose to usez-score to detect abnormal head poses based on the assumptionthat a student’s head pose distribution should follow a normal dis-tribution. We first calculate the average head pose of Student 𝑠 inhis/her video, [ 𝑝𝑖𝑡𝑐ℎ 𝑠 , 𝑦𝑎𝑤 𝑠 ] , as well as the standard deviation ofhis/her head poses, [ 𝜎 𝑝𝑖𝑡𝑐ℎ 𝑠 , 𝜎 𝑦𝑎𝑤 𝑠 ] . Then a vector of z-scores ofthe head pose at Frame 𝑖 is calculated as [ 𝑝𝑖𝑡𝑐ℎ 𝑠𝑖 − 𝑝𝑖𝑡𝑐ℎ 𝑠 𝜎 𝑝𝑖𝑡𝑐ℎ 𝑠 , 𝑦𝑎𝑤 𝑠𝑖 − 𝑦𝑎𝑤 𝑠 𝜎 𝑦𝑎𝑤 𝑠 ] . (1)If the absolute value of any z-score in the vector is larger than athreshold, the corresponding head pose is labeled as abnormal. Thedefault threshold is set to 3 by following the widely used Three-sigma rule and proctors can interactively adjust the thresholdthrough our visualization module. The total number of abnormalhead poses of Student 𝑠 is recorded as 𝑛 𝑠ℎ . Abnormal Mouse Movement Identification.

We also detecta few representative suspected mouse movements, including “copy”,“paste”, “blur” and “focus”, which can indicate cheating behaviors.For example, a “copy” can reveal that the student copies questioncontents and searches it for answers if the “copy” is recorded beforeleaving the exam web page. “Paste” sometimes signifies that a stu-dent pastes the answer from other sources, such as another websiteor electronic lecture notes. Specifically, if a “paste” is recorded aftergetting back to the exam web page, the student is most likely cheat-ing. “Blur” and “focus” can indicate that a student leaves the current https://en.wikipedia.org/wiki/Standard_score https://encyclopediaofmath.org/wiki/Three-sigma_rule exam web page and gets back later. Since “copy” and “paste” arehighly related to each other, we count the total number of “copy”sand “paste”s of Student 𝑠 as 𝑛 𝑠𝑐 . For the same reason, “blur” and“focus” are also counted together as 𝑛 𝑠𝑏 . Besides, following the samemethod of the normalization used on head poses and boundingboxes, we also normalize each student’s mouse positions to (− , ) . Overall Risk Estimation.

With all the suspected cases, we fur-ther estimate students’ risk levels on each question. First, all the oc-currence numbers of suspected cases on Question 𝑞 , [ 𝑛 𝑠𝑞𝑓 , 𝑛 𝑠𝑞ℎ , 𝑛 𝑠𝑞𝑐 , 𝑛 𝑠𝑞𝑏 ] ,are normalized to ( , ) with min-max normalization. For each typeof suspected cheating behaviors on Question 𝑞 , the minimum num-ber of instances of the type is transformed to 0, while the maximumnumber of instances is transformed to 1. In the rest of this paper, allthe values [ 𝑛 𝑠𝑞𝑓 , 𝑛 𝑠𝑞ℎ , 𝑛 𝑠𝑞𝑐 , 𝑛 𝑠𝑞𝑏 ] are the normalized ones. Finally, wesummarized the overall risk of Student 𝑠 on Question 𝑞 as follows: 𝑟𝑖𝑠𝑘 𝑠𝑞 = ∑︁ 𝑡 ∈{ 𝑓 ,ℎ,𝑐,𝑏 } 𝑤 𝑡 × 𝑛 𝑠𝑞𝑡 , (2)where 𝑤 𝑡 is a customized weight of each type of suspected cheatingbehaviors. The weights of each type are all set to 1 by default andour approach also enables proctors to interactively adjust them. As mentioned above, we also proposed straightforward and intuitivevisual designs to help proctors explore and analyze student cheatingbehaviors based on student exam videos and mouse movements.

Student List View.

Student List View (Figure 3(a)) is designed toprovide proctors with an overview of the risk levels of all studentswho have participated in an online exam ( R1 ).Each row of Student List View visualizes the risk of a student,which is composed of two main parts, a glyph and two divergingbar charts. The glyph (Figure 4(a)) shows the overall risk of sus-pected types and the diverging bar charts (Figure 4(b)) display thedifference of cheating risks and time costs between the currentstudent and all the students. The glyph has two outer radial barcharts and two radar charts. The blue radar chart with circles onvertices encodes the current student’s normalized risk level of eachsuspected type in a spatial position. The grey radar chart withoutmarks on vertices indicates the average normalized risk level ofall students in this online exam. The green outer radial bar chartof a smaller radius shows the percentage of time used to finishthe online exam, while the orange radial bar chart of the largerradius shows the percentage of scores. This design follows the ef-fectiveness principle described by Munzner [27] by encoding themost important information, cheating risk, using a strong visualchannel (i.e., spatial position) and encoding the less important in-formation, time used and scores, with a weaker channel (i.e., angle).Also, we apply a boxplot-like design and visualize the 1st, 2nd and3rd quartiles of the normalized risk levels on each axis to provideproctors with more detailed analysis and enable easy comparisonof current student and other peer students ( R4 ). The range of eachaxis is ( , ) from the center to the edge. An alternative design forplotting these statistical metrics, which is considered during ourdesign process, is to draw all of them using radar charts. However,it will lead to severe occlusion if five radar charts are drawn within https://en.wikipedia.org/wiki/Quartile HI ’21, May 8–13, 2021, Yokohama, Japan Li et al.

Figure 3: Our approach provides novel visualizations for proctors to identify cheating cases. (a)

Student List View is an overviewof all students’ risk of cheating in an online exam. (b)

Question List View shows the risk level of all questions finished by astudent. (c)

Behavior View presents a student’s detailed head and mouse movements while answering a question. (c1) and (c2)are the upper detailed behavior chart and the suspected case chart, respectively. (d)

Playback View enables proctors to checkraw videos and animated visualization of mouse movements on the exam web page. (e) The control panel can be used toselect online exams and adjust several parameters by proctors. (f) provides a function to save screenshots of the raw video.Screenshots in (f) are taken at time points indicated by vertical black dashed lines in (c). (g1)-(g4) illustrate fast location andconvenient verification of cheating behaviors in

Usage Scenario 1 . the glyph. Thus, we finally adopt the boxplot-like design to showthese metrics.Two diverging bar charts show the time spent on each question ingreen on the left and each question’s overall estimated risk in red onthe right. The risk of each question is calculated as Equation 2. Theright side of both bar charts shows the average normalized valueof all students, and their left side presents the normalized value ofcurrent students. The length of each bar encodes the normalizedvalue. The design of diverging bar charts also aims to achieve aconvenient comparison of a student and others ( R4 ).Rich interactions are supported in Student List View. First, atooltip will display the normalized risk level when hovering on theradar chart. Second, the control panel (Figure 3(e)) enables proctorsto adjust a set of configurable parameters for risk estimation, suchas the threshold of abnormal head poses and the weights of differentsuspected cheating types, as mentioned in Section 5.2. The methodof sorting and the selection of online exams can also be changedin the control panel. Third, proctors can click on the “plus” icon on the upper-left corner of each row to show the current student’sQuestion List View. Figure 4: Overview of a students’ risk of cheating is shownby two main components: (a) a glyph showing overall riskof all suspected types; (b) two diverging bar charts showingthe comparison of current student’s and all students’ overallrisk and time spent on each question.Question List View.

Question List View (Figure 3(b)) is de-signed to help proctors quickly locate high risk questions, where astudent may have cheated, for further investigation ( R2 ). Visual Analytics Approach to Facilitate the Proctoring of Online Exams CHI ’21, May 8–13, 2021, Yokohama, Japan

In this view, each question is represented by a block. The colorand border style encode whether the student correctly answers aparticular question, where green solid rectangles represent that thecorresponding questions are correctly answered and gray dashedrectangles indicate incorrect answers. The width of the block en-codes the estimated risk level of one question, which makes highrisk questions easier to be noticed by proctors. Within each block,there is a bar chart showing all normalized risk levels of the foursuspected cheating types, “blur and focus” (b), “copy and paste” (p),“abnormal head pose” (h) and “face disappearance” (f). To supporta convenient comparison with all the other students, we apply aboxplot-like design to indicate the 1st, 2nd and 3rd quartiles ofthe normalized risk levels ( R4 ). The circle on each bar encodes theaverage cheating risk level among all students on the same question.Proctors can click a question block of the student and expand theBehavior View of that question for detailed inspection. If a questionis selected, it will be highlighted by a thicker gold solid border foreasy identification.During our design process, we initially considered using a heatmapto visualize the cheating risk of questions of each student in theStudent List View. Each block in the heatmap represents a questionin the online exam and the opacity of each block denotes the riskof each question. However, this design suffers from some disadvan-tages. First, detailed risk distribution on different types of suspectedcases can hardly be represented in a block in heatmaps. Second,due to the limited screen space, scalability is also a concern of sucha design, especially when there is a large number of questions in anonline exam. Thus, inspired by EgoSlider [41], we finally decided touse diverging bar charts to show the overall risk of each questionin the Student List View and further design Question List Views topresent the detailed risk of questions. Behavior View.

Behavior View (Figure 3(c)) provides proctorswith a detailed understanding of how the student’s head and mousemove during his/her problem-solving process, which enables thefast location of suspected cases and further inspection ( R3 ). Thisview consists of three types of charts: two detailed behavior charts in the middle, a suspected case chart between two detailed behaviorcharts and periphery heatmaps on the left and right sides. Thedetailed behavior charts are plotted to show the detailed head andmouse movements, while the periphery heatmaps [24] are used forcomparing the behavior shown in detailed behavior charts acrossstudents and questions.In Figure 3(c), the upper detailed behavior chart (Figure 3(c1))presents the mouse positions and ranges of bounding boxes alongthe X-axis, while the lower detailed behavior chart (Figure 3(c2))shows the corresponding information along the Y-axis. The yawangles of head poses are shown in the upper detailed behavior chartand the pitch angles are shown in the lower detailed behavior chart.In both detailed behavior charts, the brown dashed line shows thenormalized mouse positions on the screen, while the dark greensolid line encodes the normalized angles of head poses. The area inlight blue represents the range of bounding boxes of head positionson one axis, which means the upper bound of the area is 𝑥 𝑚𝑎𝑥 or 𝑦 𝑚𝑎𝑥 , while the lower bound represents 𝑥 𝑚𝑖𝑛 or 𝑦 𝑚𝑖𝑛 . However, aspointed out by Blascheck et al. [4], encoding movement data by axesindividually has a drawback that extra mental effort is needed tounderstand movements. To mitigate this problem, we also provide animated visualizations of mouse movements and raw videos inPlayback View to help proctors understand how a student hasbehaved during the online exam (Figure 3(d)). The suspected casechart (Figure 3(c2)) between two detailed behavior charts shows thepositions of all suspected cases. We plot a bar for each suspectedcase detected by our suspected case detection engine and the glyphon the bar denotes the type of suspected cheating cases. Figure 5: An alternative design of detailed behavior charts. Itshows the same head and mouse movements as these in Fig-ure 3(c). The brown dashed line and dark green solid line rep-resent mouse movements and head poses respectively. Bluesolid boxes encode head positions.

Before adopting the current design to present head and mousemovement data, we also considered a possible alternative design.Inspired by the scanpath visualizations of eye tracking data [4],we propose a visual design to show the original X-axis (horizontalaxis) and Y-axis (vertical axis) coordinates of head positions andmouse positions, as shown in Figure 5. Also, the yaw angles ofhead poses are shown together with the X-axis, since it reflectswhere the student is looking at on the X-axis. For the same rea-son, the pitch angles are shown together with the Y-axis. In thisdesign, the trajectory of mouse movements and the change of headposes are shown as lines. The head positions are represented asbounding boxes. The opacity of lines and box borders encodesthe temporal information, where a high opacity indicates that thehead or mouse movement occurs at the latter stage of the wholeproblem-solving process for this question. However, this designleads to severe visual clutter and occlusion when the amount ofdata is huge. The visual clutter further makes it hard to learn thesequence of movements. To reduce the visual clutter and betterencode the temporal information for easy location of suspectedcases, we finally decide to employ our current design, which breaksdown the spatial movement information into two dimensions andencodes them individually.On the left and right sides of the detailed behavior charts, fourperiphery heatmaps which have the same Y-axes as those of detailedbehavior charts are designed to display other students’ behaviors onthe current question and the current student’s behaviors on otherquestions, respectively. In each heatmap, there are three columnswhich represent the frequency distributions of the lower bounds ofhead positions, head poses and the upper bounds of head positions

HI ’21, May 8–13, 2021, Yokohama, Japan Li et al. from left to right. The color of each column is the same as thecorresponding line chart in the detailed behavior chart. Its opacityencodes the frequency of head poses or head positions that fallinto a specific interval (e.g., the frequency of head poses with avalue between 0 and 0.1). The heatmaps are designed to facilitatecomparison of student behaviors across students and questions ( R4 ).Such comparisons are necessary to a reliable cheating behavioranalysis, since they enable proctors to consider students’ habitsand the specific questions they are working on, which are twoimportant factors affecting student behaviors. Playback View.

Playback View (Figure 3(d)) provides proctorswith a choice to review the suspected cases and further confirmwhether a suspected case is real cheating, especially for some am-biguous cases ( R5 ). This is important, as the underlying suspectedcase detection algorithm often cannot achieve a 100% accuracyand some normal behaviors can be misclassified as cheating, forexample, drinking water. Also, it serves as a complement to thedetailed behavior charts in Behavior View. The view contains twoparts, an animated mouse movement visualization at the top and araw video player at the bottom. The animated mouse movementvisualization uses a heatmap to show the number of times thatthe mouse stays in an area. In the color scale of our heatmap, bluedenotes few visits and red denotes frequent visits. Also, The opacityof an area with fewer visits is lower. In Playback View, the rawvideo player and the mouse movement animated visualization arelinked together to play synchronously. Additionally, they can becontrolled by clicking on the detailed behavior charts to skip to acertain time point and start to play. The vertical blue solid line inFigure 3(c) denotes the current time point of playing the raw videoand the animated visualization. The raw video player supports mul-tiple interactions, including play/pause, skip and play in full-screen.Furthermore, proctors can click the “camera” button at the top rightcorner of Behavior View (Figure 3(c)) to take a screenshot of thecurrent video and list some screenshots in the area below PlaybackView for an easy review of videos, as shown in Figure 3(f). We extensively evaluate our approach through three usage scenar-ios, a user study and expert interviews.

In this section, we describe three usage scenarios to demonstratethe usefulness and effectiveness of our visual analytics system infacilitating the proctoring of online exams.

Scenario 1: Fast Location and Convenient Verification ofCheating Behaviors

In this scenario, we report a whole workflow to find a cheatingcase by observing mouse movements in a convenient and reliablemanner. First, we select “Exam B” in the control panel and sortstudents according to their level of risk. Then, we browse the Stu-dent List View to find students of high risk. Among these students,the student whose ID is is found that his risks of severaltypes are higher than median values in Figure 3(g1) ( R4 ). To furtherconfirm if he really cheated in the online exam, we expand his Ques-tion List View and locate the two most suspected questions, 𝑚𝑐 _5and 𝑚𝑐 _6 by observing the widths of blocks and comparing theoverall risk level with other students in the Student List View ( R2, R4 ). In the block of 𝑚𝑐 _6, we find there are a large number of sus-pected cases of abnormal head pose and “blur and focus”. Then,we click on the block to further investigate the detailed behaviorwhile answering that question in Behavior View ( R3 ). In Behav-ior View on this question, we first notice that there are multipleabnormal head poses near the end of the question-answering pro-cess (Figure 3(g2)) in the suspected case chart. The line charts ofhead poses are further compared with the heatmaps on both sides toconfirm that these abnormal head poses are not led by the questionor the habits of ( R4 ). Then we click on the detailed behaviorcharts at the beginning of those abnormal head poses to check thevideo in Playback View ( R5 ). In the video, we find that actually,he had drinks (Figure 3(g3)). Then these abnormal head poses canbe considered as normal behavior at low risk of cheating. We alsonotice that there are several “blur and focus”s and a “copy and paste”during his period of answering 𝑚𝑐 _6 in (Figure 3(g4)). To furtherconfirm all “copy and paste” cases are not conducted on the currentpage, we observe the line charts of mouse positions. These chartssuggest that he arrived at the boundary of the web page and stayedfor a while. Then we feel quite confident that he left the web pageand copied and pasted some materials, which is considered to be acheating case in our online exam setting ( R3 ).This scenario demonstrates that our system can help proctorsquickly locate and verify suspected cases. Also, it shows that mousemovement data provides a new perspective to find cheating cases. Scenario 2: Cheating Cases Identification Through DetailedInspection of Head Movements

In this scenario, we report a cheating case which is found throughan in-depth inspection on head movements. In the student’s Behav-ior View (Figure 6(a)), we notice there is no suspected case detectedby our detection engine during that period. Thus we may needto observe the detailed behavior charts to further learn if he hasany risk of cheating . First, a sudden change of head positions afterhis mouse stops moving is noticed. The bounding box becomessmaller, which means the distance between his face and the screenis larger. Then we check his line charts of head poses and find thatalmost at the same time, his head is raised and afterwards the pitchangles of his head poses are frequently outside his normal rangeof pitch angles by comparing the detailed behavior chart with theheatmap ( R4 ). Then we further check his video and find that heraised his head and seems to look at something other than thelaptop screen ( R5 ), as Figures 6(c2)-(c4) show. Thus, this case isthought to be a potential cheating case. After checking his reportedcheating behaviors, he tried to use another computer behind thelaptop to search for answers, which matches our observation.Since the detection engine fails to find these abnormal headposes, we would like to investigate the setting of the threshold. Weuse the control panel in Figure 3(e) to lower our threshold fromthe default value (i.e., 3) to 2, which means head poses of smallervariation than the default will also be considered as suspectedcases and the proctoring will be more strict. The suspected casechart after adjustment is shown in Figure 6(d). We can see somesuspected cases appear in the suspected case chart. According to ourobservation, the positions of glyphs match the moments the studentraised his head and looked somewhere other than the screen. Thus,we may consider that an inappropriate setting of the thresholdon this student led to no suspected case being detected. However, Visual Analytics Approach to Facilitate the Proctoring of Online Exams CHI ’21, May 8–13, 2021, Yokohama, Japan

Figure 6: This figure illustrates how to identify cheatingcases through detailed inspection of head movements in

Us-age Scenario 2 . (a) shows part of the detailed behavior charton Y-axis and pitch angles and the suspected case chart inBehavior View (threshold of abnormal head poses = 3). (b)shows the periphery heatmap of the current student onother questions. (c1)-(c4) are the screenshots while answer-ing the question and vertical black dashed lines indicatetime points of screenshots. (d) shows the same part of thesuspected case chart after threshold adjustment (thresholdof abnormal head poses = 2). since different online exams may have different requirements ofthe proctoring, it is hard to define a unified threshold of abnormalhead poses. We leave this as an option for proctors to providethem with sufficient flexibility to detect suspected behaviors. Also,our suspected case detection only aims to provide references toproctors instead of directly making decisions about whether thestudent cheats in the online exam. Proctors need to further observethe detailed behaviors for final decisions.

Scenario 3: Cheating Case Identification through the In-consistency between Mouse and Head Movements

In this scenario, a cheating case is identified through the analysisof the inconsistency between head and mouse movements ( R3 )using our approach. We first check the detailed behavior chart onX-axis and yaw angles in the Behavior View of a student (Figure 7(a))and quickly notice that the student’s head and mouse movementsare consistent initially (Figure 7(c1)), but vary a lot in the latterstage (Figure 7(c2)). From the consistency of his head and mousemovement behaviors in Figure 7(c1), we can learn that he keptlooking at his cursor. Thus, the phenomenon in Figure 7(c2) canbe abnormal. It indicates that either the student does not lookat the cursor or the student leaves the current web page, sincecollecting mouse movement data on other web pages or applicationsis unavailable. We further investigate the suspected case chart andtwo suspected cases of “blur and focus” draw our attention. Thesetwo suspected cases happen at the beginning and the end of theperiod of the inconsistency, which confirms that the student left theexam web page. By further considering the suspected case of “copyand paste” at the beginning of the period in Figure 7(c1), the wholecheating process can be inferred: the student copied the questioncontent and searched it or ran it in an IDE (Integrated DevelopmentEnvironment). By only viewing his video, his cheating behavior can hardly be detected, since there are almost no suspected headmovements in the video, as Figures 7(b1)-(b4) show. Figure 7: This figure shows the identification of a cheatingcase through the inconsistency between mouse and headmovements in

Usage Scenario 3 . (a) is the detailed behav-ior chart on X-axis and yaw angles in the Behavior View.(b1)-(b4) show the video screenshots while answering thequestion and vertical black dashed lines indicate time pointsof screenshots. (c1) shows consistency of head and mousemovements. (c2) shows the inconsistency of head and mousemovements.

This scenario shows that our visualizations can enable easyidentification of cheating behaviors by exploring the inconsistencybetween head and mouse movements, which are not able to berevealed in videos. Also, it demonstrates the effectiveness of intro-ducing mouse movements to our proctoring system.

We also conducted a user study to further quantitatively assessthe effectiveness of our approach for facilitating the proctoring ofonline exams. Specifically, we evaluate the time cost and accuracyof finding cheating cases of our approach and compare it with thebaseline approach (i.e., manually going through the exam videos).

Datasets and Tasks.

Two datasets in our user study were col-lected in our mock online exam, as described in Section 4. For eachquestion set in our mock online exam, we picked 20 video clips of6 students who cheated on that question set as a dataset. The totallength of video clips in both datasets was around 15 minutes. Ineach dataset, according to reported cheating behaviors, we chose 10video clips with cheating behaviors and another 10 video clips with-out cheating behaviors. Besides, we selected 4 more video clips asour demo dataset. Each dataset contained instances of both cheatingtypes defined in Section 4.2.In our user study, each participant was asked to perform twotasks sequentially.

Task 1 is designed to compare the effective-ness and efficiency of our approach with the baseline approachfor proctoring online exams, i.e., viewing the original exam videosof students. Each participant reviewed two datasets by viewingraw videos (i.e., the baseline method) or using our system to labelcheating cases, respectively. In Task 1, to ensure a fair compar-ison, we showed each participant a simplified system with onlyQuestion List View, Behavior View and Playback View. The reasonwhy we removed Student List View is that this view is designedto identify high risk students, while the baseline method does not

HI ’21, May 8–13, 2021, Yokohama, Japan Li et al. provide such functionality. A screenshot of the simplified system isshown in Figure 8. The goal of

Task 2 is to let participants try thecomplete workflow of our system and evaluate the usability andvisual design of our entire system and individual views. Thus, weasked participants to freely explore our complete system and finisha questionnaire at the end.

Figure 8: The screenshot of the simplified system used inTask 1. Compared with the complete system, it removes Stu-dent List View and adds some tools to record results, e.g., thetoolbar at the top and buttons above the questions. The tool-bar is used to record the start and submission timestampsand the buttons allow participants to select cheating cases.Participants.

We recruited 16 postgraduate students (5 female, 𝑎𝑔𝑒 𝑚𝑒𝑎𝑛 = . 𝑎𝑔𝑒 𝑠𝑑 = .

10) from various departments includingComputer Science, Electronic Engineering, Economics and Envi-ronmental Science in a local university through word-of-mouthand social media. All participants are or have been teaching as-sistants (TAs) in the university. Due to the current pandemic ofCOVID-19, the study was conducted in a blend mode. Some partici-pants took our study face-to-face and others took through onlinemeetings. After the completion of the study, each participant wascompensated with US $7.

Procedure.

The whole study lasted about 1 hour. At the be-ginning, we briefly introduced the purpose of the user study andwhat data would be collected during the procedure. We asked forparticipants’ permission to allow us to use the collected data forresearch purposes anonymously. Then we introduced the wholeprocedure and had the tutorial session. In the tutorial session, wedemonstrated the usage of our system, explained all suspected typesand then asked them to try our simplified system with the demodataset. After the tutorial session, they conducted Task 1 on differ-ent datasets by using the baseline method or using our simplifiedsystem. The order of reviewing methods and datasets was coun-terbalanced to eliminate the effect brought by the differences inthe datasets. The time limit for each dataset was 10 minutes, butparticipants were allowed to submit their results early once theyfinished reviewing all the videos in the dataset and felt confidentabout their selections. The reason why our time limit was shorterthan the total length of video clips is that we would like to mimicthe real procedure of proctoring, in which proctors do not have time to review all videos and they sometimes skip some details tosave time. We recorded their time used as the time between clickinga “Start” button and submitting their results. After submitting allreviewing results for Task 1, they were able to have a short breakbefore starting Task 2. In Task 2, we first gave a demo on the entireworkflow and emphasized the design of Student List View since itwas not shown in Task 1. Then the participants spent around 20minutes to freely explore our complete system with all collecteddata. After the exploration, they were asked to finish a question-naire to evaluate our system. In our questionnaire, we adopted thebipolar survey design with negative statements at the left end of 5scale points (1-5 with 1 as the most negative and 5 as the most posi-tive) and positive statements at the right end. Also, there were twotext questions for suggestions and comparisons with the baselineapproach. All questions are listed in Table 2.

Table 2: The first section of our questionnaire is designed toevaluate usability ( 𝑄 - 𝑄 ) and the visual design ( 𝑄 - 𝑄 ) ofthe whole visual analytics system. The second section of ourquestionnaire is designed to evaluate usefulness and usabil-ity of Student List View ( 𝑄 - 𝑄 ), Question List View ( 𝑄 - 𝑄 ) and Behavior View ( 𝑄 - 𝑄 ). The third section of ourquestionnaire is to ask about some personal opinions on oursystem ( 𝑄 - 𝑄 ). The original sentences without the wordsin brackets are the positive statements at the right end of thescale points, while the sentences with words in the bracketsare the negative statements at the left end. Q1 It is very easy (difficult) to use.Q2 It is very easy (difficult) to learn.Q3 I am very wiling (unwilling) to use the system in the proctoring tasks.Q4 I am very (not) confident on my selections using the system.Q5 I will (will not) recommend the system to other TAs.Q6 The visual design is easy (difficult) to understand.Q7 The visual design provides enough (too little) information for me to findstudents who cheated.Q8 The visual design and interactions can (cannot) help me find students whocheated.Q9 It is very easy (difficult) to find students of high risk in the Student ListView.Q10 It is very easy (difficult) to know the distribution of cheating types of astudent.Q11 It is very easy (difficult) to know the distribution of risk of different questionsof a student.Q12 It is very easy (difficult) to know the overall time used and time used foreach question.Q13 It is very easy (difficult) to select questions with the most suspected cases.Q14 It is very easy (difficult) to know the distribution of cheating types of astudent of each question.Q15 This view can (cannot) help me better understand the suspected cases.Q16 It is very easy (difficult) to know when there are suspected cases.Q17 It is very easy (difficult) to know how a student moves his/her head andmouse in this view.Q18 It is very easy (difficult) to compare the student’s behavior with peers usingthe heatmap on the left side.Q19 It is very easy (difficult) to compare the student’s behavior with his/herbehaviors on other questions using the heatmap on the right side.Q20 Do you have any suggestions for our system?Q21 What do you think the advantages and disadvantages of our system areover the baseline method?

Results of Task 1.

Quantitative results on the accuracy of la-beling cheating cases and the total time for Task 1 are presentedin Figure 9. Our method outperforms the baseline method in bothaccuracy (our method: 0 . . Visual Analytics Approach to Facilitate the Proctoring of Online Exams CHI ’21, May 8–13, 2021, Yokohama, Japan the task (our method: 374 .

375 seconds, baseline: 463 .

938 seconds).However, the range of time used in our method is larger in Fig-ure 9(b). According to our observation in the user study, the reasonfor this phenomenon is that some participants would like to viewall videos to make sure our system is reliable enough, while otherspreferred to rely more on our suspected case detection results andskip some videos without suspected cases. To further demonstratethe efficiency and accuracy of our method, we conduct paired t-tests( 𝑑 𝑓 =

15) for every participant in terms of their accuracy and timecost in using both methods. The results of t-tests suggest that thedifference between our method and the baseline approach are statis-tically significant in terms of both the accuracy ( 𝑡 𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = − . 𝑝 < . 𝑡 𝑡𝑖𝑚𝑒 = . 𝑝 < . Figure 9: Comparison of using our method and the base-line method in terms of (a) the accuracy of labeling cheatingcases and (b) the total time cost for Task 1.

Since some experts mentioned that the false positive (a non-cheating case is labeled as a cheating one) rate and the false negative(a cheating case is labeled as a non-cheating one) rate are importantin evaluating the cheating detection, we also report them in Table 3.From the table, we can learn that both the average false positiverate and the average false negative rate of our method are lowerthan those of the baseline. Furthermore, the standard deviations ofthe rates of our method are also lower than those of the baseline.

Table 3: Comparison of our method and the baseline methodin terms of the false positive rate and the false negative ratein Task 1. SD means standard deviation. The lower valuesare shown in bold.

Ours BaselineFalse positive rate Average

Results of Task 2.

The results of our questionnaire are pre-sented in Figure 10. Overall, our system was highly rated by par-ticipants. They agreed that our system is quite convenient andefficient for proctoring. A participant commented that “Combiningautomated methods with visual analytics to facilitate the detection ofabnormal cases vastly improves the efficiency and efficacy of proctor-ing online exams than simply watching the students’ videos alone” .Moreover, they found that Student List View is quite useful andit is easy to learn the information they needed, which serves as acomplement to our results on accuracy and time cost mainly onother views. However, some participants worried that our systemis not easy to learn and their main concerns are about the Behavior View. Several participants commented that the detailed behaviorcharts are not easy to understand at first glance, since the move-ments on X-axis and Y-axis are encoded individually. However,after comparing our visual design with the alternative design inFigure 5, they agreed that our design can present information moreclearly by reducing occlusion and understood that we need to strikea balance between intuitiveness and the clearness of information.A participant suggested refining the legend and description of eachcolumn in the heatmaps to make them more understandable, whichhas already been done in the final version of our system.

Figure 10: The results of 𝑄 - 𝑄 in our questionnaire. - represents “the most negative” to “the most positive”. Thenumber on each section shows the corresponding score. We conducted in-depth interviews with four experts (P1, P2, P3, P5),who have been involved in our task analysis (Section 3), throughonline or face-to-face meetings. Each interview started with a briefintroduction to the visual encoding and interactions in our system.Then some cases were presented by us to further illustrate the usageof our system. After that, experts were invited to freely explore oursystem. They were encouraged to ask questions and comment onour system during the exploration. At the end of interviews, weasked them about their overall opinions about our system. Eachinterview lasted about 30-40 minutes and all the interviews wererecorded with their permission. Due to personal reasons, P4 wasnot able to take an interview. Instead, we collected his feedback on

HI ’21, May 8–13, 2021, Yokohama, Japan Li et al. our system through emails. Specifically, we sent him the link andthe user guide to our system and invited him to freely explore thesystem. Then we asked about his opinion about our system’s work-flow and visual design. Overall, our system is highly appreciatedby all the experts. In this section, we summarize their feedback in2 categories: suspected case detection and visualization system.

Suspected Case Detection.

The performance of our suspectedcase detection engine is considered quite satisfactory by the experts.All of them agreed that it is quite innovative and useful to intro-duce the usage of mouse movements to online exam proctoring.They confirmed that mouse movements reveals rich informationfor cheating detection like leaving the current website. P2 and P5thought our suspected case detection based on mouse movementdata and the video recorded by a single webcam is able to help im-prove current online exam environment settings. They pointed outthat university students were required to set up multiple webcamsfrom different angles in online exams by themselves, which led tonon-standard online exam settings and made it hard for proctors tofind cheating cases. In their opinion, our method provides a simpleand unified online exam setting that is more convenient for bothstudents and proctors. P2 also commented that he would like towork with us to apply it in real online exams, since our mousemovement data collection module can be easily integrated intohis learning management system. Considering the abnormal headmovement detection, detection failure was a common concern ofP1, P3 and P5. However, after knowing that the model we use canaccurately estimate the head poses of a student with a face mask,they believed that our detection is quite reliable and helpful forfinding abnormal head movements.The experts also provided some valuable suggestions for oursuspected case detection. P1 and P2 suggested that our detectionengine should be able to handle more online exam settings, forexample, setting up an extra webcam to record videos of bodymovement. We believed that this could be handled by adding extradetection modules to our engine. However, due to the limitationsof our dataset, this part is left as our future work. P4 mentionedthat we could try to extend the mouse movement collection pluginto the whole operating system and try to learn what the studentdid after leaving the web page.

Visualization System.

Overall, our visualization system is ap-preciated by all the experts due to its usefulness and usability. Allthe experts believed that our visualization system is quite intuitiveand helpful in finding cheating cases by providing different levelsof views. P1 liked the design of our system, and he said that theworkflow is “well-designed and intuitive” . P2 commented that theoverall UI design of our system is quite clear and professional. Hequickly learned how to use our system and appreciated our interac-tions such as starting the video from a certain point by clicking onthe detailed behavior charts ( R5 ). He also thought that our system,especially the summary views (Student List View and QuestionList View) is able to help proctors locate cheating cases very fastand greatly reduce the workload of proctors to view the long anddull videos with great concentration ( R1, R2 ). P3 also commented “Currently we use a brute-force method to reviewing videos, the rec-ommendation function provided by your system is very helpful” . Bysaying the “recommendation”, he referred to displaying the risk level of students and questions in our Student List View and Ques-tion List View (

R1, R2 ). Besides, P3 confirmed that teachers caneasily get enough information and find cheating cases from detailedbehavior charts and heatmaps in our Behavior View (

R3, R4 ), but itmight be overwhelming for student TAs. He suggested providing asimplified version for student TAs, which is left as our future work.

Privacy concerns in live video streaming-based education havebeen recognized and discussed [8], such as the unauthorized usageof video data by the third party and the unexpected exposure ofliving environment. Similarly, almost all online proctoring methods,including ours, also face similar concerns. For example, proctorsneed to record and check videos during and after online examsto identify possible cheating cases and guarantee online exam jus-tice. However, to ensure a fair evaluation of students’ performance,keeping the integrity of exams is crucial and is actually the respon-sibility of teachers [32]. Meanwhile, due to the lack of face-to-faceinteractions, it is much more challenging to maintain the integrityof online exams than traditional classroom-based exams [16, 30–32].Thus, the proctors need to balance academic integrity and privacyconcerns in online exams. We believe that the method proposed inthis paper can be regarded as an initial effort in striking a balancebetween academic integrity and privacy concerns. It provides amore effective and efficient way for online exam proctoring. Mean-while, strict measures are also needed to further address privacyconcerns when it is used in real online exams. For example, to avoidunauthorized data usage, the proctors need to set up a secure infras-tructure and comprehensive regulations to store, use and delete thecollected data appropriately. Before the online exam, the detailedmethods of data collection, processing, analysis and destructionshould be revealed to students. Then students’ consent to recordingvideo and mouse movement data needs to be obtained. During theonline exam, the usage of a virtual background can be permittedto hide bystanders and the living environment in videos. After theonline exam, once all the cheating reviews are done, the data shouldbe destroyed permanently.

As stated in Section 4.1, we conducted a mock online exam to collectthe mouse movement data, video data and cheating behavior labels,due to the lack of existing dataset and the difficulty of collectingcheating labels in real online exams. Though we had tried to mimica real online exam environment and used compensation to encour-age students to take the mock exam seriously, our dataset may notincorporate all possible cheating cases and all online exam settings.First, the types of cheating cases in our dataset are limited. In ourmock online exams, since the benefit and risk of cheating were notas large as those in real online exams, participants mostly adoptedthe common means of cheating. In real online exams, there aremore advanced methods of cheating and some of them may be evenharder to detect from their head and mouse movement, for exam-ple, using an earphone to listen to answers. Our approach showssatisfactory performance of detecting common cheating behaviors

Visual Analytics Approach to Facilitate the Proctoring of Online Exams CHI ’21, May 8–13, 2021, Yokohama, Japan like searching for answers through the Internet and using papermaterials in our dataset, but its ability to deal with other cheatingbehaviors needs further evaluation. Second, our mock online examfor cheating behavior data collection is conducted in the strictestclosed-book setting, while real online exams may be conducted inother manners. In different online exam settings, cheating behaviorscan be different. For example, some open-book online exams mayallow the usage of paper materials but prohibit the usage of searchengines. To facilitate the needs of different settings, we enable cus-tomized risk calculation to support adaptively filtering cheatingbehaviors, as mentioned in Section 5.2. As an initial explorationof using visual analytics techniques to facilitate the proctoringof online exams, we focus on the common cheating behaviors inclosed-book online exams and leave the further research on othercheating behaviors as future work. To better address the limitationsof data, more real-world datasets can be collected to extensivelyevaluate our system in the future.

In our interviews, P1 and P2 commented that they would like to useour system for real-time proctoring, since teachers’ interventionon cheating behaviors during the online exam is quite important.Currently, our method is used for reviewing videos after the on-line exam by proctors. However, it has the potential to be used forreal-time proctoring if some issues can be addressed. First, suffi-cient computational resources are required for real-time head poseestimation in our proctoring method. In our approach, most ofcomputational resources are consumed by extracting head posesfrom videos. According to our experiment, an Nvidia Titan Xp GPUis needed for a student to extract his/her head poses in a real-timemanner using the current model. Since computational resourceslike GPUs are expensive, sometimes it may not be easy to provideenough resources to estimate a large number of students’ headposes in online exams. A possible method to mitigate the high de-mand of computational resources is to apply some lightweight deeplearning models like MobileNetV3 [14] in our head pose estimation.However, compared with more complicated deep learning modelslike ResNet-50 in Hopenet [35], the lightweight deep learning mod-els may result in a performance drop. Thus, users need to strike abalance between efficiency and accuracy when selecting modelsfor real-time proctoring. Second, streaming mouse movement andvideo data need to be dealt with. In our method, we compute severalstatistical metrics of values (e.g., average yaw angle) using completevideos and mouse movement data. When switching to a real-timemode, these values can be computed by using sliding windows onthe streaming data instead.

Our approach is designed for the proctoring of online exams. How-ever, it is not limited to the proctoring of online exams. Instead, itcan also be extended to other applications. For example, the coachof E-sports can conduct an analysis on a player’s mouse movementsto evaluate the effectiveness of each action and his/her responsetime. Also, the design of our Behavior View can facilitate the needsof presenting some other types of spatial-temporal data for in-depthanalysis, for example, eye tracking data. Also, the scalability of our system needs further discussion. Fromthe perspective of processing speed, abnormal head pose detectionmay take a long time or require much computational power whenthe number of students in the online exam is large. As we discussedin Section 7.3, a possible method to mitigate this issue is to applylightweight deep learning models. From the perspective of visualdesign, when an online exam is taken by lots of students or thenumber of suspected cases is too large, our system also suffers fromscalability issues. Though proctors can sort the Student List View foreasy location of suspected students, it is still hard when the numberof students is too large. A possible solution is to apply hierarchicalvisualization methods to first group students and expand groups forindividual inspection on demand. Also, if the number of suspectedcases is too large, the glyphs between two line charts in our BehaviorView may have severe overlapping due to the limitation of screensize. A possible solution is to aggregate all suspected cases duringa particular period (e.g., 30 seconds) as a pie chart.

Online exams have become increasingly popular for course in-structors to assess the knowledge of students or other test takers.However, it remains unclear on how to conveniently and effectivelyproctor online exams. In this paper, we propose a visual analyticsapproach to achieve convenient, efficient and reliable online proc-toring in this study. It consists of two major modules: suspectedcase detection engine and visualization, which first processes stu-dents’ videos and mouse movement data during the online examand further visualizes them in three levels of details. We extensivelyevaluate our approach through three usage scenarios, a user studyand in-depth interviews with experts. The results confirm the use-fulness and effectiveness of our approach in enabling convenient,efficient and reliable proctoring for online exams.In future work, we plan to improve our visual analytics approachfor real-time proctoring and further evaluate our system in real-world online exams. Also, it will be interesting to explore howto apply visualization techniques to reduce cheating behaviors inonline exams.

ACKNOWLEDGMENTS

This work is partially sponsored by Innovation and TechnologyFund (ITF) with No. ITS/388/17FP. Yong Wang is the correspondingauthor. We would like to thank all participants in our mock onlineexam and the user study, all experts for their valuable opinions, allanonymous reviewers for their feedback, and Zezheng Feng andJiakai Wang for their proofreading.

REFERENCES [1] Ernesto Arroyo, Ted Selker, and Willy Wei. 2006. Usability tool for analysis ofweb designs using mouse tracks. In

CHI ’06 Extended Abstracts on Human Factorsin Computing Systems, CHI Extended Abstracts ’06, Montréal, Québec, Canada,April 22-27, 2006 . ACM, New York, NY, USA, 484–489. https://doi.org/10.1145/1125451.1125557[2] Yousef Atoum, Liping Chen, Alex X. Liu, Stephen D. H. Hsu, and Xiaoming Liu.2017. Automated online exam proctoring.

IEEE Transactions on Multimedia

19, 7(2017), 1609–1624. https://doi.org/10.1109/TMM.2017.2656064[3] Tadas Baltrusaitis, Peter Robinson, and Louis-Philippe Morency. 2012. 3D con-strained local model for rigid and non-rigid facial tracking. In . IEEE, New York, NY, USA, 2610–2617. https://doi.org/10.1109/CVPR.2012.6247980

HI ’21, May 8–13, 2021, Yokohama, Japan Li et al. [4] Tanja Blascheck, Kuno Kurzhals, Michael Raschke, Michael Burch, DanielWeiskopf, and Thomas Ertl. 2017. Visualization of eye tracking data: a tax-onomy and survey.

Computer Graphics Forum

36, 8 (2017), 260–284. https://doi.org/10.1111/cgf.13079[5] Eli T. Brown, Alvitta Ottley, Helen Zhao, Quan Lin, Richard Souvenir, AlexEndert, and Remco Chang. 2014. Finding Waldo: Learning about users from theirinteractions.

IEEE Transactions on Visualization and Computer Graphics

20, 12(2014), 1663–1672. https://doi.org/10.1109/TVCG.2014.2346575[6] Stefano Burigat, Luca Chittaro, and Lucio Ieronutti. 2008. Mobrex: Visualizingusers’ mobile browsing behaviors.

IEEE Computer Graphics and Applications

CHI ’01 Extended Abstracts on Human Factors in Computing Systems,CHI Extended Abstracts ’01, Seattle, Washington, USA, March 31 - April 5, 2001 .ACM, New York, NY, USA, 281–282. https://doi.org/10.1145/634067.634234[8] Xinyue Chen, Si Chen, Xu Wang, and Yun Huang. 2020. ‘I was afraid, butnow I enjoy being a streamer!’: Understanding the challenges and prospects ofusing live video streaming for online education. In the 23rd ACM Conference onComputer-Supported Cooperative Work and Social Computing, CSCW 2020, VirtualEvent . ACM, New York, NY, USA, 1–32.[9] Chia Yuan Chuang, Scotty D. Craig, and John Femiani. 2017. Detecting probablecheating during online assessments based on time delay and head pose.

HigherEducation Research & Development

36, 6 (2017), 1123–1137. https://doi.org/10.1080/07294360.2017.1303456[10] Lynne Cooke. 2006. Is the mouse a “poor man’s eye tracker”?. In the 53rdInternational STC Conference in Las Vegas, Nevada , Vol. 53. STC, Fairfax, VA, USA,252.[11] Gennaro Costagliola, Vittorio Fuccella, Massimiliano Giordano, and GiuseppePolese. 2009. Monitoring online tests through data visualization.

IEEE Trans-actions on Knowledge and Data Engineering.

21, 6 (2009), 773–784. https://doi.org/10.1109/TKDE.2008.133[12] Mark Grimes and Joseph Valacich. 2015. Mind over mouse: the effect ofcognitive load on mouse movement behavior. In . AIS, Atlanta, GA, USA, 1–13.http://aisel.aisnet.org/icis2015/proceedings/HCI/9[13] Olivia Holden, Valerie A. Kuhlmeier, and Meghan Norris. 2020. Academic in-tegrity in online testing: A Research Review. https://doi.org/10.31234/osf.io/rjk7g[14] Andrew Howard, Ruoming Pang, Hartwig Adam, Quoc V. Le, Mark Sandler,Bo Chen, Weijun Wang, Liang - Chieh Chen, Mingxing Tan, Grace Chu, VijayVasudevan, and Yukun Zhu. 2019. Searching for MobileNetV3. In . IEEE, New York, NY, USA, 1314–1324. https://doi.org/10.1109/ICCV.2019.00140[15] Ahmad Khawaji, Fang Chen, Jianlong Zhou, and Nadine Marcus. 2014. Trust andcognitive load in the text-chat environment: the role of mouse movement. In the26th Australian Computer-Human Interaction Conference on Designing Futures -the Future of Design, OZCHI ’14, Sydney, New South Wales, Australia, December2-5, 2014 . ACM, New York, NY, USA, 324–327. https://doi.org/10.1145/2686612.2686661[16] Darwin L. King and Carl J. Case. 2014. E-cheating: Incidence and trends amongcollege students.

Issues in Information Systems

15, 1 (2014), 20–27.[17] Felix Kuhnke and Jorn Ostermann. 2019. Deep head pose estimation usingsynthetic images and partial adversarial domain adaption for continuous labelspaces. In . IEEE, New York, NY,USA, 10163–10172. https://doi.org/10.1109/ICCV.2019.01026[18] Luis A. Leiva and Roberto Vivó. 2012. Interactive hypervideo visualization forbrowsing behavior analysis. In the 21st World Wide Web Conference, WWW 2012,Lyon, France, April 16-20, 2012 (Companion Volume) . ACM, New York, NY, USA,381–384. https://doi.org/10.1145/2187980.2188054[19] Haotian Li, Huan Wei, Yong Wang, Yangqiu Song, and Huamin Qu. 2020. Peer-inspired student performance prediction in interactive online question pools withgraph neural network. In the 29th ACM International Conference on Information& Knowledge Management, CIKM 2020, Virtual Event, Ireland . ACM, New York,NY, USA, 2589–2596. https://doi.org/10.1145/3340531.3412733[20] Xuanchong Li, Kai-min Chang, Yueran Yuan, and Alexander Hauptmann. 2015.Massive open online proctor: Protecting the credibility of MOOCs certificates.In the 18th ACM Conference on Computer Supported Cooperative Work & SocialComputing, CSCW 2015, Vancouver, BC, Canada . Association for ComputingMachinery, New York, NY, USA, 1129–1137. https://doi.org/10.1145/2675133.2675245[21] Yiqun Liu, Ye Chen, Jinhui Tang, Jiashen Sun, Min Zhang, Shaoping Ma, and XuanZhu. 2015. Different users, different opinions: Predicting search satisfaction withmouse movement information. In the 38th International ACM SIGIR Conferenceon Research and Development in Information Retrieval, Santiago, Chile, August9-13, 2015 . ACM, New York, NY, USA, 493–502. https://doi.org/10.1145/2766462. 2767721[22] Timothy B. Michael and Melissa A. Williams. 2013. Student equity: Discouragingcheating in online courses.

Administrative Issues Journal

3, 2 (2013), 6.[23] Gosia Migut, Dennis Koelma, Cees G. M. Snoek, and Natasa Brouwer. 2018. Cheatme not: Automated proctoring of digital exams on bring-your-own-device. In the 23rd Annual ACM Conference on Innovation and Technology in ComputerScience Education, ITiCSE 2018, Larnaca, Cyprus . ACM, New York, NY, USA, 388.https://doi.org/10.1145/3197091.3205813[24] Bryce Morrow, Trevor Manz, Arlene E. Chung, Nils Gehlenborg, and David Gotz.2019. Periphery plots for contextualizing heterogeneous time-based charts. In the 30th IEEE Visualization Conference, IEEE VIS 2019 - Short Papers, Vancouver,BC, Canada, October 20-25, 2019 . IEEE, New York, NY, USA, 1–5. https://doi.org/10.1109/VISUAL.2019.8933582[25] James Moten Jr, Alex Fitterer, Elise Brazier, Jonathan Leonard, and Avis Brown.2013. Examining online college cyber cheating methods and prevention measures.

Electronic Journal of E-learning

11, 2 (2013), 139–146.[26] Florian Mueller and Andrea Lockerd. 2001. Cheese: Tracking mouse movementactivity on websites, a tool for user modeling. In

CHI ’01 Extended Abstractson Human Factors in Computing Systems, CHI Extended Abstracts ’01, Seattle,Washington, USA, March 31 - April 5, 2001 . ACM, New York, NY, USA, 279–280.https://doi.org/10.1145/634067.634233[27] Tamara Munzner. 2014.

Visualization analysis and design . CRC press, Boca Raton,FL, USA.[28] Athi Narayanan, Ramachandra Mathava Kaimal, and Kamal Bijlani. 2014. Yawestimation using cylindrical and ellipsoidal face models.

IEEE Transactions onIntelligent Transportation Systems

15, 5 (2014), 2308–2320. https://doi.org/10.1109/TITS.2014.2313371[29] Swathi Prathish, Athi Narayanan S, and Kamal Bijlani. 2016. An intelligent systemfor online exam monitoring. In . AIS, Atlanta, GA, USA, 138–143.[30] Diane J. Prince, Richard A. Fulton, and Thomas W. Garsombke. 2009. Compar-isons of proctored versus non-proctored testing strategies in graduate distanceeducation curriculum.

Journal of College Teaching & Learning (TLC)

6, 7 (2009),51–62.[31] Ronny Richardson and Max North. 2013. Strengthening the trust in onlinecourses: a common sense approach.

Journal of Computing Sciences in Colleges

22, 2 (2006), 206–212.[33] Joseph Roth, Xiaoming Liu, and Dimitris N. Metaxas. 2014. On continuous userauthentication via typing behavior.

IEEE Transactions on Image Processing

IEEE Transactions onInformation Forensics and Security

10, 2 (2015), 333–345. https://doi.org/10.1109/TIFS.2014.2374424[35] Nataniel Ruiz, Eunji Chong, and James M. Rehg. 2018. Fine-grained head poseestimation without keypoints. In . IEEE, New York, NY, USA, 2074–2083. https://doi.org/10.1109/CVPRW.2018.00281[36] Nataniel Ruiz and James M. Rehg. 2017. Dockerface: an easy to install and useFaster R-CNN face detector in a Docker container.

CoRR abs/1708.04370 (2017),1–5. arXiv:1708.04370 http://arxiv.org/abs/1708.04370[37] Erica Southgate, Karen Blackmore, Stephanie Pieschl, Susan Grimes, JesseyMcGuire, and Kate Smithers. 2019. Artificial intelligence and emerging technolo-gies in schools. https://docs.education.gov.au/system/files/doc/other/aiet_final_report_august_2019.pdf[38] Abrar Ullah, Hannan Xiao, and Trevor Barker. 2016. A classification of threatsto remote online examinations. In . IEEE, IEEE, NewYork, NY, USA, 1–7.[39] Huan Wei, Haotian Li, Meng Xia, Yong Wang, and Huamin Qu. 2020. Predictingstudent performance in interactive online question pools using mouse interactionfeatures. In the 10th International Conference on Learning Analytics and Knowledge,Frankfurt, Germany, March 23-27, 2020 . ACM, New York, NY, USA, 645–654.https://doi.org/10.1145/3375462.3375521[40] Aoyu Wu and Huamin Qu. 2020. Multimodal analysis of video collections:Visual exploration of presentation techniques in TED talks.

IEEE Transactions onVisualization and Computer Graphics

26, 7 (2020), 2429–2442. https://doi.org/10.1109/TVCG.2018.2889081[41] Yanhong Wu, Naveen Pitipornvivat, Jian Zhao, Sixiao Yang, Guowei Huang, andHuamin Qu. 2016. egoSlider: Visual analysis of egocentric network evolution.

IEEE Transactions on Visualization and Computer Graphics

22, 1 (2016), 260–269.https://doi.org/10.1109/TVCG.2015.2468151[42] Meng Xia, Reshika Palaniyappan Velumani, Yong Wang, Huamin Qu, and Xi-aojuan Ma. 2020. QLens: Visual analytics of multi-step problem-solving be-haviors for improving question design.

CoRR abs/2009.12833 (2020), 1–11.

Visual Analytics Approach to Facilitate the Proctoring of Online Exams CHI ’21, May 8–13, 2021, Yokohama, Japan arXiv:2009.12833 https://arxiv.org/abs/2009.12833[43] Meng Xia, Huan Wei, Min Xu, Leo Yu Ho Lo, Yong Wang, Rong Zhang, andHuamin Qu. 2019. Visual analytics of student learning behaviors on K-12 mathe-matics E-learning platforms.

CoRR abs/1909.04749 (2019), 1–2. arXiv:1909.04749http://arxiv.org/abs/1909.04749[44] Tsun-Yi Yang, Yi-Ting Chen, Yen-Yu Lin, and Yung-Yu Chuang. 2019. FSA-Net:Learning fine-grained structure aggregation for head pose estimation from asingle image. In . IEEE, New York, NY, USA,1087–1096. https://doi.org/10.1109/CVPR.2019.00118 [45] Haipeng Zeng, Xinhuan Shu, Yanbang Wang, Yong Wang, Liguo Zhang, Ting-Chuen Pong, and Huamin Qu. 2020. EmotionCues: Emotion-oriented visualsummarization of classroom videos. https://doi.org/10.1109/TVCG.2019.2963659[46] Haipeng Zeng, Xingbo Wang, Aoyu Wu, Yong Wang, Quan Li, Alex Endert, andHuamin Qu. 2020. EmoCo: Visual analysis of emotion coherence in presentationvideos.

IEEE Transactions on Visualization and Computer Graphics

26, 1 (2020),927–937. https://doi.org/10.1109/TVCG.2019.2934656[47] Arkady Zgonnikov, Andrea Aleni, Petri T. Piiroinen, Denis O’Hora, and Mario diBernardo. 2017. Decision landscapes: Visualizing mouse-tracking data.