[PDF] DemSelf, a Mobile App for Self-Administered Touch-Based Cognitive Screening: Participatory Design With Stakeholders

Abstract

Early detection of mild cognitive impairment and dementia is vital as many therapeutic interventions are particularly effective at an early stage. A self-administered touch-based cognitive screening instrument, called DemSelf, was developed by adapting an examiner-administered paper-based instrument, the Quick Mild Cognitive Impairment (Qmci) screen. We conducted five semi-structured expert interviews including a think-aloud phase to evaluate usability problems. The extent to which the characteristics of the original subtests change by the adaption, as well as the conditions and appropriate context for practical application, were also in question. The participants had expertise in the domain of usability and human-machine interaction and/or in the domain of dementia and neuropsychological assessment. Participants identified usability issues in all components of the DemSelf prototype. For example, confirmation of answers was not consistent across subtests. Answers were sometimes logged directly when a button is tapped and cannot be corrected. This can lead to frustration and bias in test results, especially for people with vision or motor impairments. The direct adoption of time limits from the original paper-based instrument or the simultaneous verbal and textual item presentation also caused usability problems. DemSelf is a different test than Qmci and needs to be re-validated. Visual recognition instead of a free verbal recall is one of the main differences. Reading skill level seems to be an important confounding variable. Participants would generally prefer if the test is conducted in a medical office rather than at a patient's home so that someone is present for support and the result can be discussed directly.

Full PDF

aa r X i v : . [ c s . H C ] F e b DemSelf, a Mobile App for Self-AdministeredTouch-Based Cognitive Screening: ParticipatoryDesign With Stakeholders

Martin Burghart − − − , Julie L.O’Sullivan − − − , Robert Spang − − − , andJan-Niklas Voigt-Antons , − − − Quality and Usability Lab, Technische Universit¨at Berlin, Berlin, Germany [email protected] Institut f¨ur Medizinische Soziologie und Rehabilitationswissenschaft, Charit´e -Universit¨atsmedizin Berlin, Berlin, Germany German Research Center for Artiﬁcial Intelligence (DFKI), Berlin, Germany

Abstract.

Early detection of mild cognitive impairment and demen-tia is vital as many therapeutic interventions are particularly eﬀectiveat an early stage. A self-administered touch-based cognitive screeninginstrument, called DemSelf, was developed by adapting an examiner-administered paper-based instrument, the Quick Mild Cognitive Impair-ment (Qmci) screen.We conducted ﬁve semi-structured expert interviews including a think-aloud phase to evaluate usability problems. The extent to which thecharacteristics of the original subtests change by the adaption, as wellas the conditions and appropriate context for practical application, werealso in question. The participants had expertise in the domain of usabil-ity and human-machine interaction and/or in the domain of dementiaand neuropsychological assessment.Participants identiﬁed usability issues in all components of the Dem-Self prototype. For example, conﬁrmation of answers was not consistentacross subtests. Answers were sometimes logged directly when a button istapped and cannot be corrected. This can lead to frustration and bias intest results, especially for people with vision or motor impairments. Thedirect adoption of time limits from the original paper-based instrumentor the simultaneous verbal and textual item presentation also causedusability problems. DemSelf is a diﬀerent test than Qmci and needs tobe re-validated. Visual recognition instead of a free verbal recall is oneof the main diﬀerences. Reading skill level seems to be an importantconfounding variable. Participants would generally prefer if the test isconducted in a medical oﬃce rather than at a patient’s home so thatsomeone is present for support and the result can be discussed directly.This paper has been accepted for publication in the Human-Computer InteractionInternational conference 2021. The ﬁnal authenticated version is available online athttps://doi.org/[added after release of paper]. eywords:

Mild cognitive impairment · Dementia · Computerized cog-nitive screening · Usability · Self-assessment

Currently there are no pharmacological treatments available for mild cognitiveimpairment (MCI) and most causes of dementia [26,17]. The diagnosis of po-tentially progressive cognitive impairment can cause fears and aﬀected peopleoften feel stigmatized and marginalized by society. These arguments can be putforward against an early diagnosis of MCI and dementia. Nevertheless, in theview of most experts, the arguments for a diagnosis outweigh [13]: every patienthas the right to be honestly informed about his or her health status. A diagnosisalong with professional consultation can help family members better understandthe behaviors they are experiencing. Early detection of MCI and dementia isvital as many therapeutic and preventive approaches – such as consultation,cognitive training, and physical exercise – are particularly eﬀective at an earlystage [26]. Such approaches can reduce the need for care and the burden on thepatient and their caregivers.Hence, there is a need for fast, reliable, and aﬀordable screening instruments.The use of mobile devices such as tablets promises to better meet these require-ments. Possible advantages include better standardization, automatic scoring,a digital test result, dynamic adaptation to the test person, or more accuratemeasurement of time and other factors. A person can also perform some testswithout a trained medical professional, making cognitive testing less expensiveand more accessible [4,32]. DemSelf is an attempt to adapt a validated paper-based instrument so that it can be performed independently on a touch-device.An expert evaluation provides insights into usability issues, diﬀerences to theoriginal test and the appropriate context of use.

For an average person, cognitive decline begins in the third decade and continuesthroughout life [18]. Cognitive aging aﬀects domains such as attention, memory,executive functions, language, and visuospatial abilities [20]. Dementia is a brainsyndrome associated with a decline of cognitive functioning that is ”not entirelyattributable to normal aging and signiﬁcantly interferes with independence inthe person’s performance of activities of daily living.” [35]. The most commonform of dementia is due to Alzheimer’s disease (AD), but dementia can occurin a number of diﬀerent circumstances and diseases, such as vascular diseases,Parkinson’s disease or Lewy body dementia [17]. Cognitive decline is the clinicalhallmark in dementia and memory impairment is the most prominent symptomin most patients. A key feature of dementia is that everyday skills such as usingpublic transport or handling money are aﬀected. AD in particular has a pro-gressive course. In severe dementia, patients are almost completely dependentn help from others. In the course of research on cognitive aging, some peoplehave been found to be in a ”gray area between normal cognitive aging and de-mentia” [33, p. 370]. A person shows cognitive decline that is beyond normalageing, but functional activities of daily living are not aﬀected. MCI is one ofthe most widely used and empirically studied terms to describe this state. Aprecise and universal deﬁnition of MCI has not yet been established [17]; see [33]for a discussion on similar concepts such as Mild Neurocognitive Disorder. Insome cases, MCI can be a pre-clinical stage of dementia, particularly AD [17].Early identiﬁcation of MCI would allow interventions at an early stage whichmay inﬂuence the course of the disease. However, people with MCI often remainundiagnosed [7].An early step in diagnosing cognitive impairment is often a brief cognitive screen-ing instrument. The administration usually takes only a few minutes and allowsassessment of diﬀerent cognitive domains such as attention, memory, orienta-tion, language, executive functions, or visuospatial abilities. A cognitive screen-ing instrument provides information on the presence and severity of cognitiveimpairment in patients with suspected dementia or MCI. There is a multitude ofdiﬀerent cognitive screening instruments. Recent systematic reviews and meta-analyses are available [34,2,27]. The right choice depends on numerous factors:”Clinicians and researchers should abandon the idea that one screening instru-ment ... can be used in every setting, for all diﬀerent neurodegenerative diseasesand for each population.” [27, p. 11].Most available cognitive screening instruments are primarily paper-based andadministered by healthcare professionals. However, computers, tablets, or simi-lar devices can also be used to administer, score, or interpret a cognitive test.Computerized cognitive screening instruments oﬀer several potential advantagesover paper-based instruments, such as increased standardization of scoring, easeof administering in diﬀerent languages, reduced costs, remote testing, adaption,and more precise measurement of time- and location-sensitive tasks [4,32]. A self-administered web-based test Brain on Track, for example, uses random elementsto minimize learning eﬀects in longitudinal tests [28]. Other computerized testsare designed as mini-games to achieve a more relaxed testing environment and toreduce the drop-out rates in longitudinal testing [37], or use scoring algorithmssimilar to principal components analysis to improve sensitivity [29]. Individualfactors like technical experiences, attitudes towards technology, or non-cognitiveimpairments such as motor or sensory disabilities may aﬀect interaction with thecomputer interface and bias the test result. A computerized screening instrumentmight therefore not be suitable for certain people. Test developers should con-sider such factors during validation and report their inﬂuences on the test [4].Computerized cognitive assessment already has a long tradition and a varietyof instruments is available [38,2,27]. Nevertheless, they are still used much lessfrequently in practice than paper-based tests [32]. The lack of normative popu-lation data is one problem, making it diﬃcult to choose the right instrument.sability is another crucial aspect in self-assessment by elderly users with possi-ble cognitive and other impairments, posing a potential barrier to the practicaluse of self-assessment instruments. Mobile devices with touch displays are oftenused for computer-based testing as they allow direct manipulation of informa-tion, which can feel very natural to the user [8]. However, most applications donot take into account the needs of the elderly. Several age-related factors aﬀectinteraction with touch interfaces in mobile applications [3,10]. Older people canhave trouble identifying thin lines, reading small and low-contrast text, or distin-guishing between visually similar icons. Fine movements, time critical gestures(e.g. double taps), or complex multi-touch gestures can also be problematic.Other studies have shown that mobile devices such as tablets can be successfullyintegrated into the lives of people with dementia - for example, to assess qualityof life [12], to improve quality of live via cognitive training games or communi-cation with staﬀ and family members [6,1], or to improve outpatient dementiacare by fostering guideline-based treatment [15,14]. Usability aspects should beconsidered early on in the development of a new instrument. Some authors havechosen cognitive training exercises as subtests where high usability has alreadybeen conﬁrmed [28]. In general, however, the usability must be evaluated specif-ically for the new instrument. Various methods of expert-based evaluation andtests with users are available for this purpose. For example, some authors hadorganized focus groups with doctors and healthy older people and asked them torate the usability of their instrument with a list of 12 statements based on theSystem Usability Scale [37]. Usually, several cycles of re-design and evaluationare necessary to achieve high usability, since not all problems are discovered inone pass and a new design may also reveal or create new problems [21].

We conducted ﬁve semi-structured interviews with usability and domain expertsto determine which usability problems exist in the DemSelf app and how theycould be solved. The extent to which the characteristics of the original subtestschange by the adaption, as well as the conditions and appropriate context forpractical application, were also in question.

The participants had expertise in the domain of usability and human-machineinteraction and/or in the domain of dementia and neuropsychological assess-ment. The average age was 29.6 years ( SD = 1.34). All participants were femaleand had a university degree as their highest completed level of education. Threeparticipants worked as research assistants in an MCI-research domain, while twoparticipants were professionals in usability and user experience. Participants hadan average of 7.2 years ( SD = 2.5) of professional experience in human-machineinteraction or usability. Three participants reported practical experience in theiagnosis of MCI and dementia. Only one participant reported practical experi-ence with computerized digital screening instruments. All participants reportedpractical experience in assessing the usability of touch-based software. Two par-ticipants reported practical experience in assessing the usability of touch-basedsoftware speciﬁcally for elderly people. Four interviews were conducted in person. Participants operated the DemSelfapp on an iPad (6th Generation). One interview was conducted remotely viatelephone call. Here, the app was simulated in Xcode on an iPad (6th Generation)and operated remotely via TeamViewer. The average interview duration was 71minutes.Participants were informed about dementia, MCI, and the purpose of cognitivescreening instruments. Each participant then completed the DemSelf testing pro-cess twice. We ﬁrst presented a scenario in which an elderly patient is asked bya physician to perform the test alone. Participants were asked to put themselvesin that person’s position and to solve the test as correctly as possible. We en-couraged participants to think aloud during the ﬁrst round. This procedure isa form of cognitive walkthrough [16] in the sense that a usage context, a user,and a task were speciﬁed. The goal was to uncover usability problems when thetest is performed for the ﬁrst time as this is the most common use of a screeninginstrument.In the second round, participants could try alternative inputs and make typicalmistakes. We asked participants to clarify comments made in the ﬁrst round ofthinking aloud and asked further questions regarding usability and diﬀerences be-tween the original and the adapted subtests. After completing the second round,participants could comment on the adequate context of use and conditions forusing the app as a cognitive screening instrument.

We developed a self-administered touch-based cognitive screening instrument,called DemSelf, by adapting an examiner-administered paper-based instrument,the Quick Mild Cognitive Impairment (Qmci) screen. The goal of DemSelf is toclassify whether a person has normal cognition, MCI, or dementia. The test isto be performed independently on a mobile device such as a tablet.The Qmci was selected based on the following criteria: high accuracy in detectingMCI, short administration time (under 10 minutes), and detailed instructionsfor administration and scoring. The Qmci was originally published in 2012 and isspeciﬁcally designed to diﬀerentiate MCI from normal cognition [22]. Two recentsystematic reviews show reliable results for detecting MCI [11,27]. It is a short(3–5 minute) instrument composed of six subtests – Orientation, Word Regis-tration, Clock Drawing, Delayed Recall, Word Fluency, and Logical Memory.There are cut-oﬀ scores for MCI and dementia adjusted for age and education.he Qmci covers the cognitive domains orientation, working memory, visuospa-tial/construction, episodic memory and semantic memory/language [23].We translated the items from the Qmci into German and based the instructionsand scoring system of DemSelf on the Qmci guide [19]. In the Qmci there isa verbal interaction between the test administrator and the subject. In a self-assessment, speech recognition would come closest to this interaction, but wasconsidered too error-prone. The Qmci subtest Word Fluency, in which as manyanimals as possible are to be be named in one minute, is therefore not includedin DemSelf. Keyboard input was identiﬁed as a major diﬃculty for older peopleand a frequent source of errors in another self-assessment screening instrument[28]. Therefore, most user input in DemSelf is done by tapping large, labeledbuttons that represent either correct answers or distractors. Thus, the answerchoices are limited in DemSelf – this is one of the key diﬀerences from Qmci.Answer choices are randomly distributed for repeated testing.

Consent

DemSelf ﬁrst requires informed consent from the patient. To limitthe demands on working memory and attention, each screen contains only afew short sentences with information about the risk of a false result, the datacollected, the duration of the test, and the option to stop the test at any time.Subjects are encouraged to talk to a physician if they have any doubts about thetest. Any isolated cognitive screening instrument is insuﬃcient for a diagnosisof MCI or dementia [2]. The Qmci guide recommends verbally clarifying thepurpose and indicating that the subject can stop at any time [19]. In DemSelf,this comprehension is tested with the questions: Does the test provide a reliablediagnosis of dementia (No)? Can you stop the test after each task (Yes)? Subjectscan continue only if both questions are answered correctly and if they agree totake the test (see Figure 1).

Test Environment

The aim of cognitive testing is to achieve the best possibleperformance to ensure that deﬁcits are not caused by internal or external con-founding factors [25]. The subject is therefore instructed to perform the test in aquiet environment without distractions and to use a hearing aid and wear glassesif necessary. The volume setting and the reading aloud function can be tested inadvance. The physiological and emotional state of the person being tested cancause the test result to be biased [24]. A subject is therefore asked whether he orshe feels well, is not in pain, is not emotionally upset, and is not tired (see Figure1). The answers are reported in the test result for the healthcare professional.

Subtest Orientation

DemSelf asks ﬁve questions about the country, year,month, date (for the day), and day of the week. There are 12 answer choices forcountry and year including the correct answer and 11 distractors: 7 countriesare randomly selected European countries (testing was assumed to take placein Germany). The remaining 4 countries are randomly selected from a list of allcountries on earth. The distractors for the current year are randomly selected a) Check understandingof test limitations. (b) Test speech output. (c) Check emotional andphysiological state.

Fig. 1: Selection of screens before the test begins. See Section 3.3 for a detaileddescription. Translation of instructions: (a) Now we would like to ask you somequestions. Tap on the correct answers so that you can continue. Does the testprovide a reliable diagnosis of dementia? Can you stop the test after each task?Do you agree to take the test now? (b) In the test, words are read aloud. Usethe volume buttons on the side of the iPad to adjust the volume. If you need ahearing aid, please insert it. (c) Do you feel rested and awake? Are you in pain?Do you feel relaxed and comfortable?from a span of + / −

20 years. For month, date, and day of the week all avail-able options are given as possible answers. There is a 10 second time limit foranswering before the next question appears.

Subtest Word Registration

In Word Registration, DemSelf displays 5 wordson the screen and consecutively reads each word aloud. The Swift class AVSpeech-Synthesizer was used for speech synthesis with a German female voice. Thespeech rate was set to 0.40 throughout the app with a pause of 1 / Subtest Clock Drawing

Clock Drawing is a common subtest in cognitivescreening instruments with variations in administration and scoring systems [9].In DemSelf, all input in Clock Drawing is made by tapping and drawing witha ﬁnger. An empty circle is provided as in the Qmci. Numbers and hands canbe entered inside the circle and in a quadratic area surrounding it. The subject a) Introductory screen. (b) Items are displayedand read aloud. (c) Answers are selectedby tapping

Fig. 2: Word Registration: item presentation and selection from diﬀerent answerchoices. See Section 3.3 for a detailed description. Translation of instructions:(a) In this task you are supposed to remember and then repeat 5 words. (b)Remember these 5 words. (c) Tap on the 5 words you were just told. The orderdoes not matter.is ﬁrst asked to put in all the numbers and then draw the hands into this clockface in a subsequent step. Drawing is done with one ﬁnger and creates a straightline between the start and end points.Numbers are added by (1) tapping on a location in the rectangular area (2)entering a number in the appearing number pad (3) conﬁrming the number whichcloses the number pad (see Figure 3). A number can be modiﬁed by tapping on it– the number pad reappears, and the number can be deleted or changed. Whena number is selected this way, it can be relocated by either tapping on a newlocation or by dragging it to a new location.The target areas for numbers and hands are shown in Figure 4. The numbers12, 3, 6 and 9 must be located inside a section of 30 degrees and the numbers1, 2, 4, 5, 7, 8, 10 and 11 must be located inside the corresponding quadrantto be correct. The numbers are also accepted in a limited range outside thecircle. Numbers are scored as in the Qmci [19]: 1 point for each correct numberand minus 1 point for each number duplicated or if greater than 12. Incorrectlyplaced numbers which are not duplicates or greater than 12 are ignored.Hands need to be drawn at 10 minutes past 11. Drawing with the ﬁnger createsa straight line between start and end point. A hand is considered correct if itsend point lies within one of the hand sections shown in Figure 4. An additionalpoint is given, if both inner end points of the hands lie within the inner circle.A subject could rotate the device while entering the numbers and hands. TheQmci guide asks the human evaluator to maximize the score by lining up thescoring template at ”12 o’clock” [19]. In DemSelf, the target areas are rotated a) Tap on a location inor around the circle. (b) Enter a number andconﬁrm it. (c) The number is addedto the clock face.

Fig. 3: Clock Drawing: steps to add a number to the clock face. See Section 3.3for a detailed description. Translation of instructions: (a) Tap where a numberwould be on the face of a clock. (b) Enter the number at this point. (a) Sections for thenumbers 12, 3, 6 and 9. (b) Quadrants for thenumbers 1,2,4,5,7,8,10 and11. (c) Areas for start and endpoints of the hands.

Fig. 4: Clock Drawing: target areas for scoring numbers and handsand lined up at the location of the number 12 (if 12 is missing, 3, 6, or 9 areused respectively). The maximum of the unaligned or aligned score is selected. ubtest Delayed Recall

In Delayed Recall, test subjects are asked to repeatthe words that have been presented in the subtest Word Registration. Again,there are 16 answer choices with 11 randomly selected distractors.

Subtest Logical Memory

Logical Memory tests the recall of a short story.Instead of free verbal recall, repeating the story is divided into 9 steps. Foreach step, there are 6 answer choices with the correct story component and 5distractors. The distractors were chosen to (1) have the same syntactic structure,(2) be semantically related, and (3) be used with a similar frequency in Germanas the original item. A new story component is added to the answer by tappingon the according button. The current answer is displayed on the screen. (a) The story is displayedand read aloud. (b) Subject taps thebeginning of the story. (c) The story iscompleted step by step.

Fig. 5: Logical Memory: Item presentation and repeating a story in several steps.See Section 3.3 for a detailed description. Translation of instructions: (a) Remem-ber this story. (b) Repeat the story. Tap the beginning of the story. (c) Repeatthe story. Tap the next component of the story.

Test Result

It is reported to the subject whether there is evidence of normalcognition or risk for cognitive impairment. Subjects are once again encouragedto speak with a physician if they have any concerns. Test results can be sent tothe physician who initiated the testing or saved on the tablet.A more detailed test result is intended for medical professionals. The test scoreand the cut-oﬀ scores are displayed on a bar plot. We used the original cut-oﬀ scores from Qmci, but scaled down because the Verbal Fluency subtest wasmissing. Additional information such as the test date, start time and duration,as well as the subject’s age and education are displayed. The subject’s answersbout the test environment and current physiological and emotional status arealso reported. Further details can be displayed for each subtest, including thesubject’s responses, an explanation for the subtest’s scoring system, and addi-tional information such as the time taken to complete the subtest.

Below we present some of the feedback on the DemSelf instrument that wasmentioned by two or more participants or was considered particularly important.

Participants identiﬁed usability issues in all components of the DemSelf app. Inthis section, we present a selection of important usability issues noted by two ormore participants.Participants generally supported the idea to test a subject’s understanding thatthe test result alone does not provide a diagnosis and can be inaccurate. How-ever, hiding the button to proceed to the next screen until the user has answeredcorrectly without providing any feedback was considered bad practice (see Fig-ure 1). Leaving the application to make adjustments in the device settings orto get more information about cognitive impairments in the browser may causeproblems, as subjects may not be able to ﬁnd their way back. In addition, read-ing about dementia and commonly used cognitive tests could lead to primingeﬀects that inﬂuence behavior on the test. The buttons for changing volume onthe iPad might be diﬃcult to ﬁnd for subjects unfamiliar with tablets. Partici-pants suggested displaying arrows on the screen pointing to the volume buttons.Participants also missed an indicator of how many screens are displayed beforethe actual test begins.The conﬁrmation of answers in the subtests was not consistent. Responses aresometimes logged directly when tapping a button and can no longer be changed,such as in the Orientation and Logical Memory subtests. This can be particu-larly frustrating for people with vision or motor impairments who accidentallytouch the wrong button. Such operating errors also lead to bias within the testresult. In the Word Registration and Delayed Recall subtests, the screen changesautomatically when ﬁve words have been logged in, so that the ﬁfth word can-not be changed. Previous answers can be deselected by tapping on them. Thisbehavior was irritating and inconsistent. Participants suggested adding a globalconﬁrmation button to all subtests that logs in the selected answers and switchesto the next screen. In the Orientation subtest, the next question appears aftera time limit of 10 seconds, which was considered a serious usability problem.Subjects may feel they are loosing control. As a result, some participants alsofalsely assumed a time limit in the following subtests. Scores may still dependon reaction time, but the end of a time limit should not result in an automaticscreen change. When items are presented both in text form and verbally, newext passages should be displayed step-by-step and synchronously with the spo-ken word, as the speed of reading and following the verbal presentation maydiﬀer.In Clock Drawing, almost all participants initially overlooked the buttons to con-ﬁrm a number or hand and add it to the clock. Instead, participants tapped anew location in the clock circle or began drawing a new hand. The conﬁrmationprocess intended by the system was unexpected, since no input had to be explic-itly conﬁrmed in the previous subtests. Numbers can be relocated on the clockby drag-and-drop, which was perceived as intuitive for younger people. However,dragging a number across the screen requires ﬁne motor skills that many elderlypeople do not have. The mechanism also was not explicitly mentioned, whichcould inﬂuence the test result. A subject’s arm and hand will partially coverthe clock while using the number pad (see Figure 3). A position below the clockcircle would prevent this and correspond to the common conventions for mobiledevices.

DemSelf is a diﬀerent test than Qmci and needs to be re-validated. This sec-tion presents important changes that were introduced by the adaption as self-administered touch-based instrument. Visual recognition and selection betweenseveral answer options instead of a free verbal recall is one of the main diﬀer-ences between the Qmci and DemSelf, with implications on the involved cognitiveprocesses and the subtest diﬃculty. In all subtests besides Clock Drawing, usersprovide answers by tapping on labeled buttons and no longer have to verballyrecall the items. This changes the nature of the task. Overall, there was no clearconsensus on how many or what type of distractors are appropriate.Reading skills, visual memory abilities, or visual impairments are new confound-ing variables in DemSelf. Items are presented in text form for a certain period,and a fast reader can scan the words several times. The same applies to labeledanswer buttons when there is a time limit, such as in the Orientation subtest.Diﬀerent questions in Orientation will likely require diﬀerent time limits as dif-ferent amounts of text are displayed. We reported in the previous section thatpeople with non-cognitive impairments, such as tremor or visual impairments,could unintentionally give wrong answers by tapping on wrong buttons. Partici-pants therefore raised doubts about whether Orientation is valid and really testswhether subjects know the answer.For the Clock Drawing subtest, participants reported that important aspectsfrom the paper-based version such as spatial orientation and understanding theclock are also tested in the adapted version. Many typical errors also seemedpossible in the adapted version. Experience with touchpads and mobile devicescould be an important factor for this subtest, as some interaction options are notexplicitly mentioned, such as dragging a number to change the position. In thepaper-based version of Clock Drawing, it is common to draw auxiliary lines or adot in the middle of the clock before ﬁlling in the numbers. This is not possiblein DemSelf, nor is drawing curved hands.coring in Logical Memory is based on the number of target words recalled,which makes sense for free verbal recall. However, in the adapted version, a cor-rect story component with two target words such as ”Der rote Fuchs” [The redfox] may actually be easier to remember than a story part like ”im Mai.” [inMay] with only one target word, and thus should not score higher.

Participants were generally cautious about self-assessment in practical applica-tion. Participants assumed that many older people would not be able to self-administer the test due to a lack of technical experience. However, as mobiledevices are increasingly used by older people, this problem will decrease in thefuture. For now, an assisting person should be present at least to start the appand to adjust volume and brightness. Participants would generally prefer if thetest is conducted in a medical oﬃce rather than at a patient’s home so that some-one is present for support, and the result can be discussed directly. Healthcareprofessionals may also hesitate to use DemSelf because they lack informationfrom observation and interaction with the patient. It was therefore positivelynoted that the state of the subject is reported in the test result as fatigue andpain can aﬀect the test result, which means that the subject should be retested.Reporting the test result directly to the subject in the app was viewed verynegatively. It would not be ethical to leave individuals alone with an unexpectedtest result about MCI or dementia that may also be incorrect. DemSelf couldalso be used in a traditional context with a human administrator to automati-cally score a test and save the results. Given the current trend towards digitizingpatient data, automatic scoring and digital test results are beneﬁcial for clini-cal practice. Because DemSelf is a diﬀerent test than Qmci, participants agreedthat DemSelf needs validation to be used in practice. Validation must also clarifywhich requirements a test person must fulﬁl in order to use DemSelf. Subjectsmust at least be able to read and must not have a severe visual or motor impair-ment. For people who can perform the test, DemSelf is a promising direction forapplication and research.

Developing a self-assessed cognitive screening instrument is challenging. Psycho-metric properties must be ensured, taking into account a variety of technicaland human factors [4]. This study provides some insights for the development,validation and practical application of self-administered cognitive screening in-struments. DemSelf is still a rather traditional instrument as it is directly basedon a validated paper-based instrument. Some potential beneﬁts of computerizedtests, such as adapting to individual performance or the use of machine-learningalgorithms to interpret test performance [36,32,6], are not part of the instru-ment. A high level of usability is a prerequisite for successful practical use. Testsubjects should immediately feel conﬁdent in using the device, as the test takesnly a few minutes and is not repeated. However, it is in the nature of a stan-dardized cognitive test that certain usability heuristics are hard to fulﬁll. Itemscan, for example, be presented only once because otherwise the test results wouldno longer be comparable. This rigidity can also be a problem for computerizedtests. The conﬁrmation of inputs should be handled consistently throughoutthe subtests. Subjects must be able to change an answer, just as in a human-administered test, where a subject can revoke a statement or cross out a numberor hand. Errors that are not due to cognitive impairment, such as manuallymissing the right button should be avoided. An open question is still to whatextent the measurement of time limits can be used as in the Qmci. Automaticchange of screens after 10 seconds during orientation questions was one of themost criticized usability issues. Even if the screens do not change automatically,diﬀering reading abilities and visual or motor impairments add a lot of varianceto the time it takes to answer a question that is not related dementia or MCI.One of the main diﬀerences between the original Qmci and DemSelf was re-call vs. recognition. A comparison of free recall and either yes/no recognitionor three-alternative forced-choice recognition in cognitively-impaired subjectsfound that yes/no recognition was the best predictor of MCI and early AD [5].Yes/no recognition could therefore be used in an updated version of DemSelf. Adigital pen was suggested for the Clock Drawing subtest to make it more similarto the original paper-based version. [30] successfully used such a digital pen aswell as a machine learning approach for evaluation of the Clock Drawing test. Inthe DemSelf implementation, users only need to tap a certain location to enternumbers and can only draw straight lines. The reduced vulnerability to motorimpairments in the DemSelf implementation could be a strength of the instru-ment. User testing is required to determine if DemSelf is appropriate for certainsubpopulations – for example, for those who are uncomfortable using a tablet orhave certain limitations such as aphasia, poor reading skills, or hemiparesis [4].Some factors can aﬀect a self-assessed computerized screening instrument dif-ferently than an examiner-administered paper-based instrument. For example,technology use and commitment towards technology has been shown to aﬀecttest scores when the same test is administered on paper or on a tablet [31].Technical experience, reading skills, and motor impairments were mentioned aspossible confounding factors for DemSelf.The expert interviews revealed possibilities for improving the prototype. Bycompleting the test twice, participants could explore diﬀerent ways of interac-tion and discover usability problems that would otherwise have gone unnoticed.The involvement of stakeholders from dementia and MCI research allowed usto evaluate the acceptability of the instrument for practical use and to identifyusability issues that are speciﬁc to elderly and cognitively-impaired users. In afuture study, screen capture or video recordings can help to better analyze theinteraction with complex tasks like the clock drawing test. One limitation of thisstudy is that no users from the target group were involved. Until then, the pre-sented comments on usability remain partly speculation. Similar to [37] the nextstep in the evaluation of DemSelf could be focus group interviews with healthynd cognitively-impaired elderly users. A study comparing the test performancein Qmci and DemSelf would shed more light on the discussed diﬀerences be-tween the instruments. In the future, we hope to allocate more funding in orderto further our research on the usability and validity of computerized cognitivescreening instruments.

Early detection of MCI and dementia is vital as many therapeutic and preventiveapproaches are particularly eﬀective at an early stage. Cognitive screening in-struments are primarily paper-based and administered by healthcare profession-als. Computerized cognitive screening instruments oﬀer a number of potentialadvantages over paper-based instruments, such as increased standardization ofscoring, reduced costs, remote testing and a more precise measurement of time-and location-sensitive tasks [4].We presented a touch-based cognitive screening instrument, called DemSelf. Itwas developed by adapting an examiner-administered paper-based instrument,the Quick Mild Cognitive Impairment (Qmci) screen. Usability is a key criterionfor self-assessment by elderly users with potential cognitive and other impair-ments. We conducted interviews with experts in the domain of usability andhuman-machine interaction and/or in the domain of dementia and neuropsy-chological assessment to evaluate DemSelf. The expert interviews revealed pos-sibilities for improving the prototype. Developers should consider barriers forelderly and inexperienced users, such as not being able to reverse an answer orleaving the application without ﬁnding their way back. Time limits cannot betaken directly from a paper-based test version and should not lead to automaticscreen changes. Visual recognition instead of a free verbal recall is one of themain diﬀerences between the Qmci and DemSelf. Reading skills, technical ex-perience, and motor impairments seem to be important confounding variables.Further research is needed to determine validity and reliability of computerizedversions of conventional paper-based tests. Healthcare professionals may hesitateto use DemSelf because they lack information from observation and interactionwith the patient – that could also be evaluated electronically in the long-term.Participants would generally prefer if the test is conducted in a medical oﬃcerather than at a patient’s home so that someone is present for support and theresult can be discussed directly. In view of the current trend towards digitiza-tion of patient data, automatic scoring and digital test results are beneﬁcial forclinical practice. Computerized instruments like DemSelf also have the potentialto reduce costs and provide early support for people with MCI and dementia.

References

1. Antons, J.N., O’Sullivan, J., Arndt, S., Gellert, P., Nordheim, J., M¨oller, S.,Kuhlmey, A.: Pﬂegetab: Enhancing quality of life using a psychosocial internet-based intervention for residential dementia care. In: ISRII 8th Scientiﬁc Meeting -echnologies for a digital world: Improving health across the lifespan. InternationalSociety for Research on Internet Interventions (ISRII). pp. 1–1 (2016)2. Aslam, R.W., Bates, V., Dundar, Y., Hounsome, J., Richardson, M., Krishan,A., Dickson, R., Boland, A., Fisher, J., Robinson, L., Sikdar, S.: A system-atic review of the diagnostic accuracy of automated tests for cognitive im-pairment. International Journal of Geriatric Psychiatry (4), 561–575 (2018).https://doi.org/10.1002/gps.48523. Balata, J., Mikovec, Z., Slavicek, T.: KoalaPhone: touchscreen mobile phone UIfor active seniors. Journal on Multimodal User Interfaces (4), 263–273 (2015).https://doi.org/10.1007/s12193-015-0188-14. Bauer, R.M., Iverson, G.L., Cernich, A.N., Binder, L.M., Ruﬀ, R.M., Naugle,R.I.: Computerized neuropsychological assessment devices: Joint position paperof the american academy of clinical neuropsychology and the national academyof neuropsychology. Archives of Clinical Neuropsychology (3), 362–373 (2012).https://doi.org/10.1093/arclin/acs0275. Bennett, I.J., Golob, E.J., Parker, E.S., Starr, A.: Memory evaluationin mild cognitive impairment using recall and recognition tests. Jour-nal of Clinical and Experimental Neuropsychology (8), 1408–1422 (2006).https://doi.org/10.1080/138033905004095836. Cha, J., Voigt-Antons, J.N., Trahms, C., O’Sullivan, J.L., Gellert, P., Kuhlmey, A.,M¨oller, S., Nordheim, J.: Finding critical features for predicting quality of life intablet-based serious games for dementia. Quality and User Experience (1) (2019).https://doi.org/10.1007/s41233-019-0028-27. Cordell, C.B., Borson, S., Boustani, M., Chodosh, J., Reuben, D., Verghese, J.,Thies, W., Fried, L.B.: Alzheimer’s association recommendations for operational-izing the detection of cognitive impairment during the medicare annual wellnessvisit in a primary care setting. Alzheimer’s & Dementia (2), 141–150 (2013).https://doi.org/10.1016/j.jalz.2012.09.0118. Daniel Wigdor, D.W.: Brave Nui World: Designing Natural User Interfaces forTouch and Gesture. Morgan Kaufmann Publishers (2011)9. Ehreke, L., Luppa, M., K¨onig, H.H., Riedel-Heller, S.G.: Is the clockdrawing test a screening tool for the diagnosis of mild cognitive impair-ment? a systematic review. International Psychogeriatrics (1), 56–63 (2009).https://doi.org/10.1017/s104161020999067610. Fisk, A.D., Rogers, W.A., Charness, N., Czaja, S.J., Sharit, J.: Designing for OlderAdults. CRC Press (2004). https://doi.org/10.1201/978142002386211. Glynn, K., Coen, R., Lawlor, B.A.: Is the Quick Mild Cognitive Impairment Screen(QMCI) more accurate at detecting mild cognitive impairment than existing shortcognitive screening tests? A systematic review of the current literature. Interna-tional Journal of Geriatric Psychiatry (2019). https://doi.org/10.1002/gps.520112. Junge, S., Gellert, P., O’Sullivan, J.L., M¨oller, S., Voigt-Antons, J.N., Kuhlmey,A., Nordheim, J.: Quality of life in people with dementia living in nurs-ing homes: validation of an eight-item version of the QUALIDEM for inten-sive longitudinal assessment. Quality of Life Research (6), 1721–1730 (2020).https://doi.org/10.1007/s11136-020-02418-413. Knopman, D.S., Petersen, R.C.: Mild cognitive impairment and mild dementia: Aclinical perspective. Mayo Clin Proc. p. 1452–1459 (2014)14. Lech, S., O’Sullivan, J., Voigt-Antons, J.N., Gellert, P., Nordheim, J.: Tablet-basedoutpatient care for people with dementia: Guideline-based treatment planning,personalized disease management and network-based care. In: Abstracts of the4th International Congress of the European Geriatric Medicine Society. pp. 249–250. Springer International Publishing (2018)15. Lech, S., O’Sullivan, J.L., Gellert, P., Voigt-Antons, J.N., Greinacher, R., Nord-heim, J.: Tablet-based outpatient care for people with dementia. GeroPsych (3),135–144 (sep 2019). https://doi.org/10.1024/1662-9647/a00021016. Mahatody, T., Sagar, M., Kolski, C.: State of the art on the cognitive walkthroughmethod, its variants and evolutions. International Journal of Human-ComputerInteraction (8), 741–785 (2010). https://doi.org/10.1080/1044731100378140917. Maier, W., Deuschl, G.: S3-Leitlinie Demenzen [s3 guide-line dementia]. In: Leitlinien f¨ur Diagnostik und Therapie inder Neurologie. Deutsche Gesellschaft f¨ur Neurologie (2016),

18. Mayr, U.: Normales kognitives Altern [normal cognitive aging]. In: Karnath, H.O.,Thier, P. (eds.) Kognitive Neurowissenschaften, pp. 777–788. Springer Berlin Hei-delberg (2012). https://doi.org/10.1007/978-3-642-25527-4 7219. Molloy, W., O’Caoimh, R.: Qmci The Quick Guide. Newgrange Press (2017)20. Murman, D.: The impact of age on cognition. Seminars in Hearing (03), 111–121(2015). https://doi.org/10.1055/s-0035-155511521. M¨oller, S.: Usability engineering. In: Quality Engineering, pp. 59–76. SpringerBerlin Heidelberg (2017). https://doi.org/10.1007/978-3-662-56046-4 422. O’Caoimh, R., Gao, Y., McGlade, C., Healy, L., Gallagher, P., Timmons, S., Mol-loy, D.W.: Comparison of the quick mild cognitive impairment (qmci) screen andthe SMMSE in screening for mild cognitive impairment. Age and Ageing (5),624–629 (2012). https://doi.org/10.1093/ageing/afs05923. O’Caoimh, R., Molloy, W.: The quick mild cognitive impairment screen (qmci). In:Larner, A.J. (ed.) Cognitive Screening Instruments, pp. 255–272. Springer (2017).https://doi.org/10.1007/978-3-319-44775-9 1224. Overton, M., Pihlsg˚ard, M., Elmst˚ahl, S.: Test administrator eﬀects on cognitiveperformance in a longitudinal study of ageing. Cogent Psychology (1) (2016).https://doi.org/10.1080/23311908.2016.126023725. Pentzek, M., Dyllong, A., Grass-Kapanke, B.: Praktische Voraussetzungen undHinweise f¨ur die Durchf¨uhrung psychometrischer Tests – was jeder Testleiter wissensollte [practical requirements and tips for conducting psychometric tests - whatevery test administrator should know]. NeuroGeriatrie (1), 20–25 (2010)26. Petersen, R.C., Caracciolo, B., Brayne, C., Gauthier, S., Jelic, V., Fratiglioni, L.:Mild cognitive impairment: a concept in evolution. Journal of Internal Medicine (3), 214–228 (2014). https://doi.org/10.1111/joim.1219027. Roeck, E.E.D., Deyn, P.P.D., Dierckx, E., Engelborghs, S.: Brief cog-nitive screening instruments for early detection of alzheimer’s disease:a systematic review. Alzheimer’s Research & Therapy (1) (2 2019).https://doi.org/10.1186/s13195-019-0474-328. Ruano, L., Sousa, A., Severo, M., Alves, I., Colunas, M., Barreto, R., Mateus, C.,Moreira, S., Conde, E., Bento, V., Lunet, N., Pais, J., Cruz, V.T.: Development ofa self-administered web-based test for longitudinal cognitive assessment. ScientiﬁcReports (1) (1 2016). https://doi.org/10.1038/srep1911429. Shankle, W.R., Romney, A.K., Hara, J., Fortier, D., Dick, M.B., Chen, J.M.,Chan, T., Sun, X.: Methods to improve the detection of mild cognitive impair-ment. Proceedings of the National Academy of Sciences (13), 4919–4924 (2005).https://doi.org/10.1073/pnas.05011571020. Souillard-Mandar, W., Davis, R., Rudin, C., Au, R., Penney, D.L.: Interpretablemachine learning models for the digital clock drawing test. In: 2016 ICML Work-shop on Human Interpretability in Machine Learning (WHI 2016) (2016)31. Steinert, A., Latendorf, A., Salminen, T., M¨uller-Werdan, U.: Evaluation oftechnology-based neuropsychological assessments in older adults. Innovation in Ag-ing (suppl 1), 504–504 (2018). https://doi.org/10.1093/geroni/igy023.187432. Sternin, A., Burns, A., Owen, A.M.: Thirty-ﬁve years of computerized cogni-tive assessment of aging—where are we now? Diagnostics (3), 114 (9 2019).https://doi.org/10.3390/diagnostics903011433. Stokin, G.B., Krell-Roesch, J., Petersen, R.C., Geda, Y.E.: Mild neurocognitivedisorder: An old wine in a new bottle. Harvard Review of Psychiatry (5), 368–376 (2015). https://doi.org/10.1097/hrp.000000000000008434. Tsoi, K.K.F., Chan, J.Y.C., Hirai, H.W., Wong, S.Y.S., Kwok, T.C.Y.: Cogni-tive tests to detect dementia. JAMA Internal Medicine (9), 1450 (2015).https://doi.org/10.1001/jamainternmed.2015.215235. World Health Organization: ICD-11 for mortality and morbidity statistics (2020), https://icd.who.int/browse11/l-m/en

36. Yim, D., Yeo, T.Y., Park, M.H.: Mild cognitive impairment, demen-tia, and cognitive dysfunction screening using machine learning. Jour-nal of International Medical Research (7), 030006052093688 (2020).https://doi.org/10.1177/030006052093688137. Zeng, Z., Fauvel, S., Hsiang, B.T.T., Wang, D., Qiu, Y., Khuan, P.C.O., Leung, C.,Shen, Z., Chin, J.J.: Towards long-term tracking and detection of early dementia: Acomputerized cognitive test battery with gamiﬁcation. In: Proceedings of the 3rdInternational Conference on Crowd Science and Engineering - ICCSE’18. ACMPress (2018). https://doi.org/10.1145/3265689.326571938. Zygouris, S., Tsolaki, M.: Computerized cognitive testing for older adults. Amer-ican Journal of Alzheimer’s Disease & Other Dementias30