Conversational User Interfaces for Blind Knowledge Workers: A Case Study
AA Multifunction Printer CUI for the Blind
Kyle [email protected] Alto Research Center Kalai [email protected] Alto Research CenterJune 16, 2020
Abstract
Advances in interface design using touch surfaces creates greater obstacles for blind and visuallyimpaired users of technology. Conversational user interfaces offer a reasonable alternative for interactionsand enable greater access and most importantly greater independence for the blind. This paper presentsa case study of our work to develop a conversational user interface for accessibility for multifunctionprinters (MFP). It describes our approach to conversational interfaces in general and the specifics of thesolution we created for MFPs. It also presents a user study we performed to assess the solution andguide our future efforts.
Manufacturers of multifunction printers (MFP) tout their latest user interfaces with touchscreen displaysthat minimize the interface controls providing customization and let users create complete workflows thatare available with a few taps or swipes on built-in surfaces much like personal tablet devices. The machineslook sleeker and permit more complicated interactions than were easily provided with physical push buttons.Unfortunately, blind and visually impaired office workers can no longer use machines with surface-onlyinterfaces. In the past, blind workers could learn to feel for the physical buttons and accomplish many ofthe tasks required in their jobs. Learning a single model of machine, its functions, and where its buttonswere located was not an ideal option, but it at least allowed some degree of independence. With the newestinterfaces on office equipment, blind users have, for the most part been left out. To address this issue, ourresearch group at the Palo Alto Research Center (formerly Xerox PARC) developed a conversational userinterface that provides access to most of the important features that simply aren’t available to these userswithout assistance.The MFPs we worked with offer five high-level functions: print, scan, copy, fax, and email. Our con-versational agent does not support printing since that is mostly done from a computer or other device thatalready offers assistive features. The targeted use-case is blind and visually impaired individuals standingat a copier who want to make a copy, scan a document or send a fax or email. Many choices like single- ordouble-sided copying, number of copies, binding options, and image variations like lightening or darkeningare available. The combination of choices and options available across all the functions amounts to nearlya hundred possible selections. Our solution covers about half of these and prioritizes the most frequentlyused options. While there might have been good multi-modal interface designs using the screen display topresent large icons with high contrast to help the visually impaired, our goal was to create a solution thatwould serve people with total blindness, requiring everything to be available conversationally.
Conversational and voice interactions have become much more common recently. Porcheron et al. [2018]collected data from family interactions with standalone, screenless smart speakers in the home. They pointout that while these devices are marketed to help people get things done, little is known about what people areactually able to accomplish with their devices. From an ethnomethodological perspective, using Conversation1 a r X i v : . [ c s . H C ] J un nalysis, they are trying to understand how conversational devices fit into human interactions and whatis being achieved within dialogs. They identified very little actual collaboration between people and theirdevices. The collaboration they did find was around the mechanics of the conversation rather than inaccomplishing an external goal.Indeed, we have noticed that task-based, collaborative, conversational interfaces, as opposed to questionanswering or transaction processing, continue to be rare. Our interest is in developing interactions betweenpeople and technology with a shared goal. Moreover, we want to distinguish our interface from one thatsimply maps voice commands to individual functions on a machine. The trend for office equipment is to putcontrols on touch surfaces presenting users with a GUI interface that sighted users are very familiar with.GUI interfaces differ in many ways, most significantly in how they model tasks for users. They dictate theway that tasks are done and people learn to adapt to GUIs. Conversational interfaces must allow manydifferent ways of saying the same thing coming from people with differing mental models of the task to beperformed. Conversational interfaces have to be able to accept all of these speech inputs in addition tohandling inconsistencies of language and general disfluencies. We have drawn from the work presented inGrosz and Sidner [1986] for our understanding of dialog, especially with regard to task-based, collaborativeinteractions such as the expert/apprentice flywheel example first discussed in Grosz [1978]. More recentresearch focuses on voice as an interface modality both for accessibility and hands-free/eyes-free use cases.Across the research it is apparent that there continue to be many challenges. Corbett and Weber [2016] describes the problem of discoverability, constraint, and affordances in their voiceuser interface for controlling Android devices. Their interest was an interface for individuals with limitedhand dexterity, so visual clues and presentation were still possible. Yankelovich [1996] points out that speechinterfaces are like command-line user interfaces in that functions are hidden. GUIs were invented in largepart to bring hidden functionality to the surface. In our case, for example, many users are not aware thatMFPs may have a built-in stapler or other binding options. Even double-sided versus single-sided printingmay not be apparent to a new user.Yankelovich goes on to point out the additional problem of users who, once they start speaking, assumethe system has capabilities that it does not. Her approach is to provide carefully formed voice prompts. Wehave tried to adopt this principle but only to the extent that it won’t conflict with our goal of keeping thesystem conversational. For example, we make use of her suggested implicit confirmations where possible, sothat the interaction can flow more naturally without turns dedicated specifically to confirming information.
While there has been work that shows that personal assistants have been helpful in providing access totechnology that might otherwise be inaccessible, there is also research indicating particular problems for blindusers. Among other problems, Abdolrahmani et al. [2018] identified the problem of appropriate feedbackfrom voice systems. Many conversational devices rely on visual cues to report certain events or states. Forexample, a light of some kind is often used to indicate that a microphone is listening. Amazon’s Echo devicerelies on a light ring to indicate different kinds of activity on the device. Similarly, small buttons for mutingor initiating a conversation are not easily found without vision. The authors also describe the lack of controlon voice outputs. Some individuals may require a slower delivery while many others who are accustomed toscreen readers set to very fast rates may find listening to the responses frustrating.G¨otzelmann et al. [2017] discusses issues of accessibility with 3D printers. They deal with the challengesof physically interacting with equipment without vision, which aligns closely with our own problem of usingan MFP. They break down the workflow for 3D printing into discrete steps. We also face the problem theydescribe of a large number of parameters and alternatives that are available for each step of the process.They reduced the number of options to those that are most essential in order to reduce the scope andcomplexity of the problem. Our ambition, on the other hand, was to provide access to all of the optionsthat are available on MFPs to blind users. Unfortunately, there were technical limitations in how our agentcommunicates with the device making our ideal impossible using our current architecture.2
Focus Groups and Usability
To understand more about the experiences of our target users, we engaged with two different user groups.The first was a visit to the Association for the Blind and Visually Impaired (ABVI) in Rochester, New York.ABVI provides services to people with significant vision loss including training for skills needed for workingand living independent lives. Our informal visit with them helped us to gain an initial understanding of thecurrent situation for office workers with visual impairment, seeing some of the assistive devices available,and learning about challenges they face day-to-day in their work.Subsequently, we partnered with the Vista Center for the Blind and Visually Impaired in Palo Alto,California. The Vista Center put together a user group of eleven people who were current or past officeworkers with significant experience using multifunction printers. Their ages ranged from 47 to 73 years oldwith a median of 62.5. All of them live with moderate to severe visual impairment with most having nofunctional sight.We conducted a focus group to learn more about specific difficulties. Users expressed, in addition to theproblems of touchscreen interfaces, a lack of accessible technology in their work environments resulting inreduced independence and often exclusion from work teams. They complained about complex choices acrossmany options with poor navigational directions causing confusion, inefficiency, and frustration.Following are some examples from comments we heard. . . . if you want it collated, double-sided, stapled, whatever and you have to push the right buttonsand all of those things, or otherwise you have to have a sighted aide or another oral aide, oreven a student aide. . . and a lot of the districts just don’t have funding for a lot of that. Reallyfrustrating.
When errors happen, there is little or no explanation of the error or the cause.
I can’t just look at the page to make sure it’s printing properly. I’m relying on just, knowing itis, and sometimes the first inkling that I know something was wrong, is that I hand the page tosomebody, and they say, “Um, this only has like two-thirds of the printing on it.” Or, “This isreally light in color.” And then I know that our toner cartridge wasn’t working right or two pagesgot stuck together when the paper was feeding through, and it didn’t feed right
They also mentioned the difficulty of inconsistent designs across different brands of devices all of whichlack verbal introductions or tutorials.
I have to ask somebody to make sure that the page is turned, so I know which way. . .
We heard about workarounds the employ to carry out many tasks. For example, distinguishing whichjob is the right one in the output tray is a frequent problem. . . . [I] always check the output tray before I print anything, and I make sure that it is empty.
From the challenges and suggestions we heard, we developed a set of design recommendations for ourconversational agent shown in Table 1.
Based on the principles we derived from the focus group sessions and our own prior work on conversationalagents Dent et al. [2018], we formulated several goals and principles to apply in the solution.Creating a good conversational interaction that allows individuals to be successful in task-based goalswas a high priority. We wanted our agent to be collaborative. Keeping task objectives at the center ofthe design, we wanted the interactions to feel as normal as possible. We are aware of the risks of tryingto copy human-like behaviors in agents Chefitz et al. [2018], which we avoid. We are interested, however,in modeling aspects of human-to-human conversation to improve our interactions but without setting upunmeetable expectations. For example, human conversation is strongly characterized by mixed-initiative3able 1: Focus Group Interface Design RecommendationsFeature (“What”) Reduces OR Solves (“Why”)Invitation to set defaults Repetitive “starting over” for each job; UI disori-entation; cognitive overload from complex work-flows; inefficiencyInvitation to choose a basic function, then addoptions Reduce cognitive overload from complex featurelists; frustration“Auditory” buttons and lexicon of sound cues Undesired output from mistaken inputVoice-over walkthrough for frequently used op-tions Eliminates need to scroll through lengthy menusInvitation to ask for help at any time Avoids user “bail-outs” and job abandonmentInvitation for a new user to take a tour of thedevice and interface Facilitate new user orientationUser-prompted description of features, options,input process Reduce need for workarounds, enable use of fullfeature set, support correct paper loadingVerbalization of error message/code, its meaningand troubleshooting support Less frustration and confusion; minimize devicedamage from inappropriate troubleshootingVerbal preview of output Avoid need for “another pair of eyes” enables in-dependenceQuery and confirmation for “unusual” or non-routine request Less frustration from undesired device outputs,minimize resource waste (e.g., unnecessary copiesetc.)Verbal update of job status/progress and outputspecification Avoid restarting job, resource waste, output col-lection errorwhere individuals can seamlessly introduce new information or take the lead during a conversation. Ouragent allows for similar flexibility. Users can provide information independent of system prompts to provideit. The agent must be prepared to recognize information that arrives outside of a pre-planned sequence.We defined the following principles in developing our conversational design.
Be Accommodating
As explained, users should be able to provide information whenever they want andthey should be able to take the initiative in the conversation. They should be able to answer questionsnaturally with either fragments or complete sentences. They should be able to introduce and ask formultiple things within a single turn. The agent should allow for lulls in the conversation. When theagent has asked the user to perform an action, it might take some time before the action is complete.The agent should wait until users resume the conversation.
Be Brief
Long utterances can be hard to follow and retain. When listing options, for example, providethem in manageable chunks allowing users to ask for more as they are ready.
Be Helpful
If users have not provided required information to complete a task, try to give other relevantinformation or suggest alternatives that havent been mentioned before. If people ask for help, includethe option of giving more information about related topics. Provide procedural help for new users.When the agent speaks, expect conversational mirroring. The agent should use language and structuresthat it can understand to give users clues about the best way to say things to maximize the chancesof recognition of future utterances.
Be Transparent
As much as possible convey to users what the agent is thinking and doing. If users’utterances are ambiguous or unclear, confirm the most likely interpretation. Give cues about wherethe collaboration is in a process (e.g. “First. . . ”, “Then. . . ”, “Finally. . . ”). Use acknowledgmentswhenever users supply new information (e.g. “OK”, “Got it”). Perform a final confirmation beforeinvoking the printer especially for jobs that might use a lot of resources (e.g. “I need 500 copies of thisdocument”). 4igure 1: An integrated conversational system to augment and support MFP functions comprised of custom-built hardware and software to manage both conversational dialog strategy and also interaction with thedevice to operate its functions.
The solution consists of a device that includes a controller, a microphone, and a speaker. We used aRaspberry Pi for the controller and after trying out several different microphones settled on one designedfor conference room phones. In the range of relatively inexpensive microphones, this option performed bestto capture voice signals from anywhere around the printer.
Spoken language is captured at the device and sent to Google Cloud Speech-to-Text for automatic speechrecognition processing. From Google’s service we obtain the textual representation which is fed into our ourdialog system for natural language understanding and processing by the dialog manager. After making adecision about what to do next, the dialog manager communicates with the multifunction device and sendsits generated response to Amazon Polly for speech synthesis. We also have a version the uses local librariesfor speech recognition and synthesis that runs without depending on any cloud services. Figure 1 displays aschematic view of the architecture.Before building our own hardware device, we experimented with smart speakers. We found that forinteractions lasting more than a couple of turns, interactions with these devices is awkward. The servicessupplied by the manufacturers to support them limit the amount of control throughout the interaction. Thedevices also require a wake word, which we did not prefer, and provide buttons that are too difficult to find,given limited or no vision.On our custom-made device, conversations with the system are initiated by pressing a large button. Weexplored using a wake word on our own device as well, but preferred the button to eliminate issues with wakeword recognition. Copiers tend to be in rather noisy environments, which makes the wake word option evenless reliable. Once the conversation is initiated, we open a listening channel that stays open until the task iscompleted. This eliminates the need to press the button on every turn allowing more natural conversationalinteractions.The open channel introduces other complications, however. The voice signal processor has to determinewhen a user’s turn ends, and side conversations that are not directed at the agent have to be distinguishedfrom questions and commands. The noise of the machine itself can also create difficulties in voice detectionand recognition when the device is in operation during the conversation.
To achieve natural-seeming interactions, we let users interrupt the agent, and also let the agent interruptusers. The agent might interrupt a user mid-question, for example, to announce that a copy job has finished orthat a problem has been detected (e.g. out of paper). In order to support delivery of unprompted information,the system triggers production rules in response to the current context and the current utterance. Initiationof a new task, which can happen even while another one is in progress, pushes the new task onto a stack.There are several strategies within the dialog manager designed to maximize the chance of successfullyaccomplishing tasks Kamm [1995]. To address the problem of discoverability (see Section 2), and to assist5sers dealing with unexpected issues Myers et al. [2018], the agent has the ability to help in understandingthe machine and how to interact with the conversational interface itself. Users can ask for descriptions ofprinter concepts and functions including options for each. A help system is available during the course of atask, and a how-to system can guide people through an unfamiliar task. The system recognizes when usersencounter obstacles and provides a walkthrough skill where the agent can work through a process one stepat a time. The agent may also employ fallback strategies that provide more context or information to guideusers back on track. It also has the ability to direct users physical interactions with the machine. It cananswer questions like, “Where do I find the feeder?” and “Where can I find my copied document?”
The solution understands how to perform procedures. Accomplishing tasks in collaboration with a userrequires maintaining a dialog state that represents the knowledge relevant to the current conversation. Thestate includes a list of all turns that have occurred up to the current point. Each turn includes the utteranceand the list of dialog acts that have been parsed from it. Given a new utterance and its dialog acts, theagent updates its state with the new information. Only when utterances are unclear or ambiguous, will theagent ask for confirmation before moving on. Once the state updates are complete, the agent processes thetop task from its stack of goals. Given the current goal, the agent decides the next thing to ask or say. Theagent will ask for any missing information if it’s needed to proceed. Once all of the necessary informationhas been collected for a particular task, it will get a final confirmation that its understanding of the full jobis correct. Once confirmed, it initiates the action on the printer and then reports the status results providedby the printer.
Diagnosis is necessary when users encounter problems during the course of a procedure. The dialog agentconsults a diagnostic engine to get a list of recommendations to solve a problem. The diagnostic enginemight offer a recommended next step or suggest a condition that should be checked. The conversationalagent determines which recommendation to follow based on the current state of the dialog. For example, ifa solution has already been tried, it can be eliminated from consideration. The agent continues to iterateuntil all the recommendations have been tried, the problem is fixed, or the user decides to stop the process.
We conducted a user study to assess the success of the solution for our target user population. We alsowanted to know how users felt about the solution and how effective they thought it would be in their work.Nine of the eleven individuals from our focus group participated in the usability study. The specific goals ofthe study were to1. Assess participants’ ability to successfully complete a series of tasks on an MFP through verbal com-mands only,2. Quantify quality of experience for ease and a perceived sense of naturalness of the interaction, and3. Analyze conversational task completion against that of conventional tap/touch MFPWe did not have a baseline for conventional tap/touch interfaces for comparison, so data regarding thelast item was collected as part of a pre-test survey, which was administered before the usability study. Thesurvey collected data regarding frequency of copier use by task, friction points, workarounds, task confidence,and task avoidance.For the study, each participant performed tasks related to seven different scenarios. A test administratorread aloud each scenario. Once the scenario was understood, participants used an MFP equipped with ourconversational agent to accomplish the goals suggested by the scenario. They did not have to operate anyother controls on the printer accomplishing the task through conversation only. So as not to prime userswith the words known to be understood by the system, scenarios were intentionally constructed to omit atask’s identifying “keywords” requiring participants to articulate their own way of performing tasks.6 % 25% 50% 75% 100%
The agent understood me.I was able to easily follow the agent's prompts, and responsesMistakes I might have made were easy to correct.Mistakes made by the agent were easily gotten back on track.The agent was likable.The agent allowed me enough time to respond.
Strongly Agree Agree Neutral Disagree Strongly Disagree
Figure 2: Distribution of scores for the first set of survey questionsAs an example, the following scenario is designed to prompt a participant to ask the MFP for three copiesof a given document without priming the words ‘copy’, ‘document’, etc.:You are organizing a meeting for three coworkers. You want to hand out the meeting agenda atthe meeting. Use the copier to accomplish your goal.At the end of the scenario tests, participants answered questions from a post-test survey, which collecteddata regarding their perception of ease-of-use, ability to mitigate error, clarity, precision, benefits of voiceover tactile interaction, frustrations, successes, efficiency, and quality of the overall experience. The surveyincluded both open-ended questions for qualitative results and statements where they ranked their level ofagreement for quantitative results. Two examples of open-ended questions were “How would you describethe agent’s personality?” and “What did you like least about using the conversational copier?”For the quantitative statements, participants selected values from a Likert scale of 1 to 5 (correspondingto Strongly Agree, Agree, Neutral, Disagree and Strongly Disagree, respectively) for the following statements: • The agent understood me. • I was able to easily follow the agent’s prompts and responses. • Mistakes I might have made were easy to correct. • Mistakes made by the agent were easily gotten back on track. • The agent spoke too fast. • The agent spoke to slow. • The agent was able to summarize, and confirm my request. • The agent was repetitive. • The agent was too wordy. • The language the agent used was precise. • The agent was likable. • I knew how to ask for help if needed it. 7
If I needed help, the agent was helpful. • At times, I was frustrated, or impatient with the agent. • I knew what to say to initiate a task. • I knew when a task was successfully completed. • I knew when the conversation for a task was over. • The agent allowed me enough time to respond.The overall success was apparent in the notable contrast between pre- and post-test survey results.In particular copier-related tasks identified in the pre-test survey as difficult, time-consuming, or avoided(multiple page documents, emailing, scanning, stapling, anything more than a simple one-page copy) wereeach successfully completed with the use of the conversational assistant. We present a brief overview of ourfindings here. Full test results can be made available.In the first set of sentences, participants were asked about the agent’s behavior. The overwhelmingmajority of the participants found the agent likable (91%) and felt that it understood them (82%). Asignificant majority also thought that the agent allowed them enough time to respond (82%), and all theparticipants agreed that they were able to easily follow the prompts. Also, for most of the participants, themistakes made by them or the agent were easy to correct, while about 10% of the participants experiencedsome difficulties getting back on track. See Figure 2 for the distribution of Likert responses for thesestatements.Participants were asked to rate specific aspects of the agent’s language. While a majority of the partici-pants thought the language used by the agent was precise, about 40% of them also felt that the agent wasrepetitive. However, they insisted that the repetition or confirmation may be a positive attribute in thiscase, and agreed that the agent was able to summarize and confirm their request in all tasks. Most of theparticipants felt the speed of the delivery was right, while a small portion (about 10%) felt the agent spoketoo slowly. Also, only a few participants (about 10%) felt the agent was too wordy.Participants were then asked if they knew how to ask for help, and if the agent was helpful when theyneeded it. About 45% of the participants responded that they did not know how to ask for help, while, only55% of the participants felt the agent was helpful when they needed it. Also, about 60% of the participantsfelt frustrated or impatient with the agent at some point during the test. Moreover, about 75% of theparticipants agreed they knew what to say to initiate a task. While only 73% of the participants knewwhen the conversation for a task was over, all participants understood when a task itself was successfullycompleted.
We are encouraged by the results of the study and believe that high quality task-based conversationalinterfaces are possible and can be a great benefit to the blind and others who might otherwise have limitedaccess to different types of technology. There are several technical items that we were not able to accomplishbecause of the time we had available for this project as well as the limitations created by our architecturaldesign; however, these are solvable given more time and different kinds of technical access to the MFPdevices. The bigger problem we hope to address with future research is to reduce the amount of time andexpertise required to create these interactions. Modeling of the equipment, the tasks, the users, and theconversations are time-consuming require people from a variety of disciplines. We are hoping to continue thecross-disciplinary research needed for rich, truly collaborative interactions between people and technology.
References
Martin Porcheron, Joel E. Fischer, Stuart Reeves, and Sarah Sharples. Voice interfaces in everyday life.In
Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems , CHI ’18, pages640:1–640:12, New York, NY, USA, 2018. ACM. ISBN 978-1-4503-5620-6. doi: 10.1145/3173574.3174214.URL http://doi.acm.org/10.1145/3173574.3174214 .8arbara J. Grosz and Candace L. Sidner. Attention, intentions, and the structure of discourse.
Comput. Lin-guist. , 12(3):175–204, July 1986. ISSN 0891-2017. URL http://dl.acm.org/citation.cfm?id=12457.12458 .Barbara J. Grosz. Focusing in dialog. In
Proceedings of the 1978 Workshop on Theoretical Issues in NaturalLanguage Processing , TINLAP ’78, pages 96–103, Stroudsburg, PA, USA, 1978. Association for Compu-tational Linguistics. doi: 10.3115/980262.980278. URL https://doi.org/10.3115/980262.980278 .Eric Corbett and Astrid Weber. What can i say?: Addressing user experience challenges of a mobile voiceuser interface for accessibility. In
Proceedings of the 18th International Conference on Human-ComputerInteraction with Mobile Devices and Services , MobileHCI ’16, pages 72–82, New York, NY, USA, 2016.ACM. ISBN 978-1-4503-4408-1. doi: 10.1145/2935334.2935386. URL http://doi.acm.org/10.1145/2935334.2935386 .Nicole Yankelovich. How do users know what to say? interactions , 3(6):32–43, December 1996. ISSN1072-5520. doi: 10.1145/242485.242500. URL http://doi.acm.org/10.1145/242485.242500 .Ali Abdolrahmani, Ravi Kuber, and Stacy M. Branham. ”siri talks at you”: An empirical investigation ofvoice-activated personal assistant (vapa) usage by individuals who are blind. In
Proceedings of the 20thInternational ACM SIGACCESS Conference on Computers and Accessibility , ASSETS ’18, pages 249–258, New York, NY, USA, 2018. ACM. ISBN 978-1-4503-5650-3. doi: 10.1145/3234695.3236344. URL http://doi.acm.org/10.1145/3234695.3236344 .Timo G¨otzelmann, Lisa Branz, Claudia Heidenreich, and Markus Otto. A personal computer-based approachfor 3d printing accessible to blind people. In
Proceedings of the 10th International Conference on PErvasiveTechnologies Related to Assistive Environments , PETRA ’17, pages 1–4, New York, NY, USA, 2017. ACM.ISBN 978-1-4503-5227-7. doi: 10.1145/3056540.3064954. URL http://doi.acm.org/10.1145/3056540.3064954 .Kyle Dent, Luke Plurkowski, and John Maxwell. Collaborative human-machine interaction in mobile phonesupport centers: A case study. In Waldemar Karwowski and Tareq Ahram, editors,
Intelligent HumanSystems Integration , pages 557–563, Cham, 2018. Springer International Publishing. ISBN 978-3-319-73888-8.Meira Chefitz, Jesse Austin-Breneman, and Nigel Melville. Designing conversational interfaces to reducedissonance. In
Proceedings of the 2018 ACM Conference Companion Publication on Designing InteractiveSystems , DIS ’18 Companion, pages 219–223, New York, NY, USA, 2018. ACM. ISBN 978-1-4503-5631-2.doi: 10.1145/3197391.3205439. URL http://doi.acm.org/10.1145/3197391.3205439 .C Kamm. User interfaces for voice applications.
Proceedings of the National Academy of Sciences , 92(22):10031–10037, 1995. ISSN 0027-8424. doi: 10.1073/pnas.92.22.10031. URL .Chelsea Myers, Anushay Furqan, Jessica Nebolsky, Karina Caro, and Jichen Zhu. Patterns for how usersovercome obstacles in voice user interfaces. In
Proceedings of the 2018 CHI Conference on Human Factorsin Computing Systems , CHI ’18, pages 6:1–6:7, New York, NY, USA, 2018. ACM. ISBN 978-1-4503-5620-6.doi: 10.1145/3173574.3173580. URL http://doi.acm.org/10.1145/3173574.3173580http://doi.acm.org/10.1145/3173574.3173580