[PDF] "Hey Model!" -- Natural User Interactions and Agency in Accessible Interactive 3D Models

Abstract

While developments in 3D printing have opened up opportunities for improved access to graphical information for people who are blind or have low vision (BLV), they can provide only limited detailed and contextual information. Interactive 3D printed models (I3Ms) that provide audio labels and/or a conversational agent interface potentially overcome this limitation. We conducted a Wizard-of-Oz exploratory study to uncover the multi-modal interaction techniques that BLV people would like to use when exploring I3Ms, and investigated their attitudes towards different levels of model agency. These findings informed the creation of an I3M prototype of the solar system. A second user study with this model revealed a hierarchy of interaction, with BLV users preferring tactile exploration, followed by touch gestures to trigger audio labels, and then natural language to fill in knowledge gaps and confirm understanding.

Full PDF

““Hey Model!” – Natural User Interactions and Agency inAccessible Interactive 3D Models

Samuel Reinders

Monash UniversityMelbourne, [email protected]

Matthew Butler

Monash UniversityMelbourne, [email protected]

Kim Marriott

Monash UniversityMelbourne, [email protected]

ABSTRACT

While developments in 3D printing have opened up opportuni-ties for improved access to graphical information for peoplewho are blind or have low vision (BLV), they can provideonly limited detailed and contextual information. Interactive3D printed models (I3Ms) that provide audio labels and/or aconversational agent interface potentially overcome this lim-itation. We conducted a Wizard-of-Oz exploratory study touncover the multi-modal interaction techniques that BLV peo-ple would like to use when exploring I3Ms, and investigatedtheir attitudes towards different levels of model agency. Theseﬁndings informed the creation of an I3M prototype of thesolar system. A second user study with this model revealeda hierarchy of interaction, with BLV users preferring tactileexploration, followed by touch gestures to trigger audio la-bels, and then natural language to ﬁll in knowledge gaps andconﬁrm understanding.

Author Keywords

3D printing; Accessibility; Multi-Modal Interaction; Agency

CCS Concepts • Human-centered computing → Accessibility;

INTRODUCTION

In the last decade there has been widespread interest in theuse of 3D printed models to provide blind and low vision(BLV) people access to educational materials [50], maps andﬂoor plans [20, 21], and for cultural sites [39]. While thesemodels can contain braille labels this is problematic becauseof the difﬁculty of 3D printing braille on a model, the needto introduce braille keys and legends if the labels are toolong, and the fact that the majority of BLV people are notﬂuent braille readers. For this reason, many researchers haveinvestigated interactive 3D printed models (I3Ms) with audiolabels [18, 38, 16, 45, 21]. However, to date almost all researchhas focused on technologies for interaction, not on ascertainingthe needs and desires of the BLV end-user and their preferredinteraction strategies. The only research we are aware of

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor proﬁt or commercial advantage and that copies bear this notice and the full citationon the ﬁrst page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior speciﬁc permission and/or afee. Request permissions from [email protected].

CHI ’20, April 25–30, 2020, Honolulu, HI, USA. © 2020 Association of Computing Machinery.ACM ISBN 978-1-4503-6708-0/20/04 ...$15.00.http://dx.doi.org/10.1145/3313831.3376145 that directly addresses this question is that of Shi et. al. [44].They conducted a Wizard-of-Oz (WoZ) study with 12 BLVparticipants to elicit preferred user interaction for a pre-deﬁnedset of low-level tasks using three simple 3D models.Here we describe two user studies that complement and ex-tend this work. We investigate: (1) a wider range of interac-tion modalities including the use of embodied conversationalagents; (2) the desired level of model agency – should themodel only respond to explicit user interaction or should itproactively help the user; and (3) interaction with more com-plex models containing removable parts. Such models arecommon in STEM, e.g. anatomical models in which the or-gans are removable. We were interested in the interactions andlevel of model intervention desired by participants, particularlywhen reassembling the model.Our ﬁrst study, Study 1, was an open-ended WoZ study ofI3Ms with eight BLV participants. It extends that of [44]in three signiﬁcant ways. First, it uses a wider variety ofmodels, some of which contain multiple components. Second,we asked the participants to use the model in any way theywished. This allowed us to elicit a broader range of input andoutput modalities and interactions. Third, each participant waspresented with a low-and high-agency model allowing us toexplore the impact of agency.Study 2 was a follow-up study with six BLV participants whichinvolved exploring a prototype I3M of the solar system, thedesign of which was informed by Study 1. It supported tactileexploration, tap controlled audio labels, and a conversationalinterface supporting natural language questions. This studyallowed us to conﬁrm the ﬁndings of Study 1, whilst address-ing a limitation of that study, where participant behaviour mayhave been biased due to a human providing audio feedback onbehalf of the model. While synthesised audio output was ac-tually controlled by the experimenter in Study 2, participantswere unaware of this and believed the model was behavingautonomously.

Contributions : Our ﬁndings contribute to the understandingof the design space for I3Ms. In particular, we found that: • Interaction modalities:

Participants wished to use a mixof tactile exploration, touch triggered passive audio labels,and natural language questions to obtain information fromthe model. They mainly wanted audio output, but vibrationwas also suggested. a r X i v : . [ c s . H C ] S e p Independence and model-agency:

Participants wished tobe as independent as possible and establish their own in-terpretations. They wanted to initiate interactions with themodel and generally preferred lower model agency, howeverthey did want the model to intervene if they did somethingwrong such as placing a component in the wrong place. • Conversational agent:

Participants preferred more intelli-gent models that support natural language questions which,when appropriate, could provide guidance to the user. • Interaction strategy:

We found a hierarchy of interactionmodality. Most participants preferred to glean informationand answer questions using tactile exploration, to then usetouch triggered audio labels for speciﬁc details, and ﬁnallyuse natural language questions to obtain information not inthe label or to conﬁrm their understanding. • Prior experience:

Interaction choices were driven by par-ticipants’ prior tactile and technological experiences. • Multi-component models:

Participants found models withmultiple parts engaging and would remove parts to morereadily compare them.

RELATED WORK

Accessible Graphics & 3D Models:

The prevalent methodsto produce tactile accessible graphics include using brailleembossers, printing onto micro-capsule swell paper, or usingthermoform moulds [40]. Their main limitation is that theycannot appropriately convey height or depth [21], restrictingthe types of graphics that can be produced to those that arelargely ﬂat and two-dimensional in nature.As a consequence, handmade models are sometimes usedin STEM education and other disciplines that rely on con-cepts and information that is more three-dimensional in nature.However, while they are uncommon due to difﬁculties in pro-duction and the costs involved [21], commodity 3D printinghas seen the cost and effort required to produce 3D models fallin line with tactile graphics. 3D printing has been used to cre-ate accessible models in many contexts: resources to supportspecial education [12]; tangible aids illustrating graphic designtheory for BLV students [29]; graphs to teach mathematicalconcepts [10, 22]; programming course curriculum [24]; and3D printed children’s books [26, 46].While 3D printing allows a broader range of accessible graph-ics to be produced, the low ﬁdelity of 3D printed braille [47,43] limits the amount of contextual information that can beconveyed on these models, and the updating of braille labelsrequires model reprinting. Thus, as for tactile graphics, theuse of braille labels is problematic, especially considering thatthe majority of BLV people cannot read braille [32].

Interactive 3D Printed Models (I3Ms):

In order to over-come labelling limitations and to make more engaging ac-cessible graphics, 3D printed models have been paired withlow-cost electronics and smart devices to create interactive3D printed models (I3Ms). The majority of studies, however,have focused on technological feasibility rather than usability.Götzelmann [19] created a smartphone application that wascapable of detecting when a user pointed their ﬁnger at areas of 3D printed maps during tactile exploration, triggering auditorylabels. This method only allowed the use of one hand totactually explore the 3D prints as it required the user to holdand point their smartphone camera at the print. Shi et. al. [43]created Tickers, small percussion instruments that when addedto 3D prints can be strummed and detected by a smartphonethat triggers auditory descriptions. Testing, however, foundthat as strummers distorted the model appearance, it interferedwith tactile exploration. Further work by Shi et. al. [45]investigated how computer vision can be used to allow BLVpeople to freely explore and extract auditory labels from 3Dprints, but this required afﬁxed 3D markers to support tracking.Very little research has investigated how BLV users wouldlike to interact with I3Ms, with most studies offering onlybasic touch interaction [47, 18, 21]. A notable exceptionis Shi et. al. [44] who conducted a WoZ study to examineinput preferences and identiﬁed distinct techniques acrossthree modalities: gestures; speech; and buttons. These ﬁndingswere of considerable value; however, the study consideredonly three simple models, none of which featured detachablecomponents, and focused on a pre-deﬁned set of six genericlow-level tasks involving information retrieval and audio noterecording. Our research extends this by considering morecomplex, multi-component models, conversational agents andthe impact of model agency.The role that auditory output can serve in I3Ms is also under-explored, with the majority of I3M research considering onlypassive auditory labels such as descriptions [18, 38, 16, 45,21] and soundscape cues to add contextual detail [1, 11, 42].Holloway et. al. [21] gave preliminary guidelines to informhow auditory labels should be used in I3Ms, identifying that:a) trigger points should not distort model appearance, b) trig-gering should be a deliberate action, and that (c) differentgestures should be used to provide different levels of infor-mation. Co-designing I3Ms with teachers of BLV students,participants in Shi et. al. [42] suggested that in addition toproviding passive auditory labelling, I3Ms should allow usersto ask the model questions about what it represents. How-ever, this was not explored further in the Shi et. al. [42] study.Doing so is a major contribution of our work.

Conversational and Intelligent Agents:

Allowing the userto ask questions in natural language is a different kind ofinterface to that usually considered for I3Ms. Advances invoice recognition and natural language processing have re-sulted in the widespread use of intelligent agents. In particular,intelligent conversational interfaces are increasingly used ineveryday life. They include: Siri (Apple), Alexa (Amazon)and Google Assistant. Conversational interfaces have beenstudied in a variety of contexts, ranging from: health care [27],aging [31], physical activity [49, 8], and for people with anintellectual disability [6].With research ﬁnding that BLV people ﬁnd voice interactionconvenient [5], it comes as no surprise that adoption rates ofdevices that contain conversational agents, most of which sup-port voice control and text input, is high amongst BLV people.Exploring the use of such devices with 16 BLV participants,Pradhan et al. [35] identiﬁed that 25% owned at least one de-vice that included a conversational interface, and more broadlyound that 15% of submitted reviews for Amazon Alexa-baseddevices described use by a person with a disability.The functions offered through conversational agents are largelypassive in nature, requiring the user to activate the agent andto request that a task be performed. However, advances in‘intelligent’ agents mean that a device may be able to take on amore proactive role. Some agents support pedagogic functionsto assist the learning experience by emulating human-humansocial interaction [30, 41]. Social agency theory suggeststhat when learners are presented with a human-human socialinteraction that they become more engaged in the learningenvironment [30, 28]. An I3M with these characteristics couldfacilitate deeper engagement with the model and its widercontext. To our knowledge, the integration of agents withhigher levels of agency, including the capacity to act andintervene, into accessible 3D printed models has not beenpreviously considered. It is also important to understand howincreased agency in an I3M would be accepted, and if it mayeven lead to a feeling of decreased agency by the end user.

Multi-Modal Interfaces:

The integration of multiple modali-ties, such as those mentioned above, can result in interfacesthat are capable of communicating richer resolutions of in-formation. When designing interfaces for BLV people this isnecessary as other senses must be able to compensate for theabsence of vision. But Edwards et al. [15] described a ‘band-width problem’, wherein other senses are unable to match thecapacity of vision, that is unless multiple non-visual senses areutilised. In this context, multi-modal approaches have beenused in the creation of assistive aids and tools that use combi-nations of tactile models with auditory output [34, 1], hapticfeedback [33, 17], visual feedback for those with residualvision [18], and olfactory and gustatory perception [11].Utilising multiple modalities also increases the adaptability ofan interface [37], allowing a user to choose what modalitiesthey want to interact with based upon context or ability. A usermay be uncomfortable utilising an interface capable of speechinput and output in a noisy environment due to detection prob-lems, or because of privacy concerns [2], and may insteadprefer another modality such as text input/output, while an-other user may not have the motor skills to perform gesturalinput and would instead choose speech input. Our study aimsto create I3Ms that allow BLV people to choose their preferredmodality where possible, and to uncover any variables thatmight impact the choice of modalities offered.

Wizard-of-Oz Experiment:

When creating new user inter-faces, testing can take place before an interface is fully devel-oped. One such method, Wizard-of-Oz (WoZ), used in ourtwo studies, involves an end user interacting with an interfacewhich to some degree is being operated by a ‘wizard’ provid-ing functionality that is yet to be fully implemented [25]. Bydesign WoZ typically involves some level of deception, oftenby omission, where end users may not be aware that they areinteracting with an incomplete interface. WoZ methods havebeen used extensively within HCI research including the de-velopment of conversational agents [48], display interfaces [3]and human-robot interaction [23]. Within the context of interfaces designed for BLV people, inaddition to the work of Shi et. al. [44], WoZ has been used toexplore non-visual web access [4], smartwatch interfaces [9]and social assistive aids [36]. While this participant groupmay be seen as more vulnerable to the illusion that a WoZinterface is fully implemented, many studies with BLV partici-pants explicitly state that the involvement of some deceptionis integral to their WoZ experiment [4, 9, 36]. Our studies staywithin acknowledged WoZ methods, with the true nature ofthe WoZ deception revealed to participants either at the start(Study 1) or conclusion (Study 2) to ensure transparency.

STUDY 1: INITIAL EXPLORATION

Our ﬁrst study aimed to better understand which interactionstrategies are most natural for BLV users of I3Ms and theirpreferred level of model agency. We employed a WoZ method-ology with a ‘wizard’ providing auditory output on behalf ofthe model. WoZ allowed participants to interact with 3D mod-els in any way they felt natural, while free of technologicalconstraints and for the researchers to readily manipulate themodel’s degree of agency.

Participants

Study 1 was undertaken with eight participants. They wererecruited through the disability support services ofﬁce of theresearchers’ home university campus, and through mailinglists of BLV support and advocate groups. Demographic infor-mation is summarised in Table 1.

Table 1. Participant demographic information

Participant P1 P2 P3 P4 P5 P6 P7 P8Level of Vision:

Legally Blind (cid:88) (cid:88)

Totally Blind (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)

Formats Used:

Braille (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)

Audio (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)

Raised Line (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)

3D Models (cid:88) (cid:88) (cid:88) (cid:88)

Familiarity (/4):

Tactile Graphics 4 4 3 4 4 1 3 2

Participation:

Study 1 (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)

Study 2 (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)

Participant age was evenly spread, from early 20’s to early 70’s.All participants had smartphones and described using inbuiltconversational interfaces, such as Siri and Google Assistant,to fulﬁl a variety of tasks including reading and respondingto messages, making phone calls, looking up public transitroutes, and even asking to be told jokes. The experiments tookplace at a location of convenience for the participant. Eachexperiment lasted approximately 1.5 to 2 hours.

Materials

Six 3D printed models representative of materials employedin the classroom or used in Orientation and Mobility (O&M)training were selected for use in the study. Potential modelswere chosen from a shortlist by the DIAGRAM 3D workinggroup or designed with advice sought from the Australia & igure 1. 3D models used in Study 1 (left-to-right top-to-bottom): a) Two bar charts of average city temperatures with removable bars; b) Animal cell;c) Park map; d) Thematic map of the population of Australia; e) Dissected frog with removable organs; f) Solar system with removable planets

New Zealand Accessible Graphics Group (ANZAGG). Thelist of potential models was also informed by the researchers’previous work with the BLV community and BLV educators,with consideration being made of which models are oftenrequested by these stakeholders.The six models used were chosen to vary in their applicationdomain, the kinds of tasks they might support, and complexity.Three contained removable components. These were intendedto elicit desired interaction behaviours when components wereremoved or reassembled, as well as to determine whetherparticipants removed components in order to compare them.The models are shown in Figure 1. They were: two bar chartswith removable bars, representing the average temperature foreach season in two cities; a model of an animal cell ; a map ofa popular Melbourne public park; a thematic map of Australia,with the height of each state corresponding to population; amodel of a dissected frog , containing removable organs; anda model of the solar system, including removable planets. Procedure

Each model was explored by at least two different participants,and most models were shown in both a low-agency and ahigh-agency mode.Two researchers were present during the experiment: the facil-itator and the wizard. At the beginning of the experiment thefacilitator explained to participants that they will be given twomagical models, one at a time, capable of hearing, seeing, talk-ing, vibrating and sensing touch, and that the models wouldreact accordingly to any way in which they choose to interactwith them. It was explained to participants that one of theresearchers would take on the role of the wizard. Participantswere asked to verbalise any deliberate actions they made using the ‘think-aloud’ protocol [51], allowing the wizard to acton behalf of the model and provide auditory output throughspeech, and to verbalise other modalities, when relevant.When given a model, participants were asked to ﬁrst exploreand identify it in order to familiarise themselves with its fea-tures, and then to use the model for any purpose they wished.They were then asked to interact with the model in order toaccomplish a predeﬁned task. Tasks were model-speciﬁc andincluded navigating between a southern bench and a buildingfound in the public park map, determining which state or terri-tory has the largest area on the thematic map, and identifyingthe planet with the lowest density in the solar system model.After completing the model-speciﬁc tasks, participants wereasked to design interaction techniques for a number of genericlow-level tasks similar to those of [44]: accessing a descriptionof the model; extracting information about a speciﬁc compo-nent; comparing two components; recording an audio noterelated to part of the model; and asking the model questions.Throughout these exercises the facilitator observed how themodel was being handled, and the wizard acted appropriatelyto fulﬁl the interactions. If participants devised any task oftheir own, they were also encouraged to design associatedinteraction techniques.Each participant completed the experiment with two mod-els: a low-agency model followed by a high-agency model.When interacting with the low-agency model, feedback wasprovided by the wizard only when deliberately initiated by theparticipant, e.g. a participant indicates that when they tap themodel they expect an auditory description or haptic vibration.With high-agency models, the wizard played a more proac-tive role, providing richer assistance including introducing themodel when ﬁrst touched, and intervening if the participantexperienced difﬁculty during the experiment.nce both models had been presented, the participant wastaken through a semi-structured interview, allowing the facili-tator to ask questions about speciﬁc interactions the participantused, whether they felt comfortable interacting with the model,and if differing agency capabilities were useful. All interac-tions were captured on video and dialogue audio recorded.All footage was transcribed, as were tactual interactions. Theﬁndings below are derived from both observation of the partic-ipants’ interaction with the models, as well as responses to thequestions probing their behaviour. Analysis and Results

An initial set of themes were derived from the experimentprotocol. Two researchers independently conducted the initialcoding of a data subset to arrive at the main set of themes, andthen independently coded all data based on agreed themes. Af-ter the data was coded by the researchers, they met to conﬁrmthe coding and reconcile any data that was coded differently.Formal quantitative measures were not undertaken as the cod-ing focus was to extract themes in order to inform designprinciples. Themes were then consolidated into a ﬁnal set aspresented below.

Preferred Interaction Techniques

All participants explored the models extensively with touch,at times for long stretches without any questioning or gestures.This was especially common in the initial exploration and wasthe dominant technique for identifying what the model was. Asparticipants became either more comfortable or more conﬁdentwith the model, other interaction techniques emerged.Speciﬁc gestures were observed being used by six participants,often without the participant realising they were doing them.In particular the gesture of double tapping on an element of themodel was carried out to either explicitly expect a response, orin conjunction with a question to the model. One participantchose not to perform gestures, indicating that while they werecapable of performing them, that others may not based onphysical ability.Voice commands were used by all participants. After initialtactile exploration was conducted, these were typically usedto ﬁll in gaps or clarify their understanding of speciﬁc featuresof the model. Dialogue was usually directed at the model inthe form of a speciﬁc question.Two participants, however, began to interact with the modelin a more complex way in which multiple modalitieswere combined seamlessly, such as combining touch withconversationally-provided detail. This appeared to be task-related, as both participants completed the route-ﬁnding taskon the park map. P1 expressed this: “Okay so I am goingto start at the southernmost edge [participant taps the southentry to the gardens], and being a magical model um... canyou give me directions to the southernmost bench?” [P1] . Preferred Output Modalities

The overwhelming preference of participants was to extractinformation using only touch. This included understandingwhat the model was, its sub-components, and for comparingsub-components. Verbal output was expected in response tothe vast majority of questions asked of the model. Interactionssuch as those with a conversational agent were dominant, with the user gesturing or speaking to the model and expecting averbal response in return.While touch input and verbal output were most common, threeparticipants also wished for haptic output. This was used asa way to quickly identify or ﬁnd a speciﬁc component in themodel, e.g. the location on the map or a particular planet in thesolar system. The use of haptics was also raised as a way forthe model to provide conﬁrmation, for example to indicate thatthey were holding the correct sub-component, or had foundthe destination on a map. One participant asked: “Can youvibrate when I get to Earth?” [P4] . Exploration and Task Completion Strategies

All participants exhibited a well-deﬁned strategy for their ini-tial exploration of the model. While participants had varyingdegrees of experience with 3D prints and even tactile diagrams,the strategy of scanning model boundaries, and then system-atically exploring the different parts or areas was commonamongst all participants. When questioned on this, some ar-ticulated their strategy clearly while others indicated that theydid not have one, even when their behaviours implied that theydid.Strategies to complete tasks varied between participants andtasks, however, it was clear that participants preferred to useonly their sense of touch where possible. For example, sometasks required the discovery of the largest model sub-part (e.g.planet or bar within the chart), and participants typically wouldﬁnd tactual ways to compare, such as positioning their ﬁngersto feel two bars at once. For this type of task, gestures orverbal questions were typically only used for conﬁrmation.The presence of removable components in some models alsoprompted the removal of a part and asking questions of itspeciﬁcally. In comparison tasks, three participants removedparts and held one in each hand to compare.

Inﬂuence of Prior Experience

Six participants had signiﬁcant experience with tactile dia-grams, with many also having some experience with 3D mod-els, albeit not as a primary presentation of information forthem. This experience appeared to inﬂuence both their desireto focus on tactual information gathering early on, as well astheir strategies for initial exploration.Those who were conﬁdent smartphone users tended to usemore directed gestures. These included typical gestures suchas single, double and triple tap. No gestures were observedthat deviated from established smartphone use. Two partici-pants even used the gestures that smartphones use to controlvoice over functions (e.g. two ﬁnger tap), expecting similarbehaviours to occur. P2 explained: “I think that is a goodway to go because that knowledge that is out there in iPhone...mainstreaming that kind of technology is a great service tosociety” [P2] .Participants’ experience with conversational agents did notappear to inﬂuence verbal interactions, in that it did not makethem more or less likely to engage in dialogue with the model.When asked about comfort levels speaking with a model, theredid not appear to be a link between experience and comfort.his appeared to be inﬂuenced more by where the model explo-ration was taking place (such as noisy or public environments).A number of the braille readers indicated that they would stilllike to have either braille on the model, a braille key, or brailleinstructions to use.

Design Choices

The choice of model content was a key aspect in stimulatingquestioning of the model. Questions would relate to thingspresent (e.g. deﬁned aspects that needed explaining) or el-ements that were absent (e.g. geographical landmarks onthe thematic map). These ﬁndings support the guidelinespresented by [44] regarding “Improving Tactile Information”and the importance of considering the tactual elements of themodel itself, especially to promote inquiry.The use of removable parts where appropriate proved to bea key design choice. The parts supported comparison (suchas the size of planets), and also made for more compellingexperiences, “Being able to pick them up? Yeah, I liked it ... Ilike to be able to hold them in terms of the density and size, itis a bit hard to tell when you don’t pick them up” [P2].Tactile sensations mattered to many of the participants. Modelelements such as slight imperfections in the 3D printing pro-cess were often identiﬁed and either the model or researcherwas questioned about them. Conversely, explicit differencesin materials used (such as softer ﬁlament for planets or froginternals) were not commented on by seven participants. Oneparticipant, P8, was able to readily detect material differencesbetween planets and this prompted contextual questions ex-ploring why this was the case. Their use was somewhat subtlein the presented models, so differences may need to be moreextreme to promote inquiry.Some design choices caused confusion. Most notable wasthe lack of an equator on the Earth in the solar system model,and the inclusion of Saturn’s ring, which was confused for theEarth’s equator.

Agency and Independence

In general, independence was highly valued by participants.Independence emerged in two forms: independence of con-trol ; and independence of interpretation .Regarding control, typically a participant wanted only to knowhow to interact with the model and then to explore indepen-dently, without model intervention. This was especially truefor initial model exploration, “Look if it were a choice betweenit being over-helpful and having to ask for help, I would preferto have to ask it for help than to have it just shouting at methat I am wrong” [P1] . Independence of interpretation wasevident by the reluctance of the participants to simply ask themodel for information or the answer to a task, and a preferenceto use touch to allow them to build their own understanding, “I do like to sort of work through it myself and then sort of reachout if I need the help” [P3] .Opinion was divided on proactive model identiﬁcation. Halfof the participants seemed to like the chance to explore forthemselves before being told by or asking the model whatit represented, although when prompted some participantsindicated a simple introduction by the model would be of beneﬁt, “I would like to look at it ﬁrst, and get an idea of whatit is, which is kind of what I did, before I found out what itis meant to be.” [P4] . Participants indicated that if they didsomething ‘wrong’, such as place a part in the wrong spot, orhave the model oriented in an inappropriate direction, thenproactive model intervention would be appreciated.Participants appeared to prefer formal dialogue for deliveryof basic facts, and conversational for task solving and moreexploratory activity. Some participants felt this increased thenotion of engagement with the model, “I don’t know, ‘I amEarth’, ‘I am this, I am that’, I quite like that, because it meansyou’re interacting with it more...” [P2] . Some participantsalso embodied a character or personality into the model. Theexpression ‘Mr Model’ was used consistently by one partici-pant when asking questions, and others used variations of that.Similarly, some embodied physical features onto some ele-ments of the model. For example, with the solar system modeltwo participants commented on their hesitance to handle thesun, with P4 speaking: “So, ‘Mr Model, is this very very hot?’Ah I thought it was hot, I thought it felt a bit warm!” [P4] . Discussion

The results presented above have direct implications for thedesign of I3Ms.

Interaction modalities:

There are three clear modalities andinteraction types: tactile with no model output; touch gesturewith model output; and verbal with model output. The modelwas expected to provide verbal or haptic output. The greatestreliance was on touch. Information that was of a more de-ﬁned nature (e.g. what an element is, or a basic trivial fact)was typically obtained through a gesture (such as double tap),whereas information that was more complex or was not abouta speciﬁc element of the model tended to be obtained througha question directly aimed at the model. At times this questionwas preceded by the tapping of a model sub-component (typi-cally if it related to a particular part of the model), and othertimes it was simply a question directed generally at the model.Thus, the model should provide true multi-modal interactionallowing the user to indicate the objects of interest with touchand natural language to query these.

Conversational agent:

When interacting verbally with themodel, participants treated the model as a conversational agent.As suggested above, users shifted to verbal interaction whenseeking more detailed or complex information. The expression‘Mr Model’ that was used by one participant is emblematic ofthe nature of dialogue that took place with the model. This isvery consistent with formal greetings used to initiate dialoguewith conversational agents and should inﬂuence the design ofI3Ms.

Independence and model-agency:

Participants strongly de-sired both independence of control and independence of in-terpretation. Participants did not go for the ‘easy option’ ofsimply asking the model for the answer to every question,rather they sought to discover the information through tactualmeans and deliberate triggering of information held by themodel. Model intervention to correct their understanding or in-correct placement of a sub-component were exceptions. Thus,I3Ms should employ a low-level of model agency except when igure 2. Interactive solar system model used in Study 2 aiding the user if they are having trouble, typically with theremovable parts.

Prior experience:

Participants’ prior experience appears toinﬂuence their choice of interactions. This is not only theinﬂuence of previous experiences with technologies such assmartphones and conversational interfaces, but also their tac-tile information gathering experiences more broadly. As such,any gestures implemented must align with touch interface stan-dards. Verbal interactions that take place with conversationalagents are often more open and as such have less inﬂuence onthe design of I3Ms.

Multi-component models:

A novel aspect of this study wasthe inclusion of 3D models with removable parts. While someparticipants had trouble with reassembly for the more complexfrog model, all participants valued their inclusion and noneexhibited or mentioned any discomfort during reassembly.These proved to be a valuable design choice, as they promotedgreater levels of interaction, engagement, and also inquiry ofthe model.

STUDY 2: VALIDATION

A second study focused on evaluating a more fully functionalI3M that incorporated and validated the interaction techniquesand agent functionality identiﬁed in Study 1.

Prototype I3M of the solar system

The solar system model was selected to undergo further proto-typing in the creation of the I3M instantiation. This model waschosen for a number of reasons, including that it allowed forthe widest variety of interaction strategies and modalities to beimplemented due to its complexity and number of removablecomponents; the model was enjoyed immensely by all partic-ipants who were exposed to it in Study 1; and that previousresearch has expressed a desire to have higher engagementwith accessible STEM materials [13]. One participant madeexplicit reference to the fact they had not been able to engagewith this content previously due to the inaccessibility of mate-rials at school. Indeed, there seemed to be a knowledge gapregarding the solar system with most of the participants whoused this model in Study 1. Based on the results of Study 1, it was determined that the prototype would support the followingfunctionality and behaviours:’ • Tapping gestures to extract auditory information : De-liberate gestures such as single tap to trigger componentname and double tap to trigger pre-recorded componentdescription audio labels (aligning with standard gestureinteractions). • Optional overview : Ability to obtain an overview of theavailable model interactions, as well as a formal modelintroduction by tapping dedicated model touch points. • On/Off functionality : Capability to turn auditory outputof the model off and on using voice commands to suit theexploratory needs of the user (supporting independence). • Braille labelling : Labels identifying how to access an au-ditory overview of the prototype and instructions on howto interact with the model (supporting prior knowledge andinformation gathering techniques). • Conversational agent interface : Ability to ask questionsby performing long tap on one or more components or touse an activation phrase (‘Hey Model!’), and for the modelto respond accordingly to user questions (aligning withstandard voice interactions). • Model intervention : The model would assist the user ifduring interactions an incorrect action was performed (sup-porting preferred model intervention).The model did not support vibratory feedback due to imple-mentation considerations.In order to support these functions, a single-board computerwas paired with a capacitive touch board containing 12 touchsensors. Each touch sensor was wired to a screw which actedas a touch point. Nine of the touch points were embeddedin 3D printed models that represented the Sun and the eightplanets of the solar system. The electronics were mountedinside an enclosure constructed out of laser cut acrylic sheets,with stands 3D printed and attached to the top of the enclo-sure to seat the planets in their correct order. Each planetwas tethered using an insulated cable, allowing planets to beremoved from their stand while still functioning. A softwarelibrary was modiﬁed to detect when a single, double or longtap was performed on each touch point, and to trigger the play-back of an associated audio label through a connected speaker.A single tap would trigger a recording of the component’sname, and a double tap a recording of descriptive information,which for the planets included the following: planet name,order from the Sun, radius, type of planet (terrestrial/gas) andcomposition.Two additional touch points were placed next to braille labelsat the front of the model that read ‘overview’ and ‘instruction’.The twelfth touch point was embedded in an additional 3Dprinted cube that was to be used as a training device.

Procedure

Six of the eight participants from Study 1 were available totake part in Study 2. Using the same participants across bothstudies was a deliberate methodological choice, as Study 2ocused on validating the ﬁndings of Study 1 by having partic-ipants interact with an instantiation that supported the prefer-ences and behaviours observed in Study 1. Study 2 was a par-tial WoZ study, in which the prototype model provided most ofthe core functionality with the exception of the conversationalinterface. This was done to validate this functionality beforefully implementing it, and was provided using text-to-speechgenerated on the ﬂy by the researcher. Responses to questionswere generated using the same synthesised speech engine asthe audio labels embedded into the model. Participants be-lieved the model was behaving autonomously, however fortransparency, the true nature of the WoZ implementation wasrevealed to participants at the conclusion of the study.Participants were asked to explore the prototype model and un-dertake a number of researcher-directed tasks. For consistencybetween studies, the participant was afforded considerabletime to explore the model before a series of guided tasks werepresented. In contrast to Study 1, participants were trainedin the use of the model, as the purpose of this study was notto identify natural interactions, but rather to validate thosepreviously identiﬁed. Furthermore, the user was given moredirected tasks to complete with the model. This was to promptthe participant to have opportunity to utilise the full suite offeatures. The tasks that each user was required to completewere: • Simple information gathering : Exploring and identifyingthe model ( T1 ), the order of the planets from the Sun ( T2 ),and which of the planets were gas giants ( T3 ). • Comparison tasks : Finding out the radius of two planets( T4 ), identifying the largest planet ( T5 ), and which planethas the longest orbit ( T6 ). • Complex question answering : Using information from T6to establish relationship between planet distance and orbitalperiod ( T7 ). • Model reassembly : Placing four planets back into theirstands while the model conﬁrms correct or incorrect place-ment ( T8 ).In order to force the use of different interaction modalities,T3 and T4 could not be answered using only touch, and T6required the user to ask the model. The study concludedwith questions regarding the participants’ experiences with themodel: • General questions regarding engagement with the model ,including: whether they enjoyed using the model, whetherthey learnt anything, and whether they thought similar mod-els would have been useful during their education. • Preference of interaction techniques and how they alignedwith Study 1, including: why they gravitated towards aparticular method of interaction, and whether there wereany additional interactions they would like the model tosupport. • Questions regarding the level of agency of the model . Ele-ments were derived from the Godspeed questionnaire [7],used to measure users’ perception of AI and robot interac-tion. This included: perceived competence and intelligence of the model, and if participants found the level of interven-ing assistance provided useful. • Satisfaction with the level of independence afforded, andwhether they felt in control during interactions, and howthis aligned with their interaction technique preferences. • Inquiry into participant comfort when undertaking dia-logue and conversing with the model, and self-identiﬁedsafety and emotions during these interactions. • Overall satisfaction with the design and preference of themodel over more traditional graphical representations ofinformation.Where relevant, participants’ preferences in Study 1 wereraised for direct comparison.

Experiment Conditions

Each of the participants was taught how to use the model witha simple training interface. The interface taught the range ofgesture interactions, as well as the two natural language inter-faces (long tap and “Hey Model!”). The order of introductionto these techniques was counterbalanced in order to removebias. Similarly, some tasks were counterbalanced: Tasks 4 &5: ‘Tell me the Radius?’; and ‘Which is the biggest?’. The ex-periments ranged in time from 1 to 2 hours, primarily directedby the levels of engagement from the participant.

Results

Video and audio was recorded, and all video transcribed. Alltranscriptions were coded by two researchers, using the samethemes as identiﬁed in Study 1 for direct comparison and vali-dation of interaction techniques from that stage. In reportingthe results, focus is placed on the alignment of preferred inputand output modalities, and model agency.

Task Completion

Interaction modalities for the tasks are summarised in Table 2. • T1: All participants began by tactually exploring the modelbefore performing tap gestures on the various componentsof the model that revealed what the model represented. Par-ticipants continued interacting with the model, with threeactivating natural language and asking for further informa-tion. “... it doesn’t mention the rings... Hey Model, is theremore information on Saturn?” [P5] • T2: Five participants tactually explored each planet beforeusing single tap gestures to identify each of them. Oneparticipant, P5, chose not to use tap gestures as they feltconﬁdent in their knowledge of the planets and insteadnamed each while touching the corresponding part of themodel. • T3: Five participants used tap gestures to determine whethereach planet was a gas giant, performing a double tap totrigger a planet’s audio label. One participant instead simplyasked “Hey Model, can you identify the gas planets?” [P6] • T4: Five participants used tap gestures to determine theradius of Uranus and Neptune, performing a double tap totrigger each planet’s audio label. P6 again chose to relyupon natural language questioning and asked the model. able 2. Applicable modalities for each task and used by participants

Task: Applicable Modalities: Modalities Used: P P P P P P T1 Touch (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)

Tap (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)

Natural Language (cid:88) (cid:88) (cid:88) (cid:88) T2 Touch (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)

Tap (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)

Natural Language (cid:88) T3 Tap (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)

Natural Language (cid:88) T4 Tap (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)

Natural Language (cid:88) T5 Touch (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)

Tap (cid:88) (cid:88) (cid:88)

Natural Language T6 Natural Language (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) T7 Natural Language (cid:88) (cid:88) T8 Touch (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)

Tap (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)

Natural Language (cid:88) • T5: All participants were able to correctly identify thatJupiter was the largest planet using tactual exploration, com-paring the size of each planet. Three participants performeddouble tap gestures to conﬁrm their answers by listeningto a planet’s audio label, “I will just double check Saturnwhich is next... Yep Jupiter is the largest...” [P3] • T6: Five participants engaged in natural language question-ing in order to determine which planet had the longest orbittime, with one participant narrating “Now it hasn’t yet givenme that so... let’s ask... Hey Model, which planet takes thelongest to ... orbit the Sun?” [P4] . P1 was conﬁdent theyknew that Neptune had the longest orbit, but was unsure ofthe exact duration and engaged natural language through along tap and asked “What is the orbit time of Neptune?” • T7: Using information uncovered in T6, four participantswere able to establish the relationship between the distanceof a planet from the Sun and its time of orbit without inter-acting with the model. The others queried the model. • T8: All participants started by tactually scanning the emptyplanet stands to determine sizes. Three participants thensearched for the largest unseated planet. All participantsperformed tap gestures to identify each planet they pickedup before inserting it into a stand. Three participants wereable to correctly place all four unseated planets without anymistakes, with the remaining three participants requiringmodel intervention. P6 found the assistance provided bythe model useful when placing a planet in the correct stand,stating “That is what I wanted to know, fabulous” . Post-Task Questions

Engagement with the model : All six participants indicatedthat they enjoyed interacting with the model, citing how itgave them “A better idea of how it all works and everything”[P4] to how it was “Sort of fun because you could move themaround and feel the different sizes” [P3] . Five participants felt they learnt new knowledge ranging from the general size ofthe planets, their radii and which were gas giants. All partici-pants agreed that it would have been useful having access tosimilar models during their education, with P2 highlightingthat “It would have been very engaging, especially in Year 7” and other participants suggesting possible uses when teachinganatomy, physics and chemistry.

Interaction techniques : Preference of interaction techniqueslargely aligned with the interaction strategies used in Study1. All six participants outlined that touch was important, withP2 suggesting that they gravitated towards touch because “Iknow it is built to be touched ... touch is very important tome...” . When extracting information, four participants spokeof how they would move to use natural language when theywere unable to extract the information they desired using tapgestures, “When the information isn’t available through tap Ilike being able to extend that information by being able to askquestions” [P4] . Additionally, despite the text-to-speech inter-face involving a slight delay of a few seconds, all participantsbelieved that the model was behaving autonomously.

Agency of the model : When placing the planets back intotheir stands, all six participants found the level of interveningassistance provided useful. Two participants said that themodel having this level of agency allowed them to completethe task faster. P3 suggested a model’s level of agency shouldbe controllable, “Maybe if you could turn it on or off, becauseyou might not always want it?” . When asked if they wouldlike the model to have an embodied role, participants weredivided, with three suggesting a teaching role may be usefulwith school children.

Level of independence : All six participants indicated thatthey largely felt in control of their interactions during theirtime with the model. This aligned with the desire for an in-dependent experience found in Study 1. P4 spoke of how theconversational interface supported their desire to explore inde-pendently, “[the model has] given to me what a sighted personwould give me if they were helping me to look at something, soit makes the whole experience a lot more independent” . Oneparticipant even connected independence and knowing how tointeract with the model, suggesting that “... showing me thedifferent gestures to start with is very important” [P2] . Participant comfort : Five participants spoke of how they feltcomfortable asking the model questions, with two suggestingthis may be due to prior experience with voice activation, “Just as intuitive as asking any other sort of voice assistantquestions” [P1] . P4 said that they “Felt conﬁdent asking aquestion, [but weren’t] conﬁdent in how to phrase questions”[P4] , which was echoed by P5. When asked to place how theyfelt on a scale of agitated to neutral to calm during interactions,all six participants rated their experience as “calm".

Overall satisfaction : Participants were asked to choose theirpreference from a) models that only contain braille, to b) thosethat also output speech through tap gestures, c) models thatfurther understand speech and answer questions when asked ortapped, or d) a conversational interface that answers questionswith no attached physical model. There was strong agree-ment amongst all six participants towards c), which alignedith the solar system prototype. P1 spoke of how “With amodel like this you are able to have more information thanwith a dumb model” , while P2 suggested it was “Cateringto all sorts of abilities” . All participants suggested a physi-cal model was necessary and that the experience wouldn’t beas satisfying without one because “It is not multi-modal andeveryone learns better multi-modal” [P5] and that physicalmodels making things “More playful, more fun” [P2]

Discussion

The ﬁndings from Study 2 aligned strongly with those inStudy 1.

Interaction modalities:

Table 2 conﬁrms the ﬁndings ofStudy 1, in that multiple modalities are used throughout explo-ration and task completion. While there is a tendency towardssimple touch interaction, all participants used tap-initiatedauditory information, as well as natural language.

Conversational agent:

Participants felt that the model ex-hibited some levels of intelligence, and were comfortableengaging in natural language dialogue, even though it wasthe least preferred modality of interaction. The expression“Hey Model!” was also seen as feeling natural when initiatingdialogue with the model.

Independence and model-agency:

Study 2 further con-ﬁrmed participants’ wishes to be in control of their experienceand assemble their own understandings.

Prior experience:

Prior tactile experience emerged as astrong factor in Study 2. Of the six participants, P6 had theleast amount of tactile experience, in part due to the age ofonset of vision loss as well as opportunity. Participant P6 ex-hibited a different interaction strategy, using natural languageearlier and more often than the others. P6 also engaged in lesscomplex touch exploration. Regarding experience with tech-nology, the use of tap gestures in the prototype was well under-stood, with only minor sensitivity issues. This also extendedto the natural language interface, with multiple participantsdetailing that the latency involved when asking questions wassimilar to that experienced when using other conversationalagents.

Multi-component models:

As with Study 1, having multiplecomponents in the model promoted greater engagement withit, as well as encouraging deeper levels of inquiry.

Interaction strategy:

A key ﬁnding of Study 2 was that aclear hierarchy of interaction emerged when BLV people en-gaged with I3Ms.1. Tactile Exploration – It appears that this is seen by partic-ipants as supporting the greatest level of independence inboth control and interpretation. It is also likely the mostfamiliar information gathering technique for this cohort (i.e.experience with tactile diagrams, etc) and allows the user tocome to their own conclusions.2. Gesture-Driven Enquiry – This is the extraction of informa-tion about the model or a component of the model throughtapping and other deliberate gestures. This is used to elicitlow-level ‘factual’ information, and is used when it is notpossible to obtain this through touch. 3. Natural Language Interrogation – While participants arereasonably comfortable asking questions of the model, con-versation with the model is largely used to ﬁll gaps of knowl-edge, or to conﬁrm understanding.

CONCLUSION

In this paper we have presented two user studies investigat-ing blind and low vision (BLV) peoples’ preferred interactiontechniques and modalities for interactive 3D printed models(I3Ms). Study 1 utilised a Wizard-of-Oz methodology, andStudy 2 conﬁrmed the results by evaluating the use of a proto-type I3M of the solar system the design of which was informedby the ﬁndings of Study 1.We found that participants wished to use a mix of tactile ex-ploration, touch triggered passive audio labels, and naturallanguage questioning to obtain information from the modelwith a mix of audio and haptic output. They enjoyed engagingwith models that had multiple parts, and would remove partsto further explore and compare them.When talking to the model, participants treated it as a conversa-tional agent and indicated that they preferred more intelligentmodels that support natural language and which, when appro-priate, could provide guidance to the user. Participants wishedto be as independent as possible and establish their own inter-pretations. They wanted to initiate interactions with the modeland generally preferred lower model agency. However, theydid want the model to intervene if they did something wrongsuch as placing a component in the wrong place.The desire for independent exploration led to a hierarchy ofinteraction modalities: most participants preferred to gleaninformation and answer questions using tactile exploration,then to use touch triggered audio labels for speciﬁc details,ﬁnally using natural language questions to obtain informationnot in the label or to conﬁrm their understanding. However,interaction choices were driven by participants’ prior tactileand technological experiences. Not only are these ﬁndings ofsigniﬁcance to the assistive technology community, they alsohave wider implications to HCI. In particular the combinationof I3Ms with conversational agents suggests a radically newkind of embodied conversational agent, one that is physicallyembodied and that can be perceived tactually by a BLV person,rather than the traditional embodied conversational agent thatis perceived visually and has a human-like experience [14].Such physically embodied conversational agents raise many in-teresting research questions, including their perceived agency,autonomy and acceptance by the end user. There are also manyquestions to be answered on how such agents can be imple-mented. A major focus of our future research will be to designand construct a fully functional prototype, conduct more ex-tensive user evaluations with a variety of models, includingmaps, and to explore whether model agency preferences differwith age and environment.

ACKNOWLEDGMENTS

This research was supported by an Australian Government Re-search Training Program (RTP) Scholarship. Additionally, wethank all the research participants for their time and expertise.

EFERENCES [1] Nazatul Naquiah Abd Hamid and Alistair D.N. Edwards.2013. Facilitating Route Learning Using InteractiveAudio-tactile Maps for Blind and Visually ImpairedPeople. In

CHI ’13 Extended Abstracts on HumanFactors in Computing Systems (CHI EA ’13) . ACM,New York, NY, USA, 37–42.

DOI: http://dx.doi.org/10.1145/2468356.2468364 [2] Ali Abdolrahmani, Ravi Kuber, and Stacy M. Branham.2018. "Siri Talks at You": An Empirical Investigation ofVoice-Activated Personal Assistant (VAPA) Usage byIndividuals Who Are Blind. In

Proceedings of the 20thInternational ACM SIGACCESS Conference onComputers and Accessibility (ASSETS ’18) . ACM, NewYork, NY, USA, 249–258.

DOI: http://dx.doi.org/10.1145/3234695.3236344 [3] David Akers. 2006. Wizard of Oz for ParticipatoryDesign: Inventing a Gestural Interface for 3D Selectionof Neural Pathway Estimates. In

CHI ’06 ExtendedAbstracts on Human Factors in Computing Systems(CHI EA ’06) . ACM, New York, NY, USA, 454–459.

DOI: http://dx.doi.org/10.1145/1125451.1125552 [4] Vikas Ashok, Yevgen Borodin, Svetlana Stoyanchev,Yuri Puzis, and I. V. Ramakrishnan. 2014. Wizard-of-OzEvaluation of Speech-driven Web Browsing Interface forPeople with Vision Impairments. In

Proceedings of the11th Web for All Conference (W4A ’14) . ACM, NewYork, NY, USA, Article 12, 9 pages.

DOI: http://dx.doi.org/10.1145/2596695.2596699 [5] Shiri Azenkot and Nicole B. Lee. 2013. Exploring theUse of Speech Input by Blind People on Mobile Devices.In

Proceedings of the 15th International ACMSIGACCESS Conference on Computers and Accessibility(ASSETS ’13) . ACM, New York, NY, USA, Article 11, 8pages.

DOI: http://dx.doi.org/10.1145/2513383.2513440 [6] Saminda Sundeepa Balasuriya, Laurianne Sitbon,Andrew A. Bayor, Maria Hoogstrate, and MargotBrereton. 2018. Use of Voice Activated Interfaces byPeople with Intellectual Disability. In

Proceedings of the30th Australian Conference on Computer-HumanInteraction (OzCHI ’18) . ACM, New York, NY, USA,102–112.

DOI: http://dx.doi.org/10.1145/3292147.3292161 [7] Christoph Bartneck, Dana Kuli´c, Elizabeth Croft, andSusana Zoghbi. 2009. Measurement Instruments for theAnthropomorphism, Animacy, Likeability, PerceivedIntelligence, and Perceived Safety of Robots.

International Journal of Social Robotics

1, 1 (01 Jan2009), 71–81.

DOI: http://dx.doi.org/10.1007/s12369-008-0001-3 [8] Timothy W. Bickmore, Daniel Schulman, and CandaceSidner. 2013. Automated interventions for multiplehealth behaviors using conversational agents.

PatientEducation and Counseling

92, 2 (2013), 142 – 148.

DOI: http://dx.doi.org/https://doi.org/10.1016/j.pec.2013.05.011 [9] Syed Masum Billah, Vikas Ashok, and IVRamakrishnan. 2018. Write-it-Yourself with the Aid ofSmartwatches: A Wizard-of-Oz Experiment with BlindPeople. In . ACM, New York, NY, USA,427–431.

DOI: http://dx.doi.org/10.1145/3172944.3173005 [10] Craig Brown and Amy Hurst. 2012. VizTouch:Automatically Generated Tactile Visualizations ofCoordinate Spaces. In

Proceedings of the SixthInternational Conference on Tangible, Embedded andEmbodied Interaction (TEI ’12) . ACM, New York, NY,USA, 131–138.

DOI: http://dx.doi.org/10.1145/2148131.2148160 [11] Emeline Brule, Gilles Bailly, Anke Brock, FredericValentin, Grégoire Denis, and Christophe Jouffrais.2016. MapSense: Multi-Sensory Interactive Maps forChildren Living with Visual Impairments. In

Proceedings of the 2016 CHI Conference on HumanFactors in Computing Systems (CHI ’16) . ACM, NewYork, NY, USA, 445–457.

DOI: http://dx.doi.org/10.1145/2858036.2858375 [12] Erin Buehler, Niara Comrie, Megan Hofmann,Samantha McDonald, and Amy Hurst. 2016.Investigating the Implications of 3D Printing in SpecialEducation.

ACM Trans. Access. Comput.

8, 3, Article 11(March 2016), 28 pages.

DOI: http://dx.doi.org/10.1145/2870640 [13] Matthew Butler, Leona Holloway, Kim Marriott, andCagatay Goncu. 2017. Understanding the graphicalchallenges faced by vision-impaired students inAustralian universities.

Higher Education Research &Development

36, 1 (2017), 59–72.

DOI: http://dx.doi.org/10.1080/07294360.2016.1177001 [14] Justine Cassell. 2000. More than just another pretty face:Embodied conversational interface agents.

Commun.ACM

43, 4 (2000), 70–78.[15] Alistair D. N. Edwards, Nazatul Naquiah Abd Hamid,and Helen Petrie. 2015. Exploring Map Orientation withInteractive Audio-Tactile Maps. In

Human-ComputerInteraction – INTERACT 2015 , Julio Abascal, SimoneBarbosa, Mirko Fetter, Tom Gross, Philippe Palanque,and Marco Winckler (Eds.). Springer InternationalPublishing, Cham, 72–79.[16] Stéphanie Giraud, Anke M Brock, Marc J-M Macé, andChristophe Jouffrais. 2017. Map learning with a 3Dprinted interactive small-scale model: Improvement ofspace and text memorization in visually impairedstudents.

Frontiers in Psychology

8, 930 (2017).

DOI: http://dx.doi.org/10.3389/fpsyg.2017.00930 [17] Cagatay Goncu and Kim Marriott. 2011. GraVVITAS:Generic Multi-touch Presentation of AccessibleGraphics. In

Human-Computer Interaction – INTERACT2011 , Pedro Campos, Nicholas Graham, Joaquim Jorge,Nuno Nunes, Philippe Palanque, and Marco Winckler(Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg,30–48.18] Timo Götzelmann. 2016. LucentMaps: 3D PrintedAudiovisual Tactile Maps for Blind and VisuallyImpaired People. In

Proc. ACM SIGACCESSConference on Computers & Accessibility (ASSETS ’16) .ACM, 81–90.

DOI: http://dx.doi.org/10.1145/2982142.2982163 [19] Timo Götzelmann and Aleksander Pavkovic. 2014.Towards Automatically Generated Tactile Detail Mapsby 3D Printers for Blind Persons. In

Computers HelpingPeople with Special Needs , Klaus Miesenberger,Deborah Fels, Dominique Archambault, Petr Peˇnáz, andWolfgang Zagler (Eds.). Springer InternationalPublishing, 1–7.

DOI: http://dx.doi.org/10.1007/978-3-319-08599-9_1 [20] Jaume Gual, Marina Puyuelo, Joaquim Lloverás, andLola Merino. 2012. Visual Impairment and urbanorientation. Pilot study with tactile maps producedthrough 3D Printing.

Psyecology

3, 2 (2012), 239–250.[21] Leona Holloway, Matthew Butler, and Kim Marriott.2018. Accessible maps for the blind: Comparing 3Dprinted models with tactile graphics. In

Proceedings ofthe SIGCHI Conference on Human Factors inComputing Systems (CHI ’18) . ACM.

DOI: http://dx.doi.org/10.1145/3173574.3173772 [22] Michele Hu. 2015. Exploring New Paradigms forAccessible 3D Printed Graphs. In

Proceedings of the17th International ACM SIGACCESS Conference onComputers & . ACM,New York, NY, USA, 365–366.

DOI: http://dx.doi.org/10.1145/2700648.2811330 [23] Peter H. Kahn, Nathan G. Freier, Takayuki Kanda,Hiroshi Ishiguro, Jolina H. Ruckert, Rachel L. Severson,and Shaun K. Kane. 2008. Design Patterns for Socialityin Human-robot Interaction. In

Proceedings of the 3rdACM/IEEE International Conference on Human RobotInteraction (HRI ’08) . ACM, New York, NY, USA,97–104.

DOI: http://dx.doi.org/10.1145/1349822.1349836 [24] Shaun K. Kane and Jeffrey P. Bigham. 2014. Tracking@stemxcomet: teaching programming to blind studentsvia 3D printing, crisis management, and twitter. In

Proceedings of the 45th ACM technical symposium onComputer science education . ACM, 247–252.[25] J. F. Kelley. 1984. An Iterative Design Methodology forUser-friendly Natural Language Ofﬁce InformationApplications.

ACM Trans. Inf. Syst.

2, 1 (Jan. 1984),26–41.

DOI: http://dx.doi.org/10.1145/357417.357420 [26] Jeeeun Kim and Tom Yeh. 2015. Toward 3D-printedmovable tactile pictures for children with visualimpairments. In

Proceedings of the 33rd Annual ACMConference on Human Factors in Computing Systems .ACM, 2815–2824.[27] Liliana Laranjo, Adam G Dunn, Huong Ly Tong,Ahmet Baki Kocaballi, Jessica Chen, Rabia Bashir, DidiSurian, Blanca Gallego, Farah Magrabi, Annie Y S Lau,and Enrico Coiera. 2018. Conversational agents inhealthcare: a systematic review.

Journal of the AmericanMedical Informatics Association

25, 9 (07 2018), 1248–1258.

DOI: http://dx.doi.org/10.1093/jamia/ocy072 [28] Richard Mayer, Kristina Sobko, and PatriciaD. Mautone. 2003. Social Cues in Multimedia Learning:Role of Speaker’s Voice.

Journal of EducationalPsychology

95 (06 2003), 419–425.

DOI: http://dx.doi.org/10.1037/0022-0663.95.2.419 [29] Samantha McDonald, Joshua Dutterer, AliAbdolrahmani, Shaun K. Kane, and Amy Hurst. 2014.Tactile Aids for Visually Impaired Graphical DesignEducation. In

Proceedings of the 16th InternationalACM SIGACCESS Conference on Computers &Accessibility (ASSETS ’14) . ACM, New York, NY, USA,275–276.

DOI: http://dx.doi.org/10.1145/2661334.2661392 [30] Roxana Moreno, Richard E. Mayer, Hiller A. Spires,and James C. Lester. 2001. The Case for Social Agencyin Computer-Based Teaching: Do Students Learn MoreDeeply When They Interact With Animated PedagogicalAgents?

Cognition and Instruction

19, 2 (2001),177–213.

DOI: http://dx.doi.org/10.1207/S1532690XCI1902_02 [31] Svetlana Nikitina, Sara Callaioli, and Marcos Baez.2018. Smart Conversational Agents for Reminiscence.In

Proceedings of the 1st International Workshop onSoftware Engineering for Cognitive Services (SE4COG’18) . ACM, New York, NY, USA, 52–57.

DOI: http://dx.doi.org/10.1145/3195555.3195567 [32] National Federation of the Blind. 2009. The BrailleLiteracy Crisis in America: Facing the Truth, Reversingthe Trend, Empowering the Blind. (2009). Availablefrom https://nfb.org/images/nfb/documents/pdf/braille_literacy_report_web.pdf .[33] Helen Petrie, Christoph Schlieder, Paul Blenkhorn,Gareth Evans, Alasdair King, Anne-Marie O’Neill,George T. Ioannidis, Blaithin Gallagher, David Crombie,Rolf Mager, and Maurizio Alafaci. 2002. TeDUB: ASystem for Presenting and Exploring TechnicalDrawings for Blind People. In

Computers HelpingPeople with Special Needs , Klaus Miesenberger,Joachim Klaus, and Wolfgang Zagler (Eds.). SpringerBerlin Heidelberg, Berlin, Heidelberg, 537–539.[34] Benjamin Poppinga, Charlotte Magnusson, MartinPielot, and Kirsten Rassmus-Gröhn. 2011. TouchOverMap: Audio-tactile Exploration of Interactive Maps. In

Proceedings of the 13th International Conference onHuman Computer Interaction with Mobile Devices andServices (MobileHCI ’11) . ACM, New York, NY, USA,545–550.

DOI: http://dx.doi.org/10.1145/2037373.2037458 [35] Alisha Pradhan, Kanika Mehta, and Leah Findlater.2018. "Accessibility Came by Accident": Use ofVoice-Controlled Intelligent Personal Assistants byPeople with Disabilities. In

Proceedings of the 2018 CHIConference on Human Factors in Computing Systems(CHI ’18) . ACM, New York, NY, USA, Article 459, 13pages.

DOI: http://dx.doi.org/10.1145/3173574.3174033

36] Joshua Rader, Troy McDaniel, Artemio Ramirez,Shantanu Bala, and Sethuraman Panchanathan. 2014. AWizard of Oz Study Exploring HowAgreement/Disagreement Nonverbal Cues EnhanceSocial Interactions for Individuals Who Are Blind. In

HCI International 2014 - Posters’ Extended Abstracts ,Constantine Stephanidis (Ed.). Springer InternationalPublishing, Cham, 243–248.[37] Leah M. Reeves, Jennifer Lai, James A. Larson, SharonOviatt, T. S. Balaji, Stéphanie Buisine, Penny Collings,Phil Cohen, Ben Kraal, Jean-Claude Martin, MichaelMcTear, TV Raman, Kay M. Stanney, Hui Su, andQian Ying Wang. 2004. Guidelines for Multimodal UserInterface Design.

Commun. ACM

47, 1 (Jan. 2004),57–59.

DOI: http://dx.doi.org/10.1145/962081.962106 [38] Andreas Reichinger, Anton Fuhrmann, StefanMaierhofer, and Werner Purgathofer. 2016.Gesture-Based Interactive Audio Guide on TactileReliefs. In

Proceedings of the 18th International ACMSIGACCESS Conference on Computers and Accessibility(ASSETS ’16) . ACM, New York, NY, USA, 91–100.

DOI: http://dx.doi.org/10.1145/2982142.2982176 [39] V. Rossetti, Francesco Furfari, B. Leporini, S. Pelagatti,A. Quarta, V. Rossetti, F. Furfari, B. Leporini, SusannaPelagatti, and A. Quarta. 2018. Enabling Access toCultural Heritage for the visually impaired: anInteractive 3D model of a Cultural Site.

ProcediaComputer Science

130 (2018), 383–391.[40] Jonathan Rowell and Simon Ungar. 2003. The world oftouch: an international survey of tactile maps. Part 1:production.

British Journal of Visual Impairment

21, 3(2003), 98–104.

DOI: http://dx.doi.org/10.1177/026461960302100303 [41] Noah L. Schroeder, Olusola O. Adesope, andRachel Barouch Gilbert. 2013. How Effective arePedagogical Agents for Learning? A Meta-AnalyticReview.

Journal of Educational Computing Research

DOI: http://dx.doi.org/10.2190/EC.49.1.a [42] Lei Shi, Holly Lawson, Zhuohao Zhang, and ShiriAzenkot. 2019. Designing Interactive 3D PrintedModels with Teachers of the Visually Impaired. In

Proceedings of the 2019 CHI Conference on HumanFactors in Computing Systems (CHI ’19) . ACM, NewYork, NY, USA, Article 197, 14 pages.

DOI: http://dx.doi.org/10.1145/3290605.3300427 [43] Lei Shi, Idan Zelzer, Catherine Feng, and Shiri Azenkot.2016. Tickers and Talker: An accessible labeling toolkitfor 3D printed models. In

Proceedings of the 34rdAnnual ACM Conference on Human Factors inComputing Systems (CHI’16) . DOI: http://dx.doi.org/10.1145/2858036.2858507 [44] Lei Shi, Yuhang Zhao, and Shiri Azenkot. 2017a.Designing Interactions for 3D Printed Models withBlind People. In

Proceedings of the 19th InternationalACM SIGACCESS Conference on Computers andAccessibility (ASSETS ’17) . ACM, New York, NY, USA,200–209.

DOI: http://dx.doi.org/10.1145/3132525.3132549 [45] Lei Shi, Yuhang Zhao, and Shiri Azenkot. 2017b. Markitand Talkit: A Low-Barrier Toolkit to Augment 3DPrinted Models with Audio Annotations. In

Proceedingsof the 30th Annual ACM Symposium on User InterfaceSoftware and Technology (UIST ’17) . ACM, New York,NY, USA, 493–506.

DOI: http://dx.doi.org/10.1145/3126594.3126650 [46] Abigale Stangl, Chia-Lo Hsu, and Tom Yeh. 2015.Transcribing Across the Senses: Community Efforts toCreate 3D Printable Accessible Tactile Pictures forYoung Children with Visual Impairments. In

Proceedings of the 17th International ACM SIGACCESSConference on Computers & . ACM, New York, NY, USA, 127–137.

DOI: http://dx.doi.org/10.1145/2700648.2809854 [47] Brandon T. Taylor, Anind K. Dey, Dan P. Siewiorek, andAsim Smailagic. 2015. TactileMaps.net: A web interfacefor generating customized 3D-printable tactile maps. In

Proc. ACM SIGACCESS Conference on Computers &Accessibility . ACM, 427–428.

DOI: http://dx.doi.org/10.1145/2700648.2811336 [48] Alexandra Vtyurina and Adam Fourney. 2018.Exploring the Role of Conversational Cues in GuidedTask Support with Virtual Assistants. In

Proceedings ofthe 2018 CHI Conference on Human Factors inComputing Systems (CHI ’18) . ACM, New York, NY,USA, Article 208, 7 pages.

DOI: http://dx.doi.org/10.1145/3173574.3173782 [49] Alice Watson, Timothy Bickmore, Abby Cange, AmbarKulshreshtha, and Joseph Kvedar. 2012. AnInternet-Based Virtual Coach to Promote PhysicalActivity Adherence in Overweight Adults: RandomizedControlled Trial.

J Med Internet Res

14, 1 (26 Jan 2012),e1.

DOI: http://dx.doi.org/10.2196/jmir.1629 [50] Henry B Wedler, Sarah R Cohen, Rebecca L Davis,Jason G Harrison, Matthew R Siebert, Dan Willenbring,Christian S Hamann, Jared T Shaw, and Dean J Tantillo.2012. Applied computational chemistry for the blindand visually impaired.

Journal of Chemical Education

89, 11 (2012), 1400–1404.[51] Jacob O. Wobbrock, Meredith Ringel Morris, andAndrew D. Wilson. 2009. User-deﬁned Gestures forSurface Computing. In

Proceedings of the SIGCHIConference on Human Factors in Computing Systems(CHI ’09) . ACM, New York, NY, USA, 1083–1092.