Building Trust in Autonomous Vehicles: Role of Virtual Reality Driving Simulators in HMI Design
Lia Morra, Fabrizio Lamberti, F. Gabriele Pratticó, Salvatore La Rosa, Paolo Montuschi
IIEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XXXX XXXX 1
Building Trust in Autonomous Vehicles: Role ofVirtual Reality Driving Simulators in HMI Design
Lia Morra,
Senior Member, IEEE,
Fabrizio Lamberti,
Senior Member, IEEE,
F. Gabriele Prattic´o,Salvatore La Rosa, Paolo Montuschi,
Fellow, IEEE
Abstract —The investigation of factors contributing at makinghumans trust Autonomous Vehicles (AVs) will play a fundamentalrole in the adoption of such technology. The user’s ability toform a mental model of the AV, which is crucial to establishtrust, depends on effective user-vehicle communication; thus, theimportance of Human-Machine Interaction (HMI) is poised toincrease. In this work, we propose a methodology to validatethe user experience in AVs based on continuous, objectiveinformation gathered from physiological signals, while the useris immersed in a Virtual Reality-based driving simulation. Weapplied this methodology to the design of a head-up displayinterface delivering visual cues about the vehicle’ sensory andplanning systems. Through this approach, we obtained qualitativeand quantitative evidence that a complete picture of the vehicle’ssurrounding, despite the higher cognitive load, is conducive to aless stressful experience. Moreover, after having been exposed toa more informative interface, users involved in the study werealso more willing to test a real AV. The proposed methodologycould be extended by adjusting the simulation environment, theHMI and/or the vehicle’s Artificial Intelligence modules to diginto other aspects of the user experience.
Index Terms —autonomous vehicles, human-machine interac-tion, driving simulator, user experience, virtual reality.
I. I
NTRODUCTION M OST research efforts in the context of intelligent vehi-cles (IVs) have been directed to improving safety andeffectiveness of vehicle’s control (autonomy) and vehicle-to-vehicle coordination (connected vehicles) [1]. To fully reapthe benefits of autonomous driving (AD) systems, humans,both drivers/passengers and pedestrians alike, will need to trust their safety and reliability. Hence, there is an emerging needto support effective and reassuring communication betweenhumans and IVs. Passengers need to feel confident, at alltimes, that they have sufficient information about the stateof the vehicle, its environment and perceptions as well as itsplanned and current behavior; even more, that they possessall the appropriate information and means to take over all theaspects regarding the operation of the vehicle in due time,when needed, in a safe and appropriate manner.Despite playing a crucial role in the uptake of any systembased on autonomous agents, including autonomous vehicles(AVs), trust between humans and machines is generally hard
Copyright (c) 2019 IEEE. Personal use of this material is permitted.However, permission to use this material for any other purposes must beobtained from the IEEE by sending a request to [email protected] authors are with the GRAINS – GRAphics And INtelligent Systemsgroup at the Dipartimento di Automatica e Informatica of Politecnico diTorino, 10129 Torino, Italy. e-mail: (see http://grains.polito.it/people.php).Manuscript received XXXX XX, XXXX; revised XXXX XX, XXXX. to establish. According to a 2017 survey by the Pew ResearchCenter on “Automation in everyday life”, over half (56%) ofthe Americans who were interviewed said they would not wantto ride in a driverless vehicle if given the opportunity [2].However, preliminary experiments in the literature on par-tially autonomous driving scenarios show that these negativeemotions can be reduced by adopting Human-Machine Inter-action (HMI) designs that provide feedback about how the caris acting (what automated activity it is undertaking) and thereasons why the car is acting that way [3].The role of HMI in IVs is thus profound and, for this reason,user experience (UX) should be taken into large accountat any stage of the development process. By establishinga collaborative relationship between drivers/passengers andvehicles, HMI can positively affect the acceptance as well asthe technological advancement of AD solutions.Unfortunately, the application of consolidated approachesfor UX design and evaluation to AD systems is not straight-forward. For instance, focusing on the quantitative assessmentof a particular user interface design, techniques that measuredriver’s performance in specific driving tasks could not beeasily reused when, due to the specific level of automation,there are no more drivers but passengers. Similarly, post-experience questionnaires (alone) could be no more appropri-ate when feedback to be collected concerns the huge amountof aspects that may contribute to the perceived level of trust.Even driving simulators that are used today for developingvehicle’s intelligent behaviors may not be directly applied toUX studies, as focus would have to be shifted, e.g., on thevehicle’s interior and on interaction with it, rather than onthe fidelity of external factors affecting its decisions (traffic,presence of pedestrians, etc.).By moving from the above considerations, in this paper wepresent a methodology that is meant to support the study ofHMI with IVs, and we show its helpfulness in the evaluationof the passengers’ level of trust by considering the design ofa possible interface for AD systems.The devised methodology relies on a simulation platformbased on immersive Virtual Reality (VR), which was devel-oped by grounding on an existing driving simulator. Although,in principle, the technology is applicable to many scenarios,from unassisted to fully autonomous systems, we focused onL4 and L5 automation levels, as they represent the configura-tions for which characterizing the passenger’s experience fromthe point of view of comfort and trust is more challenging.We therefore created a virtual AD system that allows usersto experience a simulated ride in a virtual urban environment, a r X i v : . [ c s . H C ] J u l EEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XXXX XXXX 2 facing a number of different situations.For the assessment of the UX, we consider both cognitiveand affective factors, by integrating feedback based on subjec-tive post-experience questionnaires with continuous, objectiveinformation gathered from physiological signals. In particular,in this paper we focused on stress level measurements toinvestigate the perceived degree of safety and “connection”with the vehicle. Notwithstanding, the proposed methodologyhas been designed in a way to support later extensions for thedetection of other emotional states. It is worth observing that,thanks to its immersive nature, VR allows to measure the latterstate much more realistically than other traditional simulationscenarios [4].With the aim to evaluate the suitability of the proposedapproach, the methodology was applied to the design of ahead-up display (HUD)-based interface for AVs that providesvisual cues about the vehicle’ sensory and planning systems.As said, providing information about how and why the car isacting is crucial to elicit trust in AVs, but little experimentalevidence is available to determine how such information is bestpresented to the passengers [5], [6]. By applying our approachto the above scenario, we obtained qualitative and quantitativeproofs that a complete picture of the vehicle’s surrounding,despite the higher cognitive load, is conducive to a lessstressful experience. Moreover, after having been exposed toan interface delivering a higher information content, usersinvolved in the study were also more willing to test a realAD system.Besides offering interesting insights that may drive futureHMI designs, the results confirm the effectiveness of theproposed methodology in digging into a use case that wellrepresents possible facets of the UX which could be investi-gated through the experimented techniques.II. B
ACKGROUND AND R ELATED W ORK
A. HMI in Partially and Fully Automated Vehicles
Establishing trust is important in order for users to accept,and even rely on, automated systems. Mcknight & Chervany[7] have identified three constructs necessary to increase trust:ability, benevolence, and integrity. When the trustee is anautonomous system, these factors translate in the system’s performance and skillful execution, into the sharing of acommon purpose with the user, and into the implementationof a reliable and consistent process . Trust is thus establishedthrough direct observation of the system’s behavior and itsunderlying mechanisms. Lee & See observed that “Trust thatis based on an understanding of the motives of the agent willbe less fragile than trust that is based on the principle ofreliability of the agent” [8]. In the context of AD systems,HMI plays a fundamental role in this respect, by providinginformation about the vehicle’s performance. In fact, partiallyautomated vehicles on the market allow the driver to monitorthe status of the car’s components. User interfaces are designedto increase the perceived ability of the system and to supportpredictability, thus inducing trust.In recent years, a study by Ekman and colleagues provided asystematic review of HMI design principles that promote trust in AD systems [9]. The authors distinguish a learning phase ,that starts with the first interaction and lasts until the user isfamiliar with the AD systems, from the performance phase ,which takes into account a long-term use perspective. During atesting simulation, it can be argued that the learning phase ismost important, although its specific duration differs on anindividual basis. In the performance phase, trust is mainlybased on the performance and dependability of the system,and is fairly stable unless an error or unexpected event occurs;in the learning phase, it is the user’s ability to form a mentalmodel of the AD systems that is crucial to form a trust bond.Hence, in this work we focused our attention specificallyon the four factors that, according to [9], are more relevantfor the learning phase: mental model , the ability to forman approximate representation of the AD system’s skillsand functions; the system’s proneness to be perceived as an expert / reputable agent; the possibility to provide continuous feedback to the user, ideally addressing two or more senses;finally, the provision of how and why information regardingupcoming actions. In this context, a “how” message describeshow the system solves a given task, whereas a “why” messagepertains to the motivations that lead to the task itself.A limited number of experimental studies have, so far, es-tablished that providing information to the driver/user usuallyincreases driving performance and acceptability in partially[3], [10], [11] and fully automated driving systems [6]. Forinstance, Verberne et al. [10] found that Adaptive CruiseControl (ACC) systems that share the same drivers’ objectives,like the adoption of a relaxed and safe driving style withoutsudden braking and accelerations, while at the same timeproviding information to the user, are considered more reliableand acceptable. Koo et al. [3] explored the effect of providing“how” and “why” information in the context of an auto-braking system. Providing both the information types resultedin the safest driving behavior, at the expense, however, of ahigh cognitive load and decreased acceptability. Drivers pre-ferred receiving only “why” information, whereas the “how”information was often perceived as redundant. The interfacesconsidered in these studies were very simple compared to thetechnical possibilities of current user interfaces: they consistedof brief verbal messages, with no visual cues [3], or includedonly information on the position of obstacles [6].It is important to consider not only which information isprovided to the users, but also how it is conveyed. The visualmode is the primary and most widely used among the vehicleinterfaces, and represents the most consistent communicationchannel. In-vehicle display devices can be grouped in threecategories: head-down displays (HDDs), head-up displays (HUDs), and head-mounted displays (HMDs). HDDs offer theadvantage of not blocking the view of the real world for theusers, who, however, find themselves distracting from the road.HUDs make it possible to take advantage of the necessaryinformation while keeping an eye on the external environment,but pose significant construction challenges. HMDs share theadvantages of HUDs, but only a few devices are available onthe market, which suffer from some usability issues (especiallyfor in-vehicle applications).Studies have consistently shown that HUDs result in a better EEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XXXX XXXX 3
TABLE II
NFORMATION DISPLAYED BY COMMERCIAL
HUD
CONCEPTS ANDDEMONSTRATION VIDEOS .AR-HUD Displayed informationContinental AR-HUDConcept Lane Departure Warning System (LDWS),Assisted Navigation, Adaptive Cruise Con-trol (ACC)Hyudai AR-HUD Concept Traffic lights, Assisted Navigation, LKA,ACCPSA group AR-HUD con-cept Assisted Navigation, LKA, ACC, Pedestri-ans, Approaching obstacle warningDaqri AR-HUD Concept Assisted Navigation, Lane Keep Assistance(LKA), Lane Control, ACC, Approachingobstacle warning, Children crossing, Pedes-triansWayRay Holographic ARDisplay concept Assisted NavigationWaymo Demo video Traffic signs, Cars, Pedestrians, Cyclists(bounding boxes and colored overlays withdistance and speed information), MotionPrediction, Assisted NavigationNVIDIA Drive AGX Traffic signs, Cars, Traffic lights, Lanes,Pedestrians, Cyclists (bounding boxes withdistance information), Motion Prediction,Lane separation lines, Route planning data driving experience and performance than HDDs, leading toshorter reaction times [12], decreased cognitive load [13], andfewer driving errors [5], [14]; HUDs are also preferred byusers against both HDDs and HMDs [5]. Augmented RealityHUDs (AR-HUDs) have been found especially effective inincreasing the driver’s intuitive cognition [15] and promotinga safer and more effective driving behavior, particularly indemanding driving situations [16], [17].Given the technical difficulties in realizing AR-HUDs, cur-rent displays often come in the form of prototypes, concepts ordemonstration videos. Examples in the literature often focuson specific aspects of the driving experience, such as driverassistance (DA) [5] and obstacle detection [6]. Many com-mercial prototypes focus on partially automated systems thatextend current DA solutions, whereas Waymo and NVIDIAare more directly focused on L4 and L5 automation.Information displayed by the main commercial solutionsis reported in Table I. A tendency to adopt a common setof symbols and metaphors can be observed among vendors.For instance, information related to ACC and Lane KeepAssistance functionalities, such as the current lane, speed, andthe position and speed of preceding cars are displayed byContinental, Hyundai, PSA, and Daqri. Waymo and NVIDIAinclude richer information on both the path planning andthe sensory capabilities of the vehicle. Through boundingboxes (i.e., parallelepipeds enclosing detected objects), coloredoverlays and other elements, all the factors involved in drivingare highlighted. In addition, navigation information is addednot only for the user’s vehicle, but also related to other cars,pedestrians or cyclists through motion prediction.
B. Measuring User Experience in Driving Simulators
Researchers have for long time relied on driving simulatorsto cope with difficulties and risks associated with field testing[18]. In recent years, VR simulators have elicited a lot ofinterest thanks to their immersive nature [5], [13], [19], [20]. Most studies investigating different aspects of driving insimulated scenarios, including HMI design [5], [21], relyon drivers’ behavior and performance as a proxy for theiremotional and cognitive status [3], [12], [22]. Experimentalmeasures include standardized questionnaires as well as indi-cators such as driving speed, lane keeping, braking patterns,etc., for which absolute or relative validity has been generallyestablished [22]. However, in AD systems, humans are ex-pected to take progressively less part in driving, which makesbehavioral assessment less relevant.Physiological signals are increasingly used to measureusers’ affective and cognitive states in engineering in general[23]. The activity of the autonomic nervous system, which reg-ulates affective states, can be captured non-invasively throughsignals such as Heart Rate (HR) and Electrocardiography(ECG), Electromyography (EMG), Respiratory Rate, and Gal-vanic Skin Response (GSR). In the last years, researchers alsoinvestigated their use combined with traditional or immersivedriving simulators [4], [22], [24].In particular, the relative validity of physiological signals fortraditional driving simulators is supported by several studies,albeit available data is less abundant than for driving perfor-mance [4], [22], [25]. For instance, risk perception was foundto be highly correlated with changes in GSR [22]. Comparisonbetween on-road and simulated driving conditions establishedthe relative validity for mean HR and mean oxygen consump-tion, although HR values observed in real driving conditionswere higher, probably due to the increased stress associatedwith driving on a real road [25]. In a pilot study, Eudeaveand colleagues found that the physiological response in animmersive VR environment is stronger than in a traditionaldriving simulator [4].Recording of physiological signals have also been exploitedin real-life driving conditions to characterize drivers’ perfor-mance and experience, from measuring stress levels to detect-ing drivers’ drowsiness [26], [27]. Of particular interest is thestudy of Healey and colleagues on driving-related stress [26].ECG, EMG and GSR were recorded while drivers followeda set route; driving sessions were videotaped and visuallyinspected for observable stress-induced actions, such as headturning, to be used as reference standard. Collected signalsallowed the authors to distinguish different levels of stresswith high accuracy (over 97% across multiple drivers); GSRand HR metrics were most closely correlated with drivers’stress level. Again, studies have been conducted, so far, fromthe point of view of an active driver, leaving the questionopen on whether stress-induced changes can be equally andas effectively observed in passengers.III. P
ROPOSED M ETHODOLOGY
A. Overview
As discussed in Section II-A, trust in automated systemscan be achieved from direct observation of system’s behavior,coupled with an understanding of the underlying mechanisms.To this aim, as depicted in Fig. 1, the devised methodologyrelies on a VR-based AV simulator. Simulation allows theuser to get immersed in repeatable scenarios including a
EEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XXXX XXXX 4
Motion PlatformVehicle Simulation Driving ScenarioVR Simulation HMI DesignSubjective Feedback Quantitative Feedback
Fig. 1. Proposed methodology: exemplification on the task of HMI design. A defined AV driving scenario is simulated in immersive VR. Vehicle simulationis based on the open source GENIVI platform. The simulator is integrated with a motion platform to further foster immersion. User’s feedback is collectedthrough both subjective, offline questionnaires, and objective, real-time physiologic measurements reflecting cognitive and affective states. variety of both ordinary and emotional-intensive events. Useris provided with insights on the autonomous system’s behaviorby means of a virtual AR-HUD combined with additionalaudio cues. In this way, we postulate that the user can forman adequate mental model of the AD system. Assessmentis performed by collecting feedback from the user in theform of subjective (questionnaire-based) ratings and objective(physiological signal-based) measurements.
B. Technology and Setup × ◦ FOV at90Hz. The native positional tracking leverages the IR lasersemitted from the Vive Base stations (built upon the Valve’sLighthouse technology) which, combined with headset’s built-in sensors, enables a 6 DOF outside-in tracking of the user’shead.With the aim to foster immersion through the simulation ofthe motion stimuli that a driver or passenger would experienceon a real vehicle, an inertial motion platform is used. The plat-form exploited in this work is the Atomic A3 Racing, designedby Atomic Motion Systems, which supports 2 DOFs (yawand pitch) motion simulation. To simulate the user’s perceivedaccelerations, the so-called tilt coordination motion simulationstrategy [7] was implemented. In short, this technique worksby imitating the perceived acceleration via decomposition ofthe gravity acceleration vector, obtained through a coherentrotation of the platform. A motion compensation needs to beapplied to the VR coordinate system (which is centered inthe headset, i.e., in the user’s viewpoint), based on currentplatform’s rotation. To this purpose, a Vive Tracker wasmounted on the seat and tracked together with the headset.Finally, since it has been proved that letting the user see hisor her hands in the virtual environment increases the sense ofpresence [28], a virtual replica of the user’s hands includingarticulated fingers is created by tracking them using a LeapMotion Controller device attached to the headset.
C. AV Driving Simulator
The vehicle simulator implemented is based on the opensource Simulator Vehicle project by the GENIVI Alliance [29](in the following simply referred to as GENIVI, for brevity).GENIVI was selected among several possible alternatives formultiple reasons: it was originally created to support HMIdesign; it allows, by design, the addition of new features; italready provides modules for intelligent traffic simulation; itincludes a basic auto drive functionality for the user’s vehicle;it provides a few driving scenes and vehicles with their ownrigid body physics-based controller.The main activities carried out to adapt GENIVI to thepurposes of this work involved: porting of available featuresto VR; integration of the motion platform; implementation ofa custom AD controller. The latter activity was considered asnecessary since, in a preliminary study, the built-in controllerwas judged not realistic enough, especially when dealingwith complex, unpredictable events (e.g., sudden pedestriancrossing, etc.).
1) VR porting:
Implementing the support for VR wasfacilitated by the fact that GENIVI is based on the Unitygame engine, which natively allows for the creation of VRapplications for the HTC Vive. Our implementation allows tovirtual accommodate the user to any seat of the virtual vehicle.Built-in vehicles, namely a Land Rover L405 and a Jaguar XJ,are designed for a non-immersive simulation. Hence, a newvehicle was created with VR-based interaction in mind, i.e.,by focusing on visual fidelity of the vehicle’s interior. Finally,support for users’ virtual hands was added.
2) Motion platform integration: to integrate the motionplatform, an additional software module was developed. Themodule receives in input the acceleration values calculatedin the seat’s tracked point by the physics simulation engineand outputs it to the proprietary platform’s driver (AMSSymphinity), which remaps them to coherent tilt and pitchangles and consequently applies them to the platform. Othermotion platforms may be integrated in a similar way.
3) AV simulation: within this work, our aim was to providea methodology to study the considered domain using simulatedVR-based scenarios, accompanied by suitable measurementtools, rather than contribute to the advancement of the state of
EEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XXXX XXXX 5 the art of AVs’ control sub-systems. Basic AD functionalityavailable in GENIVI was therefore extended to make it copewith situations of interest. Attention was focused on re-producibility, while preserving simplicity. More sophisticatedimplementations, leveraging, e.g., data provided by vehicle’svirtual sensors could nonetheless by integrated in the future.Our implementation takes advantage of the native trajectorysystem, which is used in GENIVI to manage the traffic. Pathsto follow are embedded in the scene description using acomplex network of waypoints. The developed AD systemrelies on it to feed a PID-based controller, which is in chargeof driving the vehicle by making it accelerate, brake, andsteer. Differently than with the other cars in the traffic, theAD system is affected by the full set of accurate, rigid bodyphysics simulation variables. The PID was fine-tuned in closedloop using manual parameter adjustment targeting a maximumovershooting of 5% at step response, in order to achievea comfortable and realistic behavior. To this aim, controlcommands shaping and auxiliary waypoints were also used.Although different and far more sophisticated approachescould be investigated in the future (e.g., [30]), the selectedcontrol system proved to meet the simplicity-effectivenesstrade-off required to cope with the issues tackled in thiswork. Appropriateness of the pursued approach was alsoconfirmed by subjective observations concerning simulationquality (Section IV-E and Supplemental material).The same approach was pursued also to bind specificvehicle’s reactions to the pre-programmed events. Obstacleavoidance is handled by a dedicated logic, which also takesinto account trajectory replanning when the obstacle cannotbe avoided by simply adjusting vehicle’s speed. For movingobstacles, replanning takes into account predicted motion.Further details on the implementation can be found in [31].
D. HUD Design
Based on the principles discussed in Section II-A, an in-vehicle user interface should continuously provide feedbackaddressing, whenever possible multiple senses, highlighting“why” information that explains the vehicle’s choices, andadopting a pleasant and effective communication style thatpresents the system as a skilled/reputable driver. These ele-ments, while important in general, are particularly relevant inthe initial learning phase, where the user is still unfamiliarwith the AD system and needs to form an appropriate mentalmodel of its inner mechanisms [9]. As it will be illustratedin more detail in Section IV, subjects who participated in ourstudy were never exposed to a real AD system. An AR-HUDwas therefore designed, as it was found in the literature to bethe most effective interface under the considered conditions.It was deemed as important to ensure that visual cuesdisplayed by the AR-HUD are consistent with informationconveyed by commercial DA products, as users are mostlyfamiliar with it. However, it was regarded as crucial to providealso information that illustrate the vehicle’s sensory capa-bilities and hence, improve the user’s situational awareness.Finally, given this work’s focus on L4 and L5 AD systems,information about the vehicle’s planning functionalities neededto be delivered as well. Design was based on the features reported in Table I. TheHUD is capable to display information about all the relevantelements in the surrounding environment, including both staticobjects (trees, lighting poles, parked cars, traffic lights, roadsigns, etc.) and dynamic objects (pedestrians, animals or othercars). These elements are provided together with distanceinformation in meters from the vehicle, absolute speed whenavailable, and a visual warning status indicator. Lane keepingand navigation cues for the user’s vehicle and other cars(assuming that they are connected) are also considered. Thecolor of each car is randomly assigned by GENIVI.Objects of interests are identified by means of a boundingbox. This metaphor, previously validated in the literature[17], is adopted by commercial players such as Waymo andNVIDIA. In our implementation, each bounding box has awhite outline and is associated with a label and an icon identi-fying the detected object, thus satisfying the usability principlewhich suggests that the adopted representation must be simpleand intuitively understandable by the user [32]; the use offamiliar cues, such as icons, also reduces the cognitive load inthe presence of a large amount of information [33]. Boundingboxes are automatically generated in VR knowing the position,size and pose of all objects in the scene. Technically, thiswas implemented in Unity by associating to all objects with a
Collider component a visible colored material. The
Colliders are not visible by default in the rendering step because justdefine the bounding volume of an object for the purposesof identifying physical collisions through the physics engine.Labels always face the vehicle and are, therefore, readable bythe user.In order to determine which situations constitute a potentialdanger, we relied on the definition from the ISO 15623standard on “Forward vehicle collision warning systems” [34],counting on a previous study by Sebastian et al [35]. Amathematical model is used to determine potential collisionsbased on the trajectory, speed and acceleration of the vehicle,as well as that of potential obstacles (e.g., the preceding car).Once the possibility of a collision has been established, asafety distance is calculated, which depends on the speed ofthe vehicle and the reaction time of the driver, which wasestimated based on the study in [36]. The distance between thevehicle and the estimated collision point is therefore measured:if this distance is less than the safety distance, the passengerneeds to be warned of the potential danger. We thereforedefined a hazard index , ranging from 0 to 1 and calculatedas the ratio between the distance from the obstacle and thewarning distance defined in [35].The objects’ warning status is presented through both visualand auditory cues. In the literature, the AR-based DA system in[37] adopted an intuitive color code in which the severity of thedanger of an obstacle detected on the road is shown by meansof a color code that starts from the green (safety) and extendsup to the red (maximum state of danger). This color coding isconsistent with systems reviewed in Section II-A. Therefore,we decided to color-code the hazard index with a green to redgradient and use it to visually represent the warning status ofthe detected object by controlling the transparency color ofthe bounding box. The color-code value is computed using a
EEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XXXX XXXX 6 perception-based equation, where the hazard index is used asexponential factor.To signal potential dangers, the label associated with thebounding box flashes as well, so as to direct the user’s attentiontowards the obstacle. Flashing is used in DA systems byvarious vendors [38], [39]. It is important to underline that,through the flashing information, the vehicle communicates tothe user why it is about to perform a specific action [9]. Theflashing visual cue, at a lower frequency, is also used to notifythe user of a road sign or traffic light. Flashing occurs whena traffic light changes or when a new road sign is recognized.The lower frequency reduces the sense of alarm and, hence,allows the user to distinguish normal driving operations fromhigh-risk situations.An immediate danger is also marked by a sound alert [34],[40]. This is consistent with current DA systems, which pro-duce audible warnings, e.g., in emergency braking conditions[39]. A more pleasant sound is played when road signs aredetected, to capture the user’s attention in an unalarming way.Two variants of the HUD were designed, which in thefollowing are referred to as omni-comprehensive (OMN) and selective (SEL).
1) Omni-comprehensive HUD: in the OMN variant, weshow information about all dynamic elements (cars and pedes-trians) within a “detection” diameter which is set to 150 me-ters. This threshold was firstly motivated by practical reasons:virtual objects beyond this distance would be too small to beappreciated considering the resolution of the display in the VRheadset. This distance is also compatible with the equipmentof current AD prototypes and the detection range of LiDARsystems. Road signs and traffic lights are always shown in theinterface, except for those that regulate road sections differentthan the one the vehicle is currently on. Furthermore, it wasdecided to exclude from the display the information aboutstatic objects such as trees, parked cars, lighting poles, etc.unless they become dangerous. This exclusion is motivatedby the principle of cognitive load, according to which aninterface should be easily understandable by the user, simpleand intuitive, as it avoids excessive cluttering [32].
2) Selective HUD: in the SEL variant, only informationthat is deemed of specific interest to the user is displayed.The guiding principle was to select information that pertainsto those elements of the environment that, at any given point,affect the behavior of the AD system. Let us consider roadsigns: the vehicle detects all road signs within the diameterof interest, but not all of them are necessarily useful at thetime. For example, in the presence of a pedestrian crossingsign and a speed limit sign, the vehicle may decide not toshow any information on the former sign, based on the factthat, at the moment, there is no pedestrian intending to cross;the latter sign may force the vehicle to slow down and, thus, itwould be highlighted in the interface. More specifically, in theSEL variant only cars that precede the current vehicle or, moregenerally, that intersect its current trajectory, are highlightedwith a bounding box. Pedestrians and other static or movingobjects are identified only if and when they become dangerous,i.e., when a collision becomes possible. Navigation lines forother cars are only displayed when assessed by the vehicle (a) OMN(b) SELFig. 2. Comparison between the OMN (a) and SEL (b) AR-HUD interfaces. (e.g., at intersections to determine priority). Traffic lightsinformation, as well as tracing of the vehicle’s navigationline and the road center line, are unchanged in this variant.A comparison of the two interfaces is provided in Fig. 2.
E. Simulated Scenario
In order to create a relationship of trust between a userand an AD system, the latter must show its ability in dealingwith different driving scenarios [8]. Our simulated scenario isconstructed to include a variety of different situations, bothordinary and challenging, that may occur in an urban setting.The urban setting is, in general, considered the most difficultto manage by AVs [41]. In fact, current L2 and L3 automatedsystems are mostly restricted to motorways and extra-urbanroutes, and the biggest challenge in the development of L4and L5 systems is represented precisely by urban areas wheresignificantly more factors are at play, driving conditions arefar less predictable, and the presence of pedestrians amplifiesthe perceived risk.Compared to real-life driving, simulators offer the distinc-tive advantage of creating repeatable scenarios, where mostexperimental factors can be easily controlled. Therefore, it ispossible to study and compare subjects’ reaction to individualevents, whereas in real-life driving experiences one would bemostly restricted to consider overall measures [25], [26].The simulation was created starting from one of the scenesincluded in the GENIVI platform, representing a miniatureversion of the city of San Francisco. As said, despite thebasic auto drive functionality, GENIVI is natively meant tosupport mostly first-person driving experiences, with randomtraffic patterns, no pedestrians and no intentionally hazardoussituations. By leveraging the developed AD capabilities and
EEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XXXX XXXX 7 the integrated waypoint system, different situations were em-bedded in the simulated scenario in order to showcase differentAD abilities and elicit changes in the subjects’ affectivestate. Considered abilities include: interacting with traffic andespecially with other cars, e.g., maintaining safety distance,overtaking, etc.; handling road signs and traffic lights; avoidingobstacles and dealing with other potentially dangerous situa-tions, including those where other cars or pedestrians do notbehave correctly.The subject is seated in the passenger’s position (frontright). The experience begins in an area with a relativelysimple environment and little traffic, to allow the user tofamiliarize with the AD system. Then, the environment be-comes populated by cars and pedestrians: the subject becomesfamiliar with the HUD and the way information is conveyedby the vehicle. Afterwards, riskier situations occur in whichthe vehicle can show its decision-making skills. In [8], it wasobserved that “If trust is primarily based on rules that char-acterize performance during normal situations, then abnormalsituations might lead to the collapse of trust”. This strengthensthe importance of including driving situations that, while lesslikely, may pose significant challenges for an AD system. Tosimulate a typical urban context, such situations were spacedthroughout the simulation and alternated with ordinary ones,as illustrated in Fig. 3. After every risky situation, the carstops for few seconds, to ensure that the subject has enoughtime to understand what happened and to reflect on how the carhandled that situation. Considering the time for letting subjectsget acquainted with the system as well as the time requiredto achieve a suitable distribution of situations, the duration ofthe simulated scenario was set to 12 minutes.Simulated events include the sudden crossing of a dog(Dog), a child on the sidewalk throwing a ball on the street(Ball), scooters and cars that split lanes while driving (Scooter,Car1 and Car2), as well as pedestrians crossing the street(Man1 and Man2). Illustrative frames are reported in Fig. 3.The Dog event corresponds to a highly hazardous situation,in which the vehicle is forced not only to slow down, butalso to steer in the opposite direction to avoid a collision.The same happens in the Ball and in the Man2 events (inthe latter case, a pedestrian crosses outside of a designatedcrosswalk while the car is at full speed). Man1 is a less riskysituation, as the car is approaching a red light and is alreadybraking when the pedestrian starts crossing. In the Scooterevent, the vehicle slows down as the preceding car turns right;in the meanwhile, a scooter enters the lane from the left.The situation is not particularly dangerous, as the vehicle wasalready reducing its speed to deal with traffic jam; however,from the viewpoint of vehicle-to-human communication, thisinteraction is complicated as it involves several vehicles. In theCar1 event, a car suddenly changes its lane when approachingroad construction (which is poorly visible), forcing the vehicleto quickly reduce its speed to avoid a collision. The Car2 eventis even riskier, as another car driving on the intersecting roaddoes not stop at a red traffic light and instead passes at fullspeed, forcing the vehicle to brake very abruptly.Two videos showing the simulated scenario with the OMNand the SEL interfaces are available at http://tiny.cc/p4v16y.
BallDog Car1 Scooter Car2 Man1 Man2Start EndBallDog ScooterCar2 Man1Man2StartEndCar1
Fig. 3. Timeline of the test scenario with simulated events.
F. Galvanic Skin Response
As discussed in Section II-B, physiological signals related tothe activity of the autonomic nervous system can provide non-invasive information about the user’s affective state. While,in principle, a combination of different signals can be used,for the sake of simplicity in this work we focus on the GSRsignal, which was found to effectively detect stress in bothsimulated and real-life driving [22], [26]. Furthermore, it iseasily measured with a simple sensor placed on the fingers[23]. GSR is mostly sensitive to the dimension of arousal ,going from sleepiness to excitement or stress [42]; it leavesopen whether the arousal change is of a positive or negativenature (the valence dimension), which nonetheless in ourspecific case can be derived from the context.The GSR can be decomposed into a slowly changingtonic component, the Skin Conduction Level (SCL), and animpulsive phasic component, the Skin Conductance Response(SCR) [42]. While the SCL reflects the overall emotional stateas well as habituation to the environment, the SCR measuresactivation in response to a stimulus, e.g., a potentially stressfulevent occurring in the simulated scenario. The magnitude ofthe response should correlate to the perceived threat. Thisphenomenon was previously validated in other types of VRenvironments [43], with an observable effect on the GSRsignal even after multiple exposures.
1) Signal processing: the SCR data was extracted using a3 rd order Butterworth band-pass filter ranging from 0.16 Hzto 2.1 Hz [27], [42]. Normalization is required to account forthe intrinsic inter-individual differences in skin conductance[23]. The most common choices are z-score standardization,in which the signal is divided by the standard deviation aftersubtracting the mean, and min-max normalization, in which thesignal is normalized between 0 and 1. We found that the min- EEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XXXX XXXX 8 max scaled signal was most useful for visualization purposesand trend analysis, whereas z-score normalization could beused for the final feature extraction as we observed less inter-subject variability. Considering the fact that, typically, SCRpeaks appear between 1 and 5 seconds from the stimulus’sonset and last for about 10 seconds, we extracted the SCRwaveform for a time window of ±10s centered on each event[44], [45]. Within each window, all samples are divided by theinitial signal value, to focus on relative changes.
2) Feature extraction: for each event time window, a set offeatures is extracted [44]. Let ˆ GSR ( k, j, i ) be the z-score stan-dardized data value for subject k , event j and sample at time i , SCR ( k, j, i ) the corresponding filtered signal representing theskin conductance response, and L the total number of samplesper time window. Features extracted include the mean GSR(Eq. 1), the accumulated GSR (Eq. 2), the max GSR (Eq. 3,and the Peak to Peak distance in SCR (Eq. 4). Each feature iscalculated on the 10s before ( P re ) and the 10s after (
P ost ) thetest event, and the difference ( ∆ ) is used as the final measure. GSR mean ( k, j ) = (cid:80) Li =0 ˆ GSR ( k, j, i ) L (1) GSR
Acc ( k, j ) = L (cid:88) i =0 ˆ GSR ( k, j, i ) (2) M ax ( k, j ) = max i (cid:0) ˆ GSR ( k, j, i ) (cid:1) (3) P P ( k, j ) = max i (cid:0) SCR ( k, j, i ) (cid:1) − min i (cid:0) SCR ( k, j, i ) (cid:1) (4) G. Questionnaire
Subjective data about the experience can be collectedthrough questionnaires. The questionnaire that we designedtackles factors affecting trust and, in general, HMI effective-ness [9]. Specific sections were included to test each of thesefactors. The questionnaire includes both general questions, thatcould be re-used across different driving scenarios, as well asquestions that are more specific to HMI and to the simulatedscenario. We focused our attention on those aspects thatare more relevant for establishing trust in an initial learningphase, where the user gets acquainted with the system. Whenpossible, questions were mutuated from validated tools suchas Simulator Sickness Questionnaire (SSQ) [46], the SituationAwareness Rating Technique (SART) [47] and the NASA TaskLoad Index (NASA-TLX) [48]. Questions were organized inthe following sections.
1) Health status:
VR systems may induce motion sicknessand other side effects: to avoid biases, health status is collectedbefore and after the experience using the SSQ tool.
2) System competence:
Inspired by standard questions forthe evaluation of trust in human-robot interaction (HRI), thissection evaluates the perceived system’s competence across therange of driving situations explored in the simulation [49].
3) Reaction to test events:
For each test event, the user isasked to rate four statements:
1) The situation was dangerous ,
2) The event took me by surprise ,
3) I was able to see thepotential danger before it affected the vehicle’s performance ,and
4) The interface provided me useful information to foreseethe event . These questions provide complementary information to the physiological signals and disentangle the effect of thespecific event from the HMI.
4) Situational awareness:
This section was inspired by theSART tool, focusing on dimensions (quality, quantity and fa-miliarity) that pertain to comprehensibility. Here, quality refersto the usefulness with respect to clarifying system’s intentions.Quality and quantity were evaluated for each element of theHMI, e.g., bounding boxes, navigation lines, etc.
5) Cognitive load:
This section was adapted from theNASA-TLX evaluation tool.
6) Overall user experience:
This section investigates gen-eral aspects regarding the mental model, and is concludedwith a direct question about trust. Predisposition towardsparticipating in an AD experience was also assessed beforeand after the simulation.
7) Immersion and presence:
Immersion, presence and sim-ulation fidelity were evaluated by adapting the relevant sec-tions from the VRUSE questionnaire [50], an establishedtechnique to measure usability of VR applications.All questions were in Italian and had to be rated on a 1–5Likert scale. Sections
Reaction to test events and
Situationalawareness included snapshots of the test events and the HMIelements, respectively; the questionnaire was adapted for eachtest group with snapshots from the specific HUD version.The complete questionnaire (SEL version) is available athttps://forms.gle/CpSYZc729fho7gy86.IV. E
XPERIMENTS
A. Data Acquisition
Healthy individuals (e.g. with no impairing chronic oracute illnesses at the time of the acquisition) with a validdriving license were recruited to participate in the virtualdriving experiment. Participation was voluntary and no mon-etary compensation was provided. Study participants wererandomly assigned to either the OMN or SEL HUD group.All acquisitions were performed within one week.The test phase began for each subject with a brief explana-tion of the test session. Health status, demographic informationand general disposition towards AD systems were collectedbefore starting the simulation. Two baseline signals were alsoacquired: one minute at rest, and one minute after placing theVR headset. After the simulation, the final questionnaire wasadministered and the experience debriefed.The GSR was recorded through an ad-hoc device based onthe Groove GSR Sensor [51] and a Raspberry Pi 3 board.The acquisition module was implemented in Python. Anexternal Analog to Digital Converter (MCP3008 [52]) wasused to connect the output of the sensor to the board via theRaspberry’s Serial Peripheral Interface (SPI). The samplingfrequency was set to 256 Hz in order to separate the twocomponents of the GSR signal [44]. Due to inter-subjectvariability, the GSR may saturate during the analog to digitalconversion: therefore, during the initial baseline acquisition,the converter was manually calibrated by adjusting the resistoruntil the output fell in the 200–512 a.u. range. The sensorswere applied on the fingers of the non-dominant hand, afterwashing the hands. Postprocessing and feature extraction was
EEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XXXX XXXX 9 implemented in Python 3.6.5 and the SciPy library for filtering;all calculations were performed on an HP Pavilion, Intel Corei5-3230M CPU.
B. Statistical Analysis
A two-way factorial Analysis of Variance (ANOVA) wasconducted to examine the main effect of HUD as well as theinteraction effect between event and HUD type on each GSRfeature. A mixed design was employed with the HUD type asthe between-groups factor and the event as the within-subjectsfactor. Post-hoc comparison between the different events andHUD types was performed applying Bonferroni correction.Questionnaire data was analyzed separately for each groupof questions. Event-related questions were analyzed using atwo-way factorial ANOVA, using the same design of theGSR feature. Outcomes of the other questions were comparedbetween the OMN and SEL groups using the Mann-WhitneyU-test for categorical data. A p -value of .05 or lower wasconsidered to indicate a statistically significant difference.Statistical analysis was performed using SPSS v20, whereassignal analysis and feature extraction were coded in Python. C. Participants’ Characteristics
Thirty-nine subjects volunteered to participate in the study.One subject with excessive motion sickness was excludedfrom the data-set, as symptoms would bias the physiologicalresponse [53]. A total of 38 subjects (25 male, 13 female,mean age 23.9) were included in the analysis. GSR data wasnot available for 8 subjects due to failures in the recordingequipment. Most of the subjects reported using VR or drivingsimulators “never” or “rarely” (30/38 and 34/38, respectively).
D. Quantitative Measurements
The normalized GSR signals averaged over all study sub-jects within each group are reported in Fig. 4(a)–(b). Allsubjects showed an increase in baseline GSR in VR. Moreover,a noticeable peak in the GSR occurred for most events inthe test scenario. Fig. 4(c)–(d) show the mean SCR curve foreach event. Each curve is extracted for a time window of ±10scentered at each event; within each window, all samples aredivided by the first value to highlight changes.From the SCR and GSR curves, different features havebeen extracted, as defined in Section III-F2. We here report indetail the two-way ANOVA results for the ∆ P P feature. Themain effect of HUD was significant, F(1,28)=4.72, p =.039,indicating a statistically significant difference between theOMN and SEL interfaces. The main effect of event wasalso statistically significant, F(6,168)=13.9, p< .001. We didnot find a significant interaction between HUD and event,F(6,168)=1.74, p =.115; hence, post-hoc analyses were con-ducted on each main effect separately.The mean and standard error of the SCR feature for eachevent and for each HUD are reported in Fig. 5. At post-hocanalysis, the SEL HUD consistently showed higher emotionalarousal for Car1 ( p =.022), Car2 ( p =.042) and Man2 ( p =.041)events. For the first two events in the timeline, a positive trend (a) OMN (b) SEL(c) OMN (d) SELFig. 4. Normalized raw GSR signal for OMN (a) and SEL (b) interfaces;baseline is collected prior to the experience and red lines represent test eventsin the simulation. Average SCR curves over all subjects in the 10s before andafter the event for all the test events for OMN (c) and SEL (d) interfaces. Man2Man1Car2ScooterCar1BallDog * * * P e a k P e a k ( a . u . ) SELOMN
Page 1
Fig. 5. ∆ P P feature, for all events, with OMN and SEL. ∗ p -value < .05. could be observed ( p =.181 and p =.409). For the Scooter andMan1 events, which elicit no emotional arousal, differenceswere not statistically significant ( p =.759 and p =.990).All GSR features showed a significant main effect of HUD: ∆ M ax , F(1,28)=8.53, p =.007; ∆ GSR
Mean , F(1,28)=9.36 p =.005, and GSR
Acc , F(1,28)=9.02, p =.006. Likewise, a sig-nificant main effect of event was always found ( p < . ),with no significant interaction between HUD and event. Atpost-hoc analysis, results for the ∆ M ax feature were compa-rable to ∆ P P , whereas ∆ GSR
Mean and ∆ GSR
Acc featuresreported significant differences ( p ¡.05) for Ball, Car1 and Car2events, instead of Car, Car2 and Man2 events. EEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XXXX XXXX 10
For each HUD and each test event, ∆ P P Pre (10s beforethe event) and
Post (10s after the event) values were tested fordifferences using two-tailed t-tests. With the SEL HUD, allevents showed a significant increase in SCR ( p ¡.001), exceptfor Scooter ( p =.927) and Man1 ( p =.920). Very similar resultswere obtained with the OMN HUD: the Dog ( p =.014), Ball( p =.046), Car1 ( p =.007), Car2 ( p =.001) and Man2 ( p =.003)events showed a significant effect on SCR, whereas Scooter( p =.142) and Man1 ( p =.422) did not. Results for other GSRfeatures, here omitted for brevity, were also consistent. E. Questionnaire Results
Only one subject was excluded from this analysis dueto high motion sickness; the other subjects did not reportexcessive symptoms (nausea rating M=1.26, SD=.54).Subjective ratings for test events are reported in Fig. 6.Four statements were included for each test event, as detailedin Section III-G3; for the sake of clarity, only question 1(which evaluates the risk) and question 3 (which evaluates theability to detect the potential danger in advance) are includedin the plots, as answers to questions 2 and 4 were verysimilar. At two-way ANOVA, the main effect of both HUD,F(1,36)=15.91, p ¡.001, and event, F(6,216)=54.05, p ¡.001, onthe perceived risk (question 1) were statistically significant.Interaction between the two factors failed to reach statisticalsignificance, F(6,216)=2.05, p =.060. Regarding the ability toidentify dangerous situations in advance (question 3), themain effect of both HUD, F(1,36)=28.08, p ¡.001, and event,F(6,216)=14.78, p ¡.001, were statistically significant, withouta significant interaction, F(6,216)=1.75, p = . .Since events are the same in both groups, we attributethe difference in perceived risk to the greater ability of theOMN interface to convey information about the vehicle’s sur-roundings before critical situations occur. At post-hoc analysis,differences were statistically significant for Car1 ( p =.003),Car2 ( p =.017) and Man2 ( p =.008) events, and a positive trendwas observed for Dog ( p =.134) and Ball ( p =.872) events.For each event, questionnaire ratings and GSR featuresvalues were compared by using multiple linear regression;by attempting to predict the average GSR outcome ( ∆ P P )from the average questionnaire ratings, we can desume thedegree of similarity between the two measurements. A per-subject analysis was not attempted, given the limited samplesize. A statistically significant regression equation was found,F(4,9)=14.34, p =.0007, with an adjusted R of 0.804, whichindicates that roughly 50% of the variance of the GSR canbe explained by the questionnaires. Individual factors failedto reach statistical significance, but the strongest trends wereobserved for the perceived level of risk (coefficient 0.111, p =.29) and the element of surprise (coefficient 0.112, p =.29),which are presented in the scatter plots of Fig. 7.Subjects generally found the vehicle’s driving skills ad-equate (SEL M=4.53, SD=0.61, OMN (M=4.68, SD=0.48, p =.556). In the SEL group, subjects reported more oftenthat the vehicle faced difficulties with unexpected changes inthe environment (SEL M=1.68, SD=0.75 and OMN M=1.21,SD=0.42, p =.41); such differences can only be attributed to OMN SEL ** ** ** ** ** **
Dog Ball Car1 Scooter Car2 Man1 Man2
Fig. 6. Subjective measurements for questionnaire section
Reaction to testevents . Label 1 refers to the question which evaluates the risk perception, ona scale from 1 (low risk) to 5 (high risk); label 3 refers to the question thatevaluates if and how the individual previously noticed the dangerous situation,on a scale from 1 (not previously noticed) to 5 (previously noticed). Each testevent is considered separately. ∗ p -value < .05, ∗∗ p -value < .01. M e a n s u r p r i s e r a t i n g Mean Δ P2P (a.u.)
OMNSEL (a) M e a n r i s k r a t i n g Mean Δ P2P (a.u.)
OMNSEL (b)Fig. 7. Comparison of subjective vs. objective ratings. For subjectivemeasurements, the average perceived risk (a) and the average surprise rating(b) are reported (where the latter refers to the extent to which the user wastaken by surprise by the event). The mean ∆ P P feature is reported asobjective rating. Each data point corresponds to a specific test event. the HUD, considering that the vehicle’s behavior was exactlythe same in both experiences.Displaying more information may result in an excessivecognitive load. Indeed, subjects in the OMN group more oftenrated the amount of information provided by the interface asexcessive (OMN M=2.1, SD=0.229, SEL M=1.05, SD=0.809, p ¡.001), whereas comprehensibility was rated adequate forboth the interfaces ( p =.908). On average, the UX was sat-isfactory for both the interfaces, and the information providedby the HUD was considered useful (SEL M=4.16, SD=0.69,OMN M=4.84, SD=0.38, p =.001). Participants in the OMNgroup reported that the information was more useful in orderto understand why the vehicle made a decision (SEL M=4.26,SD=1.05, OMN M=4.84, SD=0.38, p =.055) and to feel in gen-eral at ease (SEL M=3.79, SD=1.08, OMN M=4.68, SD=.59, p =.003), as well as that the vehicle seemed to have greatercontrol on the external environment (SEL M=3.79, SD=1.08,OMN M=4.84, SD=0.38, p < . ). Overall, the OMN HUDwas more helpful in anticipating potential dangers (SELM=2.42, SD=0.61, OMN M=4.10, SD=0.57, p ¡.001). Subjectsreported a high sense of immersion (M=4.50, SD=0.73) andpresence (M=4.37, SD=0.59), with no significant differencebetween the two groups.Finally, users in the OMN group were better disposed EEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XXXX XXXX 11
Pre-Test Post-Test
OMN
OMN SEL ** *
Fig. 8. Disposition towards participating in a real AD experience. On the left,the pre-test answer for the SEL and OMN interfaces; on the right, the post-testanswer. Scale from 1 (absolutely negative) to 5 (absolutely positive). Mann-Whitney U-tests between pre- and post-test answers are shown, ∗ p -value < .05, ∗∗ p -value < towards participating in a real AD experience (OMN M=4.68,SD=0.58, SEL M=4.05, SD=0.85, p =.012). As shown in Fig. 8,prior to the experiment all participants were mildly optimistic,but after experiencing the OMN HUD, attitude towards thetechnology markedly improved (M=4.0 vs. M=4.68, p =.002).Complete data is provided in the Supplemental material.V. D ISCUSSION
We here proposed a methodology to validate the UX inAD systems based on continuous, quantitative informationgathered from physiological signals while the user is immersedin a VR driving simulation. Our methodology is exemplifiedby the comparison of two AR-HUD-based interfaces whichdiffer in the amount of information displayed to the users.By controlling all aspects of the simulated environment, wewere able to disentangle the effect of very specific designchoices and measure their impact on the overall UX. It mustbe stressed that the only difference between the two groupswas the information displayed by the HUD, as the simulationwas otherwise identical; study groups were also homogeneousin terms of age, sex and ethnicity.Our results confirmed that providing “why” information isimportant to reassure the user of the system’s competenceand to promote trust and situational awareness [3], [9]. Tothe best of our knowledge, ours is the first contribution toevaluate a realistic HUD displaying a wide range of visual andauditory cues about the vehicle and its surroundings, as it isexpected in future AVs. Given the number of objects involvedin realistic scenarios, an omni-comprehensive (OMN) displaycould lead to an excessive cognitive load. A possible way toreduce information load, which we denoted as selective (SEL),is to display only the most relevant visual cues in the currentcontext. Indeed, our results indicated that the users foundinformation displayed by the OMN HUD slightly excessive,although acceptable in both cases, but this was compensatedby a less stressful driving experience, as confirmed both bysubjective and objective measures.This difference is especially evident when potentially dan-gerous events occur, such as a pedestrian crossing the streetat the last minute. It is worth noting how the HMI influencedthe perception of external events, on one hand, and of thevehicle’s performance, on the other hand, despite the fact that the simulated scenario was identical in those respects. For in-stance, users in the OMN group perceived the vehicle as betterequipped to deal with unexpected changes in the environment.We argue that this difference arose as a consequence of themental model that users formed: as the information providedby the HUD allowed users to better anticipate dangeroussituations, they projected this feeling onto the AV as well.Our results have important implications for AI research inAD, and specifically for the sensory sub-systems, as HMIconstraints need to be considered in their design. For instance,end-to-end training from sensory input to planning does notexplicitly extract all the information that was included in thissimulated HMI [54]. In our simulation, information displayedby the SEL HUD was chosen based on a set of heuristicsthat could be further improved by exploiting a more advancedAI, such as the ability to predict the motion of objects andpedestrians to foresee potentially dangerous situations beforethey actually affect the vehicle’s trajectory.In this study, we have sought to be as independent aspossible from specific AD systems, e.g., by simulating per-fect vehicle sensing capabilities. Our conclusions are thusunaffected by potential errors or misses in the AD objectdetection system. The proposed methodology could certainlybe employed to test other types of autonomous vehicles andtheir underlying AI systems, by changing the modeled interiorand/or behavior. It would also be possible to investigate howpossible errors may affect the UX and trust.The proposed scenario is certainly representative of thelearning phase as defined in [9]. Information display by theHUD is particularly relevant in this initial phase, when theuser is still forming a mental model of how the AD systemworks. Our results may not apply entirely to the performancephase, in which the user has observed the AD system for aprolonged period of time. However, the unexpected events oraccidents which we simulate, while rare, can have a profoundeffect on trust, both at the individual and collective level. Itshould be noticed that trust begins to form even before the firstinteraction with the system, e.g. based on information from themedia, or personal preferences [8], [9]. This was evident inour study where, initially, many subjects were not willing toparticipate in a real AD experience. However, participatingin the VR experience, and being exposed to an informativeinterface, significantly improved their acceptance towards ADsystems. In a simulated setting, all AD technologies, as wellas all types of events, can be recreated, opening interestingopportunities for “training” future users of AV technology.GSR proved capable of detecting user’s stress in responseto potentially dangerous events, in line with previous literatureresults which, however, were obtained in the context of manualor partially automated driving [4], [26]. Notably, differencesin HMI design were reflected in observable changes in GSRlevels, even when using consumer electronics sensors. TheGSR response was correlated to the perceived risk as measuredby subjective questionnaires, as well as to the “surprise” factor,which depends on the HMI. We here focused on the responseto specific events, but the methodology could be extended toextract features that characterize the entire experience [26].
EEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XXXX XXXX 12
VI. C
ONCLUSIONS AND F UTURE W ORK
In this work we proposed a methodology to validate the UXin AD systems based on continuous, quantitative informationgathered from physiological signals while the user is immersedin a VR driving simulation. Its effectiveness was shown inthe context of HMI design, and specifically applied to thecomparison of HUD-based interfaces for AVs that providesvisual cues about the vehicle’s sensory and planning systems.We explored in this exemplification the role of HMI in elicitinga sense of trust and safeness in AD systems, as this will bekey for humans to relinquish control of the vehicle.The proposed methodology relies on physiological signals(GSR in this specific embodiment) to provide a continuous,quantitative and objective feedback. This is particularly rel-evant for simulation of AD systems, as objective measuresin driving research are traditionally based on driver’s perfor-mance and behavior. A limitation of GSR is that it measuresarousal, but is a poor indicator of valence. In our specific case,the experience was engineered to elicit a sense of distress and,hence, a positive valence was excluded. In the future, thislack could be overcome by including other sensors, e.g., tomeasure the HR, other types of features that reflect differentcharacteristics of the UX, as well as machine learning modelsto more accurately detect the passengers’ affective state.It should be noticed that the increasing adoption of wearabledevices like smart watches incorporating a growing set ofhealth sensors will open additional opportunities for AVs’ per-sonalization; anthropomorphism, customization and adaptivityare also important factors for trust-worthy HMI [9]. While thephysiological response (1–5s in the case of GSR) is too slowto be exploited for actual driving, it could be used to customizevarious aspects of the HMI, like the quantity and quality ofinformation displayed, and of the overall driving experience.The proposed methodology for testing could be extended tocover also the above scenarios as well as other aspects of theUX (e.g., considering not just in-vehicle scenarios, but alsovehicle-to-pedestrian interactions [55], long-term performance[9], etc.), by adjusting the simulation, the HMI and/or thevehicle’s AI as needed.A
CKNOWLEDGMENT
The authors want to thank Dario Doronzo and AntonelloLaurino for their contributions on the system implementation.This research was partly supported by the VR@Polito lab.R
EFERENCES[1] W. Jiadai, L. Jiajia, and K. Nei, “Networking and communications inautonomous driving: A survey,”
IEEE Comm. Surveys & Tutor.
International Journal onInteractive Design and Manufacturing , vol. 9, no. 4, pp. 269–275, 2015.[4] L. Eudave and M. Valencia, “Physiological response while driving inan immersive virtual environment,” in
Wearable and Implantable BodySensor Networks (BSN), 2017 IEEE 14th International Conference on .IEEE, 2017, pp. 145–148. [5] R. Jose, G. A. Lee, and M. Billinghurst, “A comparative study of simu-lated augmented reality displays for vehicle navigation,” in
Proceedingsof the 28th Australian Conference on Computer-Human Interaction , ser.OzCHI ’16. New York, NY, USA: ACM, 2016, pp. 40–48.[6] P. Lungaro, K. Tollmar, and T. Beelen, “Human-to-ai interfaces forenabling future onboard experiences,” in
Proceedings of the 9th In-ternational Conference on Automotive User Interfaces and InteractiveVehicular Applications Adjunct , ser. AutomotiveUI ’17, New York, NY,USA, 2017, pp. 94–98.[7] D. Harrison McKnight and N. L. Chervany, “Trust and distrust defi-nitions: One bite at a time,” in
Trust in Cyber-societies , R. Falcone,M. Singh, and Y.-H. Tan, Eds. Berlin, Heidelberg: Springer BerlinHeidelberg, 2001, pp. 27–54.[8] J. D. Lee and K. A. See, “Trust in automation: Designing for appropriatereliance,”
Hum. Factors , vol. 46, no. 1, pp. 50–80, 2004.[9] F. Ekman, M. Johansson, and J. Sochor, “Creating appropriate trust inautomated vehicle systems: A framework for hmi design,”
IEEE Trans.Hum-Mach. Syst. , vol. 48, no. 1, pp. 95–101, 2018.[10] F. M. F. Verberne, J. Ham, and C. J. H. Midden, “Trust in smart systems:Sharing driving goals and giving information to increase trustworthinessand acceptability of smart systems in cars,”
Hum. Factors , vol. 54, no. 5,pp. 799–810, 2012.[11] R. H¨auslschmid, M. von B¨ulow, B. Pfleging, and A. Butz, “Supportingtrust in autonomous driving,” in
Proceedings of the 22nd InternationalConference on Intelligent User Interfaces . New York, NY, USA: ACM,2017, pp. 319–329.[12] A. Doshi, S. Y. Cheng, and M. M. Trivedi, “A novel active heads-up display for driver assistance,”
IEEE Trans. Syst., Man, Cybern. B ,vol. 39, no. 1, pp. 85–93, 2009.[13] Z. Medenica, A. L. Kun, T. Paek, and O. Palinko, “Augmented realityvs. street views: a driving simulator study comparing two emergingnavigation aids,” in
Proc. 13th Int. Conf. on Human Computer Int. withMobile Dev. and Serv.
ACM, 2011, pp. 265–274.[14] S. Kim and A. K. Dey, “Simulated augmented reality windshield displayas a cognitive mapping aid for elder driver navigation,” in
Proceedingsof the SIGCHI Conference on Hum. Factors in Computing Systems , ser.CHI ’09. New York, NY, USA: ACM, 2009, pp. 133–142.[15] B.-J. Park, J.-W. Lee, C. Yoon, and K.-H. Kim, “Augmented reality forcollision warning and path guide in a vehicle,” in
Proceedings of the21st ACM Symposium on Virtual Reality Software and Technology , ser.VRST ’15. New York, NY, USA: ACM, 2015, pp. 195–195.[16] R. Haeuslschmid, L. Schnurr, J. Wagner, and A. Butz, “Contact-analogwarnings on windshield displays promote monitoring the road scene,”in
Proceedings of the 7th International Conference on Automotive UserInterfaces and Interactive Vehicular Applications , ser. AutomotiveUI’15. New York, NY, USA: ACM, 2015, pp. 64–71.[17] M. T. Phan, I. Thouvenin, and V. Frmont, “Enhancing the driverawareness of pedestrian using augmented reality cues,” in , Nov 2016, pp. 1298–1304.[18] L. Guo, S. Manglani, Y. Liu, and Y. Jia, “Automatic sensor correctionof autonomous vehicles by human-vehicle teaching-and-learning,”
IEEETrans. on Vehicular Technology , vol. 67, no. 9, pp. 8085–8099, 2018.[19] F. Bazzano, F. Gentilini, F. Lamberti, A. Sanna, G. Paravati, V. Gatteschi,and M. Gaspardone, “Immersive virtual reality-based simulation tosupport the design of natural human-robot interfaces for service roboticapplications,” in
Augmented Reality, Virtual Reality, and ComputerGraphics - Third International Conference, AVR 2016, Lecce, Italy, June15-18, 2016. Proceedings, Part I , 2016, pp. 33–51.[20] Y. Chen, C. Stout, A. Joshi, M. L. Kuang, and J. Wang, “Driver-assistance lateral motion control for in-wheel-motor-driven electricground vehicles subject to small torque variation,”
IEEE Trans. onVehicular Technology , vol. 67, no. 8, pp. 6838–6850, 2018.[21] Y. Wang, B. Mehler, B. Reimer, V. Lammers, L. A. D’Ambrosio,and J. F. Coughlin, “The validity of driving simulation for assessingdifferences between in-vehicle informational interfaces: A comparisonwith field testing,”
Ergonomics , vol. 53, no. 3, pp. 404–420, 2010.[22] N. Mullen, J. Charlton, A. Devlin, and M. Bedard,
Simulator validity:behaviours observed on the simulator and on the road , 1st ed. Australia:CRC Press, 2011, pp. 1 – 18.[23] S. Balters and M. Steinert, “Capturing emotion reactivity throughphysiology measurement as a foundation for affective engineering inengineering design science and engineering practices,”
J. of Intell.Manuf. , vol. 28, no. 7, pp. 1585–1607, 2017.[24] D. Ruscio, L. Bascetta, A. Gabrielli, M. Matteucci, D. Ariansyah,M. Bordegoni et al. , “Collection and comparison of driver/passengerphysiologic and behavioural data in simulation and on-road driving,” in
EEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XXXX XXXX 13
Models and Technologies for Intelligent Transportation Systems (MT-ITS), 2017 5th IEEE International Conference on , 2017, pp. 403–408.[25] M. J. Johnson, T. Chahal, A. Stinchcombe, N. Mullen, B. Weaver, andM. Bedard, “Physiological responses to simulated and on-road driving,”
Int. J. Psychophysiol. , vol. 81, no. 3, pp. 203–208, 2011.[26] J. Healey, R. W. Picard et al. , “Detecting stress during real-world drivingtasks using physiological sensors,”
IEEE Trans. Intell. Transp. Syst. ,vol. 6, no. 2, pp. 156–166, 2005.[27] R. R. Singh, S. Conjeti, and R. Banerjee, “Assessment of driver stressfrom physiological signals collected under real-time semi-urban drivingscenarios,”
Int. J. of Comput. Int. Sys. , vol. 7, no. 5, pp. 909–923, 2014.[28] B. Dalgarno and M. J. Lee, “What are the learning affordances of 3-dvirtual environments?”
Brit. J. Ed. Tech. , vol. 41, no. 1, pp. 10–32, 2010.[29] Genivi Vehicle Simulator. [Online]. Available: https://at.projects.genivi.org/wiki/display/PROJ/GENIVI+Vehicle+Simulator[30] R. Marino, S. Scalzi, and M. Netto, “Nested pid steering control forlane keeping in autonomous vehicles,”
Control Engineering Practice ,vol. 19, no. 12, pp. 1459–1467, 2011.[31] A. Laurino, “Virtual reality-based simulation tools for evaluating userexperience in autonomous vehicles,” Master’s thesis, 2018.[32] P. A. Hancock, R. J. Jagacinski, R. Parasuraman, C. D. Wickens, G. F.Wilson, and D. B. Kaber, “Human-automation interaction research: Past,present, and future,”
Ergon. in Design , vol. 21, no. 2, pp. 9–14, 2013.[33] S. W. A. Dekker and D. D. Woods, “Maba-maba or abracadabra?progress on human-automation co-ordination,”
Cogn. Technol. Work ,vol. 4, pp. 240–244, 2002.[34] ISO, “Intelligent transport systems – forward vehicle collision warn-ing systems – performance requirements and test procedures,” ISO22324:2015, 2013.[35] A. Sebastian, M. Tang, Y. Feng, and M. Looi, “Multi-vehicles interactiongraph model for cooperative collision warning system,” in , June 2009, pp. 929–934.[36] G. Johansson and K. Rumar, “Drivers’ brake reaction times,”
Hum.Factors , vol. 13, no. 1, pp. 23–27, 1971.[37] P. George, I. Thouvenin, V. Frmont, and V. Cherfaoui, “Daaria: Driverassistance by augmented reality for intelligent automobile,” in \ International Conference on AppliedHum. Factors and Ergonomics , 2017, pp. 220–228.[41] B. Kim, D. Kim, K. Kim, and K. Yi, “High-level automated drivingon complex urban roads with enhanced environment representation,” in , Oct 2015, pp. 516–521.[42] G. Valenza, A. Lanata, and E. P. Scilingo, “The role of nonlinear dyn.in affective valence and arousal recognition,”
IEEE transactions onaffective computing , vol. 3, no. 2, pp. 237–249, 2012.[43] M. Meehan, B. Insko, M. Whitton, and F. P. Brooks Jr, “Physiologicalmeasures of presence in stressful virtual environments,” in
ACM Trans.Graphics (tog) , vol. 21, no. 3, 2002, pp. 645–652.[44] B. Figner, R. O. Murphy et al. , “Using skin conductance in judgmentand decision making research,”
A handbook of process tracing methodsfor decision research , pp. 163–184, 2011.[45] M. Slater, C. Guger, G. Edlinger, R. Leeb, G. Pfurtscheller, A. Antley et al. , “Analysis of physiological responses to a social situation in animmersive virtual environment,”
Presence: Teleoperators and VirtualEnvironments , vol. 15, no. 5, pp. 553–569, 2006.[46] R. S. Kennedy, N. E. Lane, K. S. Berbaum, and M. G. Lilienthal,“Simulator sickness questionnaire: an enhanced method for quantifyingsimulator sickness,”
Int. J. Aviat. Psychol. , vol. 3, pp. 203–220, 1993.[47] R. Taylor, “Situational awareness rating technique: The development ofa tool for aircrew systems design,” in
Sit. Awar. , 2017, pp. 111–128.[48] NASA. [Online]. Available: https://humansystems.arc.nasa.gov/groups/TLX/[49] K. E. Schaefer, “Measuring trust in human robot interactions: Develop-ment of the trust perception scale-hri,” in
Robust Intelligence and Trustin Autonomous Systems . Springer, 2016, pp. 191–218.[50] R. S. Kalawsky, “VRUSE - a computerised diagnostic tool: for usabil-ity evaluation of virtual/synthetic environment systems,”
Appl. Ergon. ,vol. 30, no. 1, pp. 11–25, 1999. [51] “Grove GSR sensor Seeed Wiki,” 2017. [Online]. Available: http://wiki.seeedstudio.com/Grove-GSR Sensor/[52] “Raspberry ADC,” 2015. [Online]. Available: https://learn.adafruit.com/raspberry-pi-analog-to-digital-converters/mcp3008[53] B. Patrao, S. Pedro, and P. Menezes, “How to deal with motion sicknessin virtual reality,”
Sciences and Tech. of Int., 2015 22nd , pp. 40–46, 2015.[54] M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, et al. , “End toend learning for self-driving cars,” arXiv:1604.07316 , 2016.[55] R. Amir and K. T. John, “Autonomous vehicles that interact withpedestrians: A survey of theory and practice,”
IEEE Transactions onIntelligent Transportation Systems , in press.
Lia Morra received the M.Sc. and the Ph.D. degreesin computer engineering from Politecnico di Torino,Italy, in 2002 and 2006. Currently, she is senior post-doctoral fellow at the Dip. di Automatica e Informat-ica of Politecnico di Torino. Her research interestsinclude computer vision, pattern recognition, andmachine learning.
Fabrizio Lamberti is an associate professor at theDip. di Automatica e Informatica of Politecnicodi Torino. His research interests are manly in theareas of computer graphics, HMI and intelligentcomputing. He is serving as Associate Editor forthe IEEE Transactions on Computers, the IEEETransactions on Emerging Topics in Computing, theIEEE Transactions on Learning Technologies andthe IEEE Transactions on Consumer Electronics. Heis a Senior Member of the IEEE.
F. Gabriele Prattic´o received his M.Sc. degrees incomputer engineering from Politecnico di Torino,Italy, in 2017. Currently, he is a Ph.D. student atPolitecnico di Torino, where he carries out researchin the areas of mixed reality, HMI, serious gamesand user experience design.
Salvatore La Rosa received the M.Sc. degree inbiomedical engineering from Politecnico di Torino,Italy, in 2019. His major interests regard biosignalanalysis, pattern recognition, machine learning andembedded systems.
Paolo Montuschi is a full professor at the Dip.di Automatica e Informatica and a Member of theBoard of Governors of Politecnico di Torino, Italy.His research interests include computer arithmetic,computer graphics, and intelligent systems. He isserving as 2019 Acting (interim) Editor-in-Chief ofthe IEEE Transactions on Emerging Topics in Com-puting and as the 2017-19 IEEE Computer SocietyAwards Chair. He is an IEEE Fellow, and a lifemember of the International Academy of Sciencesof Turin and of IEEE Eta Kappa Nu.
EEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XXXX XXXX 14 A PPENDIX
In this Appendix, complete questionnaire results are pro-vided. In particular, in Table II, characteristics of subjects in-volved in the user study are given (per group). Results concern-ing motion sickness are reported in Fig. 9. Feedback collectedthrough the other sections of the questionnaire are provided inTables III–VI. Specifically, subjective evaluation pertaining toquestionnaire sections
System competence , Cognitive Load and
Overall User Experience are reported in Table III. Subjectivemeasurements for questionnaire section
Reaction to test events are reported in Table IV, and complete the results presented inFig. 6 of the main text. Subjective evaluation pertaining to thequestionnaire section on
Situational Awareness are reportedin Table V. Quality and quantity were evaluated for eachelement of the HMI, e.g., bounding boxes, navigation lines,etc., where quality in this context refers to how useful thespecific element was in understanding the vehicle’s behaviouras well as the surrounding environment. Finally, questionsrelated to the VR experience (questionnaire section
Immersionand presence ) are reported in Table VI. All questions were inItalian and had to be rated on a 1–5 Likert scale. The Selective(SEL) and Omni-comprehensive (OMN) AR-HUD interfacesare compared using the Mann-Whitney U-test. A p -value of .05or lower was considered to indicate a statistically significantdifference; statistically significant differences are highlightedin bold in the tables. TABLE IIS
UBJECTS CHARACTERISTICS BY STUDY GROUP
Test group Gender Age
SEL Male Female 22.53 ± ± EEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XXXX XXXX 15
TABLE IIIS
UBJECTIVE RESULTS ON SYSTEM COMPETENCE , COGNITIVE LOAD , AND USER EXPERIENCE COLLECTED THROUGH QUESTIONNAIRES ON A TO -5(D ISAGREE TO AGREE ) SCALE . I
NSPIRED BY STANDARD QUESTIONS FOR THE EVALUATION OF TRUST IN HUMAN - ROBOT INTERACTION (HRI),
THISSECTION EVALUATES THE PERCEIVED
AV’
S COMPETENCE ACROSS THE RANGE OF DRIVING SITUATIONS EXPLORED IN THE SIMULATION . O
VERALLUSER EXPERIENCE INVESTIGATES GENERAL ASPECTS REGARDING THE MENTAL MODEL , AND THE TRUST POSED IN THE SYSTEM .SEL OMN U-Test
Statement µ σ µ σ p-value
System Competence
The autonomous vehicle showed adequate decision-making skills 4.526 .612 4.684 .478 .556The autonomous vehicle faced difficulties during unexpected changes in the environment 1.684 .749 1.211 .419 .041The autonomous vehicle was smart 4.579 .692 4.368 .761 .415I appreciated the driving skills of the autonomous vehicle 4.158 .602 4.105 .567 .863
Cognitive Load
It was demanding to find information within the HUD 1.000 .000 1.053 .229 1.000It was stressful to find information within the HUD 1.000 .000 1.105 .315 .486Generally the amount of information provided by then HUD was excessive 1.053 .809 2.105 .229 .000
Generally the comprehensibility of information provided by the HUD was adequate 4.737 .562 4.684 .582 .908
Overall User Experience
HUD information was useful to build trust in the vehicle 4.158 .688 4.842 .375 .001HUD information was helpful to understand why the vehicle made a decision 4.263 1.046 4.842 .375 .055HUD information helped me feel comfortable and at ease 3.789 1.084 4.684 .582 .003
Thanks to HUD information, I had the perception that the vehicle was in full control of thesituation 3.789 1.084 4.842 .375 .000
The HUD user interface was able to inform me before the potential danger affected driving 2.421 .607 4.105 .567 .000
Generally, I have found that HUD information was helpful in anticipating the dangerous situation 2.474 .612 4.421 .607 .000
EEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XXXX XXXX 16
TABLE IVS
UBJECTIVE ASSESSMENT OF TEST EVENTS COLLECTED THROUGH QUESTIONNAIRES ON A TO -5 (D ISAGREE TO AGREE ) SCALE . T
HESE QUESTIONSPROVIDE COMPLEMENTARY INFORMATION TO THE PHYSIOLOGICAL SIGNALS BY INVESTIGATING THE PERCEPTION OF THE EVENT ITSELF ( SENSE OFDANGER ) AS WELL AS THE QUALITY OF THE
HMI
IN PRESENTING THE EVENT TO THE USER ( SENSE OF SURPRISE ).SEL OMN U-Test
Statement µ σ µ σ p-value
Test Event - Dog
How dangerous would you rate this situation? (not at all - very dangerous) 3.368 .831 3.053 .911 .269I was surprised by this situation 4.421 .769 3.737 .991 .016
I detected the potential danger before it affected driving 1.632 1.065 2.737 1.046 .001
The information displayed was useful to anticipate the potential danger 1.579 .692 3.000 .816 .001
Test Event - Ball
How dangerous would you rate this situation? (not at all - very dangerous) 3.053 1.079 3.000 .943 .872I was surprised by this situation 3.632 1.012 3.053 .970 .102I detected the potential danger before it affected driving 2.053 1.224 2.789 1.228 .064The information displayed was useful to anticipate the potential danger 1.579 .692 3.263 1.368 .000
Test Event - Car1
How dangerous would you rate this situation? (not at all - very dangerous) 3.579 .961 2.421 1.170 .003
I was surprised by this situation 3.368 1.300 1.947 1.129 .001
I detected the potential danger before it affected driving 3.000 1.528 4.158 1.259 .007
The information displayed was useful to anticipate the potential danger 2.632 1.012 3.947 1.129 .001
Test Event - Scooter
How dangerous would you rate this situation? (not at all - very dangerous) 1.895 1.243 1.368 .684 .208I was surprised by this situation 1.842 1.068 1.684 1.003 .628I detected the potential danger before it affected driving 2.105 1.286 2.000 1.291 .773The information displayed was useful to anticipate the potential danger 1.211 .713 1.053 .229 .743
Test Event - Car2
How dangerous would you rate this situation? (not at all - very dangerous) 4.211 .787 3.421 1.017 .017
I was surprised by this situation 4.105 .994 3.053 .970 .002I detected the potential danger before it affected driving 1.895 .994 3.316 1.204 .000
The information displayed was useful to anticipate the potential danger 2.053 .848 3.158 .958 .000
Test Event - Man1
How dangerous would you rate this situation? (not at all - very dangerous) 1.316 .671 1.263 .452 1.000I was surprised by this situation 1.316 .582 1.158 .501 .390I detected the potential danger before it affected driving 3.474 1.712 4.474 .772 .093The information displayed was useful to anticipate the potential danger 3.421 1.170 4.316 .885 .018
Test Event - Man2
How dangerous would you rate this situation? (not at all - very dangerous) 4.474 .697 3.737 .933 .008
I was surprised by this situation 4.526 .841 3.368 1.012 .000
I detected the potential danger before it affected driving 1.842 1.015 3.053 1.079 .001
The information displayed was useful to anticipate the potential danger 1.474 .697 3.158 .958 .001
EEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. XX, NO. XX, XXXX XXXX 17
TABLE VS
UBJECTIVE ASSESSMENT OF
HMI
ELEMENTS COLLECTED THROUGH QUESTIONNAIRES ON A TO -5 (D ISAGREE TO AGREE ) SCALE . S
ELECTIVE (SEL)
AND O MNI -C OMPREHENSIVE (OMN) AR-HUD
INTERFACES ARE COMPARED USING THE M ANN -W HITNEY U- TEST
SEL OMN U-Test
Statement µ σ µ σ p-value
Situational Awareness
Bounding boxes (BB) helped me understand that the car had taken charge of the traffic lightsand handled them appropriately 4.824 .393 4.765 .437 1.000Labels helped me understand that the car had taken charge of the traffic lights and handled themappropriately 4.842 .501 4.947 .229 .743BB helped me understand that the car had taken charge of the road sign and handled itappropriately 4.556 .784 4.500 .730 .772Labels helped me understand that the car had taken charge of the road sign and handled itappropriately 4.684 .478 4.684 .671 .714BB helped me understand that the car had taken charge of the potential obstacle (pedestrian,animal) and handled it appropriately 4.316 1.157 4.737 .452 .492Labels helped me understand that the car had taken charge of the potential obstacle (pedestrian,animal) and handled it appropriately 4.263 1.195 4.579 .838 .558BB helped me to understand that the car had taken charge of the other cars likely affecting thedriving and figured out how to handle them 4.895 .315 4.895 .315 1.000Labels helped me to understand that the car had taken charge of the other cars likely affectingthe driving and figured out how to handle them 4.842 .501 4.684 .749 .500Navigation line has been helpful in understanding the vehicles’s intentions 4.944 .236 4.947 .229 1.000Other vehicles’ navigation lines helped me to understand that the car had taken charge of theirproximity and figured out how to handle them 5.000 .000 4.579 .961 .046
BB colour linked to the level of risk helped me to understand that the car had taken charge ofobstacles and handled them appropriately 4.800 .561 4.600 .737 .555The warning sound of traffic light/road sign helped me to understand that the car had takencharge of the situation 4.688 .602 4.250 .775 .106The warning sound in case of danger helped me to understand that the car had taken charge ofthe situation and handled it appropriately 4.500 .857 4.588 .618 1.000
Quantity/Mental Workload: how would you rate the quantity of information? (1=poor,3=adequate, 5=excessive)
Number of bounding boxes and labels for traffic lights 3.053 .229 3.105 .459 .604Number of bounding boxes and labels for road signs 2.579 .607 3.263 .562 .001
Number of bounding boxes and labels for potential obstacles (pedestrian, animal, etc.) 2.474 .697 3.105 .315 .001
Number of bounding boxes and labels for traffic cars 2.789 .419 3.474 .697 .001
Number of navigation lines for the traffic cars 2.842 .375 3.474 .905 .014
The warning sound of traffic light/road sign 2.368 .684 3.316 .582 .000
The warning sound in case of danger 2.526 .697 3.000 .000 .008
TABLE VIS
UBJECTIVE ASSESSMENT OF VR SIMULATOR COLLECTED THROUGH QUESTIONNAIRES ON A TO -5 (D ISAGREE TO AGREE ) SCALE . R
ELEVANTSECTIONS FROM THE
VRUSE
QUESTIONNAIRE SELECTED TO EVALUATE THE SIMULATED ENVIRONMENT WITH RESPECT TO IMMERSION , PRESENCEAND FIDELITY . SEL OMN U-Test
Statement µ σ µ σ p-value
Immersion and Presence
I felt a sense of being immersed in the virtual environment 4.526 .697 4.474 .772 .999The quality of the image reduced my feeling of presence 2.105 1.150 2.474 1.020 .239I had a good sense of scale in the virtual environment 4.947 .229 4.842 .375 .604The presence of my hands and legs within the VR enhanced my sense of presence 4.632 .684 4.579 .769 1.000The motion platform enhanced my sense of presence 4.474 1.073 4.579 .838 .987Overall I would rate my sense of presence as: (1) very unsatisfactory- (5) very satisfactory 4.368 .597 4.421 .692 .769