The Virtual Emotion Loop: Towards Emotion-Driven Services via Virtual Reality
Davide Andreoletti, Luca Luceri, Tiziano Leidi, Achille Peternier, Silvia Giordano
TThe Virtual Emotion Loop: TowardsEmotion-Driven Services via Virtual Reality
Davide Andreoletti, Luca Luceri, Tiziano Leidi, Achille Peternier, Silvia Giordano
University of Applied Sciences and Arts of Southern Switzerland (SUPSI) [email protected]
Abstract
The importance of emotions in service and product design is wellknown. Despite this, however, it is still not very well understood howusers’ emotions can be incorporated in a product or service lifecycle. Inthis paper, we argue that this gap is due to a lack of a methodologicalframework for an effective investigation of the emotional response of per-sons when using products and services. Indeed, the emotional response ofusers is generally investigated by means of methods (e.g., surveys) that arenot effective for this purpose. In our view, Virtual Reality (VR) technolo-gies represent the perfect medium to evoke and recognize users’ emotionalresponse, as well as to prototype products and services (and, for the lat-ter, even deliver them). In this paper, we first provide our definitionof emotion-driven services, and then we propose a novel methodologicalframework, referred to as the Virtual-Reality-Based Emotion-Elicitation-Emotion-Recognition loop (VEE-loop), that can be exploited to realize it.Specifically, the VEE-loop consists in a continuous monitoring of users’emotions, which are then provided to service designers as an implicit users’feedback. This information is used to dynamically change the content ofthe VR environment (VE), until the desired affective state is solicited.Finally, we discuss issues and opportunities of this VEE-loop, and we alsopresent potential applications of the VEE-loop in research and in variousapplication areas.
Traditionally, products and services are developed considering their functionalrequirements as the primary design objective. However, the success of a productis also (or, according to many, perhaps even mainly) determined by its abilityto engage its users at the emotional level. In this respect, the authors of Ref.[1] claim that up to 95% of our buying decisions are unconscious. Despite this,however, the role of emotions is often underestimated, if not even totally dis-regarded, when designing products and services [2], and this contradiction isgenerally interpreted as a phenomenon of cultural inertia [2]. While also sup-porting this idea, in this paper we argue that another crucial reason is the lack1 a r X i v : . [ c s . H C ] F e b f a methodological framework that enables designers to effectively consider theemotional response of users of services and products. In this paper, we giveour definition of emotion-driven service/product and we propose a methodolog-ical framework, enabled by Virtual Reality (VR) technologies, that can be usedtowards the realization of this idea.We say that a service (or product) is emotion driven if the emotions of thepersons that use it are taken into consideration in as many phases of its life-cycle as possible (e.g., from design to delivery). Commonly, services undergoa validation process aimed to assess the fulfillment of their technical and util-ity requirements. In our view, emotion-driven services should be produced byperforming a similar validation of their emotional requirements as well. To thisend, we propose a methodological framework that allows to develop services fol-lowing an emotion-by-design approach, i.e., by integrating users’ emotions fromthe very early stages of service development, and not, as commonly done, byleaving them as an afterthought.To realize this idea, we envision the implementation of a loop in whichusers’ emotions are monitored and the characteristics of the service are dynam-ically changed to induce the emotions intended by its designers. This dynamicchange can be either implemented in the design phase only (e.g., to validate thehypothesis that some feature triggers a specific emotional reaction) or, when-ever possible, in the delivery phase as well (e.g., by dynamically modifying thecharacteristic of a service in real-time). In this paper, we provide argumentssupporting the claim that the VR is the most ideal instrument to realize thisloop, that we refer to as Virtual-Reality-Based Emotion-Elicitation-Emotion-Recognition loop or, in short, VEE-loop. The reasons why we consider the VRthe most suitable technology to realize this scheme are manifold. First, amongall the existing digital technologies, the VR is the one that guarantees the mosttangible experience across the most varied domains. Then, the VR allows to flex-ibly modifying the experience of its users. In addition, Head Mounted Displays(HMDs) allows to collect a significant number of valuable users’ bio-feedbacks(e.g., movements and postures) that can be exploited to infer their emotionalstatus [3], as well as their involvement, fatigue, and stress [4].The VEE-loop has a potentially high number of application areas. For exam-ple, the VEE-loop can be used as a tool to perform a validation of the capabilityof a product to trigger specific emotions, before its actual production. Indeed,designers could obtain an emotional feedback from potential customers and tunethe design accordingly. In this respect, the tangibility of the VR guarantees ahigher fidelity of this emotional reaction with respect to other methods, whileits flexibility allows to test a high number of products’ characteristics. TheVEE-loop can also improve services in which having the information on users’affective states is highly beneficial but is unavailable for some reasons (e.g., dueto physical distancing measures imposed to handle the Covid-19 pandemic). Inremote learning, for example, the emotional statuses of students can be mon-itored, and the virtual lecture dynamically changed (e.g., to induce calm instudents or to draw their attention).This paper is structured as follows. In Section 2 we elaborate on the impor-2 R User Virtual Environment
EMOTION RECOGNITION detection of user’s emotion based onthe implicit feedback collected from VR
EMOTION ELICITATION dynamic adaptation of VRaccording to i) the current detectedemotion and ii) the target emotion
Figure 1: High-Level Representation of the VEE looptance of designing products and services based on users’ emotions, we give ourdefinition of emotion-driven services and products and we motivate our proposalof the VEE-loop as an enabling framework to realize them. Then, in Section3 we provide more details of the main components of the VEE-loop, and weshow its novelty with respect to previous works. Section 4 discusses potentialapplicative areas of the proposed VEE-loop. Finally, Section 5 elaborates onthe challenges and opportunities of our framework, also adding some conclusiveremarks.
This Section starts by elaborating on the importance of users’ emotions whenusing products and services. Our aim is to clarify the rational of designingemotion-driven products and services, that are as well defined in this Section.Finally, we motivate the use of VR as an enabling technology towards the real-ization of this vision.
The authors of Ref [2] make a series of claims on the importance of emotions ofpersons that use products and services. First, emotions convey valuable infor-mation on the perceptions that users have when using a product. Emotions are3lso related to users’ desires, which are often latent and not verbalized. There-fore, traditional marketing strategies fail to effectively capture the emotionsthat a given product or service elicit on its users. Indeed, traditional marketingstrategies, such as surveys, allow designers to understand new functional re-quirements that can help to do only slight and superficial product modificationsonly. Instead, by capturing also the emotional reaction of users, designers couldunderstand their customers more deeply, and so create more radical innovations.In this respect, our proposed VEE-loop would enable designers to qualitativelymeasure their users’ emotions, and to test a high number of combinations of aproduct’s sensory qualities (e.g., form and color).Then, co-design (i.e., performed in conjunction by designers and customers)allows to create more personalized services and products, which are generallyregarded as more attractive than those done by the designers only, or by thecustomers only [2]. As users are often incapable of verbalizing their emotions,the emotional feedback obtained through the use of the VEE-loop is indeedvaluable, as it could help both designers and customer to understand whatcustomers really like about a product.Finally, emotions are extremely important to consolidate brand identifica-tion. Indeed, a clear and tight connection between a brand and a specific emo-tion ensures loyalty of customers in the long term. In this respect, it is crucialthat all the phases of a product or service lifecycle can evoke the very sameemotion. For instance, both the use and after-sales assistance should, ideally,evoke the same emotions on customers. The VEE-loop can be used to checkif there is consistency between intended and perceived emotions, and, in casethere is not, adapt the content to ensure this consistency.The aforementioned facts justify the idea of implementing the proposed loopinvolving emotion recognition and elicitation. In the following subsection, weelaborate more on the characteristics of emotion-driven services and products.
Ideally, an emotion-driven service or product considers users’ emotions in allthe phases of its lifecycle. Our proposal to take users’ emotions in considerationis the VEE-loop. As an example, the design phase is done by exploiting theVEE-loop to validate the fulfillment of users’ emotional requirements before therealization of the actual tangible product. This implies, for instance, that aproduct undergoes large-scale tests aimed to validate the assumption that itcan evoke the intended emotions. In this phase, users’ emotions are trackedand used as input for the designers, who can better understand which specificcharacteristic has caused the detected emotions. Then, various combinationsof features are tried (exploiting the flexibility of the VR), and the operation isrepeated, in a loop. Similarly to the design phase, the VEE-loop can also beused during the delivery phase to dynamically change the service until it elicitsthe emotions intended by the designers.We are aware that this paradigm might not be applicable to all the phases ofa product/service lifecycle. Let us clarify this issue with a couple of examples.4n an idealistic emotion-driven schooling, for instance, a teacher would prepareher lecture (i.e., in the design phase) also trying to induce enthusiasm on herstudents, and she will eventually modify the style (or even the content) of thelecture to achieve this goal (i.e., in the delivery phase). On the other hand, thedesigner of a commercial product might take particular stylistic choices withthe aim to elicit a specific emotion that reinforces brand identification [5] (e.g.,sense of comfort), but these choices would be limited to the design phase only(in case the product can not change after it has been made).In the following subsection, we provide arguments that support our choice ofthe VR as the most suitable candidate technology to implement our idea of loopof emotion recognition and elicitation. Indeed, the VR can be employed bothduring design (exploiting its flexibility and tangibility) and during delivery, asan increasing number of services are being provided also with VR.
VR is the most natural, direct, and ideal technology for translating into realproducts and services all the ideas explained so far, and for many reasons.First, pure VR (i.e., a user interacting with an entirely synthetic, computer-generated virtual environment) allows creating completely modifiable, dynamicexperiences. Unlike augmented and mixed reality, which is limited and linkedto surrounding, physical elements, pure VR can be easily distributed online,experienced everywhere, replayed at will, and its content regularly updated.The immersion provided by VR also amplifies emotional reactions [6, 7],which help both in the emotion recognition and elicitation phases compared toother less effective means to put a user in a given simulated situation. In addi-tion, the retention rate of learning and training dispensed via VR is increasedwhen compared to more conventional media [8].The recent revival of VR also provides a much lower entry-point to the tech-nology, which is now considered a commodity, off-the-shelf option no longerlimited to research laboratories or professional contexts. Thanks to this evo-lution, which also significantly increased the quality of modern VR comparedto the state-of-the-art of just few years ago, a significantly larger user-base cannow be targeted by VR-enabled solutions.Finally, modern VR equipment already embeds sensors that are critical forinferring the user’s emotional state (and its evolution) during the virtual expe-rience. Since body tracking is a central requirement of VR, most of the recentHead Mounted Displays (HMDs) are capable of tracking user’s head and handsposition in real-time and at high frequency, while some models started includingeye tracking, too. These sensors can be used not only for the proper positioningof the user within the virtual environment (e.g., to update the viewpoint andstereoscopic rendering parameters) and to precisely determine what the user islooking at at a precise moment, but also to derive a series of additional metricssuch as heartbeat and respiratory rate [9]. Next-generation HMDs will directlyembed dedicated sensors for monitoring such states (like the HP Reverb G2Omnicept). 5his constant source of information can be used to acquire data that pre-viously required to dress the user with a cumbersome set of devices and/or toprepare the environment for different levels of motion tracking (from a simpleMicrosoft Kinect to professional-grade systems such as the Vicon). Most ofthese capabilities are now integrated into one single device that provides allthe ingredients for building an emotion recognition and elicitation system underwearable and affordable constraints. Nevertheless, HMDs can still be coupledwith additional monitoring devices to increase the amount, kind, and accuracyof user-generated signals for this task (e.g., by combining the full-body trackingprovided by the Microsoft Kinect with the head and hands positions returnedby the headset).
The VEE loop consists in the continuous monitoring of the affective states ofusers (performed by analysing their bio-feedbacks) and in the adaptation ofthe content of the VR to induce a transition from the current to the desiredaffective state (e.g., from fear to calm), or to keep the current emotion stable.We refer to the content of the VR to as Virtual Environment (VE). To makethis possible, the VEE-loop is composed of a module for Emotion Recognition(ER) and one for Emotion Elicitation (EE). An overall representation of theVEE-loop architecture is depicted in Fig. 2. Specifically, this figure shows thatuser’s generated bio-feedback are given in input to the ER module, which infersfrom them the emotion most likely perceived by the user. The detected emotionand the emotion that service designer aims to evoke are then passed into the EEmodule, which dynamically changes the content of the VR. We further articulatethe ER and EE components in the next subsections.
The ER module is responsible of inferring, from a set of multi-modal data, theemotion that the user is most likely perceive. Note that data acquisition mightbe performed with the HMD, as well as with other supporting tools that do notprevent the VR experience (e.g., wearable devices). The ER module consists ofthe following layers: • Feature Extraction: hand-crafted and learned features can be consideredfor each type of gathered data (e.g., acceleration of joints for body’s move-ments, or spectrogram for voices); • Fusion: this layer is meant to combine data, features and algorithms tomaximally exploit the information contained in users’ data, in order toincrease the generalization of the ER module; • Segmentation: this layer is meant to make algorithms designed to workon standalone signals also able to recognize emotions from a continuous6igure 2: Architecture of the proposed VEE-Loopstream of data (i.e., to segment to stream into portions of signal in whicha particular emotion is carried); • Emotion Classification: a supervised learning algorithm that is trained toperform a classification of users’ emotions.To summarize, the architecture of the ER module infers emotions from acontinuous stream of users’ generated data (and not, as commonly done in ERresearch, from standalone data, i.e., signals associated with a single emotion),and works both on single-mode (e.g., on users’ movements only) and multi-modemanner (e.g., on a combination of users’ movements and voice).As for Emotion Elicitation (EE), this module is responsible to select thecontent of the VE based on i) the emotions detected by the ER module and ii)the emotions that designers aimed to evoke. An open research question is howthis selection can be performed. Due to the complexity of the task, rule-basedautomatic adaptation of VEs should be firstly considered. However, model-based adaptations might be considered as well (e.g., using machine learning).
Ref. [10] proposes an architecture to perform users’ emotion-driven generationof a VE and it is, among previous works, the one closest to the aim of theproposed framework. The authors of [10] also validates the effectiveness of thearchitecture in the context of mental health treatment. Such architecture isdesigned to detect users’ emotions from multi-mode data (similarly to our ER7odule) and, accordingly, to generate a VE to stabilize them, e.g., to inducecalm (similarly to our EE module). However, this existing architecture makesuse of a very simple algorithm (i.e., a linear regression), while our proposedframework aims at developing a richer module that includes the most efficientexisting ER algorithms Then, in Ref. [10] the generated VE is a simple maze,while our goal is to develop several more complex VEs in various applicativescenarios.The second relevant existing work that investigates the use of VR as a toolto perform ER and EE can be found in Ref. [11], whose authors propose, as wedo, an integrated system to perform both ER and EE in VR, and they makethe following important claim: they are the first to apply machine learningstrategies to perform ER in the context of VR. Given that Ref. [11] is a veryrecent PhD Thesis (dated April 2020), we argue that the proposed system canprovide a significant contribution to the current state of the art. Indeed, Ref.[11] presents three main drawbacks that we aim to handle: i) the VE is static,while our framework aims to be dynamic and able to automatically adapt tousers’ current affective states; ii) only user’s electroencephalogram is used asinput of the ER module, while we envision the gathering of a larger set of users’bio-feedbacks; iii) a very simple machine learning algorithm is employed, whileour framework is expected to consider a vast array of advanced machine learningalgorithms. Most of the research on ER is done on single-mode and standalonedata (see the recent survey [12]), which carry acted and exaggerated emotions.Instead, the proposed framework allows to consider streams of multi-mode data(which introduce the challenge of identifying the onset and end of emotions)and to exploit the immersiveness of the VR experience to induce (and then,recognize) more spontaneous emotions. Ref. [13] shows evidence that the VR ismore effective than traditional media to perform EE, and studies its influenceon decision making processes.
In this Section, we describe the applicative areas where the VEE-loop can beemployed, as well as the potential impact that it can have across several areas.
By enabling emotion-driven services and products, the VEE-loop opens theway to a wide spectrum of applications. We envision these potential appli-cations to fall in three main areas, which we refer to as: 1) service deliv-ery, such as education [14] and human-machine interaction and familiarization( https://v-machina.supsi.ch/ ), 2) customer experience, e.g., in marketingto understand what creates favourable and unique consumer-brand relationships[15, 5], 3) research and development, both from academics and industries.First, the VEE-loop can benefit the delivery of services where the informa-tion on users’ emotional states is unavailable for some reasons. Example of8pplications in this category are remote schooling (e.g., due to Covid-19 restric-tions) and virtual training and practice (e.g., due to expensive and dangerousmachineries). In such scenarios, the emotional response of students (or train-ers) can be monitored in order to dynamically adapt the virtual environment(or required tasks) and enhance users’ learning experience.Second, the VEE can be used to enhance customer experiences by providingVR scenarios where users can test products, while allowing companies to val-idate the capability of products to trigger specific emotions, even before theirproduction. In such a way, designers could obtain emotional response from po-tential customers and tune the design accordingly in order to create better “tellbrand stories”. For instance, in interior design, the inferred emotions can beused to understand which factors reinforce people’s well-being [16, 17].Finally, the VEE-loop has the objective of stimulating research to createinnovative emotion-driven services capable of smoothing the transition towardsan increasingly-digitalized society, as well as to advance the state of the arton the growing fields of emotion recognition and elicitation. We envision thedevelopment of research and applications in disparate areas, ranging from userexperience (UX) design (e.g., to optimize users’ spatial perceptions) to fine arts(e.g., in theatrical performances, to understand the relation between emotionsand mechanisms of an embodied acting).
In light of the numerous potential applications described before, our pioneeringsolution can potentially impact various dimensions of our society. We distinguishfive dimensions where the VEE loop can provide benefits, which we detail asfollows.
Economical Impact
The VEE loop finds applications in a countless numberof industrial sectors, while providing potential economical advantages both inthe production and in the marketing phases. For example, it can be used byexperts in advertising to understand what reinforces unique brand association,or by designers to evaluate users’ emotional response to the characteristics ofa product before its tangible development. This allows designers to take moreinformed decisions, therefore reducing the risks (and associated costs) of creatingunsuccessful products and services.
Social Impact
The VEE loop can help to deliver more empathetic servicesusing the VR, therefore bringing a high social impact across many different areas(e.g., remote schooling). Potential applications can also target the treatmentof pathologies characterized by disorders on the emotional sphere (e.g., autism)and collective trainings in emergency situations.
Environmental Impact
The VEE loop integrates emotional aspects intoservices delivered remotely, therefore increasing their adoption. This has the9otential of enabling remote working and practices, thus, limiting unnecessarytravels, and, in turn, reducing the emissions produced by means of transport.
Research Impact
Our vision contributes to the research on ER and EE,and provides a tool that researchers can readily use in the studies relative tothese fields. The VEE loop is a novel and timely solution that can be a poten-tial cornerstone in many different projects (from the research-oriented to themore applicative ones), therefore enabling transversal collaborations betweenacademy and industry.
Cultural Impact
The VEE loop enables avant-garde cultural events deliv-ered with the VR. For instance, stylistic choices of a cultural event (e.g., intheatrical representations) can be modified according to the emotional responseof the audience (even if attending remotely) in real time and in an economically-sustainable manner. This asset can find application in several cultural scenarios,e.g., theatre, virtual city trip and museum virtual tours.
The VEE-loop inherits all the benefits of VR technologies. First and foremost,the flexibility of real-time contents modification. While this characteristic istypical of other digital media as well, the VR also guarantees a much moretangible experience to its users, giving them the impression of dealing with realproducts (experience that is not instead possible with other digital technolo-gies). Moreover, the VR simulated environments significantly outperform othermedia in evoking emotions. In fact, the sense of presence experienced usingthe VR lead users to perceive more spontaneous emotions that, for this rea-son, are a more valuable feedback for the designers. These facts allow serviceand products designers to experiment a high number of stylistic (and also func-tional) choices, and to have reliable emotional feedback from their users. Thefact that emotions are more spontaneous, however, also poses the challenge ofcorrectly identify them. Indeed, most of the research on emotion recognitionis based on the analysis of emotions that are voluntarily exaggerated and that,for this reason, are also easier to recognize. On the other hand, the VR allowsto perform emotion recognition exploiting a high number of multi-mode data,either collected with the HMD itself (e.g., head and hands micro-movementsand eye movements) or with other supporting acquisition devices (ranging fromnon-invasive motion capturing technologies, to simple wearable devices and mi-crophones). This multitude of heterogeneous data can significantly benefit theemotion recognition task. Another aspect to consider is that emotions must beestimated from a stream of signals and not, as generally done in previous work,from stand-alone data. Therefore, a segmentation process is required to identifythe instants of transitions between two different affective states, before their ac-tual classification. To our knowledge, the problem of segmentation is extensivelyconsidered in the action recognition task, but quite unexplored in the emotion10ecognition one. Finally, the dynamic modification of the virtual content iscurrently done either manually or according to simple rule-based approaches.A significant research effort is required to explore more efficient automationstrategies, such as those based on powerful machine learning algorithms.
References [1] G. Zaltman, “The subconscious mind of the consumer (and how toreach it),”
Harvard Business School. Working Knowledge. Obtenido dehttp://hbswk. hbs. edu/item/3246. html , 2003.[2] C. Wrigley and K. Straker,
Affected: Emotionally engaging customers inthe digital age . John Wiley & Sons, 2019.[3] H. H. Ip, S. W. Wong, D. F. Chan, J. Byrne, C. Li, V. S. Yuan, K. S. Lau,and J. Y. Wong, “Enhance emotional and social adaptation skills for chil-dren with autism spectrum disorder: A virtual reality enabled approach,”
Computers & Education , vol. 117, pp. 1–15, 2018.[4] C. Tremmel, C. Herff, T. Sato, K. Rechowicz, Y. Yamani, and D. J.Krusienski, “Estimating cognitive workload in an interactive virtual re-ality environment using eeg,”
Frontiers in human neuroscience , vol. 13, p.401, 2019.[5] K. L. Keller,
Building customer-based brand equity: A blueprint for creatingstrong brands . Marketing Science Institute Cambridge, MA, 2001.[6] J. Diemer, G. W. Alpers, H. M. Peperkorn, Y. Shiban, and A. M¨uhlberger,“The impact of perception and presence on emotional reactions: areview of research in virtual reality,”
Frontiers in Psychology
PLOS ONE , vol. 14, no. 10, pp. 1–24, 10 2019. [Online]. Available:https://doi.org/10.1371/journal.pone.0223881[8] S. K. Babu, S. Krishna, U. R., and R. R. Bhavani, “Virtual reality learningenvironments for vocational education: A comparison study with conven-tional instructional media on knowledge retention.” in , 2018,pp. 385–389.[9] C. Floris, S. Solbiati, F. Landreani, G. Damato, B. Lenzi, V. Megale, andE. G. Caiani, “Feasibility of heart rate and respiratory rate estimation by11nertial sensors embedded in a virtual reality headset,”
Sensors
IEEE journal of biomedical and healthinformatics , vol. 23, no. 5, pp. 1877–1887, 2018.[11] J. Mar´ın Morales, “Modelling human emotions using immersive virtual re-ality, physiological signals and behavioural responses,” Ph.D. dissertation,2020.[12] A. Saxena, A. Khanna, and D. Gupta, “Emotion recognition and detectionmethods: A comprehensive survey,”
Journal of Artificial Intelligence andSystems , vol. 2, no. 1, pp. 53–79, 2020.[13] S. Susindar, M. Sadeghi, L. Huntington, A. Singer, and T. K. Ferris, “Thefeeling is real: Emotion elicitation in virtual reality,” in
Proceedings of theHuman Factors and Ergonomics Society Annual Meeting , vol. 63, no. 1.SAGE Publications Sage CA: Los Angeles, CA, 2019, pp. 252–256.[14] F. Dosseville, S. Laborde, and N. Scelles, “Music during lectures: Willstudents learn better?”
Learning and Individual Differences , vol. 22, no. 2,pp. 258–262, 2012.[15] R. J. Brodie, A. Ilic, B. Juric, and L. Hollebeek, “Consumer engagement ina virtual brand community: An exploratory analysis,”
Journal of businessresearch , vol. 66, no. 1, pp. 105–114, 2013.[16] V. De Luca, “Emotions-based interactions: Design challenges for increasingwell-being,” 2016.[17] ——, “Oltre l’interfaccia: emozioni e design dell’interazione per il be-nessere,”