Emotion in Future Intelligent Machines
EEmotion in Future Intelligent Machines
Authors:
Marwen Belkaid and Luiz Pessoa Affiliations Istituto Italiano di Tecnologia (IIT), Genova, Italy. Sorbonne Université, CNRS, Institut des Systèmes Intelligents et de Robotique (ISIR), 75005 Paris, France. Maryland Neuroimaging Center and Departments of Psychology and Electrical and Computer Engineering, University of Maryland, College Park, USA * Corresponding author:
Address:Istituto Italiano di TecnologiaCenter for Human TechnologiesVia Enrico Melen, 8316152 Genoa, ItalyPhone:+39 010 817 2246
Email addresses [email protected] (Marwen Belkaid), [email protected] (Luiz Pessoa)
Abstract
Over the past decades, research in cognitive and affective neuroscience has emphasized that emotionis crucial for human intelligence and in fact inseparable from cognition. Concurrently, there has been asignificantly growing interest in simulating and modeling emotion in robots and artificial agents. Yet,existing models of emotion and their integration in cognitive architectures remain quite limited andfrequently disconnected from neuroscientific evidence. We argue that a stronger integration of emotionin robot models is critical for the design of intelligent machines capable of tackling real worldproblems. Drawing from current neuroscientific knowledge, we provide a set of guidelines for futureresearch in artificial emotion and intelligent machines more generally.motion is critical for the flexible, intelligent behavior of biological organisms. Accordingly, multipleattempts to model emotion in robots and artificial agents have been described in the last decades. Yet,how emotion is modeled and how it interfaces with “cognitive architectures” remains poorlydeveloped. We argue that current shortcomings are due to the design of artificial emotion in a mannerthat is largely disconnected from recent neuroscientific evidence. Several robotics and artificial intelligence proposals take inspiration from biological cognition (e.g.Mnih et al, 2015; Cully et al., 2015; Moulin-Frier et al., 2017; Doncieux et al., 2018). In many of these,reinforcement learning is used as a model of autonomous learning and decision-making, providinggood examples of fruitful interactions between neuroscience and artificial intelligence (Neftci &Averbeck, 2019). However, the framework of reinforcement learning does not encompass criticalcomponents of natural emotion; it primarily addresses processes related to operant learning and simpleforms of decision-making (see review by Moerland et al., 2018). More generally, and as argued in thepresent piece, emotion is not sufficiently incorporated into robotics.Here, we begin by reviewing the literature on emotion modeling in robots and artificial agents. Wediscuss existing proposals with respect to five criteria: embodiment, behavior, architecture design,theoretical approach, and goal. Our proposed classification is aimed at bridging the gap betweencomputational models of emotion employed in robotics and the neuroscience of emotion. Critically, weemphasize recent findings in brain science that reveal how emotion and cognition are integrated atmultiple levels of the brain. We then translate such insights into guidelines for the development offuture models of emotion in intelligent machines capable of tackling real-world problems.
1. Virtual and robotic models of emotion
Over the past few decades, there has been growing interest in simulating and modeling emotion inmachines like robots and virtual artificial agents across multiple research fields, including affectivecomputing, social robotics, neurorobotics, and computer animation. Due to the specificity of eachdiscipline in terms of their engineering perspective and research goals, existing proposals are fairlydisconnected from one another. However, progress in the field requires identifying common themeswhile understanding particular requirements and/or goals of the approaches adopted. We propose thatthe existing research landscape be understood in terms of five criteria, as follows.
A straightforward dimension of artificial emotion is to consider agent embodiment: physicalembodiment (robot) or virtual embodiment (e.g. animated virtual character). Kismet (Breazeal, 2003),Berenson (Karaouzene et al., 2013), and EMYS (Correira et al., 2016) are expressive robots designedto interact with humans. They have actuators controlling eye and mouth movements with enoughdegrees of freedom to mimic stereotypical emotional facial expressions. While some platforms arebased on more anthropomorphic robot faces (e.g. Wu et al., 2009), simpler ones display facialexpressions on a screen (e.g. Masuyama et al., 2018). Another aspect that applies to physicallyembodied but non-expressive robots pertains to developing “behavioral regulation” capabilities duringtask performance (e.g. Avila-Garcia & Cañamero 2004; Krichmar, 2013; Belkaid et al., 2018).owever, while benefiting from real physical embodiment, robotic models suffer from limitedbehavioral repertoires given the inherent difficulties of movement generation in mechanical systems.In the domain of virtual embodiment, 3-D animated avatars exhibit rich non-verbal behaviors(gestures, postures, and facial expressions), in addition to verbal utterances that convey socio-emotional cues to human users. Several computational models of emotion have been implemented onconversational virtual agents (e.g. Gratch & Marsella, 2004; Gebhard, 2005), and Greta (Pelachaud,2009) and MARC (Courgeon & Clavel, 2013) are well-established platforms. To increase the feeling ofimmersion, animated virtual agents can be integrated in virtual reality (Martin et al., 2011; Ochs et al.,2016), giving users a sense of situated interaction. Nevertheless, these applications suffer fromlimitations due to the absence of physical interaction with the real world.
Another distinction can be made in terms of the nature of the behavior exhibited by the machine. Themajority of artificial emotion models focus on social behavior for the purpose of facilitating human-machine interactions: selecting verbal utterances for customer service chatbots (Yacoubi & Sabouret,2018), interacting with and learning from museum visitors (Karaouzene et al., 2013), and mixing verbaland non-verbal behaviors in companion robots (Saint-Aimé et al., 2009; Correira et al., 2016) or virtualtrainers (Gratch & Marsella, 2004). But there is a lot more to emotion than just social behaviors, andsome models address questions such as approach and avoidance behavior in foraging (Krichmar, 2013),as well as competitive foraging (Avila-Garcia & Cañamero 2004). Another example is the modulationof attention by emotion-related factors in visual search (Belkaid et al., 2017).
Modularity is an important engineering design principle. Many cognitive architectures, like otherdesigned systems, are modular (e.g. Breazeal, 2003; Courgeon & Clavel, 2013; Correira et al., 2016).Accordingly, when emotion is added to the overall system architecture, frequently it takes the form of aseparate module which interacts with other components. In this context, emotion is often conceived asa simple “bias” mechanism that up- or down-regulates other system functions; for example, sensoryprocessing might be increased, or cognitive functions may be deemphasized. In contrast, integrativeapproaches highlight the interdependence between emotion and cognition in the system (e.g. Avila-Garcia & Cañamero 2004; Belkaid et al., 2018). The modularity of the overall architecture ofintelligent systems is an essential design decision, and a growing literature provides evidence for theintegration of emotion and cognition in the brain (Phelps and LeDoux, 2005; Pessoa, 2008; 2013;Grossberg, 2018).
Some computational models of emotion take direct inspiration from emotion theories developed bypsychologists, and explicitly instantiate theoretical principles in what can be referred to as a top-downfashion. For example, the ALMA model (Gebhard, 2005) is based on a combination of two theoreticalmodels, the cognitive-based framework developed by Ortony and colleagues (Ortony et al., 1988) andthe Pleasure-Arousal-Dominance scheme by Russell and Mehrabian (1977). The EMA model (Gratch Marsella, 2004) implements the appraisal and coping theory proposed by Lazarus (1991), andTEATIME (Yacoubi & Sabouret, 2018) implements the action-tendency theory proposed by Frijda(1986). In contrast, artificial emotion can be approached in a bottom-up fashion by focusing on theimplementation of specific aspects of natural emotion. For example, Avila-Garcia & Cañamero (2004)propose a hormone-like mechanism as part of homeostatic action selection processes to address theproblem of resource competition (see also Krichmar, 2013). In another application, Boucenna andcolleagues (2014) investigated how a robotic system can learn to recognize facial expressions in anunsupervised fashion (i.e. without explicit labels). We note that bottom-up approaches can actuallycomplement and inform existing emotion theories by providing concrete implementations of processesthat are otherwise outlined descriptively (Belkaid et al., 2018).
In general terms, artificial emotion systems can be distinguished based on their research goal. Asubset of the literature is application-oriented, with the goal of generating human-like reactions in orderto enrich interactions with a human user (e.g. Breazeal, 2003; Pelachaud, 2009). Particular applicationsinclude elderly care (Correira et al., 2016) and high-stakes decision-making (Gratch & Marsella, 2004).A complementary goal is to model mechanisms of natural emotion to evaluate and test existingframeworks (e.g. Krichmar, 2013; Belkaid et al., 2018). As discussed below, we believe computationaland robotic models will play an increasingly important role in advancing the understanding of theneural basis of emotion.
2. Natural emotion: brain, body, and behavior
How emotion-related processes are modeled in robot and artificial agents often contrasts sharply withcurrent knowledge about biological emotion. In the following, we summarize key findings of theneuroscientific literature that highlight the gap between natural and artificial emotion. In particular, westress the integration between emotion and cognition in humans and animals at multiple levels: brain,body, and behavior.
Historically, the brain basis of emotion was conceptualized in an area-centric manner. For a longperiod, the hypothalamus was believed to be the emotion center, shifting to the amygdala in the 1980s.In the last decades, not only has the number of regions of the “emotional brain” increased steadily, buthow they function via complex circuits is starting to be unraveled. These regions include the medialprefrontal cortex, the orbitofrontal cortex, the cortex of the insula, the thalamus, and many more.Critically, rather than being functionally localized in specific areas, emotion-related processes areimplemented by distributed neural circuits that rely on multiple structures at the same time (Pessoa,2017; Tovote et al. 2015; Lindquist and Barrett, 2012).More broadly, the classical separation between emotion and cognition has been gradually eroded.Behind the blurring of their boundaries is the notion that mental processes are implemented via large-scale, distributed networks (Sporns, 2010). The networks that have been uncovered in the context ofcognitive processes share many nodes (i.e. regions) with those that are important for emotion (Najafi etl., 2016). Thus, neural computations underlying behavior are implemented via overlapping networks.In this manner, specific brain areas affiliate, or group with, multiple large-scale networks depending onbehavioral demands.Even more generally, the separation between mental domains such as perception, cognition, action,motivation, and emotion, while possibly suitable for a textbook organization, does not reflect theorganization of the brain. To understand how the brain generates complex, flexible, and adaptivebehaviors it is necessary to understand how brain circuits disrespect standard boundaries. In a very realsense, the domains cannot be separated.
Intelligence is not a mere collection of computations occurring in the central nervous system butresult from the coupling of the brain, the body, and the environment (Varela et al., 1992; O’Regan &Noë, 2001). From this perspective of embodied cognition, emotion is rooted in homeostatic processesthat guarantee bodily integrity, and the associated construction of bodily representations capturing thestate of body at any instant. These key functions engage both subcortical and cortical areas. Thus,neuroscientifically grounded theories of emotion attribute a central role to the body in emotional-related processes. For example, in the core affect theory, bodily states are central to emotionalexperience (Russell, 2003). In the somatic marker theory, associations between particular situations andpatterns of elicited physiological and emotional reactions are established, and help shape behavior(Damasio et al., 1996).
Emotion expressions, including those such as facial expressions, gestures, and postures, are animportant feature of the relationship between emotion and the body (de Gelder et al., 2015; Cowen etal., 2019). The variety and complexity of processes involved in emotion expression and recognitionunderlines their importance in human social behaviors. Emotion-behavior coupling is not limited to communicative functions but is also strongly related tomotivation and action generation (Frijda 1986; Blakemore & Vuilleumier, 2017). In living organisms,motivated behaviors are represented in terms of approach and avoidance. Even ostensibly simplebehaviors like escape leverage complex cognitive-emotional processes (Evans et al., 2019). Moregenerally, survival – and autonomous function – depends on the ability to generate flexible behaviorsand to adapt to dynamical environments. In sum, how an organism acts in its environment is a keyproblem that depends on emotion-related processes, which therefore is not confined to generatingexpressive behaviors for communication.
3. Toward better models of emotion
The brief review of the previous section points to several promising research directions. We proposefour principles for the development of artificial emotion in the next generation of intelligent machines: Emotion models should account for emotion-cognition integration Emotion models should subscribe to principles of embodiment
Emotion models should support both social and non-social behaviors Emotion models should inform research on natural emotion
Consider a traditional architecture with standard components such as perception and decision-making(Figure 1A). Recognizing the utility of considering affective information, models have included anemotion component that interfaces with some of its processing components. We argue, however, thatemotion and cognition should be integrated in the overall architecture such that emotion is involved inall cognitive processes (Figure 1B). In other words, emotion cannot be implemented as an “add on” toan existing cognitive machine, for example, where it boosts certain perceptual and decisionalcomponents based on urgency or threat. Although Figure 1B illustrates the need to blur the boundary between emotion and the rest of thearchitecture, emotional computations must be specified at a sublevel that is sufficiently granular toallow the translation of this principle into concrete implementations. Consider the example of attention , a central cognitive operation. A fruitful way to conceptualize attention is in terms of priority maps (Itti et al., 1998). In particular, the priority of a to-be-attended visual item depends on aseries of factors, including stimulus salience and top-down control, which can respectively labeled asperceptual and cognitive factors. Critically, priority also depends on affective and motivational factors(Anderson and Phelps, 2001; Anderson et al., 2011). For example, an item paired with aversiveconsequences in the past will acquire negative significance, and gain prioritized processing so that itcan be adequately handled. Likewise, an item paired with reward in the past will acquire motivationalsignificance. Combined, the determination of priority integrates multiple factors that are needed todetermine overall object relevance (Figure 1C).As another example, consider executive control (also called “cognitive control”), whichincludes operations involved in maintaining and updating information, monitoring conflict and/orerrors, resisting distracting information, inhibiting prepotent responses, and shifting goals. A useful wayto conceptualize executive control is in terms of a set of processes, including inhibition, updating, andshifting (Miyake et al., 2000). Insofar as value, relevance, significance, and so on, need to be taken intoaccount for proper executive control, emotion/motivation participate in these processes. In other words,objects or contexts that influence cognitive control processes such that rewards (respectively,punishments) ensue, become positively (respectively, negatively) relevant. Why is the architecture inFigure 1A not sufficient? After all, information about what is emotionally/motivationally relevant willbe conveyed to the particular architecture components. The central reason is that influences must be bidirectional (Figure 1D). For example, dealing with an emotional stimulus or situation requiresmultiple adjustments, including “updating” to refresh the contents of working memory, “shifting” toswitch the current task subgoal, or “inhibiting” to cancel previously planned actions. In this manner,resources are coordinated in the service of proper function. .2. Subscribe to principles of embodiment
To stress the importance of embodiment for artificial intelligence, roboticists often use argumentsrelated to morphology and physical interaction with the environment (Brooks, 1991; Pfeifer et al.,2007). As an example, consider a system that must learn the concept of a “chair”. Purely vision-basedapproaches (e.g. using deep neural networks) would need a massive amount of data and would only be
Figure 1 : Emotion integration in robot cognitive architectures. A) Traditional cognitive architecture in which emotion interacts with other components. Whereas such architecture seemingly acknowledgesan important role for emotion, it is problematic because it implies that emotion is an added module that can in fact be turned off or disconnected from others. B) Architecture in which the separation between emotion and other components is blurred. This representation emphasizes that emotion is an integral element of the system. C) Attention represented as a collection of priority maps which are combined to determine which elements in the environment the robot has to attend to. Priority maps are based on multiple factors: perceptual (e.g. based on saliency), cognitive (e.g. based on current plan), but also emotional/motivational (e.g. based on value, relevance, significance). D) Executive control is based on a set of operations such as inhibiting, updating and shifting. These functions need to take emotional factors (e.g. value, relevance, significance) into account to support successful autonomous adaptive behavior. At the same time, the very same functions help determine factors such as value, relevance, and significance. ble to recognize chairs by shape. In contrast, a humanoid robot able to sit on a flat surface could learnthat sitting minimizes energy loss and thus start to learn the functional aspects of chairs. In other words,disembodied machines cannot make sense of the world the same way humans do. As far as emotion isconcerned, the same reasoning applies. For instance, facial expression recognition should be embeddedin a system that can produce expressive behavior and associate it with its own internal states in order toprocess what is being expressed by the system itself or others. Otherwise, it is little more than adetection device of stereotypical patterns labeled as ‘happy’ or ‘afraid’. When addressing emotion embodiment in artificial systems, there has been a focus on how emotion isexpressed through the body (e.g. emotion recognition in computer vision, face actuators in socialrobotics, synthesis of social cues in computer animation). But, for models of emotion and cognition tobe more truly embodied, the behavior they implement must be driven by core bodily signals ofpleasure, pain, satiation, energy depletion, and so on (Figure 2A; see also Froese & Ziemke (2009) andMan & Damasio (2019)). Indeed, in the previous section, we stressed how emotional/motivationalfactors of value, relevance, and significance are important for proper autonomous function. This type ofinformation is rooted in the bodily responses that a stimulus or event elicits: reward is processedthrough signals of pleasure, harm avoidance stems from the sensation of physical pain and the drive topreserve physical integrity, such that the successful execution of higher-order goals partly depends onthe association between a set of actions with the physiological responses they trigger. Therefore,
Figure 2 : Embodiment and emotion for intelligent robots. A) As embodied intelligent machines, robots are able to acquire information about, and to act upon, the world through a variety of sensors and actuators. The notion of embodiment also includes the processing and regulation of bodily signals such as pleasure, pain, satiation, and so on, which is crucial for emotion and for autonomous behavior. B) By integrating emotion in robotics architectures, we can design machines able to generate and coordinate intelligent behaviors for survival, exploration, and high-level goals. The illustration of iCubrobot in A) was reproduced with permission from Antoni Gracia. uilding a robot capable of autonomously and intelligently exploring an unknown environment requiresmechanisms to monitor energy level, avoid physical harm, develop a preference for safe locations,attend to objects which are relevant to goals/plans, and switch between goals and behaviors dependingon current own and external states (Figure 2B), all of which rely on embodied emotional-cognitiveprocesses.
Models of emotion tend to focus on either social or non-social interactions. For example, facialexpression recognition on the one hand and autonomous navigation on the other. Frequently, engineersare interested in simulating socio-emotional competence to make human-machine interactions moreuser-friendly. Although social interaction is a major domain in which emotion is involved, we believeemotion modeling should encompass both social and non-social behaviors. Indeed, the examplesdeveloped previously highlight the key role of emotion in autonomous, flexible behaviors. Considering both social and non-social emotional processing can be beneficial. For example, how toprocess social and non-social stimuli that are self-relevant (Figure 3A), how to switch between socialand non-social goals, and how to learn which actions are more goal-conducive from both social andnon-social signals (Figure 3B). From an engineering perspective, autonomous cars could be safer forhumans if they had the capacity to interpret social cues (e.g. pedestrian patterns and interactions);industrial robots could be more efficient if they were able to manage both independent and
Figure 3 : Social and non-social interactions with the environment. Emotion is key to intelligent behavior, both in social and non-social contexts. It is important for models of emotion to implement mechanisms that handle adequately both social and non-social stimuli ( A ), and to process both social and non-social rewards ( B ). ollaborative tasks; and companion robots could be more engaging and fun if they could develop a“personality” from both social and non-social experiences. While it is in theory possible to engineer intelligent machines differently from living organisms, webelieve it is enormously beneficial to take cues from how biology gives rise to intelligent behaviors. Togo a step further, we advocate that machines be conceived as models which can further ourunderstanding of human intelligence through the process of recreating it. Can we build a machine ableto process different types of stimuli and events, to safely explore an unknown environment, to self-regulate and adapt its behavior to a variety of contexts, to develop long-term knowledge, preferences,goals and relationships? In doing so, designing intelligent machine can benefit not only from but also to the study of natural intelligence.Models can inform research on human emotion and cognition at four levels: 1) testing existingtheories, 2) proposing new theories, 3) proposing new experiments, and 4) creating opportunities fornew experiments (Figure 4). For instance, does the current understanding of how we process social andnon-social stimuli (e.g. threatening face vs. snake) suffice to implement similar mechanisms in a robot?This type of questioning allows the assessment of the current state of knowledge, revealing ambiguitiesand missing pieces of the puzzle (level 1). For example, how is processing prioritized when in thepresence of various distractors? To which extent does the triggered response depend on learning? Whatseries of computations leads to appropriate responses? The process of testing theories should behypothesis-driven and based on scientific knowledge, rather than solution-oriented (i.e. engineering afunctional system) to lead to new theories (level 2). The process can then suggest new experimentaldesigns to test the validity of the proposed hypotheses (level 3), Furthermore, modeling intelligentbehavior in machines enables innovative experimental research (level 4). An example of an emergingquestion is the investigation of factors that lead humans to consider machines as social agents (Wiese etal., 2017; Belkaid et al., in preparation ). More generally, robots offer a unique opportunity to createembodied real-time interactions to address questions about human social cognition. Conclusion
In this paper, we proposed a framework for designing intelligent, autonomous machines which iscentered on the integration between cognition and emotion. Recent advances in neuroscienceemphasize the importance of emotion in human intelligence, and stress their interdependentrelationship, as well as the brain’s interactions with the body and the environment. Modeling emotionand fully integrating it in “cognitive architectures” is thus key to building robots able to functionindependently in diverse and challenging real-world situations. We hope our proposal helps indeveloping research guidelines for future research. More generally, we encourage a closer collaborationbetween roboticists, computer-scientists, and neuroscientists for the sake of fruitful cross-fertilizationsbetween fields. eferences
Anderson, A. K., & Phelps, E. A. (2001). Lesions of the human amygdala impair enhanced perceptionof emotionally salient events. Nature, 411(6835), 305-309.Anderson, B. A., Laurent, P. A., & Yantis, S. (2011). Value-driven attentional capture. Proceedings of the National Academy of Sciences, 108(25), 10367-10371.