How much do you perceive this? An analysis on perceptions of geometric features, personalities and emotions in virtual humans (Extended Version)
Victor Araujo, Rodolfo Migon Favaretto, Paulo Knob, Soraia Raupp Musse, Felipe Vilanova, Angelo Brandelli Costa
HHow much do you perceive this? An analysis on perceptions ofgeometric features, personalities and emotions in virtualhumans (Extended Version)
Victor AraujoRodolfo Migon FavarettoPaulo KnobSoraia Raupp Musse
Graduate Program in Computer SciencePontifical Catholic University of Rio Grande do SulPorto Alegre - RS, Brazil
Felipe VilanovaAngelo Brandelli Costa
Graduate Program in PsychologyPontifical Catholic University of Rio Grande do SulPorto Alegre - RS, Brazil
ABSTRACT
This work aims to evaluate people’s perception regarding geometricfeatures, personalities and emotions characteristics in virtual hu-mans. For this, we use as a basis, a dataset containing the trackingfiles of pedestrians captured from spontaneous videos and visual-ized them as identical virtual humans. The goal is to focus on theirbehavior and not being distracted by other features. In addition totracking files containing their positions, the dataset also containspedestrian emotions and personalities detected using ComputerVision and Pattern Recognition techniques. We proceed with ouranalysis in order to answer the question if subjects can perceivegeometric features as distances/speeds as well as emotions and per-sonalities in video sequences when pedestrians are represented byvirtual humans. Regarding the participants, an amount of 73 peoplevolunteered for the experiment. The analysis was divided in twoparts: i) evaluation on perception of geometric characteristics, suchas density, angular variation, distances and speeds, and ii) evalua-tion on personality and emotion perceptions. Results indicate that,even without explaining to the participants the concepts of eachpersonality or emotion and how they were calculated (consideringgeometric characteristics), in most of the cases, participants per-ceived the personality and emotion expressed by the virtual agents,in accordance with the available ground truth.
KEYWORDS
User perception, geometric features, personalities, emotion
ACM Reference Format:
Victor Araujo, Rodolfo Migon Favaretto, Paulo Knob, Soraia Raupp Musse,Felipe Vilanova, and Angelo Brandelli Costa. 2019. How much do youperceive this? An analysis on perceptions of geometric features, person-alities and emotions in virtual humans (Extended Version). In
Proceedings
Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected].
IVA ’19, July 02–05, 2019, Paris, France © 2019 Association for Computing Machinery.ACM ISBN 978-x-xxxx-xxxx-x/YY/MM...$15.00https://doi.org/10.1145/nnnnnnn.nnnnnnn of Intelligent Virtual Agents (IVA ’19).
ACM, New York, NY, USA, 8 pages.https://doi.org/10.1145/nnnnnnn.nnnnnnn
The study of human behavior is a subject of great scientific interestand probably an inexhaustible source of research [18]. Due to itsimportance in many applications, the automatic analysis of humanbehavior has been a popular research topic in the last decades [1].In literature, there are some work involving the visualization andanalysis of cultural characteristics, such as analysis of the impactof groups on crowds through human perceptions [25], simulationof crowds through behaviors based on personality and emotionstraits [8], visualization of interactions between virtual agents incrowd simulation and pedestrians in real video sequences [20],visualization of personality traits through social media [15], visual-ization and understanding of personal emotional style [26], visu-alization of personal records [23] among others. Typically, theseapproaches deal with Natural Language Processing (NLP) and ex-tractions of social media data (analysis of feelings), criminal andmedical records, or any other record extracted from textual data.Recently, studies have used geometric features to analyze cul-tural aspects in crowds. Favaretto et al. [10] used group behaviorsto detect cultural aspects according to Hofstede [16]. In other in-vestigations, Favaretto et al. investigated cultural aspects usingcontrolled experiment videos (related to Fundamental Diagram [5])and spontaneous videos from various countries, using geometri-cal features [11], Big-Five personality [12] and OCC emotion [13]models. However, there are not many methods in the literaturethat investigate people’s perceptions regarding geometric informa-tion [25]. In this sense, the objective of this work is to investigatehow people perceive the geometric characteristics (for example,density data, distances and velocities) and non geometric character-istics (for example, cultural characteristics such as personality traitsand emotions) calculated from the geometric features of pedestri-ans from videos of crowds. For this, we use the videos of
CulturalCrowds dataset [10], which contains videos of crowds from dif-ferent countries, with pedestrians walking in different scenarios.Therefore, the dataset contains the tracking files with the pedestrianpositions and provides also personality and emotion information of (Available at: http://rmfavaretto.pro.br/vhlab/) a r X i v : . [ c s . G R ] A p r VA ’19, July 02–05, 2019, Paris, France Araujo, Favaretto, Knob, Musse, Vilanova and Costa. these pedestrians, which was obtained through Computer Visionand Pattern Recognition techniques.For the experiment, we use the track position in a simulated envi-ronment where agents were visualized as identical virtual humans.The goal is to focus on their behavior and not being distractedby other features. In our analysis the participants were asked toanswer questions to identify if they can perceive geometric featuresas distances/speeds as well as emotions and personalities in videosequences when pedestrians are represented by virtual humans. Inparticularly, and very important to this work, is understand that ourfocus is on perception of information always related to the spaceand geometry, even when we talk about emotion and personality,we are interested about the pure geometric manifestations (likedistance among agents, speeds and densities). The main motivationis to evaluate the area of personality and emotion detection in videosequences, i.e. we want to know if people perceive qualitativelywhat can be detected in video sequences.
This section discusses some work related to pedestrian and crowdsbehavioral analysis focusing on personality traits, emotion and per-ception.Knob et al. [20] presented a work related to visualizationsof interactions between pedestrians in video sequences and virtualagents in crowd simulations. Interactions are given by factors basedon the OCEAN of each pedestrian and agent. The OCEAN [7, 19]is the personality trait model most commonly used for this typeof analysis, also referenced as Big-Five: Openness to experience(“the active seeking and appreciation of new experiences”); Con-scientiousness (“degree of organization, persistence, control andmotivation in goal directed behavior”); Extraversion (“quantityand intensity of energy directed outwards in the social world”);Agreeableness (“the kinds of interaction an individual prefers fromcompassion to tough mindedness”); Neuroticism (“how much proneto psychological distress the individual is”) [22]. Durupinar et al. [8]also used OCEAN to visually represent personality traits. Agents’visual representation is given in several ways, for example, the ani-mations of the agents are based on these two cultural characteristics(OCEAN and emotion). If an agent is sad, his/her animation will rep-resent that emotion. Yang et al. [25] conducted a study on analysisperception to determine the impact of groups at various densities,using two points of view: top and first-person view. In additionto this perception, they looked at what type of camera position(top view or first-person view) could be better for the perceptionof density. The work of Ardeshir and Borji [3] shows experimentsand graphs made between two points of view (first-person and topcam view), thus helping in the integration and use of the types ofcameras used in this present work.Regarding the detection of personalities, emotion and culturalaspects in pedestrian from crowds, [10] proposed a method to iden-tify groups and characterize them to assess the aspects of culturaldifferences through the mapping of the Hofstede’s dimensions [17].A similar idea, however using computer simulation and not fo-cused on computer vision, is proposed by Lala et al. [21]. They useHofstede’s dimensions to create a simulated crowd from a culturalperspective. Gorbova and collaborators [14] present a system ofautomatic personality screening from video presentations in or-der to make a decision whether a person has to be invited to a job interview based on visual, audio and lexical cues. The workproposed by [12], presents a model to detect personality aspectsbased on the Big-five personality model using individuals behaviorsautomatically detected in video sequences.Several models have been developed to explain and quantify ba-sic emotions in humans. One of the most cited is proposed by PaulEkman [9] which considers the existence of 6 universal emotionsbased on cross-cultural facial expressions (anger, disgust, fear, hap-piness, sadness and surprise). In [13], the authors proposed a wayto detect pedestrian emotions in videos, based on OCC emotionmodel. To detect the emotions of each pedestrian, the authors usedOCEAN as inputs, as proposed by Saifi [24]. In our approach, weproceed with an analysis in order to verify if subjects can perceivegeometric features as distances/speeds as well as emotions andpersonalities in video sequences when pedestrians are representedby virtual humans. Next section present how we performed theanalysis.
The main goal of this work is to analyze the perceptions of peopleabout geometric data (speed, distance, density and angular varia-tion), personality and emotions. The data were extracted from theCultural Crowds dataset [10]. The geometric data are calculatedusing the pedestrian trajectories. Personality and emotion traits arealso calculated based on that, through psychological hypotheses.Next sections detail these processes.
Based on the tracking input file, Favaretto et al. [12] computeinformation for each pedestrian i at each timestep: i)
2D posi-tion x i (meters); ii) speed s i (meters/frame); iii) angular varia-tion α i (degrees) w.r.t. a reference vector (cid:174) r = ( , ) ; iv) isolationlevel φ i ; v) socialization level ϑ i ; and vi) collectivity ϕ i . To com-pute the collectivity affected in individual i from all n individu-als, they computed ϕ i = (cid:205) n − j = γe (− βϖ ( i , j ) ) , and the collectivitybetween two individuals was calculated as a decay function of ϖ ( i , j ) = s ( s i , s j ) . w + o ( α i , α j ) . w , considering s and o respectivelythe speed and orientation differences between two people i and j , and w and w are constants that should regulate the offset inmeters and radians.To compute the socialization level ϑ , Favaretto et al. [11] use anartificial neural network (ANN) with a Scaled Conjugate Gradient(SCG) algorithm in the training process to calculate the socialization ϑ i level for each individual i . The ANN has 3 inputs (collectivity ϕ i of person i , mean Euclidean distance from a person i to others ¯ d i , j and the number of people in the Social Space according to Hall’sproxemics [4] around the person n i ). The isolation level correspondsto its inverse, φ i = − ϑ i . For more details about how this featuresare obtained, please refer to [11, 12]. For each individual i in a video,we computed the average for all frames and generate a vector (cid:174) V i ofextracted data where (cid:174) V i = [ x i , s i , α i , φ i , ϑ i , ϕ i ] . In the next sectionwe describe how these features are mapped into personality andemotion traits. Social space is related to . meters [4]. erceptions of geometric features, personalities and emotions in virtual Humans IVA ’19, July 02–05, 2019, Paris, France To detect the five dimensions of OCEAN for each pedestrian, [12]used the NEO PI-R [6] that is the standard questionnaire measure ofthe Five Factor Model. They firstly selected NEO PI-R items relatedto individual-level crowd characteristics and the correspondingOCEAN-factor. For example: "Like being part of crowd at sportingevent" corresponding to the factor "Extroversion". As describedin details in [12], they proposed a series of empirically definedequations to map pedestrian features to OCEAN dimensions. Firstly,they selected 25 from the 240 items from NEO PI-R inventory thathad a direct relationship with crowd behavior. In order to answer theitems with data coming from real video sequences, they proposedequations that could represent each one of the 25 items with featuresextracted from videos. For example, in order to represent the item “1- Have clear goals, work to them in orderly way”, Favaretto and hiscolleagues consider that the individual i should have a high velocity s and low angular variation α to have answer in concordance withthis item. So the equation for this item was Q = s i + α i . In thisway, they empirically proposed equations for all the 25 items, aspresented in [12].In the work presented by [13], the authors proposed a way tomap OCEAN dimensions of each pedestrian in OCC Emotion model,regarding four emotions: Anger, Fear, Happiness and Sadness. Thismapping is described in Table 1. In Table 1, the plus/minus signalsalong each factor represent the positive/negative value of each one.For example concerning Openness, O+ stands for positive values(i.e. O ≥ < Table 1: Emotion mapping from OCEAN to OCC [13].
Factor Fear Happiness Sadness Anger
O+ 0 0 0 -1O- 0 0 0 1C+ -1 0 0 0C- 1 0 0 0E+ -1 1 -1 -1E- 1 0 0 0A+ 0 0 0 -1A- 0 0 0 1N+ 1 -1 1 1N- -1 1 -1 -1
The viewer was developed using the Unity3D engine, with C ChangeScene and Unity3D is available at https://unity3d.com/
RestartCamPos to, respectively, load the data file of another videoand restart the camera position for viewing in first person; 3) awindow that shows the top view of the environment; 4) the first-person view of a previously selected agent (this agent is highlightedin area 3) and 5) that contains features panel, where the userscan activate the visualization of the data related to the emotion,socialization and collectivity of agents.
Figure 1: Main window of the viewer.
This viewer has three modes of visualization: (i) first-personvisualization, (ii) top view, and (iii) an oblique view. Figure 2 showsan example of each type of camera point of view in a video avail-able in the Cultural Crowds dataset. In addition to these differentpoints of view, it is possible to observe all the pedestrians presentin each frame f . Pedestrians can be represented by an humanoid orcylinder type avatar. Each pedestrian i present in frame f has a po-sition ( X i , Y i ) (already converted from image coordinates to worldcoordinates). In addition to the positions, it is also possible to knowif the pedestrian is walking, running or stopped in frame f throughthe current speed s i . If in this frame the current speed is greaterthan or equal to . mf which is equivalent to ms , considering fs ,then the avatar is running. It was defined based on the PreferredTransition Speed PTS [2]. The values of the transitions can be seenin Equation 1, considering the current speed of the agent s i . Animation = Idle , when s i == Walk , when 0 < s i < . mf ; Run , when s i ≥ . mf . (1)Also, for the humanoid avatar type, each speed transition is ac-companied by an animation transition, for example, if the currentspeed s i ==
0, then it does not change the animation (remainingstationary), but if its speed is 0 < s i < . mf , then the animationchanges for walking as well as if s i ≥ . mf , the animation of theavatar changes to running. Next section presents some obtainedresults. This section aims to present the results of people’s perceptionsabout geometric data information (density, speed, distance betweenpedestrians and angular variation), personalities and emotions. Weused the simulation environment to generate some short sequencesof pedestrian videos together with a questionnaire where the se-quences of videos are presented. In the sequence, participants’ re-sponses were analyzed. This section was organized into three parts:
VA ’19, July 02–05, 2019, Paris, France Araujo, Favaretto, Knob, Musse, Vilanova and Costa. (a) Top view (b) Oblique view (c) First-person view
Figure 2: Types of visualization: (a) top view, (b) oblique and (c) first-person view.
Section 4.1 presents some information about the videos from thedataset that were used in the experiment, Section 4.2 discuss theresults of the perceptions about the geometrical characteristics ofpedestrians and Section 4.3 presents the results of the perceptionsabout personalities and emotions.
We generate video sequences with data extracted from the
CulturalCrowds dataset. Table 2 shows the relations of all videos from thedataset that were used in the experiment, with information aboutthe country where the video was recorded, the number of pedestri-ans and the density level (low, medium, or high). The data of eachchosen video was input to a simulated environment containingvirtual agents, represented by cylinder or humanoid type avatars,that can be seen, respectively, in Figure 2(b) and (c). We also usedthe three point of view cameras (top view illustrated in Figure 2(a)).Regarding the participants, an amount of 73 people volunteered forthe experiment: 45 males (61.6%) and 28 (38.4%) females and 47.9%have some undergraduate degree. In the next section we discussthe results obtained in the geometrical features perception analysis.
Table 2: Videos of the Cultural Crowds [10] dataset.
Video Country N. Pedestrian DensityAE-01 Unit. Arab Emirates 12 LowAT-03 Austria 10 LowBR-01 Brazil 16 LowBR-15 Brazil 15 LowBR-25 Brazil 25 MediumBR-34 Brazil 34 High
In this section, we present an analysis of subjects perception re-garding density, velocity, direction variation of pedestrians anddistance among them using three camera’s points of view (first-person, oblique and top-view) and two types of avatars (cylinderand humanoid). The first part of applied questionnaire containssix questions but in all of them we asked for the same aspect: "Inwhich video do you perceive the higher density?". Before each ques-tion, two or three short videos described in Table 2 were presented.Figure 3 shows the questions and percentage of answers.
Figure 3: Perception concerning density: questions D1 to D6.
The first question (D1) aimed to evaluate if the participants canperceive the density variation once we did not include any explana-tion about that. Therefore, scenes of videos with low, medium andhigh density of people in crowds were presented where we wantto to check if the subjects could correctly select the high densityone. 89% of participants responded according to ground truth, i.e.they could correctly classify the high density video. The other 11%answered "I do not know", "I did not notice density difference" andlow and medium density options. In D2 and D3 we presented videoswith same density but displayed with the different points of view,however in D2 we used humanoids and in D3 we used cylinders. Weasked to the subjects to select the video where the higher densitywas observed. Our goal was to check if the subjects could perceivethe same density or if density perception changes due to the camerapoint of view or the way the agents are displayed. In question D2,70% chose one of the videos, while 29% of the participants markedthe option "I did not notice density difference", so for this smallgroup it seems that the camera does not change the perception.Details are presented in Figure 3, and results indicate that the cam-era point of view can disturb the density perception. Regardingthe point of view, oblique cameras present the higher percentageof answers. In question D3, 69% chosen one of the videos, while erceptions of geometric features, personalities and emotions in virtual Humans IVA ’19, July 02–05, 2019, Paris, France
31% of the participants marked the option "I did not notice densitydifference", indicating that the visualization with cylinders or hu-manoids also change the final result. In question D4, we showedtwo videos with same density and same point of view, howeverchanging the type of avatar. 25% of people selected the option "I didnot notice density difference", while 72% chosen one of the avatartypes, being 41% of the participants have chosen humanoids. Inquestions D5 and D6, we included, in same videos analyzed before,walls that surround the agents (see Figure 2(c)). The goal is to checkif it changes the density perception using first-person camera. Inthis case 66% of subjects answered that one of the videos presentedhigher density in comparison to a same density video without walls.Regarding speed perception, the questionnaire also contains sixquestions, all of them are related to low-density videos describedin Table 2. The goal of these questions is to evaluate the speedlevels running and walking, as presented in Equation 1, throughthe top and oblique cameras, in addition to the two types of avatars:cylinder and humanoid. In such videos there was no analysis of
Figure 4: Perception concerning speed: questions S1 to S6. perceptions using the camera in the first person, since we observethat such videos did not allow a good vision of the scene. As indensity analysis, we asked the same question "In which video didyou observe the higher velocity" and showed variations of parame-ters we want to measure. Question S1 presented two videos withvelocity=running and cameras=oblique and top. As shown in Fig-ure 4, 32% of subjects do not perceive any difference in velocitywhile 64% chosen one of the videos. Same process for question S2but using velocity=walking and 26% does not perceive differencewhile 74% chosen one of the videos. Questions S3 and S4 presentedsame velocity respectively in oblique and top of view camera. ForS3, 14% of subjects do not perceive velocity changes while 85%selected only one of the cameras. In S4, 28% of them do not per-ceive velocity changes while 71% selected only one of the cameras.Finally questions S5 and S6 presented two videos containing thetwo different avatars with oblique and top camera respectively forvelocity=walking. Results were very similar having 17% and 19%respectively of people who do not perceive difference against 82% and 81% of people that chose one of the videos. So, our resultsindicate that the camera point of view and type of avatar impacts inthe velocity perception. Regarding the perception of angular varia-tion, the questionnaire contains two questions with comparisonsbetween the three types of cameras and two types of avatars. Allangular variation questions used scenes from BR-34 (high density)video shown in Table 2. Again, we asked the same question "Inwhich video do you observe more angular variation performed bythe agents?" and videos variate the measured parameters. QuestionA1 presented three videos with humanoids viewed with 3 differentcamera positions. As shown in Figure 5, only 14% of subjects donot perceive difference in the angular variation while 83% chosenone of the videos and the top view camera was more selected. Simi-lar process for question A2 where avatars were cylinders. 18% ofsubjects did not perceive difference while 79% selected one of thevideos. Most part of people who selected one video chosen the onewith humanoids.
Figure 5: Perception analysis concerning angular variation:questions A1 and A2.
Regarding the perception of distance between the avatars, thequestionnaire contains two questions, all with videos containinghigh density. The videos used in these questions were the sameas the questions about the perception of angular variation, i.e thetypes of cameras and the two types of avatars, and the question is:"In which video do you observe the largest distance among agents?".Indeed, results were very similar in both question E1 and questionE2. As shown in Figure 6, in E1 we displayed humanoids with thethree cameras and 22% of subjects do not perceive differences, whilein E2 we displayed cylinders and 24% also do not perceive changes.On the other hand, 77% and 73% of subjects, respectively, selectedone of the videos in a approximately uniformly distributed way.
Figure 6: Perception concerning distance: questions E1 andE2.
So, in this section we analyzed the subjects perception relatedto density, speed, angular variation and distances among agents
VA ’19, July 02–05, 2019, Paris, France Araujo, Favaretto, Knob, Musse, Vilanova and Costa. displayed using two types of avatars and in three different cameraspoint of view. Results indicate that changing the way we displayedavatars and cameras position the subjects perception also changes.In particular, top of view and oblique cameras seem to providebetter information to detect the parameters while humanoids werepreferred to indicate the higher values of all evaluated parameters.
In this section we present the part of this study focused on percep-tion of personality and emotion traits in crowd videos. As explainedbefore, we used the simulation environment to generate some shortsequences of pedestrian views in low density crowds (due to thedata present in the dataset). In each video sequence we highlightedtwo individuals with different colors (red and yellow) and we askedto the subjects about them. Table 3 shows the questions with thepossible answers, where the correct answer of each question ishighlighted in bold. We use as ground truth the results obtained bythe approach proposed by Favaretto et al. [13].
Table 3: Questions and possible answers. The correct answerin highlighted in bold, according to Favaretto et al. [13].
Question Possible answers Q1 : In your opinion, which of the two pedestrians highlighted inthe video has a neurotic personality, yellow or red? a ) Yellow pedestrian; b ) Red pedestrian ; c ) Both pedestrians; d ) Neither of them; e ) I don’t know. Q2 : In your opinion, which of the two pedestrians highlighted inthe video is angry, yellow or red? a ) Yellow pedestrian; b ) Red pedestrian ; c ) Both pedestrians; d ) Neither of them; e ) I don’t know. Q3 : In your opinion, which of the two pedestrians highlighted inthe video is more openness to experiences, yellow or red? a ) Yellow pedestrian ; b ) Red pedestrian; c ) Both pedestrians; d ) Neither of them; e ) I don’t know. Q4 : In your opinion, which of the two pedestrians highlighted inthe video is afraid, yellow or red? a ) Yellow pedestrian; b ) Red pedestrian ; c ) Both pedestrians; d ) Neither of them; e ) I don’t know. Q5 : In your opinion, which of the two pedestrians highlighted inthe video is happier, yellow or red? a ) Yellow pedestrian ; b ) Red pedestrian; c ) Both pedestrians; d ) Neither of them; e ) I don’t know. Q6 : In your opinion, which of the two pedestrians highlighted inthe video is more extroverted, yellow or red? a ) Yellow pedestrian ; b ) Red pedestrian; c ) Both pedestrians; d ) Neither of them; e ) I don’t know. Q7 : In your opinion, which of the two pedestrians highlighted inthe video seems to be more sociable, yellow or red? a ) Yellow pedestrian ; b ) Red pedestrian; c ) Both pedestrians; d ) Neither of them; e ) I don’t know. Figure 7 shows the initial and final frames from the video P Q Q P Q Q Q P
01, thepedestrian highlighted in red has these characteristics. The pedes-trian highlighted in red is angry: isolated, low angular variation,low speed, low socialization and low collectivity. (a) Initial frame (b) Final frame
Figure 7: Initial (a) and final (b) frames from video P .Figure 8: Perception analysis concerning Q1 and Q2. Following the analysis, video P
02 (illustrated in Figure 9) has apedestrian highlighted in yellow interacting with a group of indi-viduals and a pedestrian highlighted in red who is alone and notinteracting with anyone. Questions Q Q
4, who were related tothis video, asked participants about which highlighted pedestrianwas, respectively, openness to experiences and afraid. Figure 10shows the answers for that questions. The results plotted in Fig-ure 10 shows that most of the participants perceived the samepersonality (in case of question Q
3) an the same emotion (question Q
4) when compared to ground truth, i.e. 60% of the participantscorrectly chose the pedestrian in yellow as the most opened toexperiences in question Q erceptions of geometric features, personalities and emotions in virtual Humans IVA ’19, July 02–05, 2019, Paris, France experiences. Fear, in turn, is linked to the fact that the person isisolated from others and walks at lower speeds. (a) Initial frame (b) Final frame Figure 9: Initial (a) and final (b) frames from video P .Figure 10: Perception analysis concerning Q3 and Q4. Finally, related to the video P
03 we propose questions Q5, Q6and Q7, asking, respectively, about happiness, extraversion andsociability. The video P
03 (illustrated in Figure 11) contains a pedes-trian highlighted in yellow walking with a group of people and apedestrian highlighted in red walking alone, in the opposite direc-tion of all other pedestrians. Regarding question Q (a) Initial frame (b) Final frame Figure 11: Initial (a) and final (b) frames from video P . Questions Q Q Q
6, although most of the participants (33% of them) correctly answer that the pedestrian highlighted in yellowis the most extrovert, it seems that the participants were not verysure about perceiving this characteristic. 25% of them answeredthat none of the pedestrians were extroverted, 19% replied that themost extroverted pedestrian was the one highlighted in red, 14% didnot know and 9% believed that both pedestrians were extroverted.We believe that question Q Q Q
7, most of theparticipants (57% of them) seemed to be more convinced that thepedestrian highlighted in yellow is the most sociable, in accordancewith the model proposed by [13].
Figure 12: Perception analysis concerning Q5, Q6 and Q7.
This work evaluated people’s perceptions with respect to geometricfeatures, such as: density, speed, angular variation and distancesamong pedestrians. We also evaluated subjects perception regard-ing other subtle parameters as personalities and emotions traits incrowds. We proposed and implemented a survey that has been an-swered by 73 participants through a questionnaire that featured vi-sualizations of scenes taken from videos of the
Cultural Crowds [10]dataset and propose questions regarding variation of visualizationparameters.Regarding the results of the people’s perceptions about the geo-metric data, in the general analysis of the cameras, it was noticedthat the way agents are displayed and the camera point of viewinterfere in the parameters perception. In particular, the greaterthe distance from the camera to the environment (oblique and topcameras), the better seem to be the perception of density, speedand angular variation. With respect to density, we can see thatthere was a more accurate perception in the first person view whenthe environment contained walls around the agents. Concerningspeed parameter, subjects perceive better the speed variation ofthe avatars running through the oblique camera than in the top
VA ’19, July 02–05, 2019, Paris, France Araujo, Favaretto, Knob, Musse, Vilanova and Costa. camera. In general analysis of the avatars type, there was a moreaccurate perception of density when visualized as humanoids in thefirst-person view, a better perception of angular variation throughthe humanoids in all the cameras, and more accurate perception ofdistances when avatars were displayed as cylinders in the top andoblique cameras. We also performed an experiment to evaluate ifpeople can perceive different personalities and emotions performedby pedestrians in crowds. It was interesting to see that, even with-out explaining to the participants the concepts of each personalityor emotion and how they were calculated in our approach (consid-ering the geometric characteristics), in all the cases, more than halfof the participants perceived the personality and emotion that theagent was expressing in the video, in accordance with our approach.Of course, this last aspect is much more intangible and the missingexplanations that we were interested about spatial manifestationand not trying to "figure out" if the person is social or open in apsychological point of view is certainly one aspect we want to dealin a future work.
REFERENCES [1] X. Alameda-Pineda, E. Ricci, and N. Sebe. 2018.
Multimodal Behavior Analysis inthe Wild: Advances and Challenges . Elsevier Science, London, UK.[2] R McN Alexander. 1992. A model of bipedal locomotion on compliant legs.
Phil.Trans. R. Soc. Lond. B
IEEE Transac-tions on Pattern Analysis and Machine Intelligence (2018).[4] G. Lindzy C. S. Hall and J. B. Campbell. 1998.
Theories Of Personality (fourth ed.).John Wiley & Sons, New Jersey.[5] Ujjal Chattaraj, Armin Seyfried, and Partha Chakroborty. 2009. Comparison ofpedestrian fundamental diagram across cultures.
Advances in complex systems
12, 03 (2009), 393–405.[6] P.T. Costa and R.R. McCrae. 1992.
Revised NEO Personality Inventory (NEO PI-R) and NEO Five-Factor Inventory (NEO-FFI) . PAR. https://books.google.co.in/books?id=mp3zNwAACAAJ[7] J. M. Digman. 1990. Personality Structure: Emergence of the Five-Factor Model.
Annual Review of Psychology
41 (1990), 417–440.[8] Funda Durupınar, Uğur Güdükbay, Aytek Aman, and Norman I Badler. 2016.Psychological parameters for crowd simulation: From audiences to mobs.
IEEEtransactions on visualization and computer graphics
22, 9 (2016), 2145–2159.[9] P. Ekman and W. V. Friesen. 1971. Constants across cultures in the face andemotion.
JPSP
17, 2 (1971).[10] Rodolfo M Favaretto, Leandro Dihl, Rodrigo Barreto, and Soraia Raupp Musse.2016. Using group behaviors to detect hofstede cultural dimensions. In
ImageProcessing (ICIP), 2016 IEEE International Conference on . IEEE, 2936–2940.[11] R. M. Favaretto, L. Dihl, and S. R. Musse. 2016. Detecting crowd features invideo sequences. In
Proceedings of Conference on Graphics, Patterns and Images(SIBGRAPI) . IEEE Computer Society, SÃčo JosÃľ dos Campos, SP, 201–208.[12] Rodolfo Migon Favaretto, Leandro Dihl, Soraia Raupp Musse, Felipe Vilanova,and Angelo Brandelli Costa. 2017. Using big five personality model to detectcultural aspects in crowds. In
Graphics, Patterns and Images (SIBGRAPI), 201730th SIBGRAPI Conference on . IEEE, 223–229.[13] Rodolfo Migon Favaretto, Paulo Knob, Soraia Raupp Musse, Felipe Vilanova, andÂngelo Brandelli Costa. 2018. Detecting personality and emotion traits in crowdsfrom video sequences.
Machine Vision and Applications (2018), 1–14.[14] J. Gorbova, I. LÃijsi, A. Litvin, and G. Anbarjafari. 2017. Automated Screeningof Job Candidate Based on Multimodal Video Processing. In
CVPRW . https://doi.org/10.1109/CVPRW.2017.214[15] Liang Gou. [n. d.]. Visualizing Personality Traits Derived from Social Media. In
Electronic proceedings .[16] Geert Hofstede. 2001.
Culture’s consequences: Comparing values, behaviors, insti-tutions and organizations across nations . Sage publications.[17] Geert Hofstede. 2011. Dimensionalizing cultures: The Hofstede model in context.
Online readings in psychology and culture
2, 1 (2011), 8.[18] J.C.S. Jacques Junior, S. R. Musse, and C.R. Jung. 2010. Crowd Analysis UsingComputer Vision Techniques.
IEEE Signal Processing Magazine
27 (2010), 66–77.[19] O. P. John. 1990.
The "Big Five" factor taxonomy: Dimensions of personality inthe natural language and in questionnaires . 66–100, New York, NY, Chapter 4,66–100.[20] Paulo Knob, Victor Flavio de Andrade Araujo, Rodolfo Migon Favaretto, andSoraia Raupp Musse. [n. d.]. Visualization of Interactions in Crowd Simulation and Video Sequences. ([n. d.]).[21] D. Lala, S. Thovuttikul, and T. Nishida. 2011. Towards a virtual environment forcapturing behavior in cultural crowds. In . 310–315. https://doi.org/10.1109/ICDIM.2011.6093362[22] W. Lord. 2007.
Neo Pi-R âĂŞ A Guide to Interpretation and Feedback in a WorkContext (first ed.). Hogrefe Ltd.[23] Catherine Plaisant, Brett Milash, Anne Rose, Seth Widoff, and Ben Shneiderman.1996. LifeLines: visualizing personal histories. In
Proceedings of the SIGCHIconference on Human factors in computing systems . ACM, 221–227.[24] L. Saifi, A. Boubetra, and F. Nouioua. 2016. An approach for emotions andbehavior modeling in a crowd in the presence of rare events. AB
24, 6 (2016).https://doi.org/10.1177/1059712316674784[25] Fangkai Yang, Jack Shabo, Adam Qureshi, and Christopher Peters. 2018. Do yousee groups?: The impact of crowd density and viewpoint on the perception ofgroups. In
Proceedings of the 18th International Conference on Intelligent VirtualAgents . ACM, 313–318.[26] Jian Zhao, Liang Gou, Fei Wang, and Michelle Zhou. 2014. PEARL: An interactivevisual analytic tool for understanding personal emotion style derived from socialmedia. In