Eero Väyrynen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Eero Väyrynen is active.

Explore More

Publication

Featured researches published by Eero Väyrynen.

Applied Ergonomics | 2011

Effect of cognitive load on speech prosody in aviation: Evidence from military simulator flights.

Kerttu Huttunen; Heikki Keränen; Eero Väyrynen; Rauno Pääkkönen; Tuomo Leino

Mental overload directly affects safety in aviation and needs to be alleviated. Speech recordings are obtained non-invasively and as such are feasible for monitoring cognitive load. We recorded speech of 13 military pilots while they were performing a simulator task. Three types of cognitive load (load on situation awareness, information processing and decision making) were rated by a flight instructor separately for each flight phase and participant. As a function of increased cognitive load, the mean utterance-level fundamental frequency (F0) increased, on average, by 7 Hz and the mean vocal intensity increased by 1 dB. In the most intensive simulator flight phases, mean F0 increased by 12 Hz and mean intensity, by 1.5 dB. At the same time, the mean F0 range decreased by 5 Hz, on average. Our results showed that prosodic features of speech can be used to monitor speaker state and support pilot training in a simulator environment.

Language and Speech | 2004

Automatic Discrimination of Emotion from Spoken Finnish

Juhani Toivanen; Eero Väyrynen; Tapio Seppänen

In this paper, experiments on the automatic discrimination of basic emotions from spoken Finnish are described. For the purpose of the study, a large emotional speech corpus of Finnish was collected; 14 professional actors acted as speakers, and simulated four primary emotions when reading out a semantically neutral text. More than 40 prosodic features were derived and automatically computed from the speech samples. Two application scenarios were tested: the first scenario was speaker-independent for a small domain of speakers while the second scenario was completely speaker-independent. Human listening experiments were conducted to assess the perceptual adequacy of the emotional speech samples. Statistical classification experiments indicated that, with the optimal combination of prosodic feature vectors, automatic emotion discrimination performance close to human emotion recognition ability was achievable.

Logopedics Phoniatrics Vocology | 2006

Emotions in [a]: a perceptual and acoustic study.

Juhani Toivanen; Teija Waaramaa; Paavo Alku; Anne-Maria Laukkanen; Tapio Seppänen; Eero Väyrynen; Matti Airas

The aim of this investigation is to study how well voice quality conveys emotional content that can be discriminated by human listeners and the computer. The speech data were produced by nine professional actors (four women, five men). The speakers simulated the following basic emotions in a unit consisting of a vowel extracted from running Finnish speech: neutral, sadness, joy, anger, and tenderness. The automatic discrimination was clearly more successful than human emotion recognition. Human listeners thus apparently need longer speech samples than vowel-length units for reliable emotion discrimination than the machine, which utilizes quantitative parameters effectively for short speech samples.

Logopedics Phoniatrics Vocology | 2010

Perception of basic emotions from speech prosody in adolescents with Asperger's syndrome

Jenna Heikkinen; Eira Jansson-Verkasalo; Juhani Toivanen; Kalervo Suominen; Eero Väyrynen; Irma Moilanen; Tapio Seppänen

Abstract Aspergers syndrome (AS) belongs to the group of autism spectrum disorders and is characterized by deficits in social interaction, as manifested e.g. by the lack of social or emotional reciprocity. The disturbance causes clinically significant impairment in social interaction. Abnormal prosody has been frequently identified as a core feature of AS. There are virtually no studies on recognition of basic emotions from speech. This study focuses on how adolescents with AS (n=12) and their typically developed controls (n=15) recognize the basic emotions happy, sad, angry, and ‘neutral’ from speech prosody. Adolescents with AS recognized basic emotions from speech prosody as well as their typically developed controls did. Possibly the recognition of basic emotions develops during the childhood.

Folia Phoniatrica Et Logopaedica | 2008

Monopitched expression of emotions in different vowels.

Teija Waaramaa; Anne-Maria Laukkanen; Paavo Alku; Eero Väyrynen

Fundamental frequency (F₀) and intensity are known to be important variables in the communication of emotions in speech. In singing, however, pitch is predetermined and yet the voice should convey emotions. Hence, other vocal parameters are needed to express emotions. This study investigated the role of voice source characteristics and formant frequencies in the communication of emotions in monopitched vowel samples [a:], [i:] and [u:]. Student actors (5 males, 8 females) produced the emotional samples simulating joy, tenderness, sadness, anger and a neutral emotional state. Equivalent sound level (Leq), alpha ratio [SPL (1–5 kHz) – SPL (50 Hz–1 kHz)] and formant frequencies F1–F4 were measured. The [a:] samples were inverse filtered and the estimated glottal flows were parameterized with the normalized amplitude quotient [NAQ = fAC/(dpeakT)]. Interrelations of acoustic variables were studied by ANCOVA, considering the valence and psychophysiological activity of the expressions. Forty participants listened to the randomized samples (n = 210) for identification of the emotions. The capacity of monopitched vowels for conveying emotions differed. Leq and NAQ differentiated activity levels. NAQ also varied independently of Leq. In [a:], filter (formant frequencies F1–F4) was related to valence. The interplay between voice source and F1–F4 warrants a synthesis study.

Speech Communication | 2011

Classification of emotion in spoken Finnish using vowel-length segments: Increasing reliability with a fusion technique

Eero Väyrynen; Juhani Toivanen; Tapio Seppänen

Classification of emotional content of short Finnish emotional [a:] vowel speech samples is performed using vocal source parameter and traditional intonation contour parameter derived prosodic features. A multiple kNN classifier based decision level fusion classification architecture is proposed for multimodal speech prosody and vocal source expert fusion. The sum fusion rule and the sequential forward floating search (SFFS) algorithm are used to produce leveraged expert classifiers. Automatic classification tests in five emotional classes demonstrate that significantly higher than random level emotional content classification performance is achievable using both prosodic and vocal source features. The fusion classification approach is further shown to be capable of emotional content classification in the vowel domain approaching the performance level of the human reference.

international conference of the ieee engineering in medicine and biology society | 2012

EEG-based detection of awakening from isoflurane anesthesia in rats

Jukka Kortelainen; Eero Väyrynen; Xiaofeng Jia; Tapio Seppänen; Nitish V. Thakor

In animal studies, reliable measures for depth of anesthesia are frequently required. Previous findings suggest that the continuous depth of anesthesia indices developed for humans might not be adequate for rats whose EEG changes during anesthesia represent more of quick transitions between discrete states. In this paper, the automatic EEG-based detection of awakening from anesthesia was studied in rats. An algorithm based on Bayesian Information Criterion (BIC) is proposed for the assessment of the switch-like change in the signal characteristics occurring just before the awakening. The method was tested with EEGs recorded from ten rats recovering from isoflurane anesthesia. The algorithm was shown to be able to detect the sudden change in the EEG related to the moment of awakening with a precision comparable to careful visual inspection. Our findings suggest that monitoring such signal changes may offer an interesting alternative to the application of continuous depth of anesthesia indices when avoiding the awakening of the animal during e.g. a clinical experiment.

Multimedia Tools and Applications | 2016

MORE --- a multimodal observation and analysis system for social interaction research

Anja Keskinarkaus; Sami Huttunen; Antti Siipo; Jukka Holappa; Magda Laszlo; Ilkka Juuso; Eero Väyrynen; Janne Heikkilä; Matti Lehtihalmes; Tapio Seppänen; Seppo J. Laukka

The MORE system is designed for observation and machine-aided analysis of social interaction in real life situations, such as classroom teaching scenarios and business meetings. The system utilizes a multichannel approach to collect data whereby multiple streams of data in a number of different modalities are obtained from each situation. Typically the system collects a 360-degree video and audio feed from multiple microphones set up in the space. The system includes an advanced server backend component that is capable of performing video processing, feature extraction and archiving operations on behalf of the user. The feature extraction services form a key part of the system and rely on advanced signal analysis techniques, such as speech processing, motion activity detection and facial expression recognition in order to speed up the analysis of large data sets. The provided web interface weaves the multiple streams of information together, utilizes the extracted features as metadata on the audio and video data and lets the user dive into analyzing the recorded events. The objective of the system is to facilitate easy navigation of multimodal data and enable the analysis of the recorded situations for the purposes of, for example, behavioral studies, teacher training and business development. A further unique feature of the system is its low setup overhead and high portability as the lightest MORE setup only requires a laptop computer and the selected set of sensors on site.

Quarterly Journal of Experimental Psychology | 2017

How Young Adults with Autism Spectrum Disorder Watch and Interpret Pragmatically Complex Scenes.

Linda Lönnqvist; Soile Loukusa; Tuula Marketta Hurtig; Leena Mäkinen; Antti Siipo; Eero Väyrynen; Pertti Palo; Seppo J. Laukka; Laura Mämmelä; Marja-Leena Mattila; Hanna Ebeling

The aim of the current study was to investigate subtle characteristics of social perception and interpretation in high-functioning individuals with autism spectrum disorders (ASDs), and to study the relation between watching and interpreting. As a novelty, we used an approach that combined moment-by-moment eye tracking and verbal assessment. Sixteen young adults with ASD and 16 neurotypical control participants watched a video depicting a complex communication situation while their eye movements were tracked. The participants also completed a verbal task with questions related to the pragmatic content of the video. We compared verbal task scores and eye movements between groups, and assessed correlations between task performance and eye movements. Individuals with ASD had more difficulty than the controls in interpreting the video, and during two short moments there were significant group differences in eye movements. Additionally, we found significant correlations between verbal task scores and moment-level eye movement in the ASD group, but not among the controls. We concluded that participants with ASD had slight difficulties in understanding the pragmatic content of the video stimulus and attending to social cues, and that the connection between pragmatic understanding and eye movements was more pronounced for participants with ASD than for neurotypical participants.

multimedia information retrieval | 2008

On enabling techniques for personal audio content management

Tommi Lahti; Marko Leonard Helén; Olli Vuorinen; Eero Väyrynen; Juha Partala; Johannes Peltola; Satu-Marja Mäkelä

State-of-the-art automatic analysis tools for personal audio con-tent management are discussed in this paper. Our main target is to create a system, which has several co-operating management tools for audio database and which improve the results of each other. Bayesian networks based audio classification algorithm provides classification into four main audio classes (silence, speech, music, and noise) and serves as a first step for other subsequent analysis tools. For speech analysis we propose an improved Bayesian information criterion based speaker segmen-tation and clustering algorithm applying also a combined gender and emotion detection algorithm utilizing prosodic features. For the other main classes it is often hard to device any general and well functional pre-categorization that would fit the unforesee-able types of user recorded data. For compensating the absence of analysis tools for these classes we propose the use of efficient audio similarity measure and query-by-example algorithm with database clustering capabilities. The experimental results show that the combined use of the algorithms is feasible in practice.

Explore More