Is this you? Create Your Porfile

Iker Luengo

University of the Basque Country

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Iker Luengo is active.

Explore More

Publication

Featured researches published by Iker Luengo.

IEEE Transactions on Multimedia | 2010

Feature Analysis and Evaluation for Automatic Emotion Identification in Speech

Iker Luengo; Eva Navas; Inmaculada Hernáez

The definition of parameters is a crucial step in the development of a system for identifying emotions in speech. Although there is no agreement on which are the best features for this task, it is generally accepted that prosody carries most of the emotional information. Most works in the field use some kind of prosodic features, often in combination with spectral and voice quality parametrizations. Nevertheless, no systematic study has been done comparing these features. This paper presents the analysis of the characteristics of features derived from prosody, spectral envelope, and voice quality as well as their capability to discriminate emotions. In addition, early fusion and late fusion techniques for combining different information sources are evaluated. The results of this analysis are validated with experimental automatic emotion identification tests. Results suggest that spectral envelope features outperform the prosodic ones. Even when different parametrizations are combined, the late fusion of long-term spectral statistics with short-term spectral envelope parameters provides an accuracy comparable to that obtained when all parametrizations are combined.

international conference on acoustics, speech, and signal processing | 2007

Evaluation of Pitch Detection Algorithms Under Real Conditions

Iker Luengo; Ibon Saratxaga; Eva Navas; Inmaculada Hernáez; Jon Sanchez; Iñaki Sainz

A novel algorithm based on classical cepstrum calculation followed by dynamic programming is presented in this paper. The algorithm has been evaluated with a 60-minutes database containing 60 speakers and different recording conditions and environments. A second reference database has also been used. In addition, the performance of four popular PDA algorithms has been evaluated with the same databases. The results prove the good performance of the described algorithm in noisy conditions. Furthermore, the paper is a first initiative to perform an evaluation of widely used PDA algorithms over an extensive and realistic database.

text speech and dialogue | 2004

Obtaining and Evaluating an Emotional Database for Prosody Modelling in Standard Basque

Eva Navas; Inmaculada Hernáez; Amaia Castelruiz; Iker Luengo

This paper presents a database designed to extract prosodic models corresponding to emotional speech to be used in speech synthesis for standard Basque. A database of acted speech, which uses a corpus containing both neutral texts and texts semantically related with emotion has been recorded for the six basic emotions: anger, disgust, fear, joy, sadness and surprise. Subjective evaluation of the database shows that emotions are accurately identified, so it can be used to study prosodic models of emotion in Basque.

text speech and dialogue | 2005

Analysis of the suitability of common corpora for emotional speech modeling in standard basque

Eva Navas; Inmaculada Hernáez; Iker Luengo; Jon Sanchez; Ibon Saratxaga

This paper presents the analysis made to assess the suitability of neutral semantic corpora to study emotional speech. Two corpora have been used: one having neutral texts that were common to all emotions and the other having texts related to the emotion. Subjective and objective analysis have been performed. In the subjective test common corpus has achieved good recognition rates, although worse than those obtained with specific texts. In the objective analysis, differences among emotions are larger for common texts than for specific texts, indicating that in common corpus expression of emotions was more exaggerated. This is convenient for emotional speech synthesis, but no for emotion recognition. So, in this case, common corpus is suitable for the prosodic modeling of emotions to be used in speech synthesis, but for emotion recognition specific texts are more convenient.

COST 2102'07 Proceedings of the 2007 COST action 2102 international conference on Verbal and nonverbal communication behaviours | 2007

Meaningful parameters in emotion characterisation

Eva Navas; Inmaculada Hernáez; Iker Luengo; Iñaki Sainz; Ibon Saratxaga; Jon Sanchez

In expressive speech synthesis some method of mimicking the way one specific speaker express emotions is needed. In this work we have studied the suitability of long term prosodic parameters and short term spectral parameters to reflect emotions in speech, by means of the analysis of the results of two automatic emotion classification systems. Those systems have been trained with different emotional monospeaker databases recorded in standard Basque that include six emotions. Both of them are able to differentiate among emotions for a specific speaker with very high identification rates (above 75%), but the models are not applicable to other speakers (identification rates drop to 20%). Therefore in the synthesis process the control of both spectral and prosodic features is essential to get expressive speech and when a change in speaker is desired the values of the parameters should be re-estimated.

iberoamerican congress on pattern recognition | 2004

Acoustical Analysis of Emotional Speech in Standard Basque for Emotions Recognition

Eva Navas; Inmaculada Hernáez; Amaia Castelruiz; Jon Sanchez; Iker Luengo

This paper presents the acoustical study of an emotional speech database in standard Basque to determine the set of parameters that can be used for the recognition of emotions. The database is divided into two parts, one with neutral texts and another one with texts semantically related with the emotion. The study is performed on both parts, in order to known whether the same criteria may be used to recognize emotions independently of the semantic content of the text. Mean F0, F0 range, maximum positive slope in F0 curve, mean phone duration and RMS energy are analyzed. The parameters selected can distinguish emotions in both corpora, so they are suitable for emotion recognition.

conference on computer as a tool | 2005

Front-End for the Oral Control of Applications in Windows Environments

Iñaki Sainz; Eva Navas; Jon Sanchez; Iker Luengo; Inmaculada Hernáez

This paper presents the development of an oral interface to control any Windows application by means of speech, providing a user-friendly interface. This front-end is fully configurable using plain text files, being able to manage any program with a graphic environment that works under Windows Operating System, using functions from the Windows API. Hidden Markov models provide the speech recognition ability, using recursive training of triphone models built with a SpeechDat database. The text to speech system uses an MBROLA based algorithm and it is integrated in a dynamic library. The application is focused on customers with some vision or movement handicap and it is designed to be used in Basque language

conference of the international speech communication association | 2005