Felix Weninger
Nuance Communications
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Felix Weninger.
Proceedings of the 6th workshop on Eye gaze in intelligent human machine interaction: gaze in multimodal interaction | 2013
Florian Eyben; Felix Weninger; Lucas Paletta; Björn W. Schuller
An important aspect in short dialogues is attention as is manifested by eye-contact between subjects. In this study we provide a first analysis whether such visual attention is evident in the acoustic properties of a speakers voice. We thereby introduce the multi-modal GRAS2 corpus, which was recorded for analysing attention in human-to-human interactions of short daily-life interactions with strangers in public places in Graz, Austria. Recordings of four test subjects equipped with eye tracking glasses, three audio recording devices, and motion sensors are contained in the corpus. We describe how we robustly identify speech segments from the subjects and other people in an unsupervised manner from multi-channel recordings. We then discuss correlations between the acoustics of the voice in these segments and the point of visual attention of the subjects. A significant relation between the acoustic features and the distance between the point of view and the eye region of the dialogue partner is found. Further, we show that automatic classification of binary decision eye-contact vs. no eye-contact from acoustic features alone is feasible with an Unweighted Average Recall of up to 70%.
international conference on acoustics, speech, and signal processing | 2017
Yue Zhang; Yifan Liu; Felix Weninger; Björn W. Schuller
Emotion representations are psychological constructs for modelling, analysing, and recognising emotion, being one essential element of affect. Due to its complexity, the boundaries between different emotion concepts are often fuzzy, which is also reflected in the diversification of emotion databases, and their inconsistent target labels. When facing data scarcity as an ever present issue for acoustic emotion recognition, the straightforward method to jointly use the existing data resources is to map various emotion labels onto one common dimensional space; this, however, comes with considerable information loss. To solve the dilemma of data aggregation whilst efficiently exploiting the emotion labels in terms of their original meaning and interrelations, we advocate the usage of multi-task deep neural networks with shared hidden layers (MT-SHL-DNN), in which the feature transformations are shared across different emotion representations, while the output layers are separately associated with each emotion database. On nine frequently used emotional speech corpora and two different acoustic feature sets, we demonstrate that the MT-SHL-DNN method outperforms the single-task DNNs trained with only one emotion representation.
international conference on multimodal interfaces | 2016
Yue Zhang; Felix Weninger; Anton Batliner; Florian Hönig; Björn W. Schuller
In this work, we present an in-depth analysis of the interdependency between the non-native prosody and the native language (L1) of English L2 speakers, as separately investigated in the Degree of Nativeness Task and the Native Language Task of the INTERSPEECH 2015 and 2016 Computational Paralinguistics ChallengE (ComParE). To this end, we propose a multi-task learning scheme based on auxiliary attributes for jointly learning the tasks of L1 classification and prosody score regression. The effectiveness of this scheme is demonstrated in extensive experimental runs, comparing various standardised feature sets of prosodic, cepstral, spectral, and voice quality descriptors, as well as automatic feature selection. In the result, we show that the prediction of both prosody score and L1 can be improved by considering both tasks in a holistic way. In particular, we achieve an 11% relative gain in regression performance (Spearmans correlation coefficient) on prosody scores, when comparing the best multi- and single-task learning results.
Proc. 12th Intern. Society for Music Information Retrieval Conference (ISMIR) 2011, ISMIR, Miami, FL, USA | 2011
Björn W. Schuller; Felix Weninger; Johannes Dorfner
international conference on acoustics speech and signal processing | 2013
Felix Weninger; Jürgen T. Geiger; Martin Wöllmer; Björn W. Schuller; Gerhard Rigoll
international conference on acoustics speech and signal processing | 2014
Jürgen T. Geiger; Erik Marchi; Felix Weninger; Björn W. Schuller; Gerhard Rigoll
Proceedings CHiME 2013 | 2013
Jürgen T. Geiger; Felix Weninger; Antti Hurmalainen; Jort F. Gemmeke; Martin Wöllmer; Björn W. Schuller; Gerhard Rigoll; Tuomas Virtanen
Proceedings MediaEval 2012 Workshop | 2012
Cyril Joder; Felix Weninger; Martin Wöllmer; Björn W. Schuller
international symposium on neural networks | 2017
Zixing Zhang; Felix Weninger; Martin Wöllmer; Jing Han; Björn W. Schuller
Archive | 2018
Felix Weninger; Jun Du; Erik Marchi; Tian Gao