Felix Burkhardt | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Felix Burkhardt is active.

Explore More

Publication

Featured researches published by Felix Burkhardt.

international conference on acoustics, speech, and signal processing | 2007

Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications

Florian Metze; Jitendra Ajmera; Roman Englert; Udo Bub; Felix Burkhardt; Joachim Stegmann; Christian A. Müller; Richard Huber; Bernt Andrassy; Josef Bauer; Bernhard Dipl Ing Littel

This paper presents a comparative study of four different approaches to automatic age and gender classification using seven classes on a telephony speech task and also compares the results with human performance on the same data. The automatic approaches compared are based on (1) a parallel phone recognizer, derived from an automatic language identification system; (2) a system using dynamic Bayesian networks to combine several prosodic features; (3) a system based solely on linear prediction analysis; and (4) Gaussian mixture models based on MFCCs for separate recognition of age and gender. On average, the parallel phone recognizer performs as well as Human listeners do, while loosing performance on short utterances. The system based on prosodic features however shows very little dependence on the length of the utterance.

Journal of Web Semantics | 2007

DOLCE ergo SUMO: On foundational and domain models in the SmartWeb Integrated Ontology (SWIntO)

Daniel Oberle; Anupriya Ankolekar; Pascal Hitzler; Philipp Cimiano; Michael Sintek; Malte Kiesel; Babak Mougouie; Stephan Baumann; Shankar Vembu; Massimo Romanelli; Paul Buitelaar; Ralf Engel; Daniel Sonntag; Norbert Reithinger; Berenike Loos; Hans-Peter Zorn; Vanessa Micelli; Robert Porzel; Christian Schmidt; Moritz Weiten; Felix Burkhardt; Jianshen Zhou

Increased availability of mobile computing, such as personal digital assistants (PDAs), creates the potential for constant and intelligent access to up-to-date, integrated and detailed information from the Web, regardless of ones actual geographical position. Intelligent question-answering requires the representation of knowledge from various domains, such as the navigational and discourse context of the user, potential user questions, the information provided by Web services and so on, for example in the form of ontologies. Within the context of the SmartWeb project, we have developed a number of domain-specific ontologies that are relevant for mobile and intelligent user interfaces to open-domain question-answering and information services on the Web. To integrate the various domain-specific ontologies, we have developed a foundational ontology, the SmartSUMO ontology, on the basis of the DOLCE and SUMO ontologies. This allows us to combine all the developed ontologies into a single SmartWeb Integrated Ontology (SWIntO) having a common modeling basis with conceptual clarity and the provision of ontology design patterns for modeling consistency. In this paper, we present SWIntO, describe the design choices we made in its construction, illustrate the use of the ontology through a number of applications, and discuss some of the lessons learned from our experiences.

international conference on acoustics, speech, and signal processing | 2008

Age and gender recognition for telephone applications based on GMM supervectors and support vector machines

Tobias Bocklet; Andreas K. Maier; Josef Bauer; Felix Burkhardt; Elmar Nöth

This paper compares two approaches of automatic age and gender classification with 7 classes. The first approach are Gaussian mixture models (GMMs) with universal background models (UBMs), which is well known for the task of speaker identification/verification. The training is performed by the EM algorithm or MAP adaptation respectively. For the second approach for each speaker of the test and training set a GMM model is trained. The means of each model are extracted and concatenated, which results in a GMM supervector for each speaker. These supervectors are then used in a support vector machine (SVM). Three different kernels were employed for the SVM approach: a polynomial kernel (with different polynomials), an RBF kernel and a linear GMM distance kernel, based on the KL divergence. With the SVM approach we improved the recognition rate to 74% (p < 0.001) and are in the same range as humans.

international conference on acoustics, speech, and signal processing | 2009

Detecting real life anger

Felix Burkhardt; Tim Polzehl; Joachim Stegmann; Florian Metze; Richard Huber

Acoustic anger detection in voice portals can help to enhance human computer interaction. A comprehensive voice portal data collection has been carried out and gives new insight on the nature of real life data. Manual labeling revealed a high percentage of non-classifiable data. Experiments with a statistical classifier indicate that, in contrast to pitch and energy related features, duration measures do not play an important role for this data while cepstral information does. Also in a direct comparison between Gaussian Mixture Models and Support Vector Machines the latter gave better results.

affective computing and intelligent interaction | 2011

EmotionML - an upcoming standard for representing emotions and related states

Marc Schröder; Paolo Baggia; Felix Burkhardt; Catherine Pelachaud; Christian Peter; Enrico Zovato

The present paper describes the specification of Emotion Markup Language (EmotionML) 1.0, which is undergoing standardisation at the World Wide Web Consortium (W3C). The language aims to strike a balance between practical applicability and scientific wellfoundedness. We briefly review the history of the process leading to the standardisation of EmotionML. We describe the syntax of EmotionML as well as the vocabularies that are made available to describe emotions in terms of categories, dimensions, appraisals and/or action tendencies. The paper concludes with a number of relevant aspects of emotion that are not covered by the current specification.

Archive | 2011

Representing Emotions and Related States in Technological Systems

Marc Schröder; Hannes Pirker; Myriam Lamolle; Felix Burkhardt; Christian Peter; Enrico Zovato

In many cases when technological systems are to operate on emotions and related states, they need to represent these states. Existing representations are limited to application-specific solutions that fall short of representing the full range of concepts that have been identified as relevant in the scientific literature. The present chapter presents a broad conceptual view on the possibility to create a generic representation of emotions that can be used in many contexts and for many purposes. Potential use cases and resulting requirements are identified and compared to the scientific literature on emotions. Options for the practical realisation of an Emotion Markup Language are discussed in the light of the requirement to extend the language to different emotion concepts and vocabularies, and ontologies are investigated as a means to provide limited “mapping” mechanisms between different emotion representations.

international conference on acoustics, speech, and signal processing | 2010

Learning with synthesized speech for automatic emotion recognition

Björn W. Schuller; Felix Burkhardt

Data sparseness is an ever dominating problem in automatic emotion recognition. Using artificially generated speech for training or adapting models could potentially ease this: though less natural than human speech, one could synthesize the exact spoken content in different emotional nuances - of many speakers and even in different languages. To investigate chances, the phonemisation components Txt2Pho and openMary are used with Emofilt and Mbrola for emotional speech synthesis. Analysis is realized with our Munich open Emotion and Affect Recognition toolkit. As test set we gently limit to the acted Berlin and eNTERFACE databases for the moment. In the result synthesized speech can indeed be used for the recognition of human emotional speech.

Universal Access in The Information Society | 2009

Getting closer: tailored human–computer speech dialog

Florian Metze; Roman Englert; Udo Bub; Felix Burkhardt; Joachim Stegmann

This paper presents an advanced call center, which adapts presentation and interaction strategy to properties of the caller such as age, gender, and emotional state. User studies on interactive voice response (IVR) systems have shown that these properties can be used effectively to “tailor” services to users or user groups who do not maintain personal preferences, e.g., because they do not use the service on a regular basis. The adopted approach to achieve individualization of services, without being able to personalize them, is based on the analysis of a caller’s voice. This paper shows how this approach benefits service providers by being able to target entertainment and recommendation options. It also shows how this analysis at the same time benefits the customer, as it can increase accessibility of IVR systems to user segments which have particular expectations or which do not cope well with a “one size fits all” system. The paper summarizes the authors’ current work on component technologies, such as emotion detection, age and gender recognition on telephony speech, and presents results of usability and acceptability tests as well as an architecture to integrate these technologies in future multi-modal contact centers. It is envisioned that these will eventually serve customers with an avatar representation of an agent and tailored interaction strategies, matching powerful output capabilities with advanced analysis of the user’s input.

affective computing and intelligent interaction | 2009

Emotion detection in dialog systems: Applications, strategies and challenges

Felix Burkhardt; Markus Van Ballegooy; Klaus-Peter Engelbrecht; Tim Polzehl; Joachim Stegmann

Emotion plays an important role in human communication and therefore also human machine dialog systems can benefit from affective processing. We present in this paper an overview of our work from the past few years and discuss general considerations, potential applications and experiments that we did with the emotional classification of human machine dialogs. Anger in voice portals as well as problematic dialog situations can be detected to some degree, but the noise in real life data and the issue of unambiguous emotion definition are still challenging. Also, a dialog system reacting emotionally might raise expectations with respect to its intellectual abilities that it can not fulfill.

Computer Speech & Language | 2015

A Survey on Perceived Speaker Traits: Personality, Likability, Pathology and the First Challenge

Björn W. Schuller; Stefan Steidl; Anton Batliner; E. Nöth; Alessandro Vinciarelli; Felix Burkhardt; R.J.J.H. van Son; Felix Weninger; Florian Eyben; Tobias Bocklet; Gelareh Mohammadi; Benjamin Weiss

The INTERSPEECH 2012 Speaker Trait Challenge aimed at a unified test-bed for perceived speaker traits – the first challenge of this kind: personality in the five OCEAN personality dimensions, likability of speakers, and intelligibility of pathologic speakers. In the present article, we give a brief overview of the state-of-the-art in these three fields of research and describe the three sub-challenges in terms of the challenge conditions, the baseline results provided by the organisers, and a new openSMILE feature set, which has been used for computing the baselines and which has been provided to the participants. Furthermore, we summarise the approaches and the results presented by the participants to show the various techniques that are currently applied to solve these classification tasks.

Explore More