Ronald Müller
Technische Universität München
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ronald Müller.
international conference on multimedia and expo | 2005
Björn W. Schuller; Stephan Reiter; Ronald Müller; Marc Al-Hames; Manfred K. Lang; Gerhard Rigoll
Emotion recognition grows to an important factor in future media retrieval and man machine interfaces. However, even human deciders often experience problems realizing ones emotion, especially of strangers. In this work we strive to recognize emotion independent of the person concentrating on the speech channel. Single feature relevance of acoustic features is a critical point, which we address by filter-based gain ratio calculation starting at a basis of 276 features. As optimization of a minimum set as a whole in general saves more extraction effort, we furthermore apply an SVM-SFFS wrapper based search. For a more robust estimation we also integrate spoken content information by a Bayesian net analysis of ASR outputs. Overall classification is realized in an early feature fusion by stacked ensembles of diverse base classifiers. Tests ran on a 3,947 movie and automotive interaction dialog-turns database consisting of 35 speakers. Remarkable overall performance can be reported in the discrimination of the seven discrete emotions named in the MPEG-4 standard with added neutrality
international conference on multimedia and expo | 2007
Marc Al-Hames; Benedikt Hörnler; Ronald Müller; Joachim Schenk; Gerhard Rigoll
In a video-conference the participants usually see the video of the speaker. However if somebody reacts (e. g. nodding) the system should switch to his video. Current systems do not support this. We formulate this camera selection as a pattern recognition problem. Then we apply HMMs to learn this behaviour. Thus our system can easily be adapted to different meeting scenarios. Furthermore, while current systems stay on the speaker, our system will switch if somebody reacts. In an experimental section we show that -compared to a desired output -a current system shows the wrong camera more than half of the time (frame error rate 53%), where our system selects the wrong camera in only a quarter of the time (FER 27%).
Archive | 2006
Björn W. Schuller; Markus Ablaßmeier; Ronald Müller; Stefan Reifinger; Tony Poitschke; Gerhard Rigoll
Within the area of advanced man-machine interaction, speech communication has always played a major role for several decades. The idea of replacing the convential input devices such as buttons and keyboard by voice control and thus increasing the comfort and the input speed considerably, seems that much attractive, that even the quite slow progress of speech technology during those decades could not discourage people from pursuing that goal. However, nowadays this area is in a different situation than in those earlier times, and these facts shall be also considered in this book section: First of all, speech technology has reached a much higher degree of maturity, mainly through the technique of stochastic modeling which shall be briefly introduced in this chapter. Secondly, other interaction techniques became more mature, too, and in the framework of that development, speech became one of the preferred modalities of multimodal interaction, e.g. as ideal complementary mode to pointing or gesture. This shall be also reflected in the subsection on multimodal interaction. Another relatively recent development is the fact that speech is not only a carrier of linguistic information, but also one of emotional information, and emotions became another important aspect in today’s advanced man machine interaction. This will be considered in a subsection on affective computing, where this topic is also consequently investigated from a multimodal point of view, taking into account the possibilities for extracting emotional cues from the speech signal as well as from visual information. We believe that such an integrated approach to all the above mentioned different aspects is appropriate in order to reflect the newest developments in that field.
international conference on machine learning | 2006
Marc Al-Hames; Thomas Hain; Jan Cernocky; Sascha Schreiber; Mannes Poel; Ronald Müller; Sébastien Marcel; David A. van Leeuwen; Jean-Marc Odobez; Sileye Ba; Hervé Bourlard; Fabien Cardinaux; Daniel Gatica-Perez; Adam Janin; Petr Motlicek; Stephan Reiter; Steve Renals; Jeroen van Rest; Rutger Rienks; Gerhard Rigoll; Kevin Smith; Andrew Thean; Pavel Zemcik
The project Augmented Multi-party Interaction (AMI) is concerned with the development of meeting browsers and remote meeting assistants for instrumented meeting rooms – and the required component technologies R&D themes: group dynamics, audio, visual, and multimodal processing, content abstraction, and human-computer interaction. The audio-visual processing workpackage within AMI addresses the automatic recognition from audio, video, and combined audio-video streams, that have been recorded during meetings. In this article we describe the progress that has been made in the first two years of the project. We show how the large problem of audio-visual processing in meetings can be split into seven questions, like “Who is acting during the meeting?”. We then show which algorithms and methods have been developed and evaluated for the automatic answering of these questions.
international conference on multimedia and expo | 2004
Björn W. Schuller; Ronald Müller; Gerhard Rigoll; Manfred K. Lang
We present a novel approach towards robust keyword-based retrieval. Bayesian belief networks are applied in a word-model based approximate string matching algorithm. Apart from a proven reliable performance in a working implementation on standard sources like digital text, wholly probabilistic modeling allows for integration of confidence measures and hypotheses obtained from preprocessing stages, like handwriting recognition or optical character recognition, respecting uncertainties on the lower levels. Furthermore, a flexible method to include the modeling of specific error types derived from humans and various input sources is provided. The remarkable performance of the algorithms presented was tested during extensive evaluation with respect to the Levenstein distance, which can be seen as the basis of state-of-the-art methods in this research field. The tests ran on a 14 K database containing common international music titles and four 10 K databases consisting of the most frequently used words in English, German, French and Dutch.
international conference on multimedia and expo | 2007
Lukas Diduch; Ronald Müller; Gerhard Rigoll
This paper introduces the software framework MMER Lab which allows an effective assembly of modular signal processing systems optimized for memory efficiency and performance. Our C/C++ framework is designed to constitute the basis of a well organized and simplified development process in industrial and academic research teams. It supports the structuring of modular systems by provision of basic data-, parameter-, and command-interfaces, ensuring the re-usability of the system components. Due to the underlying multi-threading capabilities, the applications built in MMER Lab are enabled to fully exploit the increasing computational power of multi-core CPU architectures. This feature is carried out by a buffering concept which controls the data flow between the connected modules and allows for the parallel processing of consecutive signal segments (e.g. video frames). We introduce the concept of the multi-threading environment and the data flow architecture with its comfortable programming interface. We illustrate the proposed module concept for the generic assembly of processing chains and show applications from the area of video analysis and pattern
conference of the international speech communication association | 2005
Björn W. Schuller; Ronald Müller; Manfred K. Lang; Gerhard Rigoll
conference of the international speech communication association | 2006
Björn W. Schuller; Niels Köhler; Ronald Müller; Gerhard Rigoll
Archive | 2006
Björn W. Schuller; Markus Ablaßmeier; Ronald Müller; Stefan Reifinger; Tony Poitschke; Gerhard Rigoll
Lecture Notes in Computer Science | 2006
Marc Al-Hames; Thomas Hain; Jan Cernocky; Sascha Schreiber; Mannes Poel; Ronald Müller; Sébastien Marcel; David A. van Leeuwen; Jean-Marc Odobez; Sileye Ba; Fabien Cardinaux; Daniel Gatica-Perez; Adam Janin; Petr Motlicek; Stephan Reiter; Steve Renals; Jeroen van Rest; Rutger Rienks; Gerhard Rigoll; Kevin Smith; Andrew Thean; Pavel Zemcik