Frank Joublin | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Frank Joublin is active.

Explore More

Publication

Featured researches published by Frank Joublin.

International Journal of Social Robotics | 2012

Generation and Evaluation of Communicative Robot Gesture

Maha Salem; Stefan Kopp; Ipke Wachsmuth; Katharina J. Rohlfing; Frank Joublin

How is communicative gesture behavior in robots perceived by humans? Although gesture is crucial in social interaction, this research question is still largely unexplored in the field of social robotics. Thus, the main objective of the present work is to investigate how gestural machine behaviors can be used to design more natural communication in social robots. The chosen approach is twofold. Firstly, the technical challenges encountered when implementing a speech-gesture generation model on a robotic platform are tackled. We present a framework that enables the humanoid robot to flexibly produce synthetic speech and co-verbal hand and arm gestures at run-time, while not being limited to a predefined repertoire of motor actions. Secondly, the achieved flexibility in robot gesture is exploited in controlled experiments. To gain a deeper understanding of how communicative robot gesture might impact and shape human perception and evaluation of human-robot interaction, we conducted a between-subjects experimental study using the humanoid robot in a joint task scenario. We manipulated the non-verbal behaviors of the robot in three experimental conditions, so that it would refer to objects by utilizing either (1) unimodal (i.e., speech only) utterances, (2) congruent multimodal (i.e., semantically matching speech and gesture) or (3) incongruent multimodal (i.e., semantically non-matching speech and gesture) utterances. Our findings reveal that the robot is evaluated more positively when non-verbal behaviors such as hand and arm gestures are displayed along with speech, even if they do not semantically match the spoken utterance.

intelligent robots and systems | 2006

Real-time Sound Localization With a Binaural Head-system Using a Biologically-inspired Cue-triple Mapping

Tobias Rodemann; Martin Heckmann; Frank Joublin; Christian Goerick; Björn Schölling

We present a sound localization system that operates in real-time, calculates three binaural cues (IED, UD, and ITD) and integrates them in a biologically inspired fashion to a combined localization estimation. Position information is furthermore integrated over frequency channels and time. The localization system controls a head motor to fovealize on and track the dominant sound source. Due to an integrated noise-reduction module the system shows robust localization capabilities even in noisy conditions. Real-time performance is gained by multi-threaded parallel operation across different machines using a timestamp-based synchronization scheme to compensate for processing delays

intelligent robots and systems | 2008

Using binaural and spectral cues for azimuth and elevation localization

Tobias Rodemann; Gökhan Ince; Frank Joublin; Christian Goerick

It is a common assumption that with just two microphones only the azimuth angle of a sound source can be estimated and that a third, orthogonal microphone (or set of microphones) is necessary to estimate the elevation of the source. Recently, using specially designed ears and analyzing spectral cues several researchers managed to estimate sound source elevation with a binaural system. In this work, we show that with two bionic ears both azimuth and elevation angle can be determined using both binaural (e.g. IID and ITD) and spectral cues. This ability can also be used to disambiguate signals coming from the front or back. We present a detailed analysis of both azimuth and elevation localization performance for binaural and spectral cues in comparison. We demonstrate that with a small extension of a standard binaural system a basic elevation estimation capacity can be gained.

robot and human interactive communication | 2011

A friendly gesture: Investigating the effect of multimodal robot behavior in human-robot interaction

Maha Salem; Katharina J. Rohlfing; Stefan Kopp; Frank Joublin

Gesture is an important feature of social interaction, frequently used by human speakers to illustrate what speech alone cannot provide, e.g. to convey referential, spatial or iconic information. Accordingly, humanoid robots that are intended to engage in natural human-robot interaction should produce speech-accompanying gestures for comprehensible and believable behavior. But how does a robots non-verbal behavior influence human evaluation of communication quality and the robot itself? To address this research question we conducted two experimental studies. Using the Honda humanoid robot we investigated how humans perceive various gestural patterns performed by the robot as they interact in a situational context. Our findings suggest that the robot is evaluated more positively when non-verbal behaviors such as hand and arm gestures are displayed along with speech. These findings were found to be enhanced when the participants were explicitly requested to direct their attention towards the robot during the interaction.

Speech Communication | 2011

A hierarchical framework for spectro-temporal feature extraction

Martin Heckmann; Xavier Domont; Frank Joublin; Christian Goerick

In this paper we present a hierarchical framework for the extraction of spectro-temporal acoustic features. The design of the features targets higher robustness in dynamic environments. Motivated by the large gap between human and machine performance in such conditions we take inspirations from the organization of the mammalian auditory cortex in the design of our features. This includes the joint processing of spectral and temporal information, the organization in hierarchical layers, competition between coequal features, the use of high-dimensional sparse feature spaces, and the learning of the underlying receptive fields in a data-driven manner. Due to these properties we termed the features as hierarchical spectro-temporal (HIST) features. For the learning of the features at the first layer we use Independent Component Analysis (ICA). At the second layer of our feature hierarchy we apply Non-Negative Sparse Coding (NNSC) to obtain features spanning a larger frequency and time region. We investigate the contribution of the different subparts of this feature extraction process to the overall performance. This includes an analysis of the benefits of the hierarchical processing, the comparison of different feature extraction methods on the first layer, the evaluation of the feature competition, and the investigation of the influence of different receptive field sizes on the second layer. Additionally, we compare our features to MFCC and RASTA-PLP features in a continuous digit recognition task in noise. On a wideband dataset we constructed ourselves based on the Aurora-2 task, as well as on the actual Aurora-2 database. We show that a combination of the proposed HIST features and RASTA-PLP features yields significant improvements and that the proposed features carry complementary information to RASTA-PLP and MFCC features.

IEEE Transactions on Audio, Speech, and Language Processing | 2010

Combining Auditory Preprocessing and Bayesian Estimation for Robust Formant Tracking

Claudius Gläser; Martin Heckmann; Frank Joublin; Christian Goerick

We present a framework for estimating formant trajectories. Its focus is to achieve high robustness in noisy environments. Our approach combines a preprocessing based on functional principles of the human auditory system and a probabilistic tracking scheme. For enhancing the formant structure in spectrograms we use a Gammatone filterbank, a spectral preemphasis, as well as a spectral filtering using difference-of-Gaussians (DoG) operators. Finally, a contrast enhancement mimicking a competition between filter responses is applied. The probabilistic tracking scheme adopts the mixture modeling technique for estimating the joint distribution of formants. In conjunction with an algorithm for adaptive frequency range segmentation as well as Bayesian smoothing an efficient framework for estimating formant trajectories is derived. Comprehensive evaluations of our method on the VTR-formant database emphasize its high precision and robustness. We obtained superior performance compared to existing approaches for clean as well as echoic noisy speech. Finally, an implementation of the framework within the scope of an online system using instantaneous feature-based resynthesis demonstrates its applicability to real-world scenarios.

international conference on social robotics | 2011

Effects of gesture on the perception of psychological anthropomorphism: a case study with a humanoid robot

Maha Salem; Friederike Anne Eyssel; Katharina J. Rohlfing; Stefan Kopp; Frank Joublin

Previous work has shown that gestural behaviors affect anthropomorphic inferences about artificial communicators such as virtual agents. In an experiment with a humanoid robot, we investigated to what extent gesture would affect anthropomorphic inferences about the robot. Particularly, we examined the effects of the robots hand and arm gestures on the attribution of typically human traits, likability of the robot, shared reality, and future contact intentions after interacting with the robot. For this, we manipulated the non-verbal behaviors of the humanoid robot in three experimental conditions: (1) no gesture, (2) congruent gesture, and (3) incongruent gesture. We hypothesized higher ratings on all dependent measures in the two gesture (vs. no gesture) conditions. The results confirm our predictions: when the robot used gestures during interaction, it was anthropomorphized more, participants perceived it as more likable, reported greater shared reality with it, and showed increased future contact intentions than when the robot gave instructions without using gestures. Surprisingly, this effect was particularly pronounced when the robots gestures were partly incongruent with speech. These findings show that communicative non-verbal behaviors in robotic systems affect both anthropomorphic perceptions and the mental models humans form of a humanoid robot during interaction.

intelligent robots and systems | 2006

Auditory Inspired Binaural Robust Sound Source Localization in Echoic and Noisy Environments

Martin Heckmann; Tobias Rodemann; Frank Joublin; Christian Goerick; Björn Schölling

We propose a new approach for binaural sound source localization in real world environments implementing a new model of the precedence effect. This enables the robust measurement of the localization cue values (ITD, UD and IED) in echoic environments. The system is inspired by the auditory system of mammals. It uses a Gammatone filter bank for preprocessing and extracts the ITD and IED cues via zero crossings (UD calculation is straight forward). The mapping between the cue values and the different angles is learned offline which facilitates the adaptation to different head geometries. The performance of the system is demonstrated by localization results for two simultaneous speakers and the mixture of a speaker, music, and fan noise in a normal meeting room. A real time demonstrator of the system is presented in T. Rodemann, et al. (2006)

international conference on acoustics, speech, and signal processing | 2008

Hierarchical spectro-temporal features for robust speech recognition

Xavier Domont; Martin Heckmann; Frank Joublin; Christian Goerick

Previously we presented an auditory-inspired feed-forward architecture which achieves good performance in noisy conditions on a segmented word recognition task. In this paper we propose to use a modified version of this hierarchical model to generate features for standard hidden Markov models. To obtain these features we firstly compute the spectrograms using a Gammatone filterbank. A filtering over the channels permits to enhance the formant frequencies which are afterwards detected using Gabor-like receptive fields. Then the responses of the receptive fields are combined to complex features which span the whole frequency range and extend over three different time windows. The features have been evaluated on a single digit recognition task. The results show that their combination with MFCCs or RASTA features yields improved recognition scores in noise.

intelligent robots and systems | 2006

Integrated Research and Development Environment for Real-Time Distributed Embodied Intelligent Systems

Antonello Ceravola; Frank Joublin; Mark Dunn; Julian Eggert; Marcus Stein; Christian Goerick

In the field of intelligent systems, research and design approaches vary from predefined architectures to self-organizing systems. Regardless of the architectural approach, such systems may grow in size and complexity to levels where the capacities of people are strongly challenged. Such systems are commonly researched, designed and developed following several methods and with the help of a variety of software tools. In this paper we want to describe our research and development environment. It is composed of a set of tools that support our research and enable us to develop large scale intelligent systems used in our robots and in our test platforms. The main parts of our research and development environment are: the component models BBCM (brain bytes component model) and BBDM (brain bytes data model), the middleware RTBOS (real-time brain operating system), the monitoring system CMBOS (control-monitor brain operating system) and the design environment DTBOS (design tool for brain operating system). We will compare our research and development environment with others available on the market or still in research phase and we will describe some of our experiments

Explore More