Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sebastian Gergen is active.

Publication


Featured researches published by Sebastian Gergen.


conference of the international speech communication association | 2016

Dynamic Stream Weighting for Turbo-Decoding-Based Audiovisual ASR.

Sebastian Gergen; Steffen Zeiler; Ahmed Hussen Abdelaziz; Robert M. Nickel; Dorothea Kolossa

Automatic speech recognition (ASR) enables very intuitive human-machine interaction. However, signal degradations due to reverberation or noise reduce the accuracy of audio-based recognition. The introduction of a second signal stream that is not affected by degradations in the audio domain (e.g., a video stream) increases the robustness of ASR against degradations in the original domain. Here, depending on the signal quality of audio and video at each point in time, a dynamic weighting of both streams can optimize the recognition performance. In this work, we introduce a strategy for estimating optimal weights for the audio and video streams in turbo-decodingbased ASR using a discriminative cost function. The results show that turbo decoding with this maximally discriminative dynamic weighting of information yields higher recognition accuracy than turbo-decoding-based recognition with fixed stream weights or optimally dynamically weighted audiovisual decoding using coupled hidden Markov models.


international conference on acoustics, speech, and signal processing | 2013

Audio signal classification in reverberant environments based on fuzzy-clustered ad-hoc microphone arrays

Sebastian Gergen; Anil M. Nagathil; Rainer Martin

Audio signal classification suffers from the mismatch of environmental conditions when training data is based on clean and anechoic signals and test data is distorted by reverberation and signals from other sources. In this contribution we analyze the classification performance for such a scenario with two concurrently active sources in a simulated reverberant environment. To obtain robust classification results, we exploit the spatial distribution of ad-hoc microphone arrays to capture the signals and extract cepstral features. Based on these features only, we use unsupervised fuzzy clustering to estimate clusters of microphones which are dominated by one of the sources. Finally, signal classification based on clean and anechoic training data is performed for each of the cluster. The probability of cluster membership for each microphone is provided by the fuzzy clustering algorithm and is used to compute a weighted average of the feature vectors. It is shown that the proposed method exceeds the performance of classification based on single microphones.


Signal Processing | 2015

Classification of reverberant audio signals using clustered ad hoc distributed microphones

Sebastian Gergen; Anil M. Nagathil; Rainer Martin

In a real world scenario, the automatic classification of audio signals constitutes a difficult problem. Often, reverberation and interfering sounds reduce the quality of a target source signal. This results in a mismatch between test and training data when a classifier is trained on clean and anechoic data. To classify disturbed signals more accurately we make use of the spatial distribution of microphones from ad hoc microphone arrays. In the proposed algorithm clusters of microphones that either are dominated by one of the sources in an acoustic scenario or contain mainly signal mixtures and reverberation are estimated in the audio feature domain. Information is shared within and in between these clusters to create one feature vector for each cluster to classify the source dominating this cluster. We evaluate the algorithm using simultaneously active sound sources and different ad hoc microphone arrays in simulated reverberant scenarios and multichannel recordings of an ad hoc microphone setup in a real environment. The cluster based classification accuracy is higher than the accuracy based on single microphone signals and allows for a robust classification of simultaneously active sources in reverberant environments. HighlightsAd hoc microphone arrays are used to classify audio sources in a reverberant environment.Clusters of microphones are estimated and used for information exchange in the feature domain.Cluster based processing allows for a classification of audio sources with a high accuracy.The evaluation is based on simulated and recorded reverberant audio data.


international conference on signal processing | 2012

An optimized parametric model for the simulation of reverberant microphone signals

Sebastian Gergen; Christian Borss; Nilesh Madhu; Rainer Martin

In 2011, Borß introduced a parametric model for the design of virtual acoustics, which creates a natural sounding virtual environment for applications requiring virtualization e.g., in teleconferencing systems and computer games. In this work we refine this model to make it applicable for the simulation of room acoustics and reverberation to aid in the development of single- and multi-channel audio signal enhancement systems. The model takes early reflections with a frequency-dependent attenuation and the diffuse character of late reverberation with its coherence characteristics into account and provides predefined rooms and reverberation times according to a norm defined by the German Institute for Standardization, to ensure a high degree of realism and usability. Compared to the standard image source model for generating virtual acoustics, the proposed system generates a more realistic virtual acoustic environment.


multimedia signal processing | 2017

Analysis of temporal aggregation and dimensionality reduction on feature sets for speaker identification in wireless acoustic sensor networks

Alexandru Nelus; Sebastian Gergen; Rainer Martin

In this paper we analyze the impact of temporal feature aggregation and feature dimensionality reduction on the performance of speaker identification tasks. We investigate these two processing steps in the context of communication layer constraints, such as limited bitrate, and privacy constraints at node level, of a wireless acoustic sensor network. To this end, we extract Modulation-MFCC features and state-of-the-art i-vectors for speaker identification, and investigate temporal aggregation and dimensionality reduction in the feature extraction process. In the evaluation, we use clean data as well as reverberant data to assess the feature sets for different application scenarios. It is found that temporal aggregation has a positive effect on increasing speaker identification performance while respecting the aforementioned privacy constraints and that Linear Discriminant Analysis can be successfully employed for dimensionality reduction.


Hands-free Speech Communication and Microphone Arrays (HSCMA), 2014 4th Joint Workshop on | 2014

A hierarchical approach for the online, on-board detection and localisation of brake squeal using microphone arrays

Nilesh Madhu; Rainer Martin; Heinz{-}Werner Rehn; Sebastian Gergen; A. Fischer

We present here a hierarchical approach for the detection and localisation of brake squeal. The proposed system exploits the spatial diversity of microphone arrays to localise a squealing brake. As brake squeal is emitted from a priori known regions, i.e. near the wheels, localisation of a squeal may be seen as a hypothesis testing problem. However, in contrast to standard hypothesis testing approaches, the propagation environment is complex and time-varying, making modelling difficult. Additionally, we have inaccuracies in sensor positioning, source position knowledge and sensor gain mismatch. Thus, standard approaches fail in this case. Therefore, we develop a robust approach that implicitly considers such incomplete system knowledge. The algorithm detects squeal events (which may be overlapping in time and frequency) and performs localisation by allocating to each source (brake) a measure that indicates its contribution to the acoustic event. The algorithm is evaluated in a real setting.


itg symposium of speech communication | 2016

Towards Opaque Audio Features for Privacy in Acoustic Sensor Networks.

Alexandru Nelus; Sebastian Gergen; Jalal Taghia; Rainer Martin


conference of the international speech communication association | 2015

Reduction of reverberation effects in the MFCC modulation spectrum for improved classification of acoustic signals.

Sebastian Gergen; Anil M. Nagathil; Rainer Martin


itg symposium of speech communication | 2016

Estimating Source Dominated Microphone Clusters in Ad-Hoc Microphone Arrays by Fuzzy Clustering in the Feature Space.

Sebastian Gergen; Rainer Martin


itg symposium of speech communication | 2016

New Insights into Turbo-Decoding-Based AVSR with Dynamic StreamWeights.

Sebastian Gergen; Steffen Zeiler; Ahmed Hussen Abdelaziz; Dorothea Kolossa

Collaboration


Dive into the Sebastian Gergen's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Nilesh Madhu

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge