Jürgen T. Geiger | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jürgen T. Geiger is active.

Explore More

Publication

Featured researches published by Jürgen T. Geiger.

european signal processing conference | 2015

Improving event detection for audio surveillance using Gabor filterbank features

Jürgen T. Geiger; Karim Helwani

Acoustic event detection in surveillance scenarios is an important but difficult problem. Realistic systems are struggling with noisy recording conditions. In this work, we propose to use Gabor filterbank features to detect target events in different noisy background scenes. These features capture spectro-temporal modulation frequencies in the signal, which makes them suited for the detection of non-stationary sound events. A single-class detector is constructed for each of the different target events. In a hierarchical framework, the separate detectors are combined to a multi-class detector. Experiments are performed using a database of four different target sounds and four background scenarios. On average, the proposed features outperform conventional features in all tested noise levels, in terms of detection and classification performance.

ACM Transactions on Intelligent Systems and Technology | 2018

Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments

Zixing Zhang; Jürgen T. Geiger; Jouni Pohjalainen; Amr El-Desoky Mousa; Wenyu Jin; Björn W. Schuller

Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition but still remains an important challenge. Data-driven supervised approaches, especially the ones based on deep neural networks, have recently emerged as potential alternatives to traditional unsupervised approaches and with sufficient training, can alleviate the shortcomings of the unsupervised methods in various real-life acoustic environments. In this light, we review recently developed, representative deep learning approaches for tackling non-stationary additive and convolutional degradation of speech with the aim of providing guidelines for those involved in the development of environmentally robust speech recognition systems. We separately discuss single- and multi-channel techniques developed for the front-end and back-end of speech recognition systems, as well as joint front-end and back-end training frameworks. In the meanwhile, we discuss the pros and cons of these approaches and provide their experimental results on benchmark databases. We expect that this overview can facilitate the development of the robustness of speech recognition systems in acoustic noisy environments.

international conference on acoustics, speech, and signal processing | 2017

Robust audio localization with phase unwrapping

Kainan Chen; Jürgen T. Geiger; Walter Kellermann

Most of multichannel sound source Direction Of Arrival (DOA) estimation algorithms suffer from spatial aliasing problems. The phase differences between a pair of microphones are wrapped beyond the spatial aliasing frequency. A common solution is to adjust the distance between the microphones to obtain a suitable aliasing frequency, and take only the frequency band below the aliasing frequency for localization. With correct phase unwrapping, a broader frequency band can be utilized for localization. In this paper, we investigate a method for phase unwrapping solving the spatial aliasing problem for scenarios with a single source and high-level diffuse background noise (around 0dB SNR). The aliasing frequency is estimated from the signal, and is used to unwrap a phase difference vector. Pre- and post-processing steps are applied to increase the robustness. Our experiments with a large number of simulated and real signals demonstrate the robustness of our method in noise.

international conference on acoustics, speech, and signal processing | 2016

Localization of sound sources with known statistics in the presence of interferers

Kainan Chen; Jürgen T. Geiger; Karim Helwani; Mohammad Javad Taghizadeh

Methods are available for simultaneous localization of multiple (unknown) audio sources using microphone arrays. Typical algorithms aim at localizing all active sources. They moreover require that the number of sources is known and is less than or equal the number of microphones. This constraint cannot be satisfied in many reallife situations and noisy environments. We present an algorithm for localizing an audio source with known statistics in a multi-source environment. The proposed method circumvents the mentioned problems by using a phase-preserving signal extraction method on the input signal. A binary mask is estimated and used to retain only the information of the target source in the original microphone signals. The masked signals are fed to a modified version of a conventional localization algorithm, which now localizes only the target source. Experimental results obtained from real recordings show that the proposed method can successfully detect and localize the target source.

international conference on acoustics, speech, and signal processing | 2017

Reverberation-based feature extraction for acoustic scene classification

Milos Markovic; Jürgen T. Geiger

We present a system for acoustic scene classification, which is the task to classify an environment based on audio recordings. First, we describe a strong low-complexity baseline system using a compact feature set. Second, this system is improved with a novel class of audio features, which exploit the knowledge of sound behaviour within the scene - reverberation. This information is complementary to commonly used features for acoustic scene classification, such as spectral or cepstral components. For extracting the new features, temporal peaks in the audio signal are detected, and the decay after the peak reveals information about the reverberation properties. For the detected decays, statistics are extracted and summarized over time and over frequency bands. The combination of the novel features with features used in state-of-the-art algorithms for acoustic scene classification increases the classification accuracy, as our results obtained with a large in-house database and the DCASE 2016 database demonstrate.

european signal processing conference | 2015

Dialogue enhancement of stereo sound

Jürgen T. Geiger; Peter Grosche; Yesenia Lacouture Parodi

Studies show that many people have difficulties in understanding dialogue in movies when watching TV, especially hard-of-hearing listeners or in adverse listening environments. In order to overcome this problem, we propose an efficient methodology to enhance the speech component of a stereo signal. The method is designed with low computational complexity in mind, and consists of first extracting a center channel from the stereo signal. Novel methods for speech enhancement and voice activity detection are proposed which exploit the stereo information. A speech enhancement filter is estimated based on the relationship between the extracted center channel and all other channels. Subjective and objective evaluations show that this method can successfully enhance intelligibility of the dialogue without affecting the overall sound quality negatively.

international conference on acoustics speech and signal processing | 2013