Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Axel Plinge is active.

Publication


Featured researches published by Axel Plinge.


international conference on acoustics, speech, and signal processing | 2014

A Bag-of-Features approach to acoustic event detection

Axel Plinge; Rene Grzeszick; Gernot A. Fink

The classification of acoustic events in indoor environments is an important task for many practical applications in smart environments. In this paper a novel approach for classifying acoustic events that is based on a Bag-of-Features approach is proposed. Mel and gammatone frequency cepstral coefficients that originate from psychoacoustic models are used as input features for the Bag-of representation. Rather than using a prior classification or segmentation step to eliminate silence and background noise, Bag-of-Features representations are learned for a background class. Supervised learning of codebooks and temporal coding are shown to improve the recognition rates. Three different databases are used for the experiments: the CLEAR sound event dataset, the D-CASE event dataset and a new set of smart room recordings.


international conference on acoustics, speech, and signal processing | 2014

Multi-speaker tracking using multiple distributed microphone arrays

Axel Plinge; Gernot A. Fink

Tracking multiple speakers with microphone arrays is one of the key tasks in smart environments. For good accuracy in reverberant environments, several arrays should be distributed in the room. The method presented is using distributed nodes with microphone arrays that compute local angular speech detections. In an integrating node, these are associated using the spectra and tracks for multiple concurrent speaker are computed. Euclidean coordinates are derived by triangulation, which is improved by a quality based weighting. The method is not only robust against reverberation, but also against transmission errors and jitter. Test with real recordings show that good precision for practical applications can be achieved.


IEEE Signal Processing Magazine | 2016

Acoustic Microphone Geometry Calibration: An overview and experimental evaluation of state-of-the-art algorithms

Axel Plinge; Florian Jacob; Gernot A. Fink

Today, we are often surrounded by devices with one or more microphones, such as smartphones, laptops, and wireless microphones. If they are part of an acoustic sensor network, their distribution in the environment can be beneficially exploited for various speech processing tasks. However, applications like speaker localization, speaker tracking, and speech enhancement by beamforming avail themselves of the geometrical configuration of the sensors. Therefore, acoustic microphone geometry calibration has recently become a very active field of research. This article provides an application-oriented, comprehensive survey of existing methods for microphone position self-calibration, which will be categorized by the measurements they use and the scenarios they can calibrate. Selected methods will be evaluated comparatively with real-world recordings.


german conference on pattern recognition | 2015

Temporal Acoustic Words for Online Acoustic Event Detection

Rene Grzeszick; Axel Plinge; Gernot A. Fink

The Bag-of-Features principle proved successful in many pattern recognition tasks ranging from document analysis and image classification to gesture recognition and even forensic applications. Lately these methods emerged in the field of acoustic event detection and showed very promising results. The detection and classification of acoustic events is an important task for many practical applications like video understanding, surveillance or speech enhancement. In this paper a novel approach for online acoustic event detection is presented that builds on top of the Bag-of-Features principle. Features are calculated for all frames in a given window. Applying the concept of feature augmentation additional temporal information is encoded in each feature vector. These feature vectors are then softly quantized so that a Bag-of-Feature representation is computed. These representations are evaluated by a classifier in a sliding window approach. The experiments on a challenging indoor dataset of acoustic events will show that the proposed method yields state-of-the-art results compared to other online event detection methods. Furthermore, it will be shown that the temporal feature augmentation significantly improves the recognition rates.


international workshop on acoustic signal enhancement | 2014

Geometry calibration of multiple microphone arrays in highly reverberant environments

Axel Plinge; Gernot A. Fink

Microphone arrays can be used for a number of applications such as speaker diarization and tracking. For these, it is necessary to calibrate their geometry with good precision. Manual measurement is cumbersome and impractical for ad hoc configurations as distributed sensor nodes. So an fast automated calibration method that provides sufficient accuracy is required. It is even more convenient if data from the target application itself can be used so that the system can be calibrated online during its use. In this paper, we propose an automated geometry calibration method that outperforms existing state-of-the-art approaches. It does not require speakers at the nodes and works well in high reverberation. It was evaluated with real recordings in a smart room. By simply playing a white noise signal from a mobile phone at a few positions around the arrays, a calibration error of below 2 cm and 2° was achieved. By identification of speech events at different positions, the same method can be applied online; Here an error of 10 cm and 3° was achieved.


IEEE Signal Processing Letters | 2017

Passive Online Geometry Calibration of Acoustic Sensor Networks

Axel Plinge; Gernot A. Fink; Sharon Gannot

As we are surrounded by an increased number of mobile devices equipped with wireless links and multiple microphones, e.g., smartphones, tablets, laptops, and hearing aids, using them collaboratively for acoustic processing is a promising platform for emerging applications. These devices make up an acoustic sensor network comprised of nodes, i.e., distributed devices equipped with microphone arrays, communication unit, and processing unit. Algorithms for speaker separation and localization using such a network require a precise knowledge of the nodes’ locations and orientations. To acquire this knowledge, a recently introduced approach proposed a combined direction of arrival and time difference of arrival (TDoA) target function for offline calibration with dedicated recordings. This letter proposes an extension of this approach to a novel online method with two new features: First, by employing an evolutionary algorithm on incremental measurements, it is online and fast enough for real-time application. Second, by using the sparse spike representation computed in a cochlear model for TDoA estimation, the amount of information shared between the nodes by transmission is reduced, while the accuracy is increased. The proposed approach is able to calibrate an acoustic senor network online during a meeting in a reverberant conference room.


IEEE Transactions on Audio, Speech, and Language Processing | 2017

Bag-of-Features Methods for Acoustic Event Detection and Classification

Rene Grzeszick; Axel Plinge; Gernot A. Fink

The detection and classification of acoustic events in various environments is an important task. Its applications range from multimedia analysis to surveillance of humans or even animal life. Several of these tasks require the capability of online processing. Besides many approaches that tackle the task of acoustic event detection, methods that are based on the well known bag-of-features principle also emerged into the field. Acoustic features are calculated for all frames in a given time window. Then, applying the bag-of-features concept, these features are quantized with respect to a learned codebook and a histogram representation is computed. Bag-of-features approaches are particularly interesting for online processing as they have a low computational cost. In this paper, the bag-of-features principle and various extensions are reviewed, including soft quantization, supervised codebook learning, and temporal modeling. Furthermore, Mel and Gammatone frequency cepstral coefficients that originate from psychoacoustic models are used as the underlying feature set for the bag-of-features. The possibility of fusing the results of multiple channels in order to improve the robustness is shown. Two databases are used for the experiments: The DCASE 2013 office live dataset and the ITC-IRST multichannel dataset.


sensor array and multichannel signal processing workshop | 2016

Multi-microphone speech enhancement informed by auditory scene analysis

Axel Plinge; Sharon Gannot

A multitude of multi-microphone speech enhancement methods is available. In this paper, we focus our attention to the well-known minimum variance distortionless response (MVDR) beamformer, due to its ability to preserve distortionless response towards the desired speaker while minimizing the output noise power. We explore two alternatives for constructing the steering vectors towards the desired speech source. One is only using the direct path of the speech propagation in the form of delay-only filters, while the other is using the entire room impulse response (RIR). All beamforming methods requires some control information to be able to accomplish the task of enhancing a desired speech signal. In this paper, an acoustic event detection method using biologically-inspired features is employed. It can interpret the auditory scene by detecting the presence of different auditory objects. This is employed to control the estimation procedures used by beamformer. The resulting system provides a blind method of speech enhancement that can improve intelligibility independently of any additional information. Experiments with real recordings show the practical applicability of the method. Significant gain in fwSNRseg is achieved. Compared to using the direct path only, the use of the entire RIR proves beneficial.


IEEE Transactions on Audio, Speech, and Language Processing | 2018

Distributed Expectation-Maximization Algorithm for Speaker Localization in Reverberant Environments

Yuval Dorfan; Axel Plinge; Gershon Hazan; Sharon Gannot

Localization of acoustic sources has attracted a considerable amount of research attention in recent years. A major obstacle to achieving high localization accuracy is the presence of reverberation, the influence of which obviously increases with the number of active speakers in the room. Human hearing is capable of localizing acoustic sources even in extreme conditions. In this study, we propose to combine a method based on human hearing mechanisms and a modified incremental distributed expectation-maximization (IDEM) algorithm. Rather than using phase difference measurements that are modeled by a mixture of complex-valued Gaussians, as proposed in the original IDEM framework, we propose to use time difference of arrival measurements in multiple subbands and model them by a mixture of real-valued truncated Gaussians. Moreover, we propose to first filter the measurements in order to reduce the effect of the multipath conditions. The proposed method is evaluated using both simulated data and real-life recordings.


european signal processing conference | 2013

Online multi-speaker tracking using multiple microphone arrays informed by auditory scene analysis

Axel Plinge; Gernot A. Fink

Collaboration


Dive into the Axel Plinge's collaboration.

Top Co-Authors

Avatar

Gernot A. Fink

Technical University of Dortmund

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Emanuel A. P. Habets

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

Marius H. Hennecke

Technical University of Dortmund

View shared research outputs
Top Co-Authors

Avatar

Rene Grzeszick

Technical University of Dortmund

View shared research outputs
Top Co-Authors

Avatar

Sebastian J. Schlecht

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge