Florian Heese | Researchain

Archive Network Publication Hotspot Collaboration

Hotspot

Bubble
[Coming Soon]

Cloud

Mosaic
[Coming Soon]

international conference on acoustics, speech, and signal processing | 2015

Noise PSD estimation by logarithmic baseline tracing

Florian Heese; Peter Vary

A novel noise power spectral density (PSD) estimator for disturbed speech signals which operates in the short-time Fourier domain is presented. A noise PSD estimate is provided by constrained tracing with time of the noisy observation separately for each frequency bin. The constraint is a limitation of the logarithmic magnitude change between successive time frames. Since speech onset is assumed as sudden rises in the noisy observation, a fixed and adaptive tracing parameter β has been derived to track the contained noise while preventing speech leakage to the noise PSD estimate. The experimental evaluation and comparison with state-of-the-art algorithms, SPP and Minimum Statistics, confirms a lower logarithmic noise estimation error and superior speech enhancement rated in a standard noise reduction system. The proposed concept has extremely low computational complexity and memory usage. Thus, it is well suited for applications where processing power and memory is limited.

ieee convention of electrical and electronics engineers in israel | 2012

Comparison of supervised and semi-supervised beamformers using real audio recordings

Florian Heese; Magnus Schäfer; Peter Vary; Elior Hadad; Shmulik Markovich Golan; Sharon Gannot

In this contribution two different disciplines for designing microphone array beamformers are explored. On the one hand a fixed beamformer based on numerical near field optimization is employed. On the other hand an adaptive beamformer algorithm based on the linearly constrained minimum variance (LCMV) method is applied. For the evaluation, an audio-database for microphone array impulse responses and audio recordings (speech and noise) was created. Different acoustic scenarios were constructed, consisting of various audio sources (desired speaker, interfering speaker and directional noise) distributed around the microphone array at different angles and distances. The algorithms were compared based on both objective measure (signal-to-noise, signal-to-interference and speech distortion, and subjective tests (assessment of sonograms and informal listening tests).

international conference on acoustics, speech, and signal processing | 2010

Wideband noise suppression supported by artificial bandwidth extension techniques

Thomas Esch; Florian Heese; Bernd Geiser; Peter Vary

This contribution presents a wideband (50Hz – 7 kHz) speech enhancement system that is operating in the frequency domain. As a novel feature, techniques known from artificial bandwidth extension (BWE) are used to improve the spectral estimation process by exploiting the statistical dependencies between the low band (50Hz – 4kHz) and the high band (4–7kHz). Conventional noise suppression is used in the low band, while a novel approach is applied to the high band. Features from the processed (enhanced) low band signal are extracted and used to estimate subband energies of the high band. The weighting gains determined from these energy estimates are adaptively combined with conventional gains obtained in addition for the high band. The performance of the proposed method is shown to be consistently better than the conventional approach, especially at low input SNR values.

international conference on acoustics, speech, and signal processing | 2013

Numerical near field optimization of a non-uniform sub-band filter-and-sum beamformer

Florian Heese; Magnus Schäfer; Jona Wernerus; Peter Vary

A novel near field filter-and-sum beamformer using non uniform frequency sub-bands is presented. The concept is based on numerical optimization of the reception characteristic of the microphone array. In order to improve the reception characteristic over frequency and space, a non uniform filterbank is utilized to subdivide the frequency range. Individual optimization processes for each sub-band result in a clearly improved reception characteristic. The new system is able to closely approximate a target (independently of the frequency) which can be defined according to the application.

asilomar conference on signals, systems and computers | 2010

Combined reduction of time varying harmonic and stationary noise using frequency warping

Thomas Esch; Matthias Rüngeler; Florian Heese; Peter Vary

Speech enhancement under non-stationary environments is still a challenging problem. This contribution presents a noise reduction system that is capable of tracking and suppressing both time varying harmonic noise and stationary noise. In a first stage, the harmonic noise power is estimated and attenuated using a modified Minimum Statistics approach that performs frequency warping according to the harmonics fundamental frequency. A conventional noise estimation technique is applied in a second stage in order to reduce the random components of the noise spectrum. The performance of the proposed noise suppression system is shown to be consistently better than conventional approaches.

international conference on acoustics, speech, and signal processing | 2015

Speech-codebook based soft Voice Activity Detection

Florian Heese; Markus Niermann; Peter Vary

A novel noise-robust soft Voice Activity Detector (VAD) operating in the short-time Fourier domain is presented. A speech energy gain is obtained by frame-wise processing of a noisy speech signal with a speech codebook algorithm. This gain can be used for robust voice detection. A speaker-independent speech codebook, consisting of spectral envelopes, is created in the training process. While applying the algorithm, the codebook is adapted in every frame to the current speaker by combining the harmonic pitch structure of the actual noisy speech frame with the codebook entries. Soft VAD values ranging from zero to one are calculated by post-processing of the speech gain which is obtained using gain shape vector quantization. A binary VAD is carried out by applying a threshold. The proposed method does not rely on noise a-priori knowledge and is robust w.r.t. highly non-stationary noise and adverse SNR conditions. In addition, it is possible to compromise between the detection-rate and the false-alarm-rate by varying a threshold without increasing the total number of mis-detections. Compared to state-of-the-art VAD systems, the proposed method is characterized by better detection-rates at significant lower false-alarm-rates.

International Journal On Advances in Telecommunications | 2013

High Quality Video Conferencing: Region of Interest Encoding and Joint Video/Audio Analysis

Christopher Bulla; Christian Feldmann; Magnus Schäfer; Florian Heese; Thomas Schlien; Martin Schink

Archive | 2010

A Modified Minimum Statistics Algorithm for Reducing Time Varying Harmonic Noise

Thomas Esch; Florian Heese; Peter Vary

Acoustic Signal Enhancement; Proceedings of IWAENC 2012; International Workshop on | 2012

Numerical Near Field Optimization of Weighted Delay-and-Sum Microphone Arrays

Magnus Schaefer; Florian Heese; Jona Wernerus; Peter Vary

itg symposium of speech communication | 2014

Selflearning Codebook Speech Enhancement

Florian Heese; Christoph Matthias Nelke; Markus Niermann; Peter Vary

Explore More