Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Dang Hai Tran Vu is active.

Publication


Featured researches published by Dang Hai Tran Vu.


international conference on acoustics, speech, and signal processing | 2010

Blind speech separation employing directional statistics in an Expectation Maximization framework

Dang Hai Tran Vu

In this paper we propose to employ directional statistics in a complex vector space to approach the problem of blind speech separation in the presence of spatially correlated noise. We interpret the values of the short time Fourier transform of the microphone signals to be draws from a mixture of complexWatson distributions, a probabilistic model which naturally accounts for spatial aliasing. The parameters of the density are related to the a priori source probabilities, the power of the sources and the transfer function ratios from sources to sensors. Estimation formulas are derived for these parameters by employing the Expectation Maximization (EM) algorithm. The E-step corresponds to the estimation of the source presence probabilities for each time-frequency bin, while the M-step leads to a maximum signal-to-noise ratio (MaxSNR) beamformer in the presence of uncertainty about the source activity. Experimental results are reported for an implementation in a generalized sidelobe canceller (GSC) like spatial beamforming configuration for 3 speech sources with significant coherent noise in reverberant environments, demonstrating the usefulness of the novel modeling framework.


international conference on acoustics, speech, and signal processing | 2014

Source counting in speech mixtures using a variational EM approach for complex WATSON mixture models

Lukas Drude; Aleksej Chinaev; Dang Hai Tran Vu

In this contribution we derive a variational EM (VEM) algorithm for model selection in complex Watson mixture models, which have been recently proposed as a model of the distribution of normalized microphone array signals in the short-time Fourier transform domain. The VEM algorithm is applied to count the number of active sources in a speech mixture by iteratively estimating the mode vectors of the Watson distributions and suppressing the signals from the corresponding directions. A key theoretical contribution is the derivation of the MMSE estimate of a quadratic form involving the mode vector of the Watson distribution. The experimental results demonstrate the effectiveness of the source counting approach at moderately low SNR. It is further shown that the VEM algorithm is more robust with respect to used threshold values.


international conference on acoustics, speech, and signal processing | 2013

Using the turbo principle for exploiting temporal and spectral correlations in speech presence probability estimation

Dang Hai Tran Vu

In this paper we present a speech presence probability (SPP) estimation algorithmwhich exploits both temporal and spectral correlations of speech. To this end, the SPP estimation is formulated as the posterior probability estimation of the states of a two-dimensional (2D) Hidden Markov Model (HMM). We derive an iterative algorithm to decode the 2D-HMM which is based on the turbo principle. The experimental results show that indeed the SPP estimates improve from iteration to iteration, and further clearly outperform another state-of-the-art SPP estimation algorithm.


international workshop on acoustic signal enhancement | 2014

Towards online source counting in speech mixtures applying a variational EM for complex Watson mixture models

Lukas Drude; Aleksej Chinaev; Dang Hai Tran Vu

This contribution describes a step-wise source counting algorithm to determine the number of speakers in an offline sce-nario. Each speaker is identified by a variational expectation maximization (VEM) algorithm for complex Watson mixture models and therefore directly yields beamforming vectors for a subsequent speech separation process. An observation selection criterion is proposed which improves the robustness of the source counting in noise. The algorithm is compared to an alternative VEM approach with Gaussian mixture models based on directions of arrival and shown to deliver improved source counting accuracy. The article concludes by extending the offline algorithm towards a low-latency online estimation of the number of active sources from the streaming input data.


international conference on acoustics, speech, and signal processing | 2012

Improved noise power spectral density tracking by a MAP-based postprocessor

Aleksej Chinaev; Alexander Krueger; Dang Hai Tran Vu

In this paper we present a novel noise power spectral density tracking algorithm and its use in single-channel speech enhancement. It has the unique feature that it is able to track the noise statistics even if speech is dominant in a given time-frequency bin. As a consequence it can follow non-stationary noise superposed by speech, even in the critical case of rising noise power. The algorithm requires an initial estimate of the power spectrum of speech and is thus meant to be used as a postprocessor to a first speech enhancement stage. An experimental comparison with a state-of-the-art noise tracking algorithm demonstrates lower estimation errors under low SNR conditions and smaller fluctuations of the estimated values, resulting in improved speech quality as measured by PESQ scores.


workshop on positioning navigation and communication | 2013

Server based indoor navigation using RSSI and inertial sensor information

Manh Kha Hoang; Sarah Schmitz; Christian Drueke; Dang Hai Tran Vu; Joerg Schmalenstroeer

In this paper we present a system for indoor navigation based on received signal strength index information of Wireless-LAN access points and relative position estimates. The relative position information is gathered from inertial smartphone sensors using a step detection and an orientation estimate. Our map data is hosted on a server employing a map renderer and a SQL database. The database includes a complete multilevel office building, within which the user can navigate. During navigation, the client retrieves the position estimate from the server, together with the corresponding map tiles to visualize the users position on the smartphone display.


International Workshop on Acoustic Echo and Noise Control (IWAENC 2010) | 2010

An EM Approach to Integrated Multichannel Speech Separation and Noise Suppression

Dang Hai Tran Vu


european signal processing conference | 2013

Blind speech separation exploiting temporal and spectral correlations using 2D-HMMs

Dang Hai Tran Vu


international workshop on acoustic signal enhancement | 2012

Exploiting Temporal Correlations in Joint Multichannel Speech Separation and Noise Suppression using Hidden Markov Models

Dang Hai Tran Vu


conference of the international speech communication association | 2011

On Initial Seed Selection for Frequency Domain Blind Speech Separation

Dang Hai Tran Vu

Collaboration


Dive into the Dang Hai Tran Vu's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Lukas Drude

University of Paderborn

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge