Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Xionghu Zhong is active.

Publication


Featured researches published by Xionghu Zhong.


IEEE Sensors Journal | 2012

Particle Filtering and Posterior Cramér-Rao Bound for 2-D Direction of Arrival Tracking Using an Acoustic Vector Sensor

Xionghu Zhong; A. B. Premkumar; A. S. Madhukumar

Acoustic vector sensor (AVS) measures acoustic pressure as well as particle velocity, and therefore AVS signal contains 2-D (azimuth and elevation) DOA information of an acoustic source. Existing DOA estimation techniques assume that the source is static and extensively rely on the localization methods. In this paper, a particle filtering (PF) tracking approach is developed to estimate the 2-D DOA from signals collected by an AVS. A constant velocity model is employed to model the source dynamics and the likelihood function is derived based on a maximum likelihood estimation of the source amplitude and the noise variance. The posterior Cramér-Rao bound (PCRB) is also derived to provide a lower performance bound for AVS signal based tracking problem. Since PCRB incorporates the information from the source dynamics and measurement models, it is usually lower than traditional Cramér-Rao bound which only employs measurement model information. Experiments show that the proposed PF tracking algorithm significantly outperforms Capon beamforming based localization method and is much closer to the PCRB even in a challenging environment (e.g., SNR = -10 dB).


international conference on acoustics, speech, and signal processing | 2015

A learning-based approach to direction of arrival estimation in noisy and reverberant environments

Xiong Xiao; Shengkui Zhao; Xionghu Zhong; Douglas L. Jones; Eng Siong Chng; Haizhou Li

This paper presents a learning-based approach to the task of direction of arrival estimation (DOA) from microphone array input. Traditional signal processing methods such as the classic least square (LS) method rely on strong assumptions on signal models and accurate estimations of time delay of arrival (TDOA) . They only work well in relatively clean conditions, but suffer from noise and reverberation distortions. In this paper, we propose a learning-based approach that can learn from a large amount of simulated noisy and reverberant microphone array inputs for robust DOA estimation. Specifically, we extract features from the generalised cross correlation (GCC) vectors and use a multilayer perceptron neural network to learn the nonlinear mapping from such features to the DOA. One advantage of the learning based method is that as more and more training data becomes available, the DOA estimation will become more and more accurate. Experimental results on simulated data show that the proposed learning based method produces much better results than the state-of-the-art LS method. The testing results on real data recorded in meeting rooms show improved root-mean-square error (RMSE) compared to the LS method.


IEEE Sensors Journal | 2013

Particle Filtering for Acoustic Source Tracking in Impulsive Noise With Alpha-Stable Process

Xionghu Zhong; A. B. Premkumar; A. S. Madhukumar

NonGaussian impulsive noises distort the source signal and cause problems for direction of arrival (DOA) estimation of an acoustic source. In this paper, a Bayesian framework and its particle filtering (PF) implementation for DOA tracking in the presence of complex symmetric alpha-stable noise process are developed. A constant velocity model is employed to model the source dynamics, and spatial spectra are exploited to formulate a pseudo likelihood of particles. Since the second-order statistics of alpha-stable processes do not exist, the fractional lower order moment matrix of the received data is used to replace the covariance matrix in calculating the spatial spectra. The noise usually spreads and distorts the mainlobe of the likelihood function and the particles cannot be weighted accurately. Hence, the likelihood function is exponentially weighted to emphasize the particles in a high likelihood area and thus enhance the resampling efficiency. The performance of the proposed tracking algorithm is extensively studied under simulated alpha-stable noise environments. The results show that the proposed algorithm significantly outperforms the existing PF tracking approach and the traditional localization approaches in DOA estimation.


ieee automatic speech recognition and understanding workshop | 2015

Robust speech recognition using beamforming with adaptive microphone gains and multichannel noise reduction

Shengkui Zhao; Xiong Xiao; Zhaofeng Zhang; Thi Ngoc Tho Nguyen; Xionghu Zhong; Bo Ren; Longbiao Wang; Douglas L. Jones; Eng Siong Chng; Haizhou Li

This paper presents a robust speech recognition system using a microphone array for the 3rd CHiME Challenge. A minimum variance distortionless response (MVDR) beamformer with adaptive microphone gains is proposed for robust beamforming. Two microphone gain estimation methods are studied using the speech-dominant time-frequency bins. A multichannel noise reduction (MCNR) postprocessing is also proposed to further reduce the interference in the MVDR processed signal. Experimental results for the ChiME-3 challenge show that both the proposed MVDR beamformer with microphone gains and the MCNR postprocessing improve the speech recognition performance significantly. With the state-of-the-art deep neural network (DNN) based acoustic model, our system achieves a word error rate (WER) of 11.67% on the real test data of the evaluation set.


Speech Communication | 2015

Reverberant speech separation with probabilistic time-frequency masking for B-format recordings

Xiaoyi Chen; Wenwu Wang; Yingmin Wang; Xionghu Zhong; Atiyeh Alinaghi

Abstract Existing speech source separation approaches overwhelmingly rely on acoustic pressure information acquired by using a microphone array. Little attention has been devoted to the usage of B-format microphones, by which both acoustic pressure and pressure gradient can be obtained, and therefore the direction of arrival (DOA) cues can be estimated from the received signal. In this paper, such DOA cues, together with the frequency bin-wise mixing vector (MV) cues, are used to evaluate the contribution of a specific source at each time–frequency (T–F) point of the mixtures in order to separate the source from the mixture. Based on the von Mises mixture model and the complex Gaussian mixture model respectively, a source separation algorithm is developed, where the model parameters are estimated via an expectation–maximization (EM) algorithm. A T–F mask is then derived from the model parameters for recovering the sources. Moreover, we further improve the separation performance by choosing only the reliable DOA estimates at the T–F units based on thresholding. The performance of the proposed method is evaluated in both simulated room environments and a real reverberant studio in terms of signal-to-distortion ratio (SDR) and the perceptual evaluation of speech quality (PESQ). The experimental results show its advantage over four baseline algorithms including three T–F mask based approaches and one convolutive independent component analysis (ICA) based method.


Signal Processing | 2015

A distributed particle filtering approach for multiple acoustic source tracking using an acoustic vector sensor network

Xionghu Zhong; Arash Mohammadi; A.B. Premkumar; Amir Asif

Different centralized approaches such as least-squares (LS) and particle filtering (PF) algorithms have been developed to localize an acoustic source by using a distributed acoustic vector sensor (AVS) array. However, such algorithms are either not applicable for multiple sources or rely heavily on sensor-processor communication. In this paper, a distributed unscented PF (DUPF) approach is proposed for multiple acoustic source tracking. At each distributed AVS node, the first-order and the second-order statistics of the local state are estimated by using an unscented information filter (UIF) based PF. The UIF is employed to approximate the optimum importance function due to its simplicity, by which the matrix operation is the state information matrix rather than the covariance matrix of the measurement sequence. These local statistics are then fused between neighbor nodes and a consensus filter is applied to achieve a global estimation. In such an architecture, only the state statistics need to be transmitted among the neighbor nodes. Consequently, the communication cost can be reduced. The distributed posterior Cramer-Rao bound is also derived. Simulation results show that the performance of the DUPF tracking approach is similar to that of centralized PF algorithm and significantly better than that of LS algorithms. HighlightsMultiple wideband source tracking using a distributed AVS network is considered.A distributed PF approach is developed to track multiple sources in 3-D space.The theoretical performance bound (posterior Cramer-Rao lower bound) is derived.Simulations are organized to demonstrate the performance of the proposed approach.


IEEE Sensors Journal | 2014

Multiple Wideband Acoustic Source Tracking in 3-D Space Using a Distributed Acoustic Vector Sensor Array

Xionghu Zhong; A. B. Premkumar; Haiyan Wang

This paper considers the problem of tracking multiple acoustic sources in 3-D space using a distributed acoustic vector sensor array. Unlike the existing two-stage localization approach, which estimates the direction of arrival of the source at each sensor first and then triangulate a 3-D position, a particle filtering approach is developed to directly fuse the signals collected from distributed sensors. To enhance the tracking performance and constrain the computational complexity, an information filter is developed to approximate the optimal importance sampling. Since the position state of the source is linear with the velocity state, a Rao-Blackwellization step is employed to marginalize out the velocity component. In addition, the posterior Cramér-Rao bound is developed to provide a lower performance bound for the distributed tracking system. Both the numerical study and simulations show that the proposed tracking approach significantly outperforms the two-stage localization approaches for 3-D position estimation.


Circuits Systems and Signal Processing | 2013

A Decoupled Approach for Near-Field Source Localization Using a Single Acoustic Vector Sensor

V. N. Hari; A. B. Premkumar; Xionghu Zhong

This paper considers the problem of three-dimensional (3-D, azimuth, elevation, and range) localization of a single source in the near-field using a single acoustic vector sensor (AVS). The existing multiple signal classification (MUSIC) or maximum likelihood estimation (MLE) methods, which require a 3-D search over the location parameter space, are computationally very expensive. A computationally simple method previously developed by Wu and Wong (IEEE Trans. Aerosp. Electron. Syst. 48(1):159–169, 2012), which we refer to as Eigen-value decomposition and Received Signal strength Indicator-based method (Eigen-RSSI), was able to estimate 3-D location parameters of a single source efficiently. However, it can only be applied to an extended AVS which consists of a pressure sensor separated from the velocity sensors by a certain distance. In this paper, we propose a uni-AVS MUSIC (U-MUSIC) approach for 3-D location parameter estimation based on a compact AVS structure. We decouple the 3-D localization problem into step-by-step estimation of azimuth, elevation, and range and derive closed-form solutions for these parameter estimates by which a complex 3-D search for the parameters can be avoided. We show that the proposed approach outperforms the existing Eigen-RSSI method when the sensor system is required to be mounted in a confined space.


Signal Processing | 2014

Multiple wideband source detection and tracking using a distributed acoustic vector sensor array: A random finite set approach

Xionghu Zhong; A.B. Premkumar

In the past, distributed acoustic vector sensor (AVS) arrays have been employed to localize the source in a three dimensional space. Least-squares approaches were introduced to triangulate the source position by using the direction of arrival (DOA) measurements extracted at each AVS. However, such approaches: (1) cannot detect and localize multiple sources; and (2) can be seriously degraded due to inaccurate DOA estimates. In this paper, a practical scenario that the source existence and the number of sources are assumed to be unknown is considered. A random finite set (RFS) approach is developed to jointly detect and track multiple wideband acoustic sources. RFS is able to characterize the randomness of the state process (i.e., the source dynamics and the number of active sources) as well as the measurement process (i.e., DOA measurements generated by real sources and false alarms). Since the relationship between DOAs and source position is highly nonlinear, a particle filtering approach is employed to arrive at a computationally tractable approximation of the RFS densities. Simulations under different acoustic environments demonstrate the performance of the proposed approach and show a significant improvement on position estimation over the least squares approaches.


international conference on digital signal processing | 2013

Acoustic vector sensor based speech source separation with mixed Gaussian-Laplacian distributions

Xiaoyi Chen; Atiyeh Alinaghi; Xionghu Zhong; Wenwu Wang

Acoustic vector sensor (AVS) based convolutive blind source separation problem has been recently addressed under the framework of probabilistic time-frequency (T-F) masking, where both the DOA and the mixing vector cues are modelled by Gaussian distributions. In this paper, we show that the distributions of these cues vary with room acoustics, such as reverberation. Motivated by this observation, we propose a mixed model of Laplacian and Gaussian distributions to provide a better fit for these cues. The parameters of the mixed model are estimated and refined iteratively by an expectation-maximization (EM) algorithm. Experiments performed on the speech mixtures in simulated room environments show that the mixed model offers an average of about 0.68 dB and 1.18 dB improvements in signal-to-distotion (SDR) over the Gaussian and Laplacian model, respectively.

Collaboration


Dive into the Xionghu Zhong's collaboration.

Top Co-Authors

Avatar

A. B. Premkumar

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Wee Peng Tay

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

A. S. Madhukumar

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Eng Siong Chng

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Haizhou Li

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

V. N. Hari

Nanyang Technological University

View shared research outputs
Researchain Logo
Decentralizing Knowledge