Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jibin Yang is active.

Publication


Featured researches published by Jibin Yang.


international conference on acoustics, speech, and signal processing | 2016

Adaptive extraction of repeating non-negative temporal patterns for single-channel speech enhancement

Yinan Li; Xiongwei Zhang; Meng Sun; Gang Min; Jibin Yang

Estimating unknown background noise from single-channel noisy speech is a key yet challenging problem for speech enhancement. Given the fact that the background noises typically have the repeating property and the foreground speech is sparse and time-variant, many literatures decompose the noisy spectrogram directly in an unsupervised fashion when there is no isolated training example of the target speaker or particular noise types beforehand. However, recently proposed methods suffer from un-interpretable decomposed patterns, neglecting the temporal structure of the background noise or being constrained by the pre-fixed parameters. To settle these issues, we propose a novel method based on autocorrelation technique and convolutive non-negative matrix factorization. The proposed method can adaptively estimate the underlying non-negative repeating temporal patterns from noisy speech and identify the clean speech spectrogram simultaneously. Experiments on NOIZEUS dataset mixed with various real-world background noises showed that the proposed method performs better than some state-of-the-art methods.


international conference on multimedia and expo | 2016

A perceptually motivated approach via sparse and low-rank model for speech enhancement

Gang Min; Xiongwei Zhang; Jibin Yang; Wei Han; Xia Zou

A perceptually motivated speech enhancement approach is proposed in this paper. Different from the conventional sparse and low-rank model based approaches, this new approach takes into account the perceptual differences in different frequency bands of the human auditory system, and separates speech from background noises in the Mel spectral domain. After two propositions for the Mel frequency weighted spectrogram are proved, speech enhancement can be modeled as a sparse and low-rank constrained optimization problem, which is solved efficiently by the alternating direction method of multipliers (ADMM). The proposed approach is totally unsupervised, neither the speech nor the noise dictionary needs to be trained beforehand. The experimental results have shown its promising performance under strong background noises. The performance can be further improved by information fusion technique at high input SNRs.


international conference on advanced communication technology | 2016

Gridless sparse reconstruction for the cyclic autocorrelation estimation

Xushan Chen; Xiongwei Zhang; Jibin Yang; Lin Qiao

In order to exploit the inherent cyclostationary properties which vary periodically in most man-made signals, one prerequisite is the knowledge of the signals cyclic autocorrelation (CA) which can be estimated from finite time-domain samples. In this paper we concern about the sparse, periodic CA estimation and focus on recovering the CA using compressive sampling, i.e. a small amount of time-domain samples. Inspired by atomic norm based technology we model the CA estimation as a denoising problem with atomic norm, or equivalently an atomic norm soft thresholding (AST) problem, and propose a gridless version CA reconstruction which can locate the nonzero cyclic frequencies on an infinitely dense grid. The consequent convex optimization problem can be solved using semidefinite programming (SDP) via Alternating Direction Method of Multipliers (ADMM) in polynomial time. Numerical results demonstrate that the proposed method outperforms the traditional methods as well as the dictionary based CA estimator in terms of the mean square error (MSE) over a wide range of signal to noise ratios (SNR) case.


pacific rim conference on multimedia | 2016

Joint Optimization of a Perceptual Modified Wiener Filtering Mask and Deep Neural Networks for Monaural Speech Separation

Wei Han; Xiongwei Zhang; Jibin Yang; Meng Sun; Gang Min

Due to the powerful feature extraction ability, deep learning has become a new trend towards solving speech separation problems. In this paper, we present a novel Deep Neural Network (DNN) architecture for monaural speech separation. Taking into account the good mask property of the human auditory system, a perceptual modified Wiener filtering masking function is applied in the proposed DNN architecture, which is used to make the residual noise perceptually inaudible. The proposed architecture jointly optimize the perceptual modified Wiener filtering mask and DNN. Evaluation experiments on TIMIT database with 20 noise types at different signal-to-noise ratio (SNR) situations demonstrate the superiority of the proposed method over the reference DNN-based separation methods, no matter whether the noise appeared in the training database or not.


international conference on multimedia and expo | 2016

Joint optimization of audible noise suppression and deep neural networks for single-channel speech enhancement

Wei Han; Xiongwei Zhang; Gang Min; Meng Sun; Jibin Yang

Improving the perceptual quality of speech signals is a key yet challenging problem for many real world applications. Taking into account the good performance of deep learning in signal representation, a novel single-channel speech enhancement technique is presented based on joint Deep Neural Networks and audible noise suppression as a whole network architecture. This new deep neural network jointly trains an audible noise suppression function which is used to estimate the magnitude spectrum of the clean speech and shape the spectrum of the audible noise at the same time. Experimental results on TIMIT with 20 noise types at various noise levels demonstrate the superiority of the proposed method over the baselines, no matter whether the noise conditions are included in the training set or not.


ieee international conference on progress in informatics and computing | 2016

Speech enhancement using non-negative matrix factorization solved by improved alternating direction method of multipliers

Lin Qiao; Xiongwei Zhang; Xushan Chen; Jibin Yang

For the task of monaural speech enhancement, A version of Sparse Nonnegative matrix factorization (Sparse NMF) using improved Alternating Direction Method of Multipliers (IADMM) with generalized Kullback-Leibler divergence is proposed. In this paper, an alternating direction method of multipliers (ADMM) for NMF is studied, which deals with the NMF problem using the cost function of beta divergence. Our study shows that this algorithm outperforms state-of-the-art algorithms on synthetic data sets, but the study shows that it presents unstable behavior and low accuracy on real data sets. Therefore, we propose an improved algorithm for sparse NMF to solve this problem. The algorithm minimizes the K-L divergence with a pivot element weighting iterative (PEWI) method. Experimental results demonstrated that the proposed algorithm is more stable and accurate and also obtained better performance than the state-of-the-art speech enhancement algorithms.


IEEE Signal Processing Letters | 2016

Perceptually Weighted Analysis-by-Synthesis Vector Quantization for Low Bit Rate MFCC Codec

Gang Min; Xiongwei Zhang; Xia Zou; Jibin Yang

This letter presents a perceptually weighted analysis-by-synthesis vector quantization (VQ) algorithm for low bit rate MFCC codec. Different from conventional VQ of mel-frequency cepstral coefficients (MFCCs) vector, this algorithm uses an analysis-by-synthesis technique and aims to minimize the perceptually weighted spectral reconstruction distortion rather than the distortion of MFCCs vector itself. Also, to reduce the computational complexity, we propose a practical suboptimal codebook searching technique and embed it into the split and multistage VQ framework. Objective and subjective experimental results on Mandarin speech show that the proposed algorithm yields intelligible and natural sounding speech for speech coding at 600-2400 bit/s. Compared to current VQ in MFCC codec, the output speech quality is substantially improved in terms of frequency-weighted segmental SNR, short-time objective intelligibility score, perceptual evaluation of speech quality score, and mean opinion score.


international conference on multimedia and expo | 2015

SegBOMP: An efficient algorithm for block non-sparse signal recovery

Xushan Chen; Xiongwei Zhang; Jibin Yang; Meng Sun; Li Zeng

Block sparse signal recovery methods have attracted great interests which take the block structure of the nonzero coefficients into account when clustering. Compared with traditional compressive sensing methods, it can obtain better recovery performance with fewer measurements by utilizing the block-sparsity explicitly. In this paper we propose a segmented-version of the block orthogonal matching pursuit algorithm in which it divides any vector into several sparse sub-vectors. By doing this, the original method can be significantly accelerated due to the dimension reduction of measurements for each segmented vector. Experimental results showed that with low complexity the proposed method yielded identical or even better reconstruction performance than the conventional methods which treated the signal in the standard block-sparsity fashion. Furthermore, in the specific case, where not all segments contain nonzero blocks, the performance improvement can be interpreted as a gain in “effective SNR” in noisy environment.


2015 International Conference on Estimation, Detection and Information Fusion (ICEDIF) | 2015

Blind Spectrum Sensing with low rank and sparse model

Xushan Chen; Xiongwei Zhang; Jibin Yang; Meng Sun; Xinwei Zhang

Spectrum Sensing is a cornerstone in cognitive radio which can detect the spectrum holes in order to raise spectrum utilization ratio. Traditional spectrum sensing detectors depend on some prior information or are restricted by low signal-to-noise ratio and computation complexity in practical application. A GoDec based spectrum sensing detector is proposed by combining covariance based method with low rank and sparse model theory. The proposed detector divides the received signal into two segments of equal length, and then decomposes the covariance matrix respectively by GoDec decomposition. The primary user exists if the difference between the low rank matrices is lower than a predefined threshold. Simulation results show that the proposed detector has high detection probability to detect primary signals with SNR as low as -14dB.


IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences | 2015

Cramer-Rao Bounds for Compressive Frequency Estimation

Xushan Chen; Xiongwei Zhang; Jibin Yang; Meng Sun; Weiwei Yang

Collaboration


Dive into the Jibin Yang's collaboration.

Top Co-Authors

Avatar

Xiongwei Zhang

University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Meng Sun

University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Xushan Chen

University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Gang Min

University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Lin Qiao

University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Wei Han

University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Xia Zou

University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Li Zeng

University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Yinan Li

University of Science and Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge