Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jiqing Han is active.

Publication


Featured researches published by Jiqing Han.


international conference on acoustics, speech, and signal processing | 2012

Sparse power spectrum based robust voice activity detector

Datao You; Jiqing Han; Guibin Zheng; Tieran Zheng

This paper presents a robust approach to improve the performance of voice activity detector (VAD) in low signal-to-noise ratio (SNR) noisy environments. To this end, we first generate sparse representations by Bregman Iteration based sparse decomposition with a learned over-complete dictionary, and derive a kind of audio feature called sparse power spectrum from the sparse representations. we then propose a method to calculate the short segment average spectrum and long segment average spectrum from sparse power spectrum. Finally, we design a criterion to detect speech region and non-speech region based on the above average spectrum. Experiments show that the proposed approach further improves the performance of VAD in low SNR noisy environments.


international conference on acoustics, speech, and signal processing | 2012

A solution to residual noise in speech denoising with sparse representation

Yongjun He; Jiqing Han; Shiwen Deng; Tieran Zheng; Guibin Zheng

As a promising technique, sparse representation has been extensively investigated in signal processing community. Recently, sparse representation is widely used for speech processing in noisy environments; however, many problems need to be solved because of the particularity of speech. One assumption for speech denoising with sparse representation is that the representation of speech over the dictionary is sparse, while that of the noise is dense. Unfortunately, this assumption is not sustained in speech denoising scenario. We find that many noises, e.g., the babble and white noises, are also sparse over the dictionary trained with clean speech, resulting in severe residual noise in sparse enhancement. To solve this problem, we propose a novel residual noise reduction (RNR) method which first finds out the atoms which represents the noise sparely, and then ignores them in the reconstruction of speech. Experimental results show that the proposed method can reduce residual noise substantially.


international conference on acoustics, speech, and signal processing | 2011

A modified MAP criterion based on hidden Markov model for voice activity detecion

Shiwen Deng; Jiqing Han; Tieran Zheng; Guibin Zheng

The maximum a posteriori (MAP) criterion is broadly used in the statistical model-based voice activity detection (VAD) approaches. In the conventional MAP criterion, however, the inter-frame correlation of the voice activity is not taken into consideration. In this paper, we proposes a novel modified MAP criterion based on a two-state hidden Markov model (HMM) to improve the performance of the VAD, and the the inter-frame correlation of the voice activity is modeled. With the proposed MAP criterion, the decision rule is derived by explicitly incorporating the a priori, a posteriori, and inter-frame correlation information into the likelihood ratio test (LRT). In the LRT, a compensation factor for the hypothesis of speech presence is used to regulate the trade-off between the probability of detection and the false alarm probability. Experimental results show the superiority of the VAD algorithm based on the proposed MAP criterion in comparison with that based on the recent conditional MAP criterion (CMAP) under various noise conditions.


Information Sciences | 2015

Dictionary evaluation and optimization for sparse coding based speech processing

Yongjun He; Deyun Chen; Guanglu Sun; Jiqing Han

Consider the relationship between reconstruction error and sparseness degree and define comparable measures with more detailed information.Define measures at different angles, allowing for evaluation in different tasks. Our measures cover the representation, reconstruction, denoising and separation of speech.Put forward the optimization problem of a given dictionary and solve this problem by removing unimportant atoms and harmful atoms. Recently, sparse coding has attracted considerable attention in speech processing. As a promising technique, sparse coding can be widely used for analysis, representation, compression, denoising and separation of speech. To represent signals accurately and sparsely, a good dictionary which contains elemental signals is preferred and many methods have been proposed to learn such a dictionary. However, there is a lack of reasonable evaluation methods to judge whether a dictionary is good enough. To solve this problem, we define a group of measures for dictionary evaluation. These measures not only address sparseness and reconstruction error of signal representation, but also consider denoising and separating performance. We show how to evaluate dictionaries with these measures, and further propose two methods to optimize dictionaries by improving relative measures. The first method improves the efficiency of sparse coding by removing unimportant atoms; the second one improves denoising performance of dictionaries by removing harmful atoms. Experimental results show that the measures can provide reasonable evaluations and the proposed methods for optimization can further improve given dictionaries.


Digital Signal Processing | 2015

Spectrum enhancement with sparse coding for robust speech recognition

Yongjun He; Guanglu Sun; Jiqing Han

Recently, a trend in speech recognition is to introduce sparse coding for noise robustness. Although several methods have been proposed, the performance of sparse coding in speech denoising is not so optimistic. One assumption with sparse coding is that the representation of speech over the speech dictionary is sparse, while that of the noise is dense. This assumption is obviously not sustained in the speech denoising scenario. Many noises are also sparse over the speech dictionary. In such a condition, the representation of noisy speech still contains noise components, resulting in degraded performance. To solve this problem, we first analyze the assumption of sparse coding and then propose a novel method to enhance speech spectrum. This method first finds out the atoms which represent the noise sparsely, and then selectively ignores them in the reconstruction of speech to reduce the residual noise. Speech features are then extracted from the enhanced spectrum for speech recognition. Experimental results show that the proposed method can improve the noise robustness of a speech recognition system substantially.


Neurocomputing | 2016

Optimization of learned dictionary for sparse coding in speech processing

Yongjun He; Guanglu Sun; Jiqing Han

As a promising technique, sparse coding has been widely used for the analysis, representation, compression, denoising and separation of speech. This technique needs a good dictionary which contains atoms to represent speech signals. Although many methods have been proposed to learn such a dictionary, there are still two problems. First, unimportant atoms bring a heavy computational load to sparse decomposition and reconstruction, which prevents sparse coding from real-time application. Second, in speech denoising and separation, harmful atoms have no or ignorable contributions to reducing the sparsity degree but increase the source confusion, resulting in severe distortions. To solve these two problems, we first analyze the inherent assumptions of sparse coding and show that distortion can be caused if the assumptions do not hold true. Next, we propose two methods to optimize a given dictionary by removing unimportant atoms and harmful atoms, respectively. Experiments show that the proposed methods can further improve the performance of dictionaries. HighlightsAnalyze the assumptions of sparse coding.Analyze the distortion of reconstructed signals in theory.Propose two optimization methods which improve a given dictionary by atom selection, rather than providing an improved method.Present several measures for dictionary evaluation.


Digital Signal Processing | 2014

A new framework for robust speech recognition in complex channel environments

Yongjun He; Jiqing Han; Tieran Zheng; Guanglu Sun

Channel distortion is one of the major factors which degrade the performances of automatic speech recognition (ASR) systems. Current compensation methods are generally based on the assumption that the channel distortion is a constant or slowly varying bias in an utterance or globally. However, this assumption is not sustained in a more complex circumstance, when the speech records being recognized are from many different unknown channels and have parts of the spectrum completely removed (e.g. band-limited speech). On the one hand, different channels may cause different distortions; on the other, the distortion caused by a given channel varies over the speech frames when parts of the speech spectrum are removed completely. As a result, the performance of the current methods is limited in complex environments. To solve this problem, we propose a unified framework in which the channel distortion is first divided into two subproblems, namely, spectrum missing and magnitude changing. Next, the two types of distortions are compensated with different techniques in two steps. In the first step, the speech bandwidth is detected for each utterance and the acoustic models are synthesized with clean models to compensate for spectrum missing. In the second step, the constant term of the distortion is estimated via the expectation-maximization (EM) algorithm and subtracted from the means of the synthesized model to further compensate for magnitude changing. Several databases are chosen to evaluate the proposed framework. The speech in these databases is recorded in different channels, including various microphones and band-limited channels. Moreover, to simulate more types of spectrum missing, various low-pass and band-pass filters are used to process the speech from the chosen databases. Although these databases and their filtered versions make the channel conditions more challenging for recognition, experimental results show that the proposed framework can substantially improve the performance of ASR systems in complex channel environments.


international conference on intelligent control and information processing | 2011

Mandarin keyword spotting using syllable based confidence features and SVM

Haiyang Li; Jiqing Han; Tieran Zheng; Guibin Zheng

A method for confidence measure (CM) using syllable based confidence features is proposed to improve false-alarm rejection of the mandarin keyword spotting (KWS). The features take advantage of the merit of mandarin syllable structure and describe the confidences in every sub-syllable level. The evaluation is processed with support vector machine (SVM) on telephone speech database. Compared with the typical method, the experimental results show that the proposed CM features and SVM based method yields significant improvement, and at best a reduction of 12.13% equal error rate (EER) is gotten.


international conference on acoustics, speech, and signal processing | 2011

Compensation of partly reliable components for band-limited speech recognition with missing data techniques

Yongjun He; Jiqing Han; Tieran Zheng; Guibin Zheng

Mismatch in speech bandwidth between training and real operation greatly degrades the performance of automatic speech recognition (ASR) systems. Missing feature technique (MFT) is effective in handling bandwidth mismatch. However, current MFT-based methods ignore the mismatch in the filterbank channels which cover the upper and lower limit cutoff frequencies. To solve this problem, we propose to partition the feature into reliable, unreliable and partly reliable parts, and then modify the probability density functions (PDFs) of the partly reliable part to match band-limited features. Experiments showed that such compensation further improved the performances of MFT-based methods under band-limited conditions.


international conference on acoustics, speech, and signal processing | 2013

Upper and lower bounds for approximation of the Kullback-Leibler divergence between Hidden Markov models

Haiyang Li; Jiqing Han; Tieran Zheng; Guibin Zheng

The Kullback-Leibler (KL) divergence is often used for a similarity comparison between two Hidden Markov models (HMMs). However, there is no closed form expression for computing the KL divergence between HMMs, and it can only be approximated. In this paper, we propose two novel methods for approximating the KL divergence between the left-to-right transient HMMs. The first method is a product approximation which can be calculated recursively without introducing extra parameters. The second method is based on the upper and lower bounds of KL divergence, and the mean of these bounds provides an available approximation of the divergence. We demonstrate the effectiveness of the proposed methods through experiments including the deviations to the numerical approximation and the task of predicting the confusability of phone pairs. Experimental results show that the proposed product approximation is comparable with the current variational approximation, and the proposed approximation based on bounds performs better than current methods in the experiments.

Collaboration


Dive into the Jiqing Han's collaboration.

Top Co-Authors

Avatar

Tieran Zheng

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Guibin Zheng

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Haiyang Li

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Yongjun He

Harbin University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Guanglu Sun

Harbin University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Hao Yuan

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Shiwen Deng

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Datao You

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Deyun Chen

Harbin University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Tao Jiang

Harbin Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge