Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Huijun Ding is active.

Publication


Featured researches published by Huijun Ding.


Speech Communication | 2009

A spectral filtering method based on hybrid wiener filters for speech enhancement

Huijun Ding; Ing Yann Soon; Soo Ngee Koh; Chai Kiat Yeo

It is well known that speech enhancement using spectral filtering will result in residual noise. Residual noise which is musical in nature is very annoying to human listeners. Many speech enhancement approaches assume that the transform coefficients are independent of one another and can thus be attenuated separately, thereby ignoring the correlations that exist between different time frames and within each frame. This paper, proposes a single channel speech enhancement system which exploits such correlations between the different time frames to further reduce residual noise. Unlike other 2D speech enhancement techniques which apply a post-processor after some classical algorithms such as spectral subtraction, the proposed approach uses a hybrid Wiener spectrogram filter (HWSF) for effective noise reduction, followed by a multi-blade post-processor which exploits the 2D features of the spectrogram to preserve the speech quality and to further reduce the residual noise. This results in pleasant sounding speech for human listeners. Spectrogram comparisons show that in the proposed scheme, musical noise is significantly reduced. The effectiveness of the proposed algorithm is further confirmed through objective assessments and informal subjective listening tests.


IEEE Transactions on Audio, Speech, and Language Processing | 2011

A DCT-Based Speech Enhancement System With Pitch Synchronous Analysis

Huijun Ding; Ing Yann Soon; Chai Kiat Yeo

Discrete cosine transform (DCT) has been proven to be a good approximation to the Karhunen-Loeve Transform (KLT) and has similar properties to the discrete Fourier transform (DFT). It also possesses a better energy compaction capability which is advantageous for speech enhancement. However, frame to frame variations of DCT coefficients even for a perfectly stationary signal can be observed. Therefore a DCT-based speech enhancement system with pitch synchronous analysis is proposed to overcome this problem. It reduces the drawbacks of fixed window shift and the amount of shift in the analysis window is now based on the pitch period, thus increasing the inter-frame similarities. Furthermore, a Wiener filter using the a priori signal-to-noise ratio (SNR) with an adaptive parameter is also derived and implemented as an advanced noise reduction filter. This proposed speech enhancement system is evaluated in terms of several objective measures and the experimental results demonstrate the good performance of the proposed system.


IEEE Transactions on Audio, Speech, and Language Processing | 2010

Over-Attenuated Components Regeneration for Speech Enhancement

Huijun Ding; Ing Yann Soon; Chai Kiat Yeo

Despite the quality improvement of the speech signal with most traditional noise reduction (TNR) algorithms, the output is always distorted to some extent due to the over-attenuation of speech components. Weak speech components are usually regarded as noise in noise reduction processing and are therefore highly suppressed. In this paper, we propose a postprocessing technique which is based on the regeneration of both the voiced and unvoiced speech in the entire frequency domain to reduce this problem. A nonlinear transform is first applied to obtain the excitation signal, and a smooth envelope is then estimated. To utilize the information of the clean speech contained in the envelope, we combine the original TNR filter output with a weighted product of the excitation signal and the estimated envelope to generate the final synthesized speech. The synthesized speech is quite close to the clean speech and is more natural-sounding. Moreover, our algorithm can mask the residual musical noise effectively with the regenerated speech components. Experimental results demonstrate the excellent performance of our algorithm. In addition, we introduce two novel objective measures and further show the efficiency of our algorithm in maintaining the clean speech while reducing the noise as much as possible.


Speech Communication | 2015

Objective measures for quality assessment of noise-suppressed speech

Huijun Ding; Tan Lee; Ing Yann Soon; Chai Kiat Yeo; Peng Dai; Guo Dan

Three objective measures are proposed to separately evaluate the speech distortion, noise reduction and overall quality of the processed speech signal enhanced by single channel speech enhancement algorithms.The proposed measures are derived in both time and frequency domains.The high correlations between the measurement results of the proposed approaches and subjective ratings demonstrate the effectiveness of the proposed evaluation methodology.Some hints and links between signal parameters and perceptual judgements are revealed in our analysis on the evaluation process of the proposed objective measures with subjective ratings. Among all the existing objective measures, few are able to provide a clearly specific indication of speech distortion or noise reduction, which are the two key metrics to assess the performance of speech enhancement algorithms and evaluate the noise-suppressed speech quality. In this paper, new quantitative quality assessments are proposed to separately evaluate the capabilities of single channel speech enhancement algorithms in terms of maintaining the clean speech, noise reduction and overall performance. Based on these aspects, three evaluation results can be provided for any one test speech signal by analyzing the residual signal which is the difference between the clean speech and the processed speech. Several common speech enhancement algorithms are compared by these objective measures as well as subjective listening tests. High correlations between the scores of the objective measures and subjective ratings clearly show the effectiveness of the proposed evaluation methodologies on the different speech enhancement algorithms.


international conference on acoustics, speech, and signal processing | 2009

A post-processing technique for regeneration of over-attenuated speech

Huijun Ding; Ing Yann Soon; Soo Ngee Koh; Chai Kiat Yeo

Despite the success of recent speech enhancement algorithms, the enhanced signals still suffer from undesirable speech distortion caused by over-attenuation of weak speech spectral components. In this paper, a post-processing technique based on the regeneration of both voiced and unvoiced speech is proposed to alleviate this problem. A non-linear transformation is first applied to aWiener filtered speech and the transformed signal is multiplied by a pre-estimated spectral envelop to form the regenerated speech. The resulting speech is then obtained using a weighted combination of the regenerated speech components and the filtered speech. This process significantly improves the resulting speech quality as compared to the original filtered version. It results in speech that sounds less lowpassed. Also, the residual musical noise is significantly masked by the regenerated speech components. Objective measures show that the quality of the resulting speech is much closer to the clean speech as compared to the original Wiener filtered speech.


Pattern Recognition Letters | 2017

Motion intent recognition of individual fingers based on mechanomyogram

Huijun Ding; Qing He; Lei Zeng; Yongjin Zhou; Minmin Shen; Guo Dan

The motion intent of individual finger is recognized based on mechanomyogram.The mechanomyogram signal is captured by inertial sensor.Time and transform domain feature sets are effectively applied for recognition.The effectiveness of single and combined feature sets are evaluated.The best average recognition rate of 95.20% can be achieved. The mechanomyogram (MMG) signals detected from forearm muscle group contain abundant information which can be utilized to predict finger motion intention. Few works have been reported in this area especially for the recognition of individual finger motions, which however is crucial for many applications such as prosthesis control. In this paper, a MMG based finger gesture recognition system is designed to identify the motions of each finger. In this system, three kinds of feature sets, wavelet packet transform (WPT) coefficients, stationary wavelet transform (SWT) coefficients, and the time and frequency domain hybrid (TFDH) features, are adopted and processed by a support vector machine (SVM) classifier. The experimental results show that the average accuracy rates of recognition using the WPT, SWT and TFDH features are 91.64%, 94.31%, and 91.56%, respectively. Furthermore, the average rate of 95.20% can be achieved when above three feature sets are combined to use in the proposed recognition system.


Frontiers in Neurology | 2017

An Individual Finger Gesture Recognition System Based on Motion-Intent Analysis Using Mechanomyogram Signal

Huijun Ding; Qing He; Yongjin Zhou; Guo Dan; Song Cui

Motion-intent-based finger gesture recognition systems are crucial for many applications such as prosthesis control, sign language recognition, wearable rehabilitation system, and human–computer interaction. In this article, a motion-intent-based finger gesture recognition system is designed to correctly identify the tapping of every finger for the first time. Two auto-event annotation algorithms are firstly applied and evaluated for detecting the finger tapping frame. Based on the truncated signals, the Wavelet packet transform (WPT) coefficients are calculated and compressed as the features, followed by a feature selection method that is able to improve the performance by optimizing the feature set. Finally, three popular classifiers including naive Bayes (NBC), K-nearest neighbor (KNN), and support vector machine (SVM) are applied and evaluated. The recognition accuracy can be achieved up to 94%. The design and the architecture of the system are presented with full system characterization results.


international symposium on chinese spoken language processing | 2016

The correlation between signal distance and consonant pronunciation in Mandarin words

Huijun Ding; Chenxi Xie; Lei Zeng; Yang Xu; Guo Dan

In Mandarin language speaking, some consonant and vowel pairs are hard to be distinguished and pronounced clearly even for some native speakers. This study investigates the signal distance between consonants compared in pairs from the signal processing point of view to reveal the correlation of signal distance and consonant pronunciation. Some popular speech quality objective measures are innovatively applied to obtain the signal distance. The experimental results show that the confusing pair, /l/-/n/, does have the shortest signal distance compared with other consonants followed by the same vowel and lexical tone. The finding suggests that the signal distance is able to evaluate the confusion degree of certain consonant/vowel pair in a numerical and quantitative way. The signal distance of the other consonant/vowel pairs will be explored in the future work.


Signal Processing | 2015

2D Psychoacoustic modeling of equivalent masking for automatic speech recognition

Peng Dai; Frank Rudzicz; Ing Yann Soon; Alex Mihailidis; Huijun Ding

Noise robustness has long been one of the most important goals in speech recognition. While the performance of automatic speech recognition (ASR) deteriorates in noisy situations, the human auditory system is relatively adept at handling noise. To mimic this adeptness, we study and apply psychoacoustic models in speech recognition as a means to improve robustness of ASR systems. Psychoacoustic models are usually implemented in a subtractive manner with the intention to remove noise. However, this is not necessarily the only approach to this challenge. This paper presents a novel algorithm which implements psychoacoustic models additively. The algorithm is motivated by the fact that weak sound elements that are below the masking threshold are the same for the human auditory system, regardless of the actual sound pressure level. Another important contribution of our proposed algorithm is a superior implementation of masking effect. Only those sounds that fall below the masking threshold are modified, which better reflects physical masking effects. We give detailed experimental results showing relationships between the subtractive and additive approaches. Since all the parameters of the proposed filters are positive or zero, they are named 2D psychoacoustic P-filters. Detailed theoretical analysis is provided to show the noise removal ability of these filters. Experiments are carried out on the AURORA2 database. Experimental results show that the word recognition rate using our proposed feature extraction method has been effectively increased. Given models trained with clean speech, our proposed method achieves up to 84.23% word recognition on noisy data. HighlightsModeling of human auditory system.2D Psychoacoustic model based on equivalent masking.Comparison of different implementation styles of psychoacoustic models.A unified mathematical model combining different conditions.Detailed analysis is provided to study the algorithm performance.


international symposium on chinese spoken language processing | 2012

Two objective measures for speech distortion and noise reduction evaluation of enhanced speech signals

Huijun Ding; Tan Lee; Ing Yann Soon

Among all the existing objective measures, few are able to give a clearly specific indication on speech distortion or noise reduction although speech distortion and noise reduction are two key metrics to evaluate the enhanced speech quality. In this paper, two objective measurement tools are proposed to separately evaluate the capability of a speech enhancement filter in terms of recovering the clean speech and reducing the noise. Several common speech enhancement algorithms are evaluated by these objective measures as well as subjective listening test. Correlations between the results of objective measure and subjective measure clearly show the effectiveness of the proposed objective measures in evaluating the quality of enhanced speech signals.

Collaboration


Dive into the Huijun Ding's collaboration.

Top Co-Authors

Avatar

Ing Yann Soon

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Chai Kiat Yeo

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Soo Ngee Koh

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Zhexiao Guo

University of Konstanz

View shared research outputs
Top Co-Authors

Avatar

Peng Dai

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge