Is this you? Create Your Porfile

Shenghui Zhao

Beijing Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shenghui Zhao is active.

Explore More

Publication

Featured researches published by Shenghui Zhao.

international conference on communications | 2008

An improved speech playout buffering algorithm based on a new version of E-Model in VoIP

Zhongbo Li; Shenghui Zhao; Xiang Xie; Jingming Kuang

In voice over IP (VoIP) applications, playout buffering algorithms based on a tradeoff between delay and loss can be used to alleviate the effect of jitter. In the past, the aim of most buffer algorithms was not to improve the perceived speech quality directly, but to reduce the buffer delay and the packet loss rate. Then a quality-driven approach was proposed, which uses a quality model to control the playout buffer in order to maximize the mean opinion score (MOS) in terms of delay and loss. However, this method can only be used in random loss condition. Thus an improved quality-driven approach is proposed in this paper to deal with bursty loss condition. For this purpose, we make use of the latest version of ITU-T E-model to incorporate the effects of loss burstiness in the perceived quality. The experimental results show that the proposed method can achieve an ldquooptimumrdquo perceived speech quality, and reduce the bursty loss simultaneously.

Eurasip Journal on Audio, Speech, and Music Processing | 2013

Context-based adaptive arithmetic coding in time and frequency domain for the lossless compression of audio coding parameters at variable rate

Jing Wang; Xuan Ji; Shenghui Zhao; Xiang Xie; Jingming Kuang

This paper presents a novel lossless compression technique of the context-based adaptive arithmetic coding which can be used to further compress the quantized parameters in audio codec. The key feature of the new technique is the combination of the context model in time domain and frequency domain which is called time-frequency context model. It is used for the lossless compression of audio coding parameters such as the quantized modified discrete cosine transform (MDCT) coefficients and the frequency band gains in ITU-T G.719 audio codec. With the proposed adaptive arithmetic coding, a high degree of adaptation and redundancy reduction can be achieved. In addition, an efficient variable rate algorithm is employed, which is designed based on both the baseline entropy coding method of G.719 and the proposed adaptive arithmetic coding technique. Experiments show that the proposed technique is of higher efficiency compared with the conventional Huffman coding and the common adaptive arithmetic coding when used in the lossless compression of audio coding parameters. For a set of audio samples used in the G.719 application, the proposed technique achieves an average bit rate saving of 7.2% at low bit rate coding mode while producing audio quality equal to that of the original G.719.

international conference on model transformation | 2011

FEC-based packet loss recovery for AVS-M audio codec

Jianli Liu; Shenghui Zhao; Jing Wang; Jingming Kuang

In this paper, we utilize sender-based Forward Error Correction (FEC) techniques to enhance the robustness of packet loss recovery for AVS Mobile speech and audio (AVS-M) codec. Two FEC schemes are proposed which take the advantage of the codecs structure characteristics and do not introduce extra delay. The objective and subjective listening tests results show that the two methods achieve higher reconstructed quality than the codecs original frame erasure scheme in the case of packet loss.

international conference on communications | 2009

Analytical and Experimental Comparison of Packet Loss Recovery Methods Based on AMR-WB for VoIP

Zhongbo Li; Stefan Bruhn; Shenghui Zhao; Jingming Kuang

Forward error control (FEC) and multiple description coding (MDC) are two classical techniques to resist packet loss for Voice over IP (VoIP). AMR-WB codec has been standardized for wideband speech conversational applications and has widely potential applications in the migration of wireless or wired networks toward a single converged IP network. However, how to choose the optimal FEC or MDC for AMR-WB in different loss rate conditions is an unexplored option. In this paper, we compare the performance of different FEC and MDC techniques for AMR-WB codec both analytically and experimentally. Based on the comparison results, some practical configurations of FEC and MDC for AMR-WB codec are obtained.

international conference on conceptual structures | 2008

Non-intrusive objective speech quality measurement based on GMM and SVR for narrowband and wideband speech

Jing Wang; Juan Luo; Shenghui Zhao; Jingming Kuang

A non-intrusive objective measurement for estimating the quality of output speech without input clean speech is proposed for both narrowband and wideband speech based on Gaussian mixture model (GMM) and support vector regression (SVR). Perceptual linear predictive (PLP) features are extracted and clustered by GMM as an artificial reference model from clean speech. Input speech is separated into three classes, for which the consistency measures between features of the test speech signal and the pre-trained GMM reference model are calculated and mapped to an objective speech quality score using SVR method. Based on the three narrowband and two wideband MOS (mean opinion score) labeled test databases, the correlation degree between subjective MOS and objective MOS is analyzed. Experiment results show that the proposed method is an effective technique and performs better than ITU-T P.563 and MNLR (multivariate non-linear regression) method for most of the test conditions.

Speech Communication | 2012

Comparison and optimization of packet loss recovery methods based on AMR-WB for VoIP

Zhongbo Li; Shenghui Zhao; Stefan Bruhn; Jing Wang; Jingming Kuang

AMR-WB codec, which has been standardized for wideband speech conversational applications, has a broad range of potential applications in the migration of wireless and wireline networks towards a single converged IP network. Forward error control (FEC) and multiple description coding (MDC) are two promising techniques to make the transmission robust against packet loss in Voice over IP (VoIP). However, how to achieve the optimal reconstructed speech quality with these methods for AMR-WB under different packet loss rate conditions is still an open problem. In this paper, we compare the performance of various FEC and MDC schemes for the AMR-WB codec both analytically and experimentally. Based on the comparison results, some advantageous configurations of FEC and MDC for the AMR-WB codec are obtained, and hence an optimization system is proposed by selecting the optimal packet loss recovery scheme in accordance with the variable network conditions. Subjective AB test results show that the optimization can lead to obvious improvements of the perceived speech quality in the IP environment.

international congress on image and signal processing | 2010

An improved non-intrusive objective speech quality evaluation based on FGMM and FNN

Jing Wang; Ying Zhang; Yuling Song; Shenghui Zhao; Jingming Kuang

An improved non-intrusive objective speech quality evaluation method is proposed based on Fuzzy Gaussian Mixture Model (FGMM) and Fuzzy Neural Network (FNN). The degraded speech is separated into three classes (unvoiced, voiced and silence), then for each class the consistency measurement between Perceptual Linear Predictive (PLP) features of the degraded speech and the pre-trained FGMM reference model is calculated and mapped to an objective speech quality score using FNN mapping method. The proposed method performs better than the previous work using GMM and ITU-T P.563 under the test conditions used in this paper.

international symposium on chinese spoken language processing | 2008

A CSI and Rate-Distortion Based Packet Loss Recovery Algorithm for VoIP

Zhongbo Li; Shenghui Zhao; Jing Wang; Jingming Kuang

Packet loss affects the received speech quality greatly in VoIP (voice over Internet Protocol) applications. PCM (pulse code modulation) coders are often used for VoIP and also face such problems. In this paper, a packet loss recovery algorithm is proposed based on CSI (coding with side information) and rate-distortion optimization. Two virtual reconstruction methods are implemented for every voiced packet with side information at the sender endpoint. Rate-distortion optimization is used to select the proper recovery method which has the lowest cost considering both the distortion and rate. At the receiver endpoint, the lost packets will be recovered with the selected recovery method. Simulation results show that the speech quality over IP network is improved dramatically at the price of only a low bandwidth consumption with the proposed method.

international congress on image and signal processing | 2012

Pitch prediction in frequency domain for ITU-T G.719 audio codec

Linlin Jiang; Shenghui Zhao; Jing Wang; Jingming Kuang

Pitch prediction in time domain is of great importance to improve the coding quality of speech. In this paper, we propose a method of frequency-domain pitch prediction for the full band ITU-T G.719 codec. The results of both the objective and the subjective evaluation show that the proposed method achieves higher reconstructed quality for the speech signal than the original method, especially for the lower coding rates.

international conference on model transformation | 2011

Speech visualization of some Mandarin compound-finals

Shenghui Zhao; Na Huang; Jingyu Yan; Jingming Kuang

On the basis of the study of some compound finals in Chinese Mandarin, we propose a novel method for their speech visualization, which depends on the trajectory trends of the first three formants F1, F2 and F3, and the relative relationships of these formants among different compound finals. The results of the subjective test show that a high recognition rate is achieved with the method.

Explore More