Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Vladimir Cuperman is active.

Publication


Featured researches published by Vladimir Cuperman.


IEEE Transactions on Speech and Audio Processing | 1993

Efficient search and design procedures for robust multi-stage VQ of LPC parameters for 4 kb/s speech coding

Wilf P. LeBlanc; B. Bhattacharya; Samy A. Mahmoud; Vladimir Cuperman

A tree-searched multistage vector quantization (VQ) scheme for linear prediction coding (LPC) parameters which achieves spectral distortion lower than 1 dB with low complexity and good robustness using rates as low as 22 b/frame is presented. The M-L search is used, and it is shown that it achieves performance close to that of the optimal search for a relatively small M. A joint codebook design strategy for multistage VQ which improves convergence speed and the VQ performance measures is presented. The best performance/complexity tradeoffs are obtained with relatively small size codebooks cascaded in a 3-6 stage configuration. It is shown experimentally that as the number of stages is increased above the optimal performance/complexity tradeoff, the quantizer robustness and outlier performance can be improved at the expense of a slight increase in rate. Results for log area ratio (LAR) and line spectral pairs (LSPs) parameters are presented. A training technique that reduces outliers at the expense of a slight average performance degradation is introduced. The method significantly outperforms the split codebook approach. >


IEEE Communications Magazine | 1983

Vector quantization: A pattern-matching technique for speech coding

Allen Gersho; Vladimir Cuperman

V ECTOR QUANTIZATION (VQ), a new direction in source coding, has recently emerged as a powerful and widely applicable coding technique. I t was first applied to analysis/synthesis of speech, and has allowed Linear Predictive Coding (LPC) rates to be dramatically reduced to 800 b/s with very slight reduction in quality, and further compressed to rates as low as 150 b/s while retaining intelligibility [ 1,2]. More recently, the technique has found its way to waveform coding [3-51, where its applicability and effectiveness is less obvious and not widely known. There is currently a great need for a low-complexity speech coder at the rate of 16 kb/s which attains essentially “toll” quality, roughly equivalent to that of standard 64-kb/s log PCM codecs. Adaptive DPCM schemes can attain this quality with low complexity for the proposed 32 kb/s CCITT standard, but at 16 kb/s the quality of ADPCM or adaptive delta modulation schemes is inadequate. More powerful methods, such as subband coding or transform coding, are capable of producing acceptable speech quality at 16kb/s but have a much higher implementation complexity. The difficulty is further compounded by the need for a scheme that can handle both speech and voiceband data at the 16 kb/s rate. These two types of waveforms occupy the same bandwidth in the subscriber loop part of the telephone network, yet they have a widely different statistical character. Effective speech coding at this rate must be geared to the specific character of speech and must exploit our knowledge of human hearing. On the other hand, a waveform that carries data must be coded and later reconstructed so that a modem can still extract the data with an acceptably low error rate. This is purely a signal processing operation not involving human perception. Vector quantization appears to be a suitable coding technique which caters to this dual requirement. VQ may become the key to 16 kb/s coding; it may also lead to improved quality waveform coding at 8 or 9.6 kb/s. In this paper, we review recent results obtained in waveform coding of speech with vector quantization and


international conference on communications | 1993

Joint source and channel coding using a non-linear receiver

F.-H. Liu; Paul Ho; Vladimir Cuperman

The joint optimization of the source and channel coders in a system consisting of a vector quantizer (VQ) whose output indices are mapped directly into points in the modulation signal space is considered. A decoder based on a nonlinear estimator is used to reconstruct the source signal. An iterative algorithm is introduced which jointly optimizes the VQ and the modulation signal set using as optimality criterion the minimum mean-square error (MSE) between the original and the reconstructed signals. It is shown that a jointly optimized system based on average channel characteristics significantly outperforms (by up to 5 dB) a reference system based on a VQ designed for the given source and a standard quadrature amplitude modulation (QAM) modulation signal set.<<ETX>>


international conference on acoustics, speech, and signal processing | 2000

A 16-kbit/s bandwidth scalable audio coder based on the G.729 standard

Kazuhito Koishida; Vladimir Cuperman; Allen Gersho

This paper proposes a bandwidth-scalable coding scheme based on the G.729 standard as a base layer coder. In the scheme, according to the channel conditions, the output speech of the decoder can be selected to be narrowband (4-kHz bandwidth) or wideband (8-kHz bandwidth). The proposed scheme consists of two layers: base and enhancement. The base coder uses the G.729 algorithm to encode narrowband speech. The enhancement coder is based on a full-band CELP model and it encodes wideband speech while making use of the available base layer information. Two bandwidth-scalable coders are designed: one is scalable with the 8 kbit/s G.729 base coder and another with the 6.4 kbit/s G.729 (Annex D) base coder. Subjective tests show that, for wideband speech, the proposed coders at 16 kbit/s achieve better performance than the 16 kbit/s MPEG-4 CELP with bandwidth scalability.


Speech Coding, 2002, IEEE Workshop Proceedings. | 2002

A 1200/2400 bps coding suite based on MELP

Tian Wang; K. Koishida; Vladimir Cuperman; Allen Gersho; John S. Collura

This paper presents key algorithm features of the future NATO narrow band voice coder (NBVC), a 1.2/2.4 kbps speech coder with noise preprocessor based on the MELP analysis algorithm. At 1.2 kbps, the MELP parameters for three consecutive frames are grouped into a superframe and jointly quantized to obtain high coding efficiency. The inter-frame redundancy is exploited with distinct quantization schemes for different unvoiced/voiced (U/V) frame combinations in the superframe. Novel techniques used at 1.2 kbps include pitch vector quantization using pitch differentials, joint quantization of pitch and U/V decisions and LSF quantization with a forward-backward interpolation method. A new harmonic synthesizer is introduced for both rates which improves the reproduction quality. Subjective test results indicate that the 1.2 kbps speech coder achieves quality close to the existing federal standard 2.4 kbps MELP coder.


Archive | 1993

Speech and Audio Coding for Wireless and Network Applications

Bishnu S. Atal; Vladimir Cuperman; Allen Gersho

I: Introduction. II: Low Delay Speech Coding. III: Speech Quality. IV: Speech Coding for Wireless Transmission. V: Audio Coding. VI: Speech Coding for Noisy Transmission Channels. VII: Topics in Speech Coding. Author Index. Index.


global communications conference | 1989

Backward pitch prediction for low-delay speech coding

Robert Pettigrew; Vladimir Cuperman

A backward-adaptive pitch prediction algorithm is described. It is used in conjunction with a backward-adaptive short-term predictor in a low-delay speech coding system operating at 16 kb/s. The backward-adaptive pitch prediction algorithm is a hybrid algorithm which combines backward block adaptive pitch prediction and backward recursive pitch prediction. The pitch predictor tap gains and the pitch period are periodically initialized by using a backward block adaptive algorithm. Between these initializations, however, both the tap gains and the pitch period are adapted using backward recursive algorithms. The tap gains are adapted using the well-known gradient algorithm, in a manner similar to the way the short-term predictor coefficients are adapted. The pitch period is adapted using a novel pitch tracking algorithm. By combining backward recursive adaptation with backward block adaptation, it was possible to increase the prediction gain of the pitch predictor and reduce the interval required between initialization of the pitch predictor parameters.<<ETX>>


international conference on acoustics, speech, and signal processing | 2000

A 1200 bps speech coder based on MELP

Tian Wang; Kazuhito Koishida; Vladimir Cuperman; Allen Gersho; John S. Collura

This paper presents a 1.2 kbps speech coder based on the mixed excitation linear prediction (MELP) analysis algorithm. In the proposed coder, the MELP parameters of three consecutive frames are grouped into a superframe and jointly quantized to obtain a high coding efficiency. The interframe redundancy is exploited with distinct quantization schemes for different unvoiced/voiced (U/V) frame combinations in the superframe. Novel techniques for improving performance make use of the superframe structure. These include pitch vector quantization using pitch differentials, joint quantization of pitch and U/V decisions and LSF quantization with a forward-backward interpolation method. Subjective test results indicate that the 1.2 kbps speech coder achieves approximately the same quality as the proposed federal standard 2.4 kbps MELP coder.


international conference on communications | 1993

A multi-mode variable rate CELP coder based on frame classification

Peter Lupini; Neil B. Cox; Vladimir Cuperman

The authors present a modular CELP (code-excited linear prediction) coder which can switch bit-rates in response to local speech characteristics (source-controlled mode) or external network conditions (network-controlled mode). The coder is capable of operating at several bit-rates and is optimized for 16 kb/s, 8 kb/s, and 4 kb/s. A 925-b/s configuration is included for silent frames. The authors present the results of informal MOS tests which show that the variable-rate system running at an average rate of 8 kb/s achieves subjective speech quality close to that of the 16-kb/s fixed-rate system (a difference of less than 0.1 on the MOS scale).<<ETX>>


vehicular technology conference | 1994

Variable rate speech and channel coding for mobile communication

Eric Yuen; Paul Ho; Vladimir Cuperman

Although the mobile communication channels are time-varying, most systems allocate the combined rate between the speech coder and error correction coder according to a nominal channel condition. This generally leads to a pessimistic design and consequently an inefficient utilization of the available resources, such as bandwidth and power. This paper describes an adaptive coding system that adjusts the rate allocation according to actual channel conditions. Two types of variable rate speech coders are considered : the embedded coders and the multimode coders and both are based on code excited linear prediction (CELP). On the other hand, the variable rate channel coders are based on the rate compatible punctured convolutional codes (RCPC). A channel estimator is used at the receiver to track both the short term and the long term fading condition in the channel. The estimated channel state information is then used to vary the rate allocation between the speech and the channel coder, on a frame by frame basis. This is achieved by sending an appropriate rate adjustment command through a feedback channel. Experimental results show that the objective and the subjective speech quality of the adaptive coders are superior than their non-adaptive counterparts. Improvements of up to 1.35 dB in SEGSNR of the speech signal and up to 0.9 in informal MOS for a combined rate of 12.8 kbit/s have been found. In addition, we found that the multimode coders perform better than their embedded counterparts.<<ETX>>

Collaboration


Dive into the Vladimir Cuperman's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Peter Lupini

Simon Fraser University

View shared research outputs
Top Co-Authors

Avatar

Paul Ho

Simon Fraser University

View shared research outputs
Top Co-Authors

Avatar

Eyal Shlomot

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Chunyan Li

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kazuhito Koishida

Tokyo Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge