Yair Shoham | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yair Shoham is active.

Explore More

Publication

Featured researches published by Yair Shoham.

IEEE Transactions on Acoustics, Speech, and Signal Processing | 1988

Efficient bit allocation for an arbitrary set of quantizers (speech coding)

Yair Shoham; Allen Gersho

A bit allocation algorithm that is capable of efficiently allocating a given quota of bits to an arbitrary set of different quantizers is proposed. This algorithm is useful in any coding scheme which uses bit allocation or, more generally, codebook allocation. It produces an optimal or very nearly optimal allocation, while allowing the set of admissible bit allocation values to be constrained to nonnegative integers. It is particularly useful in cases where the quantizer performance versus rate is irregular and changing in time, a situation that cannot be handled by conventional allocation algorithms. >

IEEE Transactions on Speech and Audio Processing | 1998

Design and description of CS-ACELP: a toll quality 8 kb/s speech coder

Redwan Salami; Claude Laflamme; Jean-Pierre Adoul; Akitoshi Kataoka; Shinji Hayashi; Takehiro Moriya; Claude Lamblin; Dominique Massaloux; Stéphane Proust; Peter Kroon; Yair Shoham

This paper describes the 8 kb/s speech coding algorithm G.729 which has been standardized by ITU-T. The algorithm is based on a conjugate-structure algebraic CELP (CS-ACELP) coding technique and uses 10 ms speech frames. The codec delivers toll-quality speech (equivalent to 32 kb/s ADPCM) for most operating conditions. This paper describes the coder structure in detail and discusses the reasons behind certain design choices. A 16-b fixed-point version has been developed as part of Recommendation G.729 and a summary of the subjective test results based on a real-time implementation of this version are presented.

international conference on acoustics, speech, and signal processing | 1984

Fast search algorithms for vector quantization and pattern matching

De-Yuan Cheng; Allen Gersho; Bhaskar Ramamurthi; Yair Shoham

A fundamental computational task that arises in several areas of signal processing is pattern matching, where a given test pattern is compared with a large set of stored templates, to find the best match that minimizes a given measure of dissimilarity. Three different geometrically-oriented methods are proposed for substantially reducing the computational complexity of the search process by reducing the number of multiplies in exchange for additional low complexity operations and, in two of the methods, additional memory for storing precomputed tables.

international conference on acoustics, speech, and signal processing | 1993

High-quality speech coding at 2.4 to 4.0 kbit/s based on time-frequency interpolation

Yair Shoham

The author presents a novel algorithm for high-quality coding and demonstrates the advantage of the proposed coder over the conventional CELP (code-excited linear prediction) algorithm for low rate coding. He proposes an empirical but perceptually advantageous framework for voice speech processing, called time-frequency interpolation (TFI). The general formulation of the TFI technique is given and then a TFI speech coder is described. The performance of this coder at 4.05 and 2.5 kbit/s is demonstrated in terms of formal MOS (mean opinion score) scores. It is shown that the 4.05 kbit/s TFI coder is comparable in performance with the 8 kbit/s European standard GSM (Group Special Mobile) coder. It is also shown that reducing the bit rate to 2.50 kbit/s only gracefully degrades the performance and the coder delivers good-quality speech at this rate.<<ETX>>

conference of the international speech communication association | 1991

Constrained-Stochastic Excitation Coding of Speech at 4.8 kb/s

Yair Shoham

In the last few years, Code-Excited Linear Predictive (CELP) coding has emerged as the most prominent technique for digital speech communication at rates of 8 Kb/s and below, and it is now considered the best candidate coder for digital mobile telephony and secure speech communication. While the CELP coder is able to provide fairly good-quality speech at 8 Kb/s, its performance at 4.8 Kb/s is yet unsatisfactory for many applications. The novelty in the CELP coding concept, namely, the stochastic excitation of a linear filter, also constitutes a weakness of this method: the excitation contains a noisy component which does not contribute to the speech synthesis process and can not be completely removed by the filter. It is a common opinion among speech communication researchers that new forms of excitations need to be studied in order to improve the CELP performance at low bit rates.

IEEE Journal on Selected Areas in Communications | 1988

New directions in subband coding

Richard V. Cox; Yair Shoham; Schuyler Quackenbush; Nambirajan Seshadri; Nikil S. Jayant

Two very different subband coders are described. The first is a modified dynamic bit-allocation-subband coder (D-SBC) designed for variable rate coding situations and easily adaptable to noisy channel environments. It can operate at rates as low as 12 kb/s and still give good quality speech. The second coder is a 16-kb/s waveform coder, based on a combination of subband coding and vector quantization (VQ-SBC). The key feature of this coder is its short coding delay, which makes it suitable for real-time communication networks. The speech quality of both coders has been enhanced by adaptive postfiltering. The coders have been implemented on a single AT&T DSP32 signal processor. >

international conference on acoustics speech and signal processing | 1996

A low-complexity waveform interpolation coder

W.B. Kleijn; Yair Shoham; D. Sen; R. Hagen

A recent independent survey found a 2.4 kbit/s waveform-interpolation (WI) algorithm to perform better than other state-of-the-art speech coders. However, this coder had a very high level of computational complexity. The introduction of various techniques, including a time-varying waveform sampling rate and a cubic B-spline waveform representation, has reduced the computational complexity by an order of magnitude. The new implementation allows full-duplex real-time operation on a single DSP device and on an average workstation (the latter using non-optimized compiled C source code). The new coder also contains a number of new features which improve the quality of the reconstructed speech signal.

international conference on acoustics, speech, and signal processing | 1987

Vector predictive quantization of the spectral parameters for low rate speech coding

Yair Shoham

Vector Predictive Quantization (VPQ) is proposed for coding the short-term spectral envelope of speech. The proposed VPQ scheme predicts the current spectral envelope from several past spectra, using a predictor codebook. The residual spectrum is coded by a residual codebook. The system operates in the log-spectral domain using a sampled version of the spectral envelope. Experimental results indicate a prediction gain in the range of 9 to 13 dB and an average log-spectral distance of 1.3 to 1.7 dB. Informal listening tests suggest that replacing the conventional scalar quantizer in a 4.8 Kbits/s CELP coder by a VPQ system allows a reduction of the rate assigned to the LPC data from 1.8 Kbits/s to 1.0 Kbits/s without any obvious difference in the perceptual quality.

conference of the international speech communication association | 1992

Coding of wideband speech

Nikil S. Jayant; James D. Johnston; Yair Shoham

Abstract The technologies of ISDN teleconferencing, CD-ROM multimedia services, and High Definition Television are creating new opportunities and challenges for the digital coding of wideband audio signals, wideband speech in particular. In the coding of wideband speech, an important point of reference is the CCITT standard for 7 kHz speech at a rate of 64 kbit/s. Results of recent research are pointing to better capabilities — higher signal bandwidth at 64 kbit/s, and 7 kHz bandwidth at lower bit-rates such as 32 and 16 kbit/s. The coding of audio with a signal bandwidth of 20 kHz is receiving significant attention due to recent activity in the ISO (International Standards Organization), with a goal of storing a CD-grade monophonic audio channel at a bit-rate not exceeding 128 kbit/s. Prospects for accomplishing this are very good. As a side result, emerging algorithms will offer very attractive options at lower rates such as 96 and 64 kbit/s. As we address new challenges in wideband speech technology, several strides in coding research are likely to occur. Among these are refinements of existing models for auditory noise-masking, and a unification of linear prediction and frequency-domain coding.

international conference on acoustics, speech, and signal processing | 1989

Cascaded likelihood vector coding of the LPC information

Yair Shoham

Cascade likelihood vector quantization is proposed for coding LPC spectral parameters for application in low-rate code-excited linear prediction systems. The approach is based on representing the LPC all-pole filter as a cascade of two lower-order all-pole filters. The partitioning of the LPC polynomial is done in the root domain by clustering the roots into two distinct groups. The quantizer uses two codebooks to quantize each of the lower-order filters. However, the quantization process is done so as to optimize jointly the performances of the two subsystems. The likelihood-ratio distortion measure is used as a performance criterion in the design and coding processes. Splitting the LPC filter into two subsystems dramatically reduces the coding complexity while the efficiency of using vector quantization is essentially preserved. Experimental results show an average performance of 1.59, 1.41, 1.24, and 1.10 dB log-spectral distortion at the rates of 1.0, 1.1, 1.2, and 1.3 kb/s, respectively. These figures represent good quality at 1.0 kb/s up to essentially transparent quality at 1.3 kb/s.<<ETX>>

Explore More