W.B. Kleijn
Royal Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by W.B. Kleijn.
international conference on acoustics, speech, and signal processing | 2001
M. Nilsson; W.B. Kleijn
We present a new way of treating the problem of extending a narrow-band signal to a wide-band signal. For many cases of bandwidth extension, the high-band energy is overestimated, leading to undesirable audible artifacts. To overcome these problems we introduce an asymmetric cost-function in the estimation process of the high-band that penalizes over-estimates more than under-estimates of the energy in the high-band. We show that the resulting attenuation of the estimated high-band energy depends on the broadness of the a-posteriori distribution of the energy given the extracted information about the narrow-band. Thus, the uncertainty about how to extend the signal at the high-band influences the level of extension. Results from a listening test show that the proposed algorithm produces less artifacts.
international conference on acoustics speech and signal processing | 1999
H. Pobloth; W.B. Kleijn
In this paper we define perceptual phase capacity as the size of a codebook of phase spectra necessary to represent all possible phase spectra in a perceptually accurate manner. We determine the perceptual phase capacity for voiced speech. To this purpose, we use an auditory model which indicates if phase spectrum changes are audible or not. The correct performance of the model was adjusted and verified by listening tests. The perceptual phase capacity in low pitched speech is found to be much higher than it is for high pitched speech. Our results are consistent with the well known fact that speech coding schemes which preserve the phase accurately work better for male voices, while coders which put more weight on the amplitude spectrum of the speech signal result in better quality for female speech.
international conference on acoustics, speech, and signal processing | 2000
Mattias Nilsson; Søren Vang Andersen; W.B. Kleijn
In this paper we investigate the mutual information in speech between the spectral envelope of the high frequency band and low frequency bands of various widths. Direct methods on the computation of the mutual information often result in an excessive amount of data required even for modest situations. We reduce the required amount of data by quantizing the low band leading to a lower bound expression on the mutual information. We indicate by simulation that this lower bound is in the same order of magnitude as the true mutual information. Simulations on speech show that we have no less than 0.1 bit of shared information between the slope of the high band and the low frequency band from 0-4 kHz. Performing the analogous simulation with the gain of the high band we obtained no less than 0.45 bit of mutual information.
international conference on acoustics, speech, and signal processing | 2005
S. Srinivasan; J. Samuelsson; W.B. Kleijn
In this paper, we propose a Bayesian approach for the estimation of the short-term predictor parameters of speech and noise, from the noisy observation. The resulting estimates of the speech and noise spectra can be used in a Wiener filter or any state-of-the-art speech enhancement system. We utilize a-priori information about both speech and noise in the form of trained codebooks of linear predictive coefficients. In contrast to current Bayesian estimation approaches that consider the excitation variances as part of the a-priori information, in the proposed method they are computed analytically based on the observation at hand. Consequently, the method performs well in nonstationary noise conditions. Experimental results confirm the superior performance of the proposed method compared to existing Bayesian approaches, such as those based on hidden Markov models.
international conference on acoustics, speech, and signal processing | 2000
Renat Vafin; Søren Vang Andersen; W.B. Kleijn
In this paper, we elaborate on the issue of analysis-synthesis consistency in sinusoidal coding. Our analysis is based on windowed sinusoids, and uses the same amplitude-complementary window as is used in the overlap-add synthesis. Reconstructions of the neighboring segments are taken into account when forming a particular analysis segment. Sinusoidal estimation is based on a perceptual criterion. In our new procedure, when analyzing the current segment we take advantage of the forward masking effect due to estimated sinusoids in the previous segments (possibly overlapping with the current segment). Experimental results verify that the number of sinusoids can be reduced significantly with our time masking model, without introducing perceptual artifacts in the reconstructed signal.
international conference on acoustics, speech, and signal processing | 2005
V. Grancharov; J. Samuelsson; W.B. Kleijn
The Kalman recursion is a powerful technique for reconstruction of the speech signal observed in additive background noise. In contrast to Wiener filtering and spectral subtraction schemes, the Kalman algorithm can be easily implemented in both causal and noncausal form. After studying the perceptual differences between these two implementations we propose a novel algorithm that combines the low complexity and the robustness of the Kalman filter and the proper noise shaping of the Kalman smoother.
international conference on acoustics, speech, and signal processing | 2004
Renat Vafin; W.B. Kleijn
We develop a new method for quantization in multistage audio coding. We consider the case of a two-stage sinusoidal/waveform coder. Given a distortion measure and a bit-rate constraint, we analytically derive the optimal rate distribution between subcoders (stages) and the corresponding optimal quantizers, which allows the coder to adapt easily to changes in bit-rate requirements. We verify that the performance, both in terms of signal-to-noise ratio (SNR) and perceptual quality, is higher if the input to the second stage is obtained by subtracting the quantized first-stage reconstruction from the original signal, as opposed to subtracting the unquantized reconstruction.
international conference on acoustics, speech, and signal processing | 2004
S. Srinivasan; J. Samuelsson; W.B. Kleijn
We describe a technique for obtaining estimates of the short-term predictor parameters of speech under noisy conditions. We use a-priori information about speech in the form of a trained codebook of speech linear predictive coefficients. Our contribution is two-fold. First, we provide a framework where the standard vector quantization search to obtain the quantized linear predictive coefficients can be replaced by a maximum likelihood search, given the noisy observation, the speech codebook and an estimate of the noise. This results in an enhancement method that is integrated with parametric coders such as linear predictive analysis-by-synthesis coders. Second, we provide a scheme where the chosen vector is not restricted to be an element of the codebook. An interpolative search between the maximum likelihood estimate and its nearest neighbors in the codebook is used to improve the precision of the estimated parameters. Such a scheme is relevant when enhancement is considered separately from coding. Experimental results show improved performance for the proposed methods.
asilomar conference on signals, systems and computers | 1998
Søren Vang Andersen; W.B. Kleijn; Sh Jensen; E Hansen
In analysis-by-synthesis linear predictive coding (AbS-LPC) an LPC synthesis filter is combined with an analysis-by-synthesis search of the excitation signal. The synthesis filter is an estimator for the speech signal given the excitation. However, in most AbS-LPC algorithms this estimator has no explicit model of the quantization noise, which is present in the excitation signal. This paper describes quantization noise modeling in a vector AbS-LPC algorithm. Methods based on recursive Bayesian filtering and Kalman filtering are considered. Simulations indicate improved signal-to-noise ratios due to quantization noise modeling.
international conference on acoustics, speech, and signal processing | 2004
H. Pobloth; Renat Vafin; W.B. Kleijn
We introduce multi-variate block polar quantization (MBPQ). MBPQ minimizes a weighted distortion for a set of complex variables representing one block of a signal under a resolution constraint for the entire block. MBPQ allows for different probability distributions in different dimensions of the set of complex variables. It outperforms a block polar quantizer introduced earlier (Pobloth, H. et al., Proc. Eurospeech, p.1097-100, 2003), and unrestricted polar quantization (UPQ), for both Gaussian complex variables and sinusoids found from audio data. In the case of audio data, we found a performance gain of about 2.5 dB over the best performing conventional resolution-constrained polar quantization, UPQ.