Abraham Alcaim
Pontifical Catholic University of Rio de Janeiro
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Abraham Alcaim.
IEEE Transactions on Audio, Speech, and Language Processing | 2006
Ricardo Sant'ana; Rosangela Coelho; Abraham Alcaim
In this paper, a text-independent automatic speaker recognition (ASkR) system is proposed-the SR/sub Hurst/-which employs a new speech feature and a new classifier. The statistical feature pH is a vector of Hurst (H) parameters obtained by applying a wavelet-based multidimensional estimator (M/spl I.bar/dim/spl I.bar/wavelets ) to the windowed short-time segments of speech. The proposed classifier for the speaker identification and verification tasks is based on the multidimensional fBm (fractional Brownian motion) model, denoted by M/spl I.bar/dim/spl I.bar/fBm. For a given sequence of input speech features, the speaker model is obtained from the sequence of vectors of H parameters, means, and variances of these features. The performance of the SR/sub Hurst/ was compared to those achieved with the Gaussian mixture models (GMMs), autoregressive vector (AR), and Bhattacharyya distance (dB) classifiers. The speech database-recorded from fixed and cellular phone channels-was uttered by 75 different speakers. The results have shown the superior performance of the M/spl I.bar/dim/spl I.bar/fBm classifier and that the pH feature aggregates new information on the speaker identity. In addition, the proposed classifier employs a much simpler modeling structure as compared to the GMM.
ieee international telecommunications symposium | 2006
Carlos R. Ferreira; Abraham Alcaim; Rodrigo C. de Lamare
The new telecommunications services have been pushing toward the development of improvements in speech coding, because of the need to improve encoded speech quality, using the lowest transmission rate possible. This study analyzes and proposes a method to adjust LSF parameters in order to improve their accuracy, minimizing the losses in the encoded LSFs interpolation process. With this scheme, the synthesized speech perceptual quality at the decoder end is increased, without relying on an increase of the transmission rate. We present a mathematical optimization method that minimizes different distortion measures, namely the Euclidean distortion measure and an approximation of the spectral distance in a detailed way. To evaluate the performance of the proposed improvements, the method is implemented in a speech coder with average rates below 2 kb/s. The results confirm that it is possible to obtain significant reduction in distortion measures using the proposed adjustment method of LSFs.
international conference on advances in pattern recognition | 2005
Vladimir Fabregas Surigué de Alencar; Abraham Alcaim
In this paper, we describe and present an overall evaluation of several features for distributed speech recognition systems. These systems are based on a client-server architecture. This means that recognizers access only the coded parameters of the speech coder employed in communication networks (e.g., cellular mobile and IP networks). The recognition features considered in this paper are obtained from transformations of codec parameters. In particular, features generated from LPC and LSF parameters, in intervals of 10 ms and 20 ms, are analyzed in a continuous observation HMM-based speaker independent recognizer.
global communications conference | 1996
L. Martins da Silva; Abraham Alcaim
This paper presents a new strategy to encode the LPC spectral envelope of speech. The proposed scheme uses an interpolation-based differential vector coding of the LSF parameters in order to better track the temporal variations of the speech short-time spectral envelope. Two consecutive sets of LSF parameters are simultaneously encoded during each speech frame. Simulation results show major improvements over techniques that vector quantize a single set of LSF parameters per frame.
IEEE Signal Processing Letters | 2001
Emilio Carlos Acocella; Abraham Alcaim
In this paper, we introduce a formula to align the vertical coefficients of the SA-DCT (shape-adaptive discrete cosine transform). Instead of grouping coefficients with the same index as proposed by Sikora and Makai (1995), the new method employs an alignment by phase strategy. Experimental results are given for both synthetic segments and real testing image sequences.
European Transactions on Telecommunications | 2010
Sonia L. Q. Dall'Agnol; Abraham Alcaim; José Roberto Boisson de Marca
Efficient quantization of synthesis filter coefficients for CELP (Code Excited Linear Prediction) coders is essential to achieve high quality speech at low rates. Three vector quantizers with good potential for utilization in low rate coders are studied. Each of them is implemented inside the structure of the VSELP (Vector Sum Excited Linear Prediction) coder, an important member in the class of CELP coders. These three vector quantizers are used to encode LSF (Line Spectral Frequency) parameters and are compared in terms of robustness to channel errors, complexity and quality of synthesized speech. Performance of synthesized speech is evaluated considering the objective measure of frequency weighted signal to noise ratio and subjective results obtained from listening tests. With the purpose of improving the robustness to channel errors, the application of simulated annealing to assign binary indices to the output levels of the quantizer is also investigated. A split vector quantization scheme which employs interframe prediction has shown to be an attractive approach to encode the synthesis filter parameters. It provides a performance comparable to the IS-54 scheme and uses 10 bits less for each LSF frame.
international symposium on communications, control and signal processing | 2008
Eduardo Esteves Vale; Abraham Alcaim
A new algorithm for image enhancement in the two dimensional discrete cosine transform (2D DCT) domain is presented. The proposed scheme uses a band-adaptive contrast modification to enhance a large number of image details. In addition, it avoids noise amplification during the enhancement process. Experiments show the effectiveness of the proposed scheme as compared to other techniques reported in the literature. The new method was applied in the decompression domain of a DCT-based codec.
international conference on systems, signals and image processing | 2008
E. E. Vale; A. A. Cunha; Abraham Alcaim
A new approach for robust speaker identification using multiple classifiers in the subband domain is presented. The proposed system uses weights calculated from the energy in each subband of the speech signal to combine a set of likelihoods provided by GMM subband-classifiers. Experiments show the effectiveness of the proposed scheme as compared to other techniques.
IEEE Transactions on Speech and Audio Processing | 2000
L.M. da Silva; Abraham Alcaim
This article presents a new strategy to encode the LP (linear predictive) short-time spectral envelope (STSE) of speech. A better reconstruction of the STSE is achieved by modifying the usual trade-off between the transmission rate of the LP parameters and the performance of the quantization algorithm. A differential coding based on bidirectional prediction and hybrid vector quantization is used to compensate the increase in transmission rate. Simulation results show the effectiveness of this coding strategy.
ieee international telecommunications symposium | 2014
Christian Arcos Gordillo; Marco Grivet; Abraham Alcaim
One of the biggest problems of a speech recognition system is the signal degradation due to adverse conditions. Such situations usually lead to mismatch between the test conditions and the training data, caused by non-linear distortion. The authors propose a histogram mapping followed by a filter through neural networks techniques (based on the features compensation), in order to minimize the misfit caused by noise insertion in the speech signal. The proposed method has been evaluated using the TIMIT and Noisex-92 databases. Recognition results show that the histogram mapping combined with filter with neural networks in the field of the cepstral coefficients do improve the recognition rates.
Collaboration
Dive into the Abraham Alcaim's collaboration.
Vladimir Fabregas Surigué de Alencar
Pontifical Catholic University of Rio de Janeiro
View shared research outputs