Antonio M. Peinado | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Antonio M. Peinado is active.

Explore More

Publication

Featured researches published by Antonio M. Peinado.

IEEE Transactions on Speech and Audio Processing | 2005

Histogram equalization of speech representation for robust speech recognition

A. de la Torre; Antonio M. Peinado; José C. Segura; José L. Pérez-Córdoba; M.C. Benitez; Antonio J. Rubio

This paper describes a method of compensating for nonlinear distortions in speech representation caused by noise. The method described here is based on the histogram equalization method often used in digital image processing. Histogram equalization is applied to each component of the feature vector in order to improve the robustness of speech recognition systems. The paper describes how the proposed method can be applied to robust speech recognition and it is compared with other compensation techniques. The recognition experiments, including results in the AURORA II framework, demonstrate the effectiveness of histogram equalization when it is applied either alone or in combination with other compensation techniques.

international conference on acoustics, speech, and signal processing | 2002

Non-linear transformations of the feature space for robust Speech Recognition

Ángel de la Torre; José C. Segura; M. Carmen Benítez; Antonio M. Peinado; Antonio J. Rubio

The noise usually produces a non-linear distortion of the feature space considered for Automatic Speech Recognition. This distortion causes a mismatch between the training and recognition conditions which significantly degrades the performance of speech recognizers. In this contribution we analyze the effect of the additive noise over cepstral based representations and we compare several approaches to compensate this effect. We discuss the importance of the non-linearities introduced by the noise and we propose a method (based on the histogram equalization technique) specifically oriented to the compensation of the non-linear transformation caused by the additive noise. The proposed method has been evaluated using the AURORA-2 database and task. The recognition results show significant improvements with respect to other compensation methods reported in the bibliography and reveals the importance of the non-linear effects of the noise and the utility of the proposed method.

IEEE Transactions on Signal Processing | 1995

Diagonalizing properties of the discrete cosine transforms

Victoria E. Sánchez; P. Garcia; Antonio M. Peinado; José C. Segura; Antonio J. Rubio

Since its introduction in 1974 by Ahmed et al., the discrete cosine transform (DCT) has become a significant tool in many areas of digital signal processing, especially in signal compression. There exist eight types of discrete cosine transforms (DCTs). We obtain the eight types of DCTs as the complete orthonormal set of eigenvectors generated by a general form of matrices in the same way as the discrete Fourier transform (DFT) can be obtained as the eigenvectors of an arbitrary circulant matrix. These matrices can be decomposed as the sum of a symmetric Toeplitz matrix plus a Hankel or close to Hankel matrix scaled by some constant factors. We also show that all the previously proposed generating matrices for the DCTs are simply particular cases of these general matrix forms. Using these matrices, we obtain, for each DCT, a class of stationary processes verifying certain conditions with respect to which the corresponding DCT has a good asymptotic behavior in the sense that it approaches Karhunen-Loeve transform performance as the block size N tends to infinity. As a particular result, we prove that the eight types of DCTs are asymptotically optimal for all finite-order Markov processes. We finally study the decorrelating power of the DCTs, obtaining expressions that show the decorrelating behavior of each DCT with respect to any stationary processes.

Speech Communication | 2003

HMM-based channel error mitigation and its application to distributed speech recognition☆

Antonio M. Peinado; Victoria E. Sánchez; José L. Pérez-Córdoba; Ángel de la Torre

The emergence of distributed speech recognition has generated the need to mitigate the degradations that the transmission channel introduces in the speech features used for recognition. This work proposes a hidden Markov model (HMM) framework from which different mitigation techniques oriented to wireless channels can be derived. First, we study the performance of two techniques based on the use of a minimum mean square error (MMSE) esti- mation, a raw MMSE and a forward MMSE estimation, over additive white Gaussian noise (AWGN) channels. These techniques are also adapted to bursty channels. Then, we propose two new mitigation methods specially suitable for bursty channels. The first one is based on a forward-backward MMSE estimation and the second one on the well- known Viterbi algorithm. Different experiments are carried out, dealing with several issues such as the application of hard decisions on the received bits or the influence of the estimated channel SNR. The experimental results show that the HMM-based techniques can effectively mitigate channel errors, even in very poor channel conditions. 2003 Elsevier B.V. All rights reserved.

IEEE Transactions on Multimedia | 2013

Sequential Error Concealment for Video/Images by Sparse Linear Prediction

Ján Koloda; Jan Østergaard; Søren Holdt Jensen; Victoria E. Sánchez; Antonio M. Peinado

In this paper, we propose a novel sequential error concealment algorithm for video and images based on sparse linear prediction. Block-based coding schemes in packet loss environments are considered. Images are modelled by means of linear prediction, and missing macroblocks are sequentially reconstructed using the available groups of pixels. The optimal predictor coefficients are computed by applying a missing data regression imputation procedure with a sparsity constraint. Moreover, an efficient procedure for the computation of these coefficients based on an exponential approximation is also proposed. Both techniques provide high-quality reconstructions and outperform the state-of-the-art algorithms both in terms of PSNR and MS-SSIM.

IEEE Transactions on Wireless Communications | 2006

Recognition of coded speech transmitted over wireless channels

Angel M. Gomez; Antonio M. Peinado; Victoria E. Sánchez; Antonio J. Rubio

Network-based speech recognition (NSR) and distributed speech recognition (DSR) have been proposed as solutions to translate speech recognition technologies to mobile environments. NSR is the most straightforward solution since it does not require any modification in the mobile phone, however DSR offers higher robustness against codec compression and transmission channel degradation. This paper explores an alternative approach for remote speech recognition which combines the advantages of NSR and DSR. In this scheme, a standard speech codec is used for speech transmission but the recognition is performed from the received codec parameters. In particular, we focus on the effect of transmission channel errors, which can cause a more severe performance reduction on speech recognition than codec distortion. First, we show that an NSR solution can approach DSR through a reconstruction technique along with an adapted noise reduction technique originally proposed for acoustic noise. Then, these results are improved by working with recognition features directly extracted from the codec bitstream by means of parameter transcoding. Required modifications on current networks in order to access the bitstream are described. The network upgrading with the tandem free operation (TFO) protocol is an attractive solution. This upgrade not only offers an overall improvement on the end-to-end speech quality, but would also allow a recognition performance similar, and even higher in poor channel conditions, to that obtained by DSR when parameter transcoding along with the proposed mitigation techniques are applied

Speech Communication | 1996

An application of minimum classification error to feature space transformations for speech recognition

Ángel de la Torre; Antonio M. Peinado; Antonio J. Rubio; Victoria E. Sánchez; Jesús E. Díaz

The use of signal transformations is a necessary step for feature extraction in pattern recognition systems. These transformations should take into account the main goal of pattern recognition: the error-rate minimization. In this paper we propose a new method to obtain feature space transformations based on the Minimum Classification Error criterion. The goal of these transformations is to obtain a new representation space where the Euclidean distance is optimal for classification. The proposed method is tested on a speech recognition system using different types of Hidden Markov Models. The comparison with standard pre-processing techniques shows that our method provides an error-rate reduction in all the performed experiments.

IEEE Transactions on Audio, Speech, and Language Processing | 2013

MMSE-Based Missing-Feature Reconstruction With Temporal Modeling for Robust Speech Recognition

José A. González; Antonio M. Peinado; Ning Ma; Angel M. Gomez; Jon Barker

This paper addresses the problem of feature compensation in the log-spectral domain by using the missing-data (MD) approach to noise robust speech recognition, that is, the log-spectral features can be either almost unaffected by noise or completely masked by it. First, a general MD framework based on minimum mean square error (MMSE) estimation is introduced which exploits the correlation across frequency bands to reconstruct the missing features. This framework allows the derivation of different MD imputation approaches and, in particular, a novel technique taking advantage of truncated Gaussian distributions is presented. While the proposed technique provides excellent results at high and medium signal-to-noise ratios (SNRs), its performance diminishes at low SNRs where very few reliable features are available. The reconstruction technique is therefore extended to exploit temporal constraints using two different approaches. In the first approach, time-frequency patches of speech containing a number of consecutive frames are modeled using a Gaussian mixture model (GMM). In the second one, the sequential structure of speech is alternatively modeled by a hidden Markov model (HMM). The proposed techniques are evaluated on Aurora-2 and Aurora-4 databases using both oracle and estimated masks. In both cases, the proposed techniques outperform the recognition performance obtained by the baseline system and other related techniques. Also, the introduction of a temporal modeling turns out to be very effective in reconstructing spectra at low SNRs. In particular, HMMs show the highest capability of accounting for time correlations and, therefore, achieve the best results.

IEEE Transactions on Multimedia | 2006

Combining Media-Specific FEC and Error Concealment for Robust Distributed Speech Recognition Over Loss-Prone Packet Channels

Angel M. Gomez; Antonio M. Peinado; Victoria E. Sánchez; Antonio J. Rubio

This paper presents a mixed recovery scheme for robust distributed speech recognition (DSR) implemented over a packet channel which suffers packet losses. The scheme combines media-specific forward error correction (FEC) and error concealment (EC). Media-specific FEC is applied at the client side, where FEC bits representing strongly quantized versions of the speech vectors are introduced. At the server side, the information provided by those FEC bits is used by the EC algorithm to improve the recognition performance. We investigate the adaptation of two different EC techniques, namely minimum mean square error (MMSE) estimation, which operates at the decoding stage, and weighted Viterbi recognition (WVR), where EC is applied at the recognition stage, in order to be used along with FEC. The experimental results show that a significant increase in recognition accuracy can be obtained with very little bandwidth increase, which may be null in practice, and a limited increase in latency, which in any case is not so critical for an application such as DSR

IEEE Transactions on Wireless Communications | 2005

Efficient MMSE-based channel error mitigation techniques. Application to distributed speech recognition over wireless channels

Antonio M. Peinado; Victoria E. Sánchez; José L. Pérez-Córdoba; Antonio J. Rubio

This work addresses the mitigation of channel errors by means of efficient minimum mean-square-error (MMSE) estimation. Although powerful model-based implementations have been recently proposed, the computational burden involved can make them impractical. We propose two new approaches that maintain a good level of performance with a low computational complexity. These approaches keep the simple structure and complexity of a raw MMSE estimation, although they enhance it with additional source a priori knowledge. The proposed techniques are built on a distributed speech recognition system. Different degrees of tradeoff between recognition performance and computational complexity are obtained.

Explore More