Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where M. Carmen Benítez is active.

Publication


Featured researches published by M. Carmen Benítez.


Speech Communication | 2004

Efficient voice activity detection algorithms using long-term speech information

Javier Ramírez; José C. Segura; M. Carmen Benítez; Ángel de la Torre; Antonio J. Rubio

Abstract Currently, there are technology barriers inhibiting speech processing systems working under extreme noisy conditions. The emerging applications of speech technology, especially in the fields of wireless communications, digital hearing aids or speech recognition, are examples of such systems and often require a noise reduction technique operating in combination with a precise voice activity detector (VAD). This paper presents a new VAD algorithm for improving speech detection robustness in noisy environments and the performance of speech recognition systems. The algorithm measures the long-term spectral divergence (LTSD) between speech and noise and formulates the speech/non-speech decision rule by comparing the long-term spectral envelope to the average noise spectrum, thus yielding a high discriminating decision rule and minimizing the average number of decision errors. The decision threshold is adapted to the measured noise energy while a controlled hang-over is activated only when the observed signal-to-noise ratio is low. It is shown by conducting an analysis of the speech/non-speech LTSD distributions that using long-term information about speech signals is beneficial for VAD. The proposed algorithm is compared to the most commonly used VADs in the field, in terms of speech/non-speech discrimination and in terms of recognition performance when the VAD is used for an automatic speech recognition system. Experimental results demonstrate a sustained advantage over standard VADs such as G.729 and adaptive multi-rate (AMR) which were used as a reference, and over the VADs of the advanced front-end for distributed speech recognition.


IEEE Signal Processing Letters | 2005

Statistical voice activity detection using a multiple observation likelihood ratio test

Javier Ramírez; José C. Segura; M. Carmen Benítez; Luz García; Antonio J. Rubio

Currently, there are technology barriers inhibiting speech processing systems that work in extremely noisy conditions from meeting the demands of modern applications. This letter presents a new voice activity detector (VAD) for improving speech detection robustness in noisy environments and the performance of speech recognition systems. The algorithm defines an optimum likelihood ratio test (LRT) involving multiple and independent observations. The so-defined decision rule reports significant improvements in speech/nonspeech discrimination accuracy over existing VAD methods that are defined on a single observation and need empirically tuned hangover mechanisms. The algorithm has an inherent delay that, for several applications, including robust speech recognition, does not represent a serious implementation obstacle. An analysis of the overlap between the distributions of the decision variable shows the improved robustness of the proposed approach by means of a clear reduction of the classification error as the number of observations is increased. The proposed strategy is also compared to different VAD methods, including the G.729, AMR, and AFE standards, as well as recently reported algorithms showing a sustained advantage in speech/nonspeech detection accuracy and speech recognition performance.


international conference on acoustics, speech, and signal processing | 2002

Non-linear transformations of the feature space for robust Speech Recognition

Ángel de la Torre; José C. Segura; M. Carmen Benítez; Antonio M. Peinado; Antonio J. Rubio

The noise usually produces a non-linear distortion of the feature space considered for Automatic Speech Recognition. This distortion causes a mismatch between the training and recognition conditions which significantly degrades the performance of speech recognizers. In this contribution we analyze the effect of the additive noise over cepstral based representations and we compare several approaches to compensate this effect. We discuss the importance of the non-linearities introduced by the noise and we propose a method (based on the histogram equalization technique) specifically oriented to the compensation of the non-linear transformation caused by the additive noise. The proposed method has been evaluated using the AURORA-2 database and task. The recognition results show significant improvements with respect to other compensation methods reported in the bibliography and reveals the importance of the non-linear effects of the noise and the utility of the proposed method.


Speech Communication | 2002

Discriminative feature weighting for HMM-based continuous speech recognizers

Ángel de la Torre; Antonio M. Peinado; Antonio J. Rubio; José C. Segura; M. Carmen Benítez

The Discriminative Feature Extraction (DFE) method provides an appropriate formalism for the design of the front-end feature extraction module in pattern classification systems. In the recent years, this formalism has been successfully applied to different speech recognition problems, like classification of vowels, classification of phonemes or isolated word recognition. The DFE formalism can be applied to weight the contribution of the components in the feature vector. This variant of DFE, that we call Discriminative Feature Weighting (DFW), improves the pattern classification systems by enhancing those components more relevant for the discrimination among the different classes. This paper is dedicated to the application of the DFW formalism to Continuous Speech Recognizers (CSR) based on Hidden Markov Models (HMMs). Two different types of HMM-based speech recognizers are considered: recognizers based on Discrete-HMMs (DHMMs) (for which the acoustic evaluation is based on an Euclidean distance measure) and Semi-Continuous-HMMs (SCHMMs) (for which the acoustic evaluation is performed making use of a mixture of multivariated Gaussians). We report how the components can be weighted and how the weights can be discriminatively trained and applied to the speech recognizers. We present recognition results for several continuous speech recognition tasks. The experimental results show the utility of DFW for HMM-based continuous speech recognizers.


IEEE Geoscience and Remote Sensing Letters | 2013

An Automatic P-Phase Picking Algorithm Based on Adaptive Multiband Processing

Isaac Alvarez; Luz García; Sonia Mota; Guillermo Cortés; M. Carmen Benítez; Ángel de la Torre

This letter presents a novel picking algorithm which allows an automated determination of the P-phase onset time. The algorithm includes an adaptive multiband processing and noise-reduction techniques to allow a confident onset time estimation in signals strongly affected by background and/or nonstationary noise processes. Results using a set of 3780 computer-generated earthquake-like signals show that the accuracy is much better than that achieved by conventional STA/LTA algorithm. In addition, the accuracy of the proposed method is improved when it is combined with an autoregressive method. An application of the algorithm to a set of 400 natural earthquakes confirms that the combination of both algorithms provides a precise P-phase onset time estimation in real environments, overcoming the limitations associated with the autoregressive method.


IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing | 2016

A Comparative Study of Dimensionality Reduction Algorithms Applied to Volcano-Seismic Signals

Guillermo Cortés; M. Carmen Benítez; Luz García; Isaac Alvarez; Jesús M. Ibáñez

Detection and classification of the different seismic events are important tasks in volcanological observatories. Trying to make these an automatic process is fundamental for the volcanological community. It is crucial to choose how the seismic signal is represented in terms of parameters or features useful for dealing with the automatic classification problem, since the number and type of parameters could be really large leading to the curse of dimensionality issue. Machine learning theory establishes that in order to build a classifier from a labeled database, there should be a compromise between the complexity of the classifier and the size of the database. Since generating a manually labeled database is a tedious work performed by specialists in volcanology, the size of the databases limits the complexity of the classification systems built by them. On the other hand, if the databases could be represented by a reduced, but relevant, number of features, the complexity of the classifier would be simplified. In order to study the problem just described, this paper performs a comparative study of different classical techniques of dimensionality reduction (DR) of the feature set. The algorithms implemented include feature selection techniques as wrappers and filters and methods which directly transform the original feature space into another with lower dimension. All algorithms have been tested using an automatic classification system of volcano-seismic events. The best results have been obtained with the discriminative feature selection (DFS) algorithm which belongs to the set of wrapper methods.


international geoscience and remote sensing symposium | 2009

Evaluating robustness of a HMM-based classification system of volcano-seismic events at colima and popocatepetl volcanoes

Guillermo Cortés; Raúl Arámbula; Ligdamis A. Gutiérrez; M. Carmen Benítez; Jesús M. Ibáñez; Philippe Lesage; Isaac Alvarez; Luz García

This work presents a continuous volcano-seismic classification system based in the Hidden Markov Models as solution to recently strong needs for automatic event detection and recognition methods in early warning and monitoring scenarios. Furthermore, our system includes a reliable method to assign confidence measures to the recognized signals in order to evaluate the robustness of the results. Data from the two most active volcanoes have been used to probe the system reliability on a complex joint corpus achieving a recognition accuracy higher than 78% in blind recognition tests.


Speech Communication | 2015

Sub-band based histogram equalization in cepstral domain for speech recognition

Vikas Joshi; Raghavendra Bilgi; Srinivasan Umesh; Luz García; M. Carmen Benítez

Proposed a novel extension to Histogram Equalization method for noise compensation.We perform a sub-band specific equalization on the noisy cepstral features.Histogram analysis and recognition results show usefulness of the proposed approach.Favorable in real-time systems due to superior performance and computational benifits. This paper describes a novel framework to sub-band based Histogram Equalization (HEQ) applied to robust speech recognition. We propose a frequency band specific equalization to compensate the noise distortion on the individual frequency bands. The proposed equalization framework is a two step process. In the first step, conventional histogram equalization is done. By analyzing the histograms of equalized cepstra, we show that the first stage of conventional HEQ approach does not compensate the sub-band specific noise distortion, even though the overall histogram is normalized. Hence, in the second stage, sub-band specific histogram equalization is done. Every frame of cepstral coefficients is decomposed into low-frequency (LF) cepstra and high-frequency (HF) cepstra. Separate equalization is done on LF and HF cepstra to compensate LF and HF specific noise distortion. The cepstra corresponding to the LF and HF bands are obtained by using simple averaging and differencing filters on the cepstral components within a particular frame. The proposed approach is referred to as Sub-band Histogram Equalization (S-HEQ). Using histogram analysis, we show that the S-HEQ approach is able to compensate for the sub-band specific noise distortion. S-HEQ approach shows a consistent improvement over the conventional HEQ approach with a relative improvement of 12 % and 22.10 % over conventional HEQ in WER on Aurora-2 and Aurora-4 databases respectively. Proposed equalization approach can also be used with the deep neural network based systems and has shown a consistent improvement in the recognition accuracies over conventional HEQ. Finally, the efficacy of the proposed S-HEQ approach for embedded real-time speech applications is shown by comparing the performance and computational complexity trade-off with other state-of-the-art noise compensation methods.


international conference on acoustics, speech, and signal processing | 2006

Continuous HMM-Based Volcano Monitoring at Deception Island, Antarctica

M. Carmen Benítez; Javier Ramírez; José C. Segura; Antonio J. Rubio; Jesús M. Ibáñez; Javier Almendros; Araceli García-Yeguas

This paper shows a complete volcano monitoring system that has been developed on the basis of the seismicity observed during three summer Antarctic surveys at Deception Island Volcano (Antarctica). The system is based on the state of the art in hidden Markov modelling (HMM) techniques successfully applied to other scenarios. A database containing a representative set of different seismic events including volcano-tectonic earthquakes, long-period events, volcanic tremor and hybrid events recorded during the 1994-1995 and 1995-1996 seismic surveys was collected for training and testing. Simple left-to-right HMMs and multivariate Gaussian probability density functions (PDF) with diagonal covariance matrix were used. The feature vector consists of the log-energies of a filter-bank consisting of 16 triangular weighting functions uniformly spaced between 0 and 20 Hz plus the first and second order derivatives. The system is suitable to operate in real-time and its accuracy is close to 90%. When the system was tested with a different data set including mainly long-period events registered during several seismic swarms during the 2001-2002 field survey, more than 95% of the recognized events were correctly marked by the recognition system


international conference on acoustics, speech, and signal processing | 2006

Gaialab: A Weblab Project for Digital Communications Distributed Learning

Javier Ramírez; José C. Segura; Juan Manuel Górriz; M. Carmen Benítez; Antonio J. Rubio

This paper presents GAIALAB, a Weblab project for digital communications distributed learning developed at the University of Granada (Spain). This initiative has been funded by a University program to improve teaching quality. The project addresses problems of teaching and learning innovatively and effectively by enabling the user interaction with the simulation engines through a Web browser. Our Weblab project provides an interface with JAVA applets that have been developed and integrated in the system. The Weblab project enables simulating digital communication systems and analyzing the results obtained through the Web. Moreover, all the applets that are included in the system have been designed to allow the modification of the simulation parameters, thus enabling a better understanding of the algorithms. The system also permits evaluating the experimental work developed by the students

Collaboration


Dive into the M. Carmen Benítez's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Srinivasan Umesh

Indian Institute of Technology Kanpur

View shared research outputs
Researchain Logo
Decentralizing Knowledge