Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Climent Nadeu is active.

Publication


Featured researches published by Climent Nadeu.


Speech Communication | 2000

Time and frequency filtering of filter-bank energies for robust HMM speech recognition

Climent Nadeu; Du san Macho; Javier Hernando

Abstract Every speech recognition system requires a signal representation that parametrically models the temporal evolution of the speech spectral envelope. Current parameterizations involve, either explicitly or implicitly, a set of energies from frequency bands which are often distributed in a mel scale. The computation of those energies is performed in diverse ways, but it always includes smoothing of basic spectral measurements and non-linear amplitude compression. Several linear transformations are then applied to the two-dimensional time-frequency sequence of energies before entering the HMM pattern matching stage. In this paper, a recently introduced technique that consists of filtering that sequence of energies along the frequency dimension is presented, and its resulting parameters are compared with the widely used cepstral coefficients. Then, that frequency filtering transformation is jointly considered with the time filtering transformation that is used to compute dynamic parameters, showing that the flexibility of this combined (tiffing) approach can be used to design a robust set of filters. Recognition experiment results are reported which show the potential of tiffing for an enhanced and more robust HMM speech recognition.


CLEaR | 2006

CLEAR evaluation of acoustic event detection and classification systems

Andrey Temko; Robert G. Malkin; Christian Zieger; Dusan Macho; Climent Nadeu; Maurizio Omologo

In this paper, we present the results of the Acoustic Event Detection (AED) and Classification (AEC) evaluations carried out in February 2006 by the three participant partners from the CHIL project. The primary evaluation task was AED of the testing portions of the isolated sound databases and seminar recordings produced in CHIL. Additionally, a secondary AEC evaluation task was designed using only the isolated sound databases. The set of meeting-room acoustic event classes and the metrics were agreed by the three partners and ELDA was in charge of the scoring task. In this paper, the various systems for the tasks of AED and AEC and their results are presented.


Pattern Recognition | 2006

Classification of acoustic events using SVM-based clustering schemes

Andrey Temko; Climent Nadeu

Acoustic events produced in controlled environments may carry information useful for perceptually aware interfaces. In this paper we focus on the problem of classifying 16 types of meeting-room acoustic events. First of all, we have defined the events and gathered a sound database. Then, several classifiers based on support vector machines (SVM) are developed using confusion matrix based clustering schemes to deal with the multi-class problem. Also, several sets of acoustic features are defined and used in the classification tests. In the experiments, the developed SVM-based classifiers are compared with an already reported binary tree scheme and with their correlative Gaussian mixture model (GMM) classifiers. The best results are obtained with a tree SVM-based classifier that may use a different feature set at each node. With it, a 31.5% relative average error reduction is obtained with respect to the best result from a conventional binary tree scheme.


IEEE Transactions on Speech and Audio Processing | 1997

Linear prediction of the one-sided autocorrelation sequence for noisy speech recognition

Javier Hernando; Climent Nadeu

The article presents a robust representation of speech based on AR modeling of the causal part of the autocorrelation sequence. In noisy speech recognition, this new representation achieves better results than several other related techniques.


Pattern Recognition Letters | 2009

Acoustic event detection in meeting-room environments

Andrey Temko; Climent Nadeu

Acoustic event detection (AED) aims at determining the identity of sounds and their temporal position in the signals that are captured by one or several microphones. The AED problem has been recently proposed for meeting-room or class-room environments, where a specific set of meaningful sounds has been defined, and several evaluations have been carried out (within the international CLEAR evaluation campaigns). This paper reports some work in AED done by the authors in that framework, and particularly presents the extension to the difficult problem of detecting overlapped sounds. Actually, temporal overlaps accounted for more than 70% of errors in the real-world interactive seminar recordings used in CLEAR 2007 evaluations. An attempt to deal with that problem at the level of models using our SVM-based AED system is reported in the paper. The proposed two-step system noticeably outperforms the baseline system for both an artificially generated database and a real seminar recording database. The databases and metrics developed for the CLEAR 2007 evaluations are also described. Finally, a real-time AED system implemented in the UPCs smart-room using several microphones is reported, along with a GUI-based demo that includes also the output of an acoustic source localization system.


Pattern Recognition | 2008

Fuzzy integral based information fusion for classification of highly confusable non-speech sounds

Andrey Temko; Dusan Macho; Climent Nadeu

Acoustic event classification may help to describe acoustic scenes and contribute to improve the robustness of speech technologies. In this work, fusion of different information sources with the fuzzy integral (FI), and the associated fuzzy measure (FM), are applied to the problem of classifying a small set of highly confusable human non-speech sounds. As FI is a meaningful formalism for combining classifier outputs that can capture interactions among the various sources of information, it shows in our experiments a significantly better performance than that of any single classifier entering the FI fusion module. Actually, that FI decision-level fusion approach shows comparable results to the high-performing SVM feature-level fusion and thus it seems to be a good choice when feature-level fusion is not an option. We have also observed that the importance and the degree of interaction among the various feature types given by the FM can be used for feature selection, and gives a valuable insight into the problem.


Speech Communication | 1997

Filtering the time sequences of spectral parameters for speech recognition

Climent Nadeu; Pau Pachès-Leal; Biing-Hwang Juang

Abstract In automatic speech recognition, the signal is usually represented by a set of time sequences of spectral parameters (TSSPs) that model the temporal evolution of the spectral envelope frame-to-frame. Those sequences are then filtered either to make them more robust to environmental conditions or to compute differential parameters (dynamic features) which enhance discrimination. In this paper, we apply frequency analysis to TSSPs in order to provide an interpretation framework for the various types of parameter filters used so far. Thus, the analysis of the average long-term spectrum of the successfully filtered sequences reveals a combined effect of equalization and band selection that provides insights into TSSP filtering. Also, we show in the paper that, when supplementary differential parameters are not used, the recognition rate can be improved even for clean speech, just by properly filtering the TSSPs. To support this claim, a number of experimental results are presented, both using whole-word and subword based models. The empirically optimum filters attenuate the low-pass band and emphasize a higher band so that the peak of the average long-term spectrum of the output of these filters lies at around the average syllable rate of the employed database (≈3 Hz).


international conference of the ieee engineering in medicine and biology society | 2011

EEG Signal Description with Spectral-Envelope-Based Speech Recognition Features for Detection of Neonatal Seizures

Andrey Temko; Climent Nadeu; William P. Marnane; Geraldine B. Boylan; Gordon Lightbody

In this paper, features which are usually employed in automatic speech recognition (ASR) are used for the detection of seizures in newborn EEG. In particular, spectral envelope-based features, composed of spectral powers and their spectral derivatives are compared to the established feature set which has been previously developed for EEG analysis. The results indicate that the ASR features which model the spectral derivatives, either full-band or localized in frequency, yielded a performance improvement, in comparison to spectral-power-based features. Indeed it is shown here that they perform reasonably well in comparison with the conventional EEG feature set. The contribution of the ASR features was analyzed here using the support vector machines (SVM) recursive feature elimination technique. It is shown that the spectral derivative features consistently appear among the top-rank features. The study shows that the ASR features should be given a high priority when dealing with the description of the EEG signal.


international conference on acoustics, speech, and signal processing | 2006

On Real-Time Mean-and-Variance Normalization of Speech Recognition Features

Pere Pujol; Dusan Macho; Climent Nadeu

This work aims at gaining an insight into the mean and variance normalization technique (MVN), which is commonly used to increase the robustness of speech recognition features. Several versions of MVN are empirically investigated, and the factors affecting their performance are considered. The reported experimental work with real-world speech data (Speecon) particularly focuses on the recursive updating of MVN parameters, paying attention to the involved algorithmical delay. First, we propose a decoupling of the look-ahead factor (which determines the delay) and the initial estimation of mean and variance, and show that the latter is a key factor for the recognition performance. Then, several kinds of initial estimations that make sense in different application environments are tested, and their performance is compared


international conference on acoustics, speech, and signal processing | 2007

Enhanced SVM Training for Robust Speech Activity Detection

Andrey Temko; Dusan Macho; Climent Nadeu

Speech activity detection (SAD) is a key objective in speech-related technologies. In this work, an enhanced version of the training stage of a SAD system based on a support vector machine (SVM) classifier is presented, and its performance is tested with the RT05 and RT06 evaluation tasks. A fast algorithm of data reduction based on proximal SVM has been developed and, furthermore, the specific characteristics of the metric used in the NIST SAD evaluation have been taken into account during training. Tested with the RT06 data, the resulting SVM SAD system has shown better scores than the best GMM-based system developed by the authors and submitted to the past RT06 evaluation.

Collaboration


Dive into the Climent Nadeu's collaboration.

Top Co-Authors

Avatar

Javier Hernando

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar

Andrey Temko

University College Cork

View shared research outputs
Top Co-Authors

Avatar

Dusan Macho

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar

Taras Butko

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar

Carlos Segura

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

José B. Mariño

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jaume Padrell

Polytechnic University of Catalonia

View shared research outputs
Top Co-Authors

Avatar

Cristian Canton-Ferrer

Polytechnic University of Catalonia

View shared research outputs
Researchain Logo
Decentralizing Knowledge