Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Pedro Gómez-Vilda is active.

Publication


Featured researches published by Pedro Gómez-Vilda.


IEEE Transactions on Biomedical Engineering | 2006

Dimensionality Reduction of a Pathological Voice Quality Assessment System Based on Gaussian Mixture Models and Short-Term Cepstral Parameters

Juan Ignacio Godino-Llorente; Pedro Gómez-Vilda; Manuel Blanco-Velasco

Voice diseases have been increasing dramatically in recent times due mainly to unhealthy social habits and voice abuse. These diseases must be diagnosed and treated at an early stage, especially in the case of larynx cancer. It is widely recognized that vocal and voice diseases do not necessarily cause changes in voice quality as perceived by a listener. Acoustic analysis could be a useful tool to diagnose this type of disease. Preliminary research has shown that the detection of voice alterations can be carried out by means of Gaussian mixture models and short-term mel cepstral parameters complemented by frame energy together with first and second derivatives. This paper, using the F-Ratio and Fishers discriminant ratio, will demonstrate that the detection of voice impairments can be performed using both mel cepstral vectors and their first derivative, ignoring the second derivative


IEEE Transactions on Biomedical Engineering | 2004

Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors

Juan Ignacio Godino-Llorente; Pedro Gómez-Vilda

It is well known that vocal and voice diseases do not necessarily cause perceptible changes in the acoustic voice signal. Acoustic analysis is a useful tool to diagnose voice diseases being a complementary technique to other methods based on direct observation of the vocal folds by laryngoscopy. Through the present paper two neural-network based classification approaches applied to the automatic detection of voice disorders will be studied. Structures studied are multilayer perceptron and learning vector quantization fed using short-term vectors calculated accordingly to the well-known Mel Frequency Coefficient cepstral parameterization. The paper shows that these architectures allow the detection of voice disorders-including glottic cancer-under highly reliable conditions. Within this context, the Learning Vector quantization methodology demonstrated to be more reliable than the multilayer perceptron architecture yielding 96% frame accuracy under similar working conditions.


Biomedical Signal Processing and Control | 2006

Methodological issues in the development of automatic systems for voice pathology detection

Nicolás Sáenz-Lechón; Juan Ignacio Godino-Llorente; Víctor Osma-Ruiz; Pedro Gómez-Vilda

This paper describes some methodological concerns to be considered when designing systems for automatic detection of voice pathology, in order to enable comparisons to be made with previous or future experiments. The proposed methodology is built around the Massachusetts Eye & Ear Infirmary (MEEI) Voice Disorders Database, which to the present date is the only commercially available one. Discussion about key points on this database is included. Any experiment should have a cross-validation strategy, and results should supply, along with the final confusion matrix, confidence intervals for all measures. Detector performance curves such as detector error trade off (DET) and receiver operating characteristic (ROC) plots are also considered. An example of the methodology is provided, with an experiment based on short-term parameters and multi-layer perceptrons.


Pattern Recognition | 2007

An improved watershed algorithm based on efficient computation of shortest paths

Víctor Osma-Ruiz; Juan Ignacio Godino-Llorente; Nicolás Sáenz-Lechón; Pedro Gómez-Vilda

The present paper describes a new algorithm to calculate the watershed transform through rain simulation of greyscale digital images by means of pixel arrowing. The efficiency of this method is based on limiting the necessary neighbouring operations to compute the transform to the outmost, and in the total number of scannings performed over the whole image. The experiments demonstrate that the proposed algorithm is able to significantly reduce the running time of the fastest known algorithm without involving any loss of efficiency.


non linear speech processing | 2009

Glottal Source biometrical signature for voice pathology detection

Pedro Gómez-Vilda; Roberto Fernández-Baíllo; Victoria Rodellar-Biarge; Victor Nieto Lluis; Agustín Álvarez-Marquina; Luis Miguel Mazaira-Fernández; Rafael Martínez-Olalla; Juan Ignacio Godino-Llorente

The Glottal Source is an important component of voice as it can be considered as the excitation signal to the voice apparatus. The use of the Glottal Source for pathology detection or the biometric characterization of the speaker are important objectives in the acoustic study of the voice nowadays. Through the present work a biometric signature based on the speakers power spectral density of the Glottal Source is presented. It may be shown that this spectral density is related to the vocal fold cover biomechanics, and from literature it is well-known that certain speakers features as gender, age or pathologic condition leave changes in it. The paper describes the methodology to estimate the biometric signature from the power spectral density of the mucosal wave correlate, which after normalization can be used in pathology detection experiments. Linear Discriminant Analysis is used to confront the detection capability of the parameters defined on this glottal signature among themselves and compared to classical perturbation parameters. A database of 100 normal and 100 pathologic subjects equally balanced in gender and age is used to derive the best parameter cocktails for pathology detection and quantification purposes to validate this methodology in voice evaluation tests. In a study case presented to illustrate the detection capability of the methodology exposed a control subset of 24+24 subjects is used to determine a subjects voice condition in a pre- and post-surgical evaluation. Possible applications of the study can be found in pathology detection and grading and in rehabilitation assessment after treatment.


Journal of Voice | 2010

The effectiveness of the glottal to noise excitation ratio for the screening of voice disorders.

Juan Ignacio Godino-Llorente; Víctor Osma-Ruiz; Nicolás Sáenz-Lechón; Pedro Gómez-Vilda; Manuel Blanco-Velasco; Fernando Cruz-Roldán

This paper evaluates the capabilities of the Glottal to Noise Excitation Ratio for the screening of voice disorders. A lot of effort has been made using this parameter to evaluate voice quality, but there do not exist any studies that evaluate the discrimination capabilities of this acoustic parameter to classify between normal and pathological voices, and neither are there any previous studies that reflect the normative values that could be used for screening purposes. A set of 226 speakers (53 normal and 173 pathological) taken from a voice disorders database were used to evaluate the usefulness of this parameter for discriminating normal and pathological voices. To evaluate this parameter, the effect of the bandwidth of the Hilbert envelopes and the frequency shift have been analyzed, concluding that a good discrimination is obtained with a bandwidth of 1000 Hz and a frequency shift of 300 Hz. The results confirm that the Glottal to Noise Excitation Ratio provides reliable measurements in terms of discrimination among normal and pathological voices, comparable to other classical long-term noise measurements found in the literature, such as Normalized Noise Energy or Harmonics to Noise Ratio, so this parameter can be considered a good choice for screening purposes.


Biomedical Signal Processing and Control | 2009

Automatic detection of voice impairments from text-dependent running speech

Juan Ignacio Godino-Llorente; Rubén Fraile; Nicolás Sáenz-Lechón; Víctor Osma-Ruiz; Pedro Gómez-Vilda

Abstract Acoustic analysis is a useful tool to diagnose voice diseases. Furthermore it presents several advantages: it is non-invasive, provides an objective diagnostic and, also, it can be used for the evaluation of surgical and pharmacological treatments and rehabilitation processes. Most of the approaches found in the literature address the automatic detection of voice impairments from speech by using the sustained phonation of vowels. In this paper it is proposed a new scheme for the detection of voice impairments from text-dependent running speech. The proposed methodology is based on the segmentation of speech into voiced and non-voiced frames, parameterising each voiced frame with mel-frequency cepstral parameters. The classification is carried out using a discriminative approach based on a multilayer perceptron neural network. The data used to train the system were taken from the voice disorders database distributed by Kay Elemetrics. The material used for training and testing contains the running speech corresponding to the well known “rainbow passage” of 140 patients (23 normal and 117 pathological). The results obtained are compared with those using sustained vowels. The text-dependent running speech showed a light improvement in the accuracy of the detection.


Neurocomputing | 2015

Robust and complex approach of pathological speech signal analysis

Jiri Mekyska; Eva Janoušová; Pedro Gómez-Vilda; Zdenek Smekal; Irena Rektorová; Ilona Eliasova; Milena Kostalova; Martina Mrackova; Jesús B. Alonso-Hernández; Marcos Faundez-Zanuy; Karmele López-de-Ipiña

This paper presents a study of the approaches in the state-of-the-art in the field of pathological speech signal analysis with a special focus on parametrization techniques. It provides a description of 92 speech features where some of them are already widely used in this field of science and some of them have not been tried yet (they come from different areas of speech signal processing like speech recognition or coding). As an original contribution, this work introduces 36 completely new pathological voice measures based on modulation spectra, inferior colliculus coefficients, bicepstrum, sample and approximate entropy and empirical mode decomposition. The significance of these features was tested on 3 (English, Spanish and Czech) pathological voice databases with respect to classification accuracy, sensitivity and specificity. To our best knowledge the introduced approach based on complex feature extraction and robust testing outperformed all works that have been published already in this field. The results (accuracy, sensitivity and specificity equal to 100.0 ? 0.0 % ) are discussable in the case of Massachusetts Eye and Ear Infirmary (MEEI) database because of its limitation related to a length of sustained vowels, however in the case of Principe de Asturias (PdA) Hospital in Alcala de Henares of Madrid database we made improvements in classification accuracy ( 82.1 ? 3.3 % ) and specificity ( 83.8 ? 5.1 % ) when considering a single-classifier approach. Hopefully, large improvements may be achieved in the case of Czech Parkinsonian Speech Database (PARCZ), which are discussed in this work as well. All the features introduced in this work were identified by Mann-Whitney U test as significant ( p < 0.05 ) when processing at least one of the mentioned databases. The largest discriminative power from these proposed features has a cepstral peak prominence extracted from the first intrinsic mode function ( p = 6.9443 i? 10 - 32 ) which means, that among all newly designed features those that quantify especially hoarseness or breathiness are good candidates for pathological speech identification. The paper also mentions some ideas for the future work in the field of pathological speech signal analysis that can be valuable especially under the clinical point of view.


non linear speech processing | 2005

Support vector machines applied to the detection of voice disorders

Juan Ignacio Godino-Llorente; Pedro Gómez-Vilda; Nicolás Sáenz-Lechón; Manuel Blanco-Velasco; Fernando Cruz-Roldán; Miguel Angel Ferrer-Ballester

Support Vector Machines (SVMs) have become a popular tool for discriminative classification. An exciting area of recent application of SVMs is in speech processing. In this paper discriminatively trained SVMs have been introduced as a novel approach for the automatic detection of voice impairments. SVMs have a distinctly different modelling strategy in the detection of voice impairments problem, compared to other methods found in the literature (such a Gaussian Mixture or Hidden Markov Models): the SVM models the boundary between the classes instead of modelling the probability density of each class. In this paper it is shown that the scheme proposed fed with short-term cepstral and noise parameters can be applied for the detection of voice impairments with a good performance.


Cognitive Computation | 2013

Characterizing Neurological Disease from Voice Quality Biomechanical Analysis

Pedro Gómez-Vilda; Victoria Rodellar-Biarge; Víctor Nieto-Lluis; Cristina Muñoz-Mulas; Luis Miguel Mazaira-Fernández; Rafael Martínez-Olalla; Agustín Álvarez-Marquina; Carlos Ramírez-Calvo; Mario Fernández-Fernández

The dramatic impact of neurological degenerative pathologies in life quality is a growing concern nowadays. Many techniques have been designed for the detection, diagnosis, and monitoring of the neurological disease. Most of them are too expensive or complex for being used by primary attention medical services. On the other hand, it is well known that many neurological diseases leave a signature in voice and speech. Through the present paper, a new method to trace some neurological diseases at the level of phonation will be shown. In this way, the detection and grading of the neurological disease could be based on a simple voice test. This methodology is benefiting from the advances achieved during the last years in detecting and grading organic pathologies in phonation. The paper hypothesizes that some of the underlying neurological mechanisms affecting phonation produce observable correlates in vocal fold biomechanics and that these correlates behave differentially in neurological diseases than in organic pathologies. A general description about the main hypotheses involved and their validation by acoustic voice analysis based on biomechanical correlates of the neurological disease is given. The validation is carried out on a balanced database of normal and organic dysphonic patients of both genders. Selected study cases will be presented to illustrate the possibilities offered by this methodology.

Collaboration


Dive into the Pedro Gómez-Vilda's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Rafael Martínez-Olalla

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Víctor Nieto-Lluis

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Cristina Muñoz-Mulas

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Daniel Palacios-Alonso

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Nicolás Sáenz-Lechón

Technical University of Madrid

View shared research outputs
Researchain Logo
Decentralizing Knowledge