Nicolás Sáenz-Lechón

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nicolás Sáenz-Lechón is active.

Explore More

Publication

Featured researches published by Nicolás Sáenz-Lechón.

Biomedical Signal Processing and Control | 2006

Methodological issues in the development of automatic systems for voice pathology detection

Nicolás Sáenz-Lechón; Juan Ignacio Godino-Llorente; Víctor Osma-Ruiz; Pedro Gómez-Vilda

This paper describes some methodological concerns to be considered when designing systems for automatic detection of voice pathology, in order to enable comparisons to be made with previous or future experiments. The proposed methodology is built around the Massachusetts Eye & Ear Infirmary (MEEI) Voice Disorders Database, which to the present date is the only commercially available one. Discussion about key points on this database is included. Any experiment should have a cross-validation strategy, and results should supply, along with the final confusion matrix, confidence intervals for all measures. Detector performance curves such as detector error trade off (DET) and receiver operating characteristic (ROC) plots are also considered. An example of the methodology is provided, with an experiment based on short-term parameters and multi-layer perceptrons.

Pattern Recognition | 2007

An improved watershed algorithm based on efficient computation of shortest paths

Víctor Osma-Ruiz; Juan Ignacio Godino-Llorente; Nicolás Sáenz-Lechón; Pedro Gómez-Vilda

The present paper describes a new algorithm to calculate the watershed transform through rain simulation of greyscale digital images by means of pixel arrowing. The efficiency of this method is based on limiting the necessary neighbouring operations to compute the transform to the outmost, and in the total number of scannings performed over the whole image. The experiments demonstrate that the proposed algorithm is able to significantly reduce the running time of the fastest known algorithm without involving any loss of efficiency.

IEEE Transactions on Biomedical Engineering | 2011

Automatic Detection of Pathological Voices Using Complexity Measures, Noise Parameters, and Mel-Cepstral Coefficients

Julián D. Arias-Londoño; Juan Ignacio Godino-Llorente; Nicolás Sáenz-Lechón; Víctor Osma-Ruiz; Germán Castellanos-Domínguez

This paper proposes a new approach to improve the amount of information extracted from the speech aiming to increase the accuracy of a system developed for the automatic detection of pathological voices. The paper addresses the discrimination capabilities of 11 features extracted using nonlinear analysis of time series. Two of these features are based on conventional nonlinear statistics (largest Lyapunov exponent and correlation dimension), two are based on recurrence and fractal-scaling analysis, and the remaining are based on different estimations of the entropy. Moreover, this paper uses a strategy based on combining classifiers for fusing the nonlinear analysis with the information provided by classic parameterization approaches found in the literature (noise parameters and mel-frequency cepstral coefficients). The classification was carried out in two steps using, first, a generative and, later, a discriminative approach. Combining both classifiers, the best accuracy obtained is 98.23% ± 0.001.

European Archives of Oto-rhino-laryngology | 2008

Acoustic analysis of voice using WPCVox: a comparative study with Multi Dimensional Voice Program

Juan Ignacio Godino-Llorente; Víctor Osma-Ruiz; Nicolás Sáenz-Lechón; Ignacio Cobeta-Marco; Ramón González-Herranz; Carlos Ramírez-Calvo

In this study, two different tools developed for the parametric extraction and acoustic analysis of voice samples are compared. The main goal of the paper is to contrast the results obtained using the classical Multi Dimensional Voice Program (MDVP), with the results obtained with the novel WPCVox. The aim of this comparison was to find differences and similarities in the parameters extracted with both systems in order to make comparison of measurements and data transfer among both equipments. The study was carried out in two stages: in the first, a wide sample of healthy voices belonging to Spanish-speaking adults from both genders were used to carry out a direct comparison between the results given by MDVP and those obtained with WPCVox. In the second stage, a sample of 200 speakers (53 normal and 173 pathological) taken from a commercially available database of voice disorders were used to demonstrate the usefulness of WPCVox for the acoustic analysis and the characterization of normal and pathological voices. The results conclude that WPCVox provides very reliable measurements which are very similar to those obtained using MDVP, and very similar capabilities to discriminate among normal and pathological voices.

Pattern Recognition | 2010

An improved method for voice pathology detection by means of a HMM-based feature space transformation

Julián D. Arias-Londoño; Juan Ignacio Godino-Llorente; Nicolás Sáenz-Lechón; Víctor Osma-Ruiz; Germán Castellanos-Domínguez

This paper presents new a feature transformation technique applied to improve the screening accuracy for the automatic detection of pathological voices. The statistical transformation is based on Hidden Markov Models, obtaining a transformation and classification stage simultaneously and adjusting the parameters of the model with a criterion that minimizes the classification error. The original feature vectors are built up using classic short-term noise parameters and mel-frequency cepstral coefficients. With respect to conventional approaches found in the literature of automatic detection of pathological voices, the proposed feature space transformation technique demonstrates a significant improvement of the performance with no addition of new features to the original input space. In view of the results, it is expected that this technique could provide good results in other areas such as speaker verification and/or identification.

Journal of Voice | 2010

The effectiveness of the glottal to noise excitation ratio for the screening of voice disorders.

Juan Ignacio Godino-Llorente; Víctor Osma-Ruiz; Nicolás Sáenz-Lechón; Pedro Gómez-Vilda; Manuel Blanco-Velasco; Fernando Cruz-Roldán

This paper evaluates the capabilities of the Glottal to Noise Excitation Ratio for the screening of voice disorders. A lot of effort has been made using this parameter to evaluate voice quality, but there do not exist any studies that evaluate the discrimination capabilities of this acoustic parameter to classify between normal and pathological voices, and neither are there any previous studies that reflect the normative values that could be used for screening purposes. A set of 226 speakers (53 normal and 173 pathological) taken from a voice disorders database were used to evaluate the usefulness of this parameter for discriminating normal and pathological voices. To evaluate this parameter, the effect of the bandwidth of the Hilbert envelopes and the frequency shift have been analyzed, concluding that a good discrimination is obtained with a bandwidth of 1000 Hz and a frequency shift of 300 Hz. The results confirm that the Glottal to Noise Excitation Ratio provides reliable measurements in terms of discrimination among normal and pathological voices, comparable to other classical long-term noise measurements found in the literature, such as Normalized Noise Energy or Harmonics to Noise Ratio, so this parameter can be considered a good choice for screening purposes.

Biomedical Signal Processing and Control | 2009

Automatic detection of voice impairments from text-dependent running speech

Juan Ignacio Godino-Llorente; Rubén Fraile; Nicolás Sáenz-Lechón; Víctor Osma-Ruiz; Pedro Gómez-Vilda

Abstract Acoustic analysis is a useful tool to diagnose voice diseases. Furthermore it presents several advantages: it is non-invasive, provides an objective diagnostic and, also, it can be used for the evaluation of surgical and pharmacological treatments and rehabilitation processes. Most of the approaches found in the literature address the automatic detection of voice impairments from speech by using the sustained phonation of vowels. In this paper it is proposed a new scheme for the detection of voice impairments from text-dependent running speech. The proposed methodology is based on the segmentation of speech into voiced and non-voiced frames, parameterising each voiced frame with mel-frequency cepstral parameters. The classification is carried out using a discriminative approach based on a multilayer perceptron neural network. The data used to train the system were taken from the voice disorders database distributed by Kay Elemetrics. The material used for training and testing contains the running speech corresponding to the well known “rainbow passage” of 140 patients (23 normal and 117 pathological). The results obtained are compared with those using sustained vowels. The text-dependent running speech showed a light improvement in the accuracy of the detection.

Folia Phoniatrica Et Logopaedica | 2009

Automatic detection of laryngeal pathologies in records of sustained vowels by means of mel-frequency cepstral coefficient parameters and differentiation of patients by sex.

Rubén Fraile; Nicolás Sáenz-Lechón; Juan Ignacio Godino-Llorente; Víctor Osma-Ruiz; C. Fredouille

Mel-frequency cepstral coefficients (MFCC) have traditionally been used in speaker identification applications. Their use has been extended to speech quality assessment for clinical applications during the last few years. While the significance of such parameters for such an application may not seem clear at first thought, previous research has demonstrated their robustness and statistical significance and, at the same time, their close relationship with glottal noise measurements. This paper includes a review of this parameterization scheme and it analyzes its performance for voice analysis when patients are differentiated by sex. While it is of common use for establishing normative values for traditional voice descriptors (e.g. pitch, jitter, formants), differentiation by sex had not been tested yet for cepstral analysis of voice with clinical purposes. This paper shows that the automatic detection of laryngeal pathology on voice records based on MFCC can significantly improve its performance by means of this prior differentiation by sex.

Intelligent Automation and Soft Computing | 2013

Dynamic Feature Extraction: an Application to Voice Pathology Detection

Genaro Daza-Santacoloma; Julián D. Arias-Londoño; Juan Ignacio Godino-Llorente; Nicolás Sáenz-Lechón; Víctor Osma-Ruiz; Germán Castellanos-Domínguez

Abstract In pattern recognition, observations are often represented by the so called static features, that is, numeric values that represent some kind of attribute from observations, which are assumed constant with respect to an associated dimension or dimensions (e.g. time, space, and so on). Nevertheless, we can represent the objects to be classified by means of another kind of measurements that do change over some associated dimension: these are called dynamic features. A dynamic feature can be represented by either a vector or a matrix for each observation. The advantage of using such an extended form is the inclusion of new information that gives abetter representation of the object. The main goal in this work is to extend traditional Principal Component Analysis (normally applied on static features) to a classification task using a dynamic representation. The method was applied to detect the presence of pathology in the speech using two different voice disorders databases, obtaining high classificat...

Computerized Medical Imaging and Graphics | 2008

Segmentation of the glottal space from laryngeal images using the watershed transform

Víctor Osma-Ruiz; Juan Ignacio Godino-Llorente; Nicolás Sáenz-Lechón; Rubén Fraile

The present work describes a new method for the automatic detection of the glottal space from laryngeal images obtained either with high speed or with conventional video cameras attached to a laryngoscope. The detection is based on the combination of several relevant techniques in the field of digital image processing. The image is segmented with a watershed transform followed by a region merging, while the final decision is taken using a simple linear predictor. This scheme has successfully segmented the glottal space in all the test images used. The method presented can be considered a generalist approach for the segmentation of the glottal space because, in contrast with other methods found in literature, this approach does not need either initialization or finding strict environmental conditions extracted from the images to be processed. Therefore, the main advantage is that the user does not have to outline the region of interest with a mouse click. In any case, some a priori knowledge about the glottal space is needed, but this a priori knowledge can be considered weak compared to the environmental conditions fixed in former works.

Explore More