Julián D. Arias-Londoño
Technical University of Madrid
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Julián D. Arias-Londoño.
IEEE Transactions on Biomedical Engineering | 2011
Julián D. Arias-Londoño; Juan Ignacio Godino-Llorente; Nicolás Sáenz-Lechón; Víctor Osma-Ruiz; Germán Castellanos-Domínguez
This paper proposes a new approach to improve the amount of information extracted from the speech aiming to increase the accuracy of a system developed for the automatic detection of pathological voices. The paper addresses the discrimination capabilities of 11 features extracted using nonlinear analysis of time series. Two of these features are based on conventional nonlinear statistics (largest Lyapunov exponent and correlation dimension), two are based on recurrence and fractal-scaling analysis, and the remaining are based on different estimations of the entropy. Moreover, this paper uses a strategy based on combining classifiers for fusing the nonlinear analysis with the information provided by classic parameterization approaches found in the literature (noise parameters and mel-frequency cepstral coefficients). The classification was carried out in two steps using, first, a generative and, later, a discriminative approach. Combining both classifiers, the best accuracy obtained is 98.23% ± 0.001.
Pattern Recognition | 2010
Julián D. Arias-Londoño; Juan Ignacio Godino-Llorente; Nicolás Sáenz-Lechón; Víctor Osma-Ruiz; Germán Castellanos-Domínguez
This paper presents new a feature transformation technique applied to improve the screening accuracy for the automatic detection of pathological voices. The statistical transformation is based on Hidden Markov Models, obtaining a transformation and classification stage simultaneously and adjusting the parameters of the model with a criterion that minimizes the classification error. The original feature vectors are built up using classic short-term noise parameters and mel-frequency cepstral coefficients. With respect to conventional approaches found in the literature of automatic detection of pathological voices, the proposed feature space transformation technique demonstrates a significant improvement of the performance with no addition of new features to the original input space. In view of the results, it is expected that this technique could provide good results in other areas such as speaker verification and/or identification.
Intelligent Automation and Soft Computing | 2013
Genaro Daza-Santacoloma; Julián D. Arias-Londoño; Juan Ignacio Godino-Llorente; Nicolás Sáenz-Lechón; Víctor Osma-Ruiz; Germán Castellanos-Domínguez
Abstract In pattern recognition, observations are often represented by the so called static features, that is, numeric values that represent some kind of attribute from observations, which are assumed constant with respect to an associated dimension or dimensions (e.g. time, space, and so on). Nevertheless, we can represent the objects to be classified by means of another kind of measurements that do change over some associated dimension: these are called dynamic features. A dynamic feature can be represented by either a vector or a matrix for each observation. The advantage of using such an extended form is the inclusion of new information that gives abetter representation of the object. The main goal in this work is to extend traditional Principal Component Analysis (normally applied on static features) to a classification task using a dynamic representation. The method was applied to detect the presence of pathology in the speech using two different voice disorders databases, obtaining high classificat...
Logopedics Phoniatrics Vocology | 2011
Julián D. Arias-Londoño; Juan Ignacio Godino-Llorente; Maria Markaki; Yannis Stylianou
Abstract This work presents a novel approach for the automatic detection of pathological voices based on fusing the information extracted by means of mel-frequency cepstral coefficients (MFCC) and features derived from the modulation spectra (MS). The system proposed uses a two-stepped classification scheme. First, the MFCC and MS features were used to feed two different and independent classifiers; and then the outputs of each classifier were used in a second classification stage. In order to establish the best configuration which provides the highest accuracy in the detection, the fusion of information was carried out employing different classifier combination strategies. The experiments were carried out using two different databases: the one developed by The Massachusetts Eye and Ear Infirmary Voice Laboratory, and a database recorded by the Universidad Politécnica de Madrid. The results show that the combination of MFCC and MS features employing the proposed approach yields an improvement in the detection accuracy, demonstrating that both methods of parameterization are complementary.
IEEE Transactions on Biomedical Engineering | 2008
Nicolás Sáenz-Lechón; Víctor Osma-Ruiz; Juan Ignacio Godino-Llorente; Manuel Blanco-Velasco; Fernando Cruz-Roldán; Julián D. Arias-Londoño
This paper investigates the performance of an automatic system for voice pathology detection when the voice samples have been compressed in MP3 format and different binary rates (160, 96, 64, 48, 24, and 8 kb/s). The detectors employ cepstral and noise measurements, along with their derivatives, to characterize the voice signals. The classification is performed using Gaussian mixtures models and support vector machines. The results between the different proposed detectors are compared by means of detector error tradeoff (DET) and receiver operating characteristic (ROC) curves, concluding that there are no significant differences in the performance of the detector when the binary rates of the compressed data are above 64 kb/s. This has useful applications in telemedicine, reducing the storage space of voice recordings or transmitting them over narrow-band communications channels.
international conference on acoustics, speech, and signal processing | 2010
Maria Markaki; Yannis Stylianou; Julián D. Arias-Londoño; Juan Ignacio Godino-Llorente
In this paper, we combine modulation spectral features with mel-frequency cepstral coefficients for automatic detection of dysphonia. For classification purposes, dimensions of the original modulation spectra are reduced using higher order singular value decomposition (HOSVD). Most relevant features are selected based on their mutual information to discrimination between normophonic and dysphonic speakers made by experts. Features that highly correlate with voice alterations are associated then with a support vector machine (SVM) classifier to provide an automatic decision. Recognition experiments using two different databases suggest that the system provides complementary information to the standard mel-cepstral features.
international conference of the ieee engineering in medicine and biology society | 2009
Julián D. Arias-Londoño; Juan Ignacio Godino-Llorente; Germán Castellanos-Domínguez; Nicolás Sáenz-Lechón; Víctor Osma-Ruiz
In this work an entropy based nonlinear analysis of pathological voices is presented. The complexity analysis is carried out by means of six different entropies, including three measures derived from the entropy rate of Markov chains. The aim is to characterize the divergence of the trajectories and theirs directions into the state space of Markov Chains. By employing these measures in conjunction with conventional entropy features, it is possible to improve the discrimination capabilities of the nonlinear analysis in the automatic detection of pathological voices.
Logopedics Phoniatrics Vocology | 2011
Nicolás Sáenz-Lechón; Rubén Fraile; Juan Ignacio Godino-Llorente; Roberto Fernández-Baíllo; Víctor Osma-Ruiz; Juana M. Gutiérrez-Arriola; Julián D. Arias-Londoño
Abstract Within this paper, the authors report on an experiment on automatic labelling of perceived voice roughness (R) and breathiness (B), according to the GRBAS scale. The main objective of the experiment has not been to correlate objective measures to perceived R and B, but to automatically evaluate R and B. For this purpose, a system has been trained that extracts the first mel-frequency cepstral coefficients (MFCC) of available sustained vowel phonations. Afterwards, a classifier has been trained to estimate the corresponding degrees of roughness and breathiness. The obtained results reveal a significant correlation between subjective and automatic labelling, hence indicating the feasibility of objective evaluation of voice quality by means of perceptually meaningful measures.
Revista Facultad De Ingenieria-universidad De Antioquia | 2016
Gabriel Jaime Zapata-Zapata; Julián D. Arias-Londoño; J. F. Vargas-Bonilla; Juan Rafael Orozco-Arroyave
This paper addresses the problem of training on-line signature verification systems when the number of training samples is small, facing the real-world scena...
Biomedical Signal Processing and Control | 2012
Víctor Osma-Ruiz; Juan Ignacio Godino-Llorente; Nicolás Sáenz-Lechón; J.Ma. Gutiérrez-Arriola; Julián D. Arias-Londoño; Rubén Fraile; B. Scola-Yurrita
A PC-based integrated aid tool has been developed for the analysis and screening of pathological voices. With it the user can simultaneously record speech, electroglottographic (EGG), and videoendoscopic signals, and synchronously edit them to select the most significant segments. These multimedia data are stored on a relational database, together with a patients personal information, anamnesis, diagnosis, visits, explorations and any other comment the specialist may wish to include. The speech and EGG waveforms are analysed by means of temporal representations and the quantitative measurements of parameters such as spectrograms, frequency and amplitude perturbation measurements, harmonic energy, noise, etc. are calculated using digital signal processing techniques, giving an idea of the degree of hoarseness and quality of the voice register. Within this framework, the system uses a standard protocol to evaluate and build complete databases of voice disorders. The target users of this system are speech and language therapists and ear nose and throat (ENT) clinicians. The application can be easily configured to cover the needs of both groups of professionals. The software has a user-friendly Windows style interface. The PC should be equipped with standard sound and video capture cards. Signals are captured using common transducers: a microphone, an electroglottograph and a fiberscope or telelaryngoscope. The clinical usefulness of the system is addressed in a comprehensive evaluation section.