J. D. Arias-Londoño
University of Antioquia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by J. D. Arias-Londoño.
Journal of the Acoustical Society of America | 2016
Juan Rafael Orozco-Arroyave; Florian Hönig; J. D. Arias-Londoño; J. F. Vargas-Bonilla; Khaled Daqrouq; Sabine Skodda; Jan Rusz; Elmar Nöth
The aim of this study is the analysis of continuous speech signals of people with Parkinsons disease (PD) considering recordings in different languages (Spanish, German, and Czech). A method for the characterization of the speech signals, based on the automatic segmentation of utterances into voiced and unvoiced frames, is addressed here. The energy content of the unvoiced sounds is modeled using 12 Mel-frequency cepstral coefficients and 25 bands scaled according to the Bark scale. Four speech tasks comprising isolated words, rapid repetition of the syllables /pa/-/ta/-/ka/, sentences, and read texts are evaluated. The method proves to be more accurate than classical approaches in the automatic classification of speech of people with PD and healthy controls. The accuracies range from 85% to 99% depending on the language and the speech task. Cross-language experiments are also performed confirming the robustness and generalization capability of the method, with accuracies ranging from 60% to 99%. This work comprises a step forward for the development of computer aided tools for the automatic assessment of dysarthric speech signals in multiple languages.
IEEE Journal of Biomedical and Health Informatics | 2015
Juan Rafael Orozco-Arroyave; Elkyn Alexander Belalcázar-Bolaños; J. D. Arias-Londoño; J. F. Vargas-Bonilla; Sabine Skodda; Jan Rusz; Khaled Daqrouq; Florian Hönig; Elmar Nöth
This paper evaluates the accuracy of different characterization methods for the automatic detection of multiple speech disorders. The speech impairments considered include dysphonia in people with Parkinsons disease (PD), dysphonia diagnosed in patients with different laryngeal pathologies (LP), and hypernasality in children with cleft lip and palate (CLP). Four different methods are applied to analyze the voice signals including noise content measures, spectral-cepstral modeling, nonlinear features, and measurements to quantify the stability of the fundamental frequency. These measures are tested in six databases: three with recordings of PD patients, two with patients with LP, and one with children with CLP. The abnormal vibration of the vocal folds observed in PD patients and in people with LP is modeled using the stability measures with accuracies ranging from 81% to 99% depending on the pathology. The spectral-cepstral features are used in this paper to model the voice spectrum with special emphasis around the first two formants. These measures exhibit accuracies ranging from 95% to 99% in the automatic detection of hypernasal voices, which confirms the presence of changes in the speech spectrum due to hypernasality. Noise measures suitably discriminate between dysphonic and healthy voices in both databases with speakers suffering from LP. The results obtained in this study suggest that it is not suitable to use every kind of features to model all of the voice pathologies; conversely, it is necessary to study the physiology of each impairment to choose the most appropriate set of features.
non-linear speech processing | 2013
Juan Rafael Orozco-Arroyave; J. D. Arias-Londoño; J. F. Vargas-Bonilla; Elmar Nöth
Different characterization approaches, including nonlinear dynamics (NLD), have been addressed for the automatic detection of PD; however, the obtained discrimination capability when only NLD features are considered has not been evaluated yet.
international conference on acoustics, speech, and signal processing | 2016
Juan Rafael Orozco-Arroyave; J. C. Vdsquez-Correa; Florian Hönig; J. D. Arias-Londoño; J. F. Vargas-Bonilla; Sabine Skodda; Jan Rusz; Elmar Nöth
The suitability of articulation measures and speech intelligibility is evaluated to estimate the neurological state of patients with Parkinsons disease (PD). A set of measures recently introduced to model the articulatory capability of PD patients is considered. Additionally, the speech intelligibility in terms of the word accuracy obtained from the Google® speech recognizer is included. Recordings of patients in three different languages are considered: Spanish, German, and Czech. Additionally, the proposed approach is tested on data recently used in the INTERSPEECH 2015 Computational Paralinguistics Challenge. According to the results, it is possible to estimate the neurological state of PD patients from speech with a Spearmans correlation of up to 0.72 with respect to the evaluations performed by neurologist experts.
international work-conference on the interplay between natural and artificial computation | 2013
Juan Rafael Orozco-Arroyave; J. D. Arias-Londoño; J. F. Vargas-Bonilla; Elmar Nöth
Parkinson’s disease (PD) is a neurodegenerative disorder of the nervous central system and it affects the limbs motor control and the communication skills of the patients. The evolution of the disease can get to the point of affecting the intelligibility of the patient’s speech.
Expert Systems | 2015
Juan Rafael Orozco-Arroyave; Florian Hönig; J. D. Arias-Londoño; J. F. Vargas-Bonilla; Elmar Nöth
About 1% of people older than 65years suffer from Parkinsons disease PD and 90% of them develop several speech impairments, affecting phonation, articulation, prosody and fluency. Computer-aided tools for the automatic evaluation of speech can provide useful information to the medical experts to perform a more accurate and objective diagnosis and monitoring of PD patients and can help also to evaluate the correctness and progress of their therapy. Although there are several studies that consider spectral and cepstral information to perform automatic classification of speech of people with PD, so far it is not known which is the most discriminative, spectral or cepstral analysis. In this paper, the discriminant capability of six sets of spectral and cepstral coefficients is evaluated, considering speech recordings of the five Spanish vowels and a total of 24 isolated words. According to the results, linear predictive cepstral coefficients are the most robust and exhibit values of the area under the receiver operating characteristic curve above 0.85 in 6 of the 24 words.
international carnahan conference on security technology | 2014
J. C. Vásquez-Correa; N. García; J. F. Vargas-Bonilla; Juan Rafael Orozco-Arroyave; J. D. Arias-Londoño; M. O. Lucía Quintero
Detection of emotion in humans from speech signals is a recent research field. One of the scenarios where this field has been applied is in situations where the human integrity and security are at risk. In this paper we are propossing a set of features based on the Teager energy operator, and several entropy measures obtained from the decomposition signals from discrete wavelet transform to characterize different types of negative emotions such as anger, anxiety, disgust, and desperation. The features are measured in three different conditions: (1) the original speech signals, (2) the signals that are contaminated with noise, or are affected by the presence of a phone channel, and (3) the signals that are obtained after processing using an algorithm for Speech Enhancement based on Karhunen-Love Transform. According to the results, when the speech enhancement is applied, the detection of emotion in speech is increased in up to 22% compared to results obtained when the speech signal is highly contaminated with noise.
international work-conference on the interplay between natural and artificial computation | 2013
Elkyn Alexander Belalcázar-Bolaños; Juan Rafael Orozco-Arroyave; J. F. Vargas-Bonilla; J. D. Arias-Londoño; César Germán Castellanos-Domínguez; Elmar Nöth
In this paper, the analysis of low-frequency zone of the speech signals from the five Spanish vowels, by means of the Teager energy operator (TEO) and the modified group delay functions (MGDF) is proposed for the automatic detection of Parkinson’s disease.
international carnahan conference on security technology | 2015
Juan Camilo Vásquez-Correa; Nicanor García; Juan Rafael Orozco-Arroyave; J. D. Arias-Londoño; J. F. Vargas-Bonilla; Elmar Nöth
Automatic emotion recognition considering speech signals has attracted the attention of the research community in the last years. One of the main challenges is to find suitable features to represent the affective state of the speaker. In this paper, a new set of features derived from the wavelet packet transform is proposed to classify different negative emotions such as anger, fear, and disgust, and to differentiate between those negative emotions and neutral state, or positive emotions such as happiness. Different wavelet decompositions are considered both for voiced and unvoiced segments, in order to determine a frequency band where the emotions are concentrated. Several measures are calculated in the wavelet decomposed signals, including log-energy, entropy measures, mel frequency cepstral coefficients, and the Lempel-Ziv complexity. The experiments consider two different databases extensively used in emotion recognition: the Berlin emotional database, and the enterface05 database. Also, in order to approximate to real-world conditions in terms of the quality of recorded speech, such databases are degraded using different environmental noise such as cafeteria babble, and street noise. The addition of noise is performed considering several signal to noise ratio levels which range from -3 to 6 dB. Finally, the effect produced by two different speech enhancement methods is evaluated. According to results, the features calculated from the lower frequency wavelet decomposition coefficients are able to recognize the fear-type emotions in speech. Also, one of the speech enhancement algorithms has proven to be useful to improve of the accuracy in cases of speech signals affected by highly background noise.
Archive | 2016
Juan Camilo Vásquez-Correa; Juan Rafael Orozco-Arroyave; J. D. Arias-Londoño; J. F. Vargas-Bonilla; Elmar Nöth
A new set of features based on non-linear dynamics measures obtained from the wavelet packet transform for the automatic recognition of “fear-type” emotions in speech is proposed. The experiments are carried out using three different databases with a Gaussian Mixture Model for classification. The results indicate that the proposed approach is promising for modeling “fear-type” emotions in speech.