Juana M. Gutiérrez-Arriola
Technical University of Madrid
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Juana M. Gutiérrez-Arriola.
international conference on acoustics, speech, and signal processing | 2001
Juana M. Gutiérrez-Arriola; Juan Manuel Montero; D. Saiz; José Manuel Pardo
We present the analysis of a Spanish prosody database by estimating the parameters of Fujisakis (1981) model for F/sub 0/ contours. These parameters are classified attending to linguistic features and they form the analysis database. When synthesizing F/sub 0/ contours we extract the linguistic features from the text and perform a k-nearest neighbour search. Linguistic feature comparison distance is trained using data from the prosody database. To avoid artifacts we perform a rule-base filtering on synthesis parameters. The results of our evaluation test show that the proposed system is significantly better than the previous neural network approach. This evaluation confirms the ability of Fujisakis model to represent prosody information based on linguistic features.
Journal of Voice | 2013
Rubén Fraile; Juan Ignacio Godino-Llorente; Nicolás Sáenz-Lechón; Víctor Osma-Ruiz; Juana M. Gutiérrez-Arriola
OBJECTIVES This article presents a comparative study of the spectral power distribution for normal and dysphonic voices, both for sustained vowels and running speech. The objective of this study was to find robust cues of dysphonia in spectral domain. For this purpose, recordings from two databases are processed, one of them including both sustained vowels and running speech. Additionally, a new measure of stability is introduced (decorrelation time). The application of this measure to the power spectrum is also tested as a cue of dysphonia. MATERIALS AND METHODS The spectral analysis is done having both an auditory model and the filterbank approach as references to the computation of discrete spectrograms. Results are obtained from three sets of recordings belonging to two different databases. RESULTS The reported results indicate that only minor differences exist in the shape of the power spectrum of normal and dysphonic voices when performing sustained vowel phonation tasks. However, the calculated band power decorrelation times indicate that power in bands between 2000 and 6400Hz is significantly less stable in dysphonic voices. As for running speech, the stability of spectral power is not such a good indicator of dysphonia, but there is a significant difference between normal and dysphonic voices in the power level of high-frequency bands (above 5300Hz). In addition, this means that sampling rates above 10.6ksps are needed for assessing running speech in spectral domain. Also, the results involving decorrelation times indicate that for short-time spectral analysis, frame rates above 100 frames/s should be preferred.
Biomedical Signal Processing and Control | 2012
Rubén Fraile; Malte Kob; Juan Ignacio Godino-Llorente; Nicolás Sáenz-Lechón; Víctor Osma-Ruiz; Juana M. Gutiérrez-Arriola
Abstract This paper reports on the use of a high-dimensional discrete vocal fold model for the simulation of voice production under the presence of laryngeal disorders. Specifically, the effect of increases in mass and stiffness, both unilateral and bilateral, has been analysed independently for both magnitudes. The glottal flow waveform and the mass displacement have been studied and the obtained results are coherent with clinical observations that relate mass increments with lowered fundamental frequencies and mass and stiffness increments with reduced vibratory amplitudes of vocal folds. The reported results also indicate that asymmetries in the physical properties of vocal folds result in asymmetries in their vibratory patterns, including phase, amplitude and behaviour on collision. These are also correlated with voice perturbation measures such as jitter, shimmer and normalised noise energy.
global engineering education conference | 2010
Rubén Fraile; Irina Argüelles; Juan C. González; Juana M. Gutiérrez-Arriola; César Benavente; Luis Arriero; David Osés
The authors propose a system for the assessment of Final Year Projects (FYPs) whose educational outputs have been defined previously in terms of competences. For building the proposal, eleven pre-defined competences were ranked and a different weight was assigned to each one. The ranking was made individually by all the authors following a blind two-step process. The first step consisted in ordering the competences by relevance and the second step in grading that relevance for each pair of competences having consecutive positions in the list. As a result, an overall weight was computed for each competence and the final proposal was produced by averaging the individual proposals. In addition, three moments are defined for the assessment of FYPs: the FYP process itself, the written report and the oral presentation. Bearing in mind this, the competences that can be evaluated in each moment are identified and a specific assessment form for each moment is also proposed.
global engineering education conference | 2010
Wilmar Hernandez; Javier Palmero; Manuel Labrador; Jorge Bonache; Carmen Cousido; Antonio Álvarez-Vellisco; Juana M. Gutiérrez-Arriola; Juan Jiménez-Trillo
In this paper, the results of four years of a research aimed at carrying out a comparative analysis between the application of the European Credit Transfer and Accumulation System (ECTS) and the traditional teaching and learning system (TTLS) to first-year students, in order to improve their performance in the subject Analysis of Circuits I (AC-I) are presented. The ECTS is a student-centered system based on the student workload required to achieve the objectives of a program, and the outcomes of its application have been quite positive. In order to conduct the statistical analysis of the data collected in the educational experiment and make the right decisions, at the beginning of the experiment, during the first years, both treatment and control groups were formed and several tests of hypothesis were conducted in the groups that participated in the educational experiment. Neither all the students who took the above-mentioned subjects nor all the professors who taught them participated in the experiment. However, during the last year of the experiment all the students and almost all professors participated in the experiment. Since the beginning of the experiment, satisfactory partial results have been gradually achieved, and when we managed to involve all the students and almost all professors in the last year of the experiment, the overall results where not only satisfactory but also significantly better than the ones achieved in the previous years. The students satisfaction and confidence have increased gradually, and, in general, the students under the ECTS passed more exams and with better Grades than the students under the TTLS. Also, the teaching-learning methodology strategies, tutor sessions, assessment methods, use of the virtual learning environment (VLE), student teamwork, and collaborative work among professors performed better under the ECTS than under the TTLS.
Logopedics Phoniatrics Vocology | 2011
Nicolás Sáenz-Lechón; Rubén Fraile; Juan Ignacio Godino-Llorente; Roberto Fernández-Baíllo; Víctor Osma-Ruiz; Juana M. Gutiérrez-Arriola; Julián D. Arias-Londoño
Abstract Within this paper, the authors report on an experiment on automatic labelling of perceived voice roughness (R) and breathiness (B), according to the GRBAS scale. The main objective of the experiment has not been to correlate objective measures to perceived R and B, but to automatically evaluate R and B. For this purpose, a system has been trained that extracts the first mel-frequency cepstral coefficients (MFCC) of available sustained vowel phonations. Afterwards, a classifier has been trained to estimate the corresponding degrees of roughness and breathiness. The obtained results reveal a significant correlation between subjective and automatic labelling, hence indicating the feasibility of objective evaluation of voice quality by means of perceptually meaningful measures.
international conference on acoustics, speech, and signal processing | 2001
Ricardo de Córdoba; Juan Manuel Montero; Juana M. Gutiérrez-Arriola; José Manuel Pardo
The objective of this paper is the accurate prediction of segmental duration in a Spanish text-to-speech system. There are many parameters that affect duration, but not all of them are always relevant. We present a complete environment in which to decide which parameters are more relevant and the best way to code them. This work is the continuation of Cordoba et al. (1999), where all efforts were dedicated to an unrestricted-domain database for a male voice. In this case, we are considering a female voice in a restricted-domain environment. This restricted-domain offers several advantages to the modeling: the variation in the different patterns is reduced, and so most of the decisions we have made about the parameters are now based in more significant results. So, the conclusions that we present now show clearly which parameters are best. The system is based in a neural network absolutely configurable.
biomedical engineering systems and technologies | 2010
Rubén Fraile; Malte Kob; Juana M. Gutiérrez-Arriola; Nicolás Sáenz-Lechón; J.Ignacio Godino-Llorente; Víctor Osma-Ruiz
Pathological voices have features that make them distinct from normophonic voices. In fact, the unstability of phonation associated to some voice disorders has a big impact on the spectral envelope of the speech signal and also on the feasibility of reliable pitch detection. These two issues (characteristics of the spectral envelope and pitch detection) and corresponding assumptions play a key role in many current inverse filtering algorithms. Thus, the inverse filtering of disordered or special voices is not a solved problem yet. Nevertheless, the assessment of glottal function is expected to be useful in voice function evaluation. This paper approaches the problem of inverse filtering by homomorphic prediction. While not favoured much by researchers in recent literature, such an approach offers two potential advantages: it does not require previous pitch detection and it does not rely on any assumptions about the spectral enevelope of the glottal signal. Its performance is herein assessed and compared to that of an adaptive inverse filtering method making use of synthetic voices produced with a biomechanical voice production model. Results indicate that the performance of the inverse filtering based on homomorphic prediction is within the range of that of adaptive inverse filtering and, at the same time, it has a better behaviour when the spectral envelope of the glottal signal does not suit an all-pole model of predefined order.
european signal processing conference | 2015
Rubén Fraile; Nicolás Sáenz-Lechón; Víctor Osma-Ruiz; Juana M. Gutiérrez-Arriola
Vocal tremor is a low frequency instability of the voice that causes modulation of its amplitude and fundamental frequency. Among these two, frequency modulation is more relevant for perception and it has been shown to be present both in normophonic and dysphonic voices and to happen in similar frequency bands for both voice types. This paper presents a characterisation of the frequency modulating signal estimated for normophonic voices in terms of both its spectral characteristics and its statistical distribution. By using the discrete Fourier transform for data non-uniformly spaced in time domain, it is shown that the modulating signal may be either low-pass or band-pass (i.e. oscillating), though the low-pass case dominates in the analysed data. As for the values of the modulating signal, their distribution is shown to fairly fit a Gaussian distribution with a standard deviation that significantly depends on the average fundamental frequency.
international conference of the ieee engineering in medicine and biology society | 2012
Juana M. Gutiérrez-Arriola; Víctor Osma-Ruiz; N. Seenz-Lechon; Juan Ignacio Godino-Llorente; Rubén Fraile; J. D. Arias-Londoño
Objective evaluation of the results of medical image segmentation is a known problem. Applied to the task of automatically detecting the glottal area from laryngeal images, this paper proposes a new objective measurement to evaluate the quality of a segmentation algorithm by comparing with the results given by a human expert. The new figure of merit is called Area Index, and its effectiveness is compared with one of the most used figures of merit found in the literature: the Pratt Index. Results over 110 laryngeal images presented high correlations between both indexes, demonstrating that the proposed measure is comparable to the Pratt Index and it is a good indicator of the segmentation quality.