Agustín Álvarez-Marquina

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Agustín Álvarez-Marquina is active.

Explore More

Publication

Featured researches published by Agustín Álvarez-Marquina.

non linear speech processing | 2009

Glottal Source biometrical signature for voice pathology detection

Pedro Gómez-Vilda; Roberto Fernández-Baíllo; Victoria Rodellar-Biarge; Victor Nieto Lluis; Agustín Álvarez-Marquina; Luis Miguel Mazaira-Fernández; Rafael Martínez-Olalla; Juan Ignacio Godino-Llorente

The Glottal Source is an important component of voice as it can be considered as the excitation signal to the voice apparatus. The use of the Glottal Source for pathology detection or the biometric characterization of the speaker are important objectives in the acoustic study of the voice nowadays. Through the present work a biometric signature based on the speakers power spectral density of the Glottal Source is presented. It may be shown that this spectral density is related to the vocal fold cover biomechanics, and from literature it is well-known that certain speakers features as gender, age or pathologic condition leave changes in it. The paper describes the methodology to estimate the biometric signature from the power spectral density of the mucosal wave correlate, which after normalization can be used in pathology detection experiments. Linear Discriminant Analysis is used to confront the detection capability of the parameters defined on this glottal signature among themselves and compared to classical perturbation parameters. A database of 100 normal and 100 pathologic subjects equally balanced in gender and age is used to derive the best parameter cocktails for pathology detection and quantification purposes to validate this methodology in voice evaluation tests. In a study case presented to illustrate the detection capability of the methodology exposed a control subset of 24+24 subjects is used to determine a subjects voice condition in a pre- and post-surgical evaluation. Possible applications of the study can be found in pathology detection and grading and in rehabilitation assessment after treatment.

Cognitive Computation | 2013

Characterizing Neurological Disease from Voice Quality Biomechanical Analysis

Pedro Gómez-Vilda; Victoria Rodellar-Biarge; Víctor Nieto-Lluis; Cristina Muñoz-Mulas; Luis Miguel Mazaira-Fernández; Rafael Martínez-Olalla; Agustín Álvarez-Marquina; Carlos Ramírez-Calvo; Mario Fernández-Fernández

The dramatic impact of neurological degenerative pathologies in life quality is a growing concern nowadays. Many techniques have been designed for the detection, diagnosis, and monitoring of the neurological disease. Most of them are too expensive or complex for being used by primary attention medical services. On the other hand, it is well known that many neurological diseases leave a signature in voice and speech. Through the present paper, a new method to trace some neurological diseases at the level of phonation will be shown. In this way, the detection and grading of the neurological disease could be based on a simple voice test. This methodology is benefiting from the advances achieved during the last years in detecting and grading organic pathologies in phonation. The paper hypothesizes that some of the underlying neurological mechanisms affecting phonation produce observable correlates in vocal fold biomechanics and that these correlates behave differentially in neurological diseases than in organic pathologies. A general description about the main hypotheses involved and their validation by acoustic voice analysis based on biomechanical correlates of the neurological disease is given. The validation is carried out on a balanced database of normal and organic dysphonic patients of both genders. Selected study cases will be presented to illustrate the possibilities offered by this methodology.

Neurocomputing | 2011

Neuromorphic detection of speech dynamics

Pedro Gómez-Vilda; José Manuel Ferrández-Vicente; Victoria Rodellar-Biarge; Agustín Álvarez-Marquina; Luis Miguel Mazaira-Fernández; Rafael Martínez Olalla; Cristina Muñoz-Mulas

Speech and voice technologies are experiencing a profound review as new paradigms are sought to overcome some specific problems which cannot be completely solved by classical approaches. Neuromorphic Speech Processing is an emerging area in which research is turning the face to understand the natural neural processing of speech by the Human Auditory System in order to capture the basic mechanisms solving difficult tasks in an efficient way. In the present paper a further step ahead is presented in the approach to mimic basic neural speech processing by simple neuromorphic units standing on previous work to show how formant dynamics - and henceforth consonantal features - can be detected by using a general neuromorphic unit which can mimic the functionality of certain neurons found in the upper auditory pathways. Using these simple building blocks a General Speech Processing Architecture can be synthesized as a layered structure. Results from different simulation stages are provided as well as a discussion on implementation details. Conclusions and future work are oriented to describe the functionality to be covered in the next research steps.

international conference on digital signal processing | 2013

Estimating Tremor in Vocal Fold Biomechanics for Neurological Disease Characterization

Pedro Gómez-Vilda; Víctor Nieto-Lluis; Victoria Rodellar-Biarge; Agustín Álvarez-Marquina; Luis Miguel Mazaira-Fernández; Rafael Martínez-Olalla; Cristina Muñoz-Mulas; Mario Fernández-Fernández; Carlos Ramírez-Calvo

Neurological Diseases (ND) are affecting larger segments of aging population every year. Treatment is dependent on expensive accurate and frequent monitoring. It is well known that ND leave correlates in speech and phonation. The present work shows a method to detect alterations in vocal fold tension during phonation. These may appear either as hypertension or as cyclical tremor. Estimations of tremor may be produced by auto-regressive modeling of the vocal fold tension series in sustained phonation. The correlates obtained are a set of cyclicality coefficients, the frequency and the root mean square amplitude of the tremor. Statistical distributions of these correlates obtained from a set of male and female subjects are presented. Results from five study cases of female voice are also given.

international work conference on the interplay between natural and artificial computation | 2007

A Bio-inspired Architecture for Cognitive Audio

Pedro Gómez-Vilda; José Manuel Ferrández-Vicente; Victoria Rodellar-Biarge; Agustín Álvarez-Marquina; Luis Miguel Mazaira-Fernández

A comprehensive view of speech and voice technologies is now demanding better and more complex tools amenable of extracting as much knowledge about sound and speech as possible. Many knowledge-extraction tasks from speech and voice share well-known procedures at the algorithmic level under the point of view of bio-inspiration. The same resources employed to decode speech phones may be used in the characterization of the speaker (gender, age, speaking group, etc.). Based on these facts the present paper examines a hierarchy of sound processing levels at the auditory and perceptual levels on the brain neural paths which can be translated into a bio-inspired audio-processing architecture. Through this paper its fundamental characteristics are analyzed in relation with current tendencies in cognitive audio processing. Examples extracted from speech processing applications in the domain of acoustic-phonetics are presented. These may find applicability in speakers characterization, forensics, and biometry, among others.

Frontiers in Bioengineering and Biotechnology | 2015

Improving Speaker Recognition by Biometric Voice Deconstruction

Luis Miguel Mazaira-Fernández; Agustín Álvarez-Marquina; Pedro Gómez-Vilda

Person identification, especially in critical environments, has always been a subject of great interest. However, it has gained a new dimension in a world threatened by a new kind of terrorism that uses social networks (e.g., YouTube) to broadcast its message. In this new scenario, classical identification methods (such as fingerprints or face recognition) have been forcedly replaced by alternative biometric characteristics such as voice, as sometimes this is the only feature available. The present study benefits from the advances achieved during last years in understanding and modeling voice production. The paper hypothesizes that a gender-dependent characterization of speakers combined with the use of a set of features derived from the components, resulting from the deconstruction of the voice into its glottal source and vocal tract estimates, will enhance recognition rates when compared to classical approaches. A general description about the main hypothesis and the methodology followed to extract the gender-dependent extended biometric parameters is given. Experimental validation is carried out both on a highly controlled acoustic condition database, and on a mobile phone network recorded under non-controlled acoustic conditions.

Neurocomputing | 2017

Parkinson's disease monitoring by biomechanical instability of phonation

Pedro Gómez-Vilda; Daniel Palacios-Alonso; Victoria Rodellar-Biarge; Agustín Álvarez-Marquina; Víctor Nieto-Lluis; Rafael Martínez-Olalla

Abstract Patients suffering from Parkinsons disease (PD) may be successfully treated pharmacologically and surgically to preserve and even improve their life quality and health conditions. Although the progress of the disease cannot be stopped, at least mitigation of the most handicapping symptoms can be achieved. But both pharmacological and surgical treatments require the adequate monitoring of the disease stage of progress and the effects of treatment. Several techniques have been proposed for PD evolution monitoring, ranging from subjective auto-evaluation by questionnaires, or from gait and handwriting examination by specialists. Nevertheless, these techniques present certain difficulties, which make frequent evaluation impractical. On the other hand, it is known that speech acoustic analysis may estimate indicators of patients conditions, and can be implemented for a frequent evaluation protocol; and under minimal help, it can be carried out at distance using communication technologies. The acoustic analysis, may be based on mel-cepstral coefficients, distortion features as jitter, shimmer, harmonic-to-noise contents, or pitch-perturbation estimates, among others. Phonation biomechanical parameter and tremor estimates are also good markers of PD. The present work proposes a combination of biomechanical features to predict PD progress using Bayesian likelihood estimation. This methodology proves to be very sensitive and allows a three-band based comparison: pre-treatment versus post-treatment in reference to a control subject or a normative population. Results from a study are presented, including eight patients recorded on a 4-week separation interval, meanwhile they were treated with medication, physical exercising and speech therapy. The conclusions show that certain distortion, biomechanical and tremor features are of special relevance to monitor PD phonation, and that they can be used as evolution markers.

Neurocomputing | 2015

Phonation biomechanic analysis of Alzheimer's Disease cases

Pedro Gómez-Vilda; Victoria Rodellar-Biarge; Víctor Nieto-Lluis; Karmele López de Ipiña; Agustín Álvarez-Marquina; Rafael Martínez-Olalla; Miriam Ecay-Torres; Pablo Martinez-Lage

Speech production in patients suffering of dementias of Alzheimer?s type is known to experience noticeable changes with respect to normative speakers. Classically this kind of speech has been described as presenting altered prosody, rhythmic pace, anomy, or impaired semantics. Phonation, conceived as the production of voice in voiced speech fragments remains as an unexplored field. The aim of the present paper is to open a preliminary study presenting biomechanical estimates from phonation produced by two patients (male and female) suffering Alzheimer?s Disease (AD), contrasted on two controls of both genders (CS: control speakers). A vocal fold biomechanical model is inverted to facilitate estimates of the vocal fold stiffness to analyze significant segments of phonated speech as long vowels and fillers. The estimates of both the AD patients and CS subjects are contrasted on a database of phonation features from a normative speaker population of both genders, as well as in paired tests contrasting AD and CS subjects. Results show the possibility of establishing significant discrimination between AD and CS when using f0, as well as vocal fold body stiffness, although this last feature seems to be more relevant and shows larger statistical significance.

non-linear speech processing | 2011

KPCA vs. PCA study for an age classification of speakers

Cristina Muñoz-Mulas; Rafael Martínez-Olalla; Pedro Gómez-Vilda; Elmar Wolfgang Lang; Agustín Álvarez-Marquina; Luis Miguel Mazaira-Fernández; Víctor Nieto-Lluis

Kernel-PCA and PCA techniques are compared in the task of age and gender separation. A feature extraction process that discriminates between vocal tract and glottal source is implemented. The reason why speech is processed in that way is because vocal tract length and resonant characteristics are related to gender and age and there is also a great relationship between glottal source and age and gender. The obtained features are then processed with PCA and kernel-PCA techniques. The results show that gender and age separation is possible and that kernel-PCA (especially with RBF kernel) clearly outperforms classical PCA or no preprocessing features.

international work-conference on the interplay between natural and artificial computation | 2017

Relating Facial Myoelectric Activity to Speech Formants

Pedro Gómez-Vilda; Daniel Palacios-Alonso; Andrés Gómez-Rodellar; José Manuel Ferrández-Vicente; Agustín Álvarez-Marquina; Rafael Martínez-Olalla; Víctor Nieto-Lluis

Speech articulation is conditioned by the movements produced by well determined groups of muscles in the larynx, pharynx, mouth and face. The resulting speech shows acoustic features which are directly related with muscle neuromotor actions. Formants are some of the observable correlates most related to certain muscle actions, such as the ones activating jaw and tongue. As the recording of speech is simple and ubiquitous, the use of speech as a vehicular tool for neuromotor action monitoring would open a wide set of applications in the study of functional grading of neurodegenerative diseases. A relevant question is how far speech correlates and neuromotor action are related. This question is answered by the present study using electromyographic recordings on the masseter and the acoustic kinematics related with the first formant. Correlation measurements help in establishing a clear relation between the time derivative of the first formant and the masseter myoelectric activity. Monitoring disease progress by acoustic kinematics in one case of Amyotrophic Lateral Sclerosis ALS is described.

Explore More