Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Javier Macias-Guarasa is active.

Publication


Featured researches published by Javier Macias-Guarasa.


IEEE Transactions on Education | 2006

A project-based learning approach to design electronic systems curricula

Javier Macias-Guarasa; Juan Manuel Montero; Rubén San-Segundo; Alvaro Araujo; Octavio Nieto-Taladriz

This paper presents an approach to design Electronic Systems Curricula for making electronics more appealing to students. Since electronics is an important grounding for other disciplines (computer science, signal processing, and communications), this approach proposes the development of multidisciplinary projects using the project-based learning (PBL) strategy for increasing the attractiveness of the curriculum. The proposed curriculum structure consists of eight courses: four theoretical courses and four PBL courses (including a compulsory Masters thesis). In PBL courses, the students, working together in groups, develop multidisciplinary systems, which become progressively more complex. To address this complexity, the Department of Electronic Engineering has invested in the last five years in many resources for developing software tools and a common hardware. This curriculum has been evaluated successfully for the last four academic years: the students have increased their interest in electronics and have given the courses an average grade of more than 71% for all PBL course evaluations (data extracted from students surveys). The students have also acquired new skills and obtained very good academic results: the average grade was more than 74% for all PBL courses. An important result is that all students have developed more complex and sophisticated electronic systems, while considering that the results are worth the effort invested


Speech Communication | 2008

Speech to sign language translation system for Spanish

Rubén San-Segundo; R. Barra; Ricardo de Córdoba; Luis Fernando D'Haro; F. Fernández; Javier Ferreiros; J.M. Lucas; Javier Macias-Guarasa; Juan Manuel Montero; José Manuel Pardo

This paper describes the development of and the first experiments in a Spanish to sign language translation system in a real domain. The developed system focuses on the sentences spoken by an official when assisting people applying for, or renewing their Identity Card. The system translates official explanations into Spanish Sign Language (LSE: Lengua de Signos Espanola) for Deaf people. The translation system is made up of a speech recognizer (for decoding the spoken utterance into a word sequence), a natural language translator (for converting a word sequence into a sequence of signs belonging to the sign language), and a 3D avatar animation module (for playing back the hand movements). Two proposals for natural language translation have been evaluated: a rule-based translation module (that computes sign confidence measures from the word confidence measures obtained in the speech recognition module) and a statistical translation module (in this case, parallel corpora were used for training the statistical model). The best configuration reported 31.6% SER (Sign Error Rate) and 0.5780 BLEU (BiLingual Evaluation Understudy). The paper also describes the eSIGN 3D avatar animation module (considering the sign confidence), and the limitations found when implementing a strategy for reducing the delay between the spoken utterance and the sign sequence animation.


Speech Communication | 2010

Analysis of statistical parametric and unit selection speech synthesis systems applied to emotional speech

Roberto Barra-Chicote; Junichi Yamagishi; Simon King; Juan Manuel Montero; Javier Macias-Guarasa

We have applied two state-of-the-art speech synthesis techniques (unit selection and HMM-based synthesis) to the synthesis of emotional speech. A series of carefully designed perceptual tests to evaluate speech quality, emotion identification rates and emotional strength were used for the six emotions which we recorded -happiness, sadness, anger, surprise, fear, disgust. For the HMM-based method, we evaluated spectral and source components separately and identified which components contribute to which emotion. Our analysis shows that, although the HMM method produces significantly better neutral speech, the two methods produce emotional speech of similar quality, except for emotions having context-dependent prosodic patterns. Whilst synthetic speech produced using the unit selection method has better emotional strength scores than the HMM-based method, the HMM-based method has the ability to manipulate the emotional strength. For emotions that are characterized by both spectral and prosodic components, synthetic speech using unit selection methods was more accurately identified by listeners. For emotions mainly characterized by prosodic components, HMM-based synthetic speech was more accurately identified. This finding differs from previous results regarding listener judgements of speaker similarity for neutral speech. We conclude that unit selection methods require improvements to prosodic modeling and that HMM-based methods require improvements to spectral modeling for emotional speech. Certain emotions cannot be reproduced well by either method.


international conference on acoustics, speech, and signal processing | 2006

Prosodic and Segmental Rubrics in Emotion Identification

R. Barra; Juan Manuel Montero; Javier Macias-Guarasa; Luis Fernando D'Haro; Rubén San-Segundo; Ricardo de Córdoba

It is well known that the emotional state of a speaker usually alters the way she/he speaks. Although all the components of the voice can be affected by emotion in some statistically-significant way, not all these deviations from a neutral voice are identified by human listeners as conveying emotional information. In this paper we have carried out several perceptual and objective experiments that show the relevance of prosody and segmental spectrum in the characterization and identification of four emotions in Spanish. A Bayes classifier has been used in the objective emotion identification task. Emotion models were generated as the contribution of every emotion to the build-up of a universal background emotion codebook. According to our experiments, surprise is primarily identified by humans through its prosodic rubric (in spite of some automatically-identifiable segmental characteristics); while for anger the situation is just the opposite. Sadness and happiness need a combination of prosodic and segmental rubrics to be reliably identified


Journal of Visual Languages and Computing | 2008

Proposing a speech to gesture translation architecture for Spanish deaf people

Rubén San-Segundo; Juan Manuel Montero; Javier Macias-Guarasa; Ricardo de Córdoba; Javier Ferreiros; José Manuel Pardo

This article describes an architecture for translating speech into Spanish Sign Language (SSL). The architecture proposed is made up of four modules: speech recognizer, semantic analysis, gesture sequence generation and gesture playing. For the speech recognizer and the semantic analysis modules, we use software developed by IBM and CSLR (Center for Spoken Language Research at University of Colorado), respectively. Gesture sequence generation and gesture animation are the modules on which we have focused our main effort. Gesture sequence generation uses semantic concepts (obtained from the semantic analysis) associating them with several SSL gestures. This association is carried out based on a number of generation rules. For gesture animation, we have developed an animated agent (virtual representation of a human person) and a strategy for reducing the effort in gesture animation. This strategy consists of making the system automatically generate all agent positions necessary for the gesture animation. In this process, the system uses a few main agent positions (two or three per second) and some interpolation strategies, both issues previously generated by the service developer (the person who adapts the architecture proposed in this paper to a specific domain). Related to this module, we propose a distance between agent positions and a measure of gesture complexity. This measure can be used to analyze the gesture perception versus its complexity. With the architecture proposed, we are not trying to build a domain independent translator but a system able to translate speech utterances into gesture sequences in a restricted domain: railway, flights or weather information.


spoken language technology workshop | 2008

Evaluation of a spoken dialogue system for controlling a Hifi audio system

F. Fernandez Martinez; J. Blazquez; Javier Ferreiros; R. Barra; Javier Macias-Guarasa; J.M. Lucas-Cuesta

In this paper, a Bayesian networks, BNs, approach to dialogue modelling is evaluated in terms of a battery of both subjective and objective metrics. A significant effort in improving the contextual information handling capabilities of the system has been done. Consequently, besides typical dialogue measurement rates for usability like task or dialogue completion rates, dialogue time, etc. we have included a new figure measuring the contextuality of the dialogue as the number of turns where contextual information is helpful for dialogue resolution. The evaluation is developed through a set of predefined scenarios according to different initiative styles and focusing on the impact of the users level of experience.


Journal of Lightwave Technology | 2016

Toward Prevention of Pipeline Integrity Threats Using a Smart Fiber-Optic Surveillance System

Javier Tejedor; Hugo F. Martins; Daniel Piote; Javier Macias-Guarasa; Juan Pastor-Graells; Sonia Martin-Lopez; Pedro Corredera Guillen; Filip De Smet; Willy Postvoll; Miguel Gonzalez-Herraez

This paper presents the first available report in the literature of a system aimed at the detection and classification of threats in the vicinity of a long gas pipeline. The system is based on phase-sensitive optical time-domain reflectometry technology for signal acquisition and pattern recognition strategies for threat identification. The system operates in two different modes: 1) machine+activity identification, which outputs the activity being carried out by a certain machine; and 2) threat detection, aimed at detecting threats no matter what the real activity being conducted is. Different strategies dealing with position selection and normalization methods are presented and evaluated using a rigorous experimental procedure on realistic field data. Experiments are conducted with eight machine+activity pairs, which are further labeled as threat or nonthreat for the second mode of the system. The results obtained are promising given the complexity of the task and open the path to future improvements toward fully functional pipeline threat detection systems operating in real conditions.


international conference on spoken language processing | 1996

Initial evaluation of a preselection module for a flexible large vocabulary speech recognition system in telephone environment

Javier Macias-Guarasa; A. Gallardo; Javier Ferreiros; José Manuel Pardo; L. Villarrubia

We are improving a flexible, large-vocabulary, speaker-independent, isolated-word recognition system in a telephone environment, originally designed as an integrated system doing all the recognition process in one step. We have transformed it by adopting the hypothesis-verification paradigm. In this paper, we describe the architecture and results of the hypothesis subsystem. We show the system evolution and the modifications adopted to face such a difficult task, achieving significant improvements using automatically clustered phoneme-like units, semi-continuous HMMs and multiple models per unit. The system behavior for vocabulary-dependent and vocabulary-independent tasks and for vocabularies up to 10,000 words are tested.


Signal Processing | 2016

Proposal and validation of an analytical generative model of SRP-PHAT power maps in reverberant scenarios

Jose Velasco; Carlos Julián Martín-Arguedas; Javier Macias-Guarasa; Daniel Pizarro; Manuel Mazo

The algorithms for acoustic source localization based on PHAT filtering have been profusely used with good results in reverberant and noisy environments. However, there are very few studies that give a formal explanation of their robustness, most of them providing just an empirical validation or showing results on simulated data. In this work we present a novel analytical model for predicting the behavior of both the SRP-PHAT power maps and the GCC-PHAT functions. The results show that they are only affected by the signal bandwidth, the microphone array topology, and the room geometry, being independent of the spectral content of the received signal. The proposed model is shown to be valid in reverberant environments and under far and near field conditions. Using this result, an analysis study on how the aforementioned factors affect the SRP-PHAT power maps is presented providing well supported theoretical and practical considerations. The model validation is based on both synthetic and real data, obtaining in all cases a high accuracy of the model to reproduce the SRP-PHAT power maps, both in anechoic and non-anechoic scenarios, becoming thus an excellent tool to be exploited for the improvement of real world relevant applications related to acoustic localization. HighlightsA novel parametric analytical model to predict SRP-PHAT power maps is formulated.An exhaustive evaluation is done on both synthetic and real data.Results show high accuracy for very different acoustical and geometrical conditions.The paper also addresses practical issues in the model implementation.


IEEE Transactions on Signal Processing | 2016

TDOA Matrices: Algebraic Properties and their Application to Robust Denoising with Missing Data

Jose Velasco; Daniel Pizarro; Javier Macias-Guarasa; Afsaneh Asaei

Measuring the time delay of arrival (TDOA) between a set of sensors is the basic setup for many applications, such as localization or signal beamforming. This paper presents the set of TDOA matrices, which are built from noise-free TDOA measurements, not requiring knowledge of the sensor array geometry. We prove that TDOA matrices are rank-two and have a special singular value decomposition decomposition that leads to a compact linear parametric representation. Properties of TDOA matrices are applied in this paper to perform denoising, by finding the TDOA matrix closest to the matrix composed with noisy measurements. This paper shows that this problem admits a closed-form solution for TDOA measurements contaminated with Gaussian noise that extends to the case of having missing data. This paper also proposes a novel robust denoising method resistant to outliers, missing data and inspired in recent advances in robust low-rank estimation. Experiments in synthetic and real datasets show significant improvements of the proposed denoising algorithms in TDOA-based localization, both in terms of TDOA accuracy estimation and localization error.

Collaboration


Dive into the Javier Macias-Guarasa's collaboration.

Top Co-Authors

Avatar

Juan Manuel Montero

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Javier Ferreiros

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Rubén San-Segundo

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar

José Manuel Pardo

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Ricardo de Córdoba

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar

F. Fernández

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Luis Fernando D'Haro

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge