Víctor G. Guijarrubia
University of the Basque Country
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Víctor G. Guijarrubia.
ambient media and systems | 2008
Raquel Justo; Oscar Saz; Víctor G. Guijarrubia; Antonio Miguel; M. Inés Torres; Eduardo Lleida
In this paper, a task of human-machine interaction based on speech is presented. The specific task consists on the use and control of a set of home appliances through a turn-based dialogue system. This work focuses on the first part of the dialogue system, the Automatic Speech Recognition (ASR) system. Two lines of work are taken into account to improve the performance of the ASR system. On one hand, the acoustic modeling required for the ASR is improved via Speaker Adaptation techniques. On the other hand, the Language Modeling in the system is improved by the use of class-based Language Models. The results show the good performance of both techniques to improve the ASR results, as the Word Error Rate (WER) drops from 5.81% using a close-talk microphone to a 0.99% and from 14.53% using a lapel microphone to a 1.52%. Also, an important reduction is achieved in terms of the Category Error Rate (CER), which measures the ability of the ASR system to extract the semantic information of the uttered sentence, dropping from 6.13% and 15.32% to 1.29% and 1.32% for the two microphones used in the experiments.
Pattern Recognition Letters | 2010
Víctor G. Guijarrubia; M. Inés Torres
This paper presents a series of spoken language identification experiments involving Spanish and Basque. Spanish and Basque are both official languages in the Basque Country, a region located in northern Spain. We focused our research on the study of several phonotactic-based methodologies, analysing at the same time the performance of phonotactic models trained from text and speech samples and the use of phone and phone sequences as decoding units. Although we focus mainly on Spanish-Basque identification, the analysis is later extended to English, so that more generic conclusions can be drawn. From the bilingual results, we can conclude that the text-based phonotactic models can perform similarly to the audio-based ones when applied to read speech. Moreover, when using task-specific information it is also possible to achieve a high accuracy. The use of phone sequences as decoding units results, in most of the cases, in a decrease in performance and appears to be useful when constraining the phone decoders to those sequences. Similar conclusions can be drawn from the trilingual experiments.
finite state methods and natural language processing | 2005
Alicia Pérez; Francisco Casacuberta; M. Inés Torres; Víctor G. Guijarrubia
Finite state transducers can be automatically learnt from bilingual corpus, and they can be easily integrated in an automatic speech recognition system for speech translation applications. In this work we explore the possibility of using k-testable language models to generate translations models. We report speech translation results for one easy and well known task, EuTrans (Spanish-English), and for other similar task, Euskal Turista (Spanish-Basque). Euskal Turista has proved to be a quite difficult task because of the distance between the languages involved.
iberian conference on pattern recognition and image analysis | 2009
Víctor G. Guijarrubia; M. Inés Torres; Raquel Justo
In this work, we focus on studying a morpheme-based speech recognition system for Basque, an highly inflected language that is official language in the Basque Country (northern Spain). Two different techniques are presented to decompose the words into their morphological units. The morphological units are then integrated into an Automatic Speech Recognition System, and those systems are then compared to a word-based approach in terms of accuracy and processing speed. Results show that whereas the morpheme-based approaches perform similarly from an accuracy point of view, they can be significantly faster than the word-based system when applied to a weather-forecast task.
database and expert systems applications | 2010
Josu Doncel Vicente; Javier Mikel Olaso; Raquel Justo; Víctor G. Guijarrubia; Alicia Pérez; María Inés Torres
The goal of this paper is to describe a multimodal dialogue system based project developed under EDECA ??Narchitecture for Softec-Iberma ??tica company. The demonstration takes 10 min. aprox.
iberoamerican congress on pattern recognition | 2008
Víctor G. Guijarrubia; M. Inés Torres
This paper presents a series of language identification (LID) experiments for Spanish and Basque. Spanish and Basque are both official languages in the Basque Country, a region located in northern Spain. We focused our research on studying several phonotactic-based methodologies, comparing both the performance of phonotactic models trained from text and audio samples and the use of phone and phone-sequences as decoding units. The results show that whereas the use of audio-based phonotactic models performs better than the text ones, when using task-specific information it is also possible to achieve great accuracies. The use of phone sequences as decoding units appears to be useful when constraining the phone decoders to those sequences.
iberoamerican congress on pattern recognition | 2007
Víctor G. Guijarrubia; M. Inés Torres
This paper presents a series of language identification (LID) experiments for Spanish, Basque and English. Spanish and Basque are both official languages in the Basque Country, a region located in northern Spain. We focused our research on some techniques based on phone decoding. We propose the use of phone segments as decoding units instead of just phones. We describe a simple procedure to obtain a set of phone segments that typically appear in the languages involved. In comparison with similar techniques that do not rely on phone segments, the choice of these segments as decoding units yields a remarkable improvement in terms of LID accuracy: from 93.02% using phones to 98.32% using phone segments, when applied to trilingual read speech.
iberian conference on pattern recognition and image analysis | 2007
Víctor G. Guijarrubia; M. Inés Torres
This paper presents some experiments in language identification for Spanish and Basque, both official languages in the Basque Country in the North of Spain. We focus on four methods based on phone decoding, some of which make use of phonotactic knowledge. We run also a comparison between the use of a generic and a task-specific phonotactic model. Despite initial poor performances, significant accuracies are achieved when better phonotactic knowledge is used. The use of a task-specific phonotactic model performs slightly better, but it is only useful when using less expensive methods. Finally, we present a temporal evolution of the accuracies. Results show that 5-6 seconds are enough to achieve similar percentage of correctly classified utterances.
language resources and evaluation | 2004
Víctor G. Guijarrubia; Inés Torres; Luis Javier Rodríguez
Archive | 2006
Alicia PØrez; Víctor G. Guijarrubia; Francisco Casacuberta