Inés Torres
University of the Basque Country
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Inés Torres.
Computer Speech & Language | 2001
Inés Torres; Amparo Varona
The aim of this work is to show the ability of stochastic regular grammars to generate accurate language models which can be well integrated, allocated and handled in a continuous speech recognition system. For this purpose, a syntactic version of the well-known n -gram model, called k -testable language in the strict sense (k -TSS), is used. The complete definition of a k -TSS stochastic finite state automaton is provided in the paper. One of the difficulties arising in representing a language model through a stochastic finite state network is that the recursive schema involved in the smoothing procedure must be adopted in the finite state formalism to achieve an efficient implementation of the backing-off mechanism. The use of the syntactic back-off smoothing technique applied to k -TSS language modelling allowed us to obtain a self-contained smoothed model integrating several k -TSS automata in a unique smoothed and integrated model, which is also fully defined in the paper. The proposed formulation leads to a very compact representation of the model parameters learned at training time: probability distribution and model structure. The dynamic expansion of the structure at decoding time allows an efficient integration in a continuous speech recognition system using a one-step decoding procedure. An experimental evaluation of the proposed formulation was carried out on two Spanish corpora. These experiments showed that regular grammars generate accurate language models (k -TSS) that can be efficiently represented and managed in real speech recognition systems, even for high values of k, leading to very good system performance.
iberian conference on pattern recognition and image analysis | 2003
Luis Javier Rodríguez; Inés Torres
In this paper we compare the performance of acoustic HMMs obtained through Viterbi training with that of acoustic HMMs obtained through the Baum-Welch algorithm. We present recognition results for discrete and continuous HMMs, for read and spontaneous speech databases, acquired at 8 and 16 kHz. We also present results for a combination of Viterbi and Baum-Welch training, intended as a trade-off solution. Though Viterbi training yields a good performance in most cases, sometimes it leads to suboptimal models, specially when using discrete HMMs to model spontaneous speech. In these cases, Baum-Welch shows more robust than both Viterbi training and the combined approach, compensating for its high computational cost. The proposed combination of Viterbi and Baum-Welch only outperforms Viterbi training in the case of read speech at 8 kHz. Finally, when using continuous HMMs, Viterbi training reveals as good as Baum-Welch at a much lower cost.
international conference natural language processing | 2006
Alicia Pérez; Inés Torres; Francisco Casacuberta
Statistical translation models can be inferred from bilingual samples whenever enough training data are available. However, bilingual corpora are usually too scarce resources so as to get reliable statistical models, particularly, when we are dealing with very inflected languages, or with agglutinative languages, where many words appear just once. Such events often distort the statistics. In order to cope with this problem, we have turned to morphological knowledge. Instead of dealing directly with running words, we also take advantage of lemmas, thus, producing the translation in two stages. In the first stage we transform the source sentence into a lemmatized target sentence, and in the second stage we convert the lemmatized target sentence into the target full forms.
international conference on acoustics speech and signal processing | 1999
Amparo Varona; Inés Torres
A syntactic approach of the well-known N-grams models, the K-testable language in the strict sense (K-TSS), is used in this work to be integrated in a continuous speech recognition (CSR) system. The use of smoothed K-TSS regular grammars allowed to obtain a deterministic stochastic finite state automaton (SFSA) integrating K k-TSS models into a self-contained model. An efficient representation of the whole model in a simple array of adequate size is proposed. This structure can be easily handled at decoding time by a simple search function through the array. This formulation strongly reduced the number of parameters to be managed and thus the computing complexity of the model. An experimental evaluation of the proposed SFSA representation was carried out over an Spanish recognition task. These experiments showed important memory saving to allocate K-TSS language models, more important for higher values of K. They also showed that the decoding time did not meaningfully increased when K did. The lower word error rates for the Spanish task tested were achieved for K=4 and 5. As a consequence the ability of this syntactic approach of the N-grams to be well integrated in a CSR system, even for high values of K, has been established.
international conference on acoustics, speech, and signal processing | 1993
Inés Torres; Francisco Casacuberta
The design of current acoustic-phonetic decoders for a specific language involves the selection of an adequate set of sublexical units and the choice of the mathematical framework for modeling such units. The authors first discuss the choice of sublexical units they made for a Spanish continuous speech decoder. The goal was to build a general, vocabulary-independent, speaker-independent Spanish continuous speech phonetic decoder with only sublexical knowledge. For this purpose, a corpus of 1700 Spanish sentences uttered by ten speakers was used. In addition, different approaches to the Viterbi-based reestimation procedure were considered in the framework of semicontinuous hidden Markov modeling. These simple and less complex computational approaches allow the codebook to be updated in the training phase and make it possible to obtain better decoding results than with the discrete hidden Markov model, mainly in speaker-independent experiments.<<ETX>>
International Journal of Pattern Recognition and Artificial Intelligence | 1994
Isabel Galiano; Emilio Sanchis; Francisco Casacuberta; Inés Torres
The design of current acoustic-phonetic decoders for a specific language involves the selection of an adequate set of sublexical units, and a choice of the mathematical framework for modelling the corresponding units. In this work, the baseline chosen for continuous Spanish speech consists of 23 sublexical units that roughly correspond to the 24 Spanish phonemes. The process of selection of such a baseline was based on language phonetic criteria and some experiments with an available speech corpora. On the other hand, two types of models were chosen for this work, conventional Hidden Markov Models and Inferred Stochastic Regular Grammars. With these two choices we could compare classical Hidden Markov modelling where the structure of a unit-model is deductively supplied, with Grammatical Inference modelling where the baseforms of model-units are automatically generated from training samples. The best speaker-independent phone recognition rate was 64% for the first type of modelling, and 66% for the second type.
international conference on computational linguistics | 2003
Arantza Casillas; Amparo Varona; Inés Torres
In this work we obtain robust category-based language models to be integrated into speech recognition systems. Deductive rules are used to select linguistic categories and to match words with categories. Statistical techniques are then used to build n-gram Language Models based on lexicons that consist of sets of categories. The categorization procedure and the language model evaluation were carried out on a task-oriented Spanish corpus. The cooperation between deductive and inductive approaches has proved efficient in building small, reliable language models for speech understanding purposes.
iberoamerican congress on pattern recognition | 2003
Amparo Varona; Inés Torres
In Continuous Speech Recognition (CSR) systems, acoustic and Language Models (LM) must be integrated. To get optimum CSR performances, it is well-known that heuristic factors must be optimised. Due to its great effect on final CSR performances, the exponential scaling factor applied to LM probabilities is the most important. LM probabilities are obtained after applying a smoothing technique. The use of the scaling factor implies a redistribution of the smoothed LM probabilities, i.e., a new smoothing is obtained. In this work, the relationship between the amount of smoothing of LMs and the new smoothing achieved by the scaling factor is studied. High and low smoothed LMs, using well-known discounting techniques, were integrated into the CSR system. The experimental evaluation was carried out on two Spanish speech application tasks with very different levels of difficulty. The strong relationship observed between the two redistributions of the LM probabilities was independent of the task. When the adequate value of the scaling factor was applied, not very different optimum CSR performances were obtained in spite of the great differences between perplexity values.
International Journal of Speech Technology | 2005
Amparo Varona; Inés Torres
In Continuous Speech Recognition (CSR) systems a Language Model (LM) is required to represent the syntactic constraints of the language. Then a smoothing technique needs to be applied to avoid null LM probabilities. Each smoothing technique leads to a different LM probability distribution. Test set perplexity is usually used to evaluate smoothing techniques but the relationship with acoustic models is not taken into account. In fact, it is well-known that to obtain optimum CSR performances a scaling exponential parameter must be applied over LMs in the Bayes’ rule. This scaling factor implies a new redistribution of smoothed LM probabilities. The shape of the final probability distribution is due to both the smoothing technique used when designing the language model and the scaling factor required to get the optimum system performance when integrating the LM into the CSR system. The main object of this work is to study the relationship between the two factors, which result in dependent effects. Experimental evaluation is carried out over two Spanish speech application tasks. Classical smoothing techniques representing very different degrees of smoothing are compared. A new proposal, Delimited discounting, is also considered. The results of the experiments showed a strong dependence between the amount of smoothing given by the smoothing technique and the way that the LM probabilities need to be scaled to get the best system performance, which is perplexity independent in many cases. This relationship is not independent of the task and available training data.
DiSS | 2001
Luis Javier Rodríguez; Inés Torres; Amparo Varona