Judith M. Kessens
Radboud University Nijmegen
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Judith M. Kessens.
Speech Communication | 2003
Judith M. Kessens; Catia Cucchiarini; Helmer Strik
This paper describes a rule-based data-driven (DD) method to model pronunciation variation in automatic speech recognition (ASR). The DD method consists of the following steps. First, the possible pronunciation variants are generated by making each phone in the canonical transcription of the word optional. Next, forced recognition is performed in order to determine which variant best matches the acoustic signal. Finally, the rules are derived by aligning the best matching variant with the canonical transcription of the variant. Error analysis is performed in order to gain insight into the process of pronunciation modeling. This analysis shows that although modeling pronunciation variation brings about improvements, deteriorations are also introduced. A strong correlation is found between the number of improvements and deteriorations per rule. This result indicates that it is not possible to improve ASR performance by excluding the rules that cause deteriorations, because these rules also produce a considerable number of improvements. Finally, we compare three different criteria for rule selection. This comparison indicates that the absolute frequency of rule application (Fabs) is the most suitable criterion for rule selection. For the best testing condition, a statistically significant reduction in word error rate (WER) of 1.4% absolutely, or 8% relatively, is found.
Language and Speech | 2001
Mirjam Wester; Judith M. Kessens; Catia Cucchiarini; Helmer Strik
Key words Abstract In this article, we address the issue of using a continuous speech recognition tool to obtain phonetic or phonological representations of speech. Two experiments were carried out in which the performance of a continuous speech recognizer (CSR) was compared to the performance of expert listeners in a task of judging whether a number of prespecified phones had been realized in an utterance. In the first experiment, nine expert listeners and the CSR carried out exactly the same task: deciding whether a segment was presentor no tin 467 cases. In the second experiment, we expanded on the first experiment by focusing on two phonological processes: schwa-deletion and schwa-insertion. The results of these experiments show that significant differences in performance were found between the CSR and the listeners, but also between individual listeners. Although some of these differences appeared to be statistically significant, their magnitude is such that they may very well be acceptable dependingon what the transcriptions are needed for. In other words, although the CSR is not infallible, it makes it possible to explore large data sets, which might outweigh the errors introduced by the mistakes the CSR makes. For these reasons, we can conclude that the CSR can be used instead of a listener to carry out this type of task: deciding whether a phone is presentor not.
Computer Speech & Language | 2004
Judith M. Kessens; Helmer Strik
Abstract The first goal of this study was to investigate the effect of changing several properties of a continuous speech recognizer (CSR) on the automatic phonetic transcriptions generated by the same CSR. Our results show that the quality of the automatic transcriptions can be improved by using ‘short’ hidden Markov models (HMMs) and by reducing the amount of contamination in the HMMs. The amount of contamination can be reduced by training the HMMs on the basis of a transcription that better matches the actual pronunciation, e.g., by modeling pronunciation variation or by training HMMs on read speech. Furthermore, we found that context-dependent HMMs should preferably not be trained on baseline transcriptions if there is a mismatch between these baseline transcriptions of the speech material and the realized pronunciation. Finally, we found that by combining the changes in the properties of the CSR, the quality of automatic transcription can be further improved. The second goal of this study was to find out whether a relationship exists between word error rate (WER) and transcription quality. As no clear relationship was found, we conclude that taking the CSR with the lowest WER does not necessarily provide the optimal solution for obtaining optimal automatic transcriptions.
Speech Communication | 1999
Judith M. Kessens; Mirjam Wester; Helmer Strik
Journal of the Acoustical Society of America | 1998
Mirjam Wester; Judith M. Kessens; Helmer Strik
conference of the international speech communication association | 2009
David A. van Leeuwen; Judith M. Kessens; Eric Sanders; Henk van den Heuvel
conference of the international speech communication association | 2000
Mirjam Wester; Judith M. Kessens; Helmer Strik
conference of the international speech communication association | 1998
Judith M. Kessens; Mirjam Wester; Catia Cucchiarini; Helmer Strik
conference of the international speech communication association | 2000
Helmer Strik; Catia Cucchiarini; Judith M. Kessens
conference of the international speech communication association | 2001
Helmer Strik; Catia Cucchiarini; Judith M. Kessens