David Mrva
University of Cambridge
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by David Mrva.
international conference on acoustics, speech, and signal processing | 2005
Gunnar Evermann; Ho Yin Chan; Mark J. F. Gales; Bin Jia; David Mrva; Philip C. Woodland; Kai Yu
Typical systems for large vocabulary conversational speech recognition (LVCSR) have been trained on a few hundred hours of carefully transcribed acoustic training data. The paper describes an LVCSR system for the conversational telephone speech (CTS) task trained on more than 2000 hours of data for which only approximate transcriptions were available. The challenges of dealing with such a large data set and the accuracy improvements over the small baseline system are discussed. The effect on both acoustic and language modelling performance is studied. Overall, increasing the training data size from 360 h to 2200 h and optimising the training procedure reduced the word error rate on the DARPA/NIST 2003 evaluation set by about 20% relative.
international conference on acoustics, speech, and signal processing | 2015
Will Williams; Niranjani Prasad; David Mrva; Tom Ash; Tony Robinson
This paper investigates the scaling properties of Recurrent Neural Network Language Models (RNNLMs). We discuss how to train very large RNNs on GPUs and address the questions of how RNNLMs scale with respect to model size, training-set size, computational costs and memory. Our analysis shows that despite being more costly to train, RNNLMs obtain much lower perplexities on standard benchmarks than n-gram models. We train the largest known RNNs and present relative word error rates gains of 18% on an ASR task. We also present the new lowest perplexities on the recently released billion word language modelling benchmark, 1 BLEU point gain on machine translation and a 17% relative hit rate gain in word prediction.
ieee automatic speech recognition and understanding workshop | 2003
Do Yeong Kim; Gunnar Evermann; Thomas Hain; David Mrva; S. E. Tranter; Lan Wang; Philip C. Woodland
Th paper describes recent advances in the CU-HTK Broadcast News English (BN-E) transcription system and its performance in the DARPA/NIST Rich Transcription 2003 Speech-to-Text (RT-03) evaluation. Heteroscedastic linear discriminant analysis (HLDA) and discriminative training, which were previously developed in the context of the recognition of conversational telephone speech, have been successfully applied to the BN-E task for the first time. A number of new features have also been added. These include gender-dependent (GD) discriminative training and modified discriminative training using lattice regeneration and combination. On the 2003 evaluation set, the system gave an overall word error rate of 10.7% in less than 10 times real time (10/spl times/RT).
international conference on acoustics, speech, and signal processing | 2005
Do Yeong Kim; Ho Yin Chan; Gunnar Evermann; Mark J. F. Gales; David Mrva; Khe Chai Sim; Philip C. Woodland
The paper describes our recent work on improving broadcast news transcription and presents details of the CU-HTK broadcast news English (BN-E) transcription system for the DARPA/NIST rich transcription 2004 speech-to-text (RT04) evaluation. A key focus has been building a system using an order of magnitude more acoustic training data than we have previously attempted. We have also investigated a range of techniques to improve both minimum phone error (MPE) training and the efficient creation of MPE-based narrow-band models. The paper describes two alternative system structures that run in under 10/spl times/RT and a further system that runs in less than 1/spl times/RT. This final system gives lower word error rates than our 2003 system that ran in 10/spl times/RT.
IEEE Transactions on Audio, Speech, and Language Processing | 2006
Mark J. F. Gales; Do Yeong Kim; Philip C. Woodland; Ho Yin Chan; David Mrva; Rohit Sinha; S. E. Tranter
conference of the international speech communication association | 2006
David Mrva; Philip C. Woodland
conference of the international speech communication association | 2004
David Mrva; Philip C. Woodland
Archive | 2004
Philip C. Woodland; Ricky Ho Yin Chan; Gunnar Evermann; Mark J. F. Gales; Do Yeong Kim; Xiao Liu; David Mrva; Khe Chai Sim; Liqiang Wang; Kin Man Yu; John Makhoul; Richard M. Schwartz; Luong T. Nguyen; S. Masoukas; Bing Xiang; Mohamed Afify; Sherif M. Abdou; Jean-Luc Gauvain; Lori Lamel; Holger Schwenk; Gilles Adda; Fabrice Lefèvre; Dimitra Vergyri; Wei Wang; Jingfang Zheng; Anand Venkataraman; Ramana Rao Gadde; Andreas Stolcke
Archive | 2004
Gunnar Evermann; Ho Yin Chan; Mjf Gales; B. Jia; Xunying Liu; David Mrva; Khe Chai Sim; Lan Wang; Philip C. Woodland
Archive | 2004
Do Yeong Kim; Ho Yin Chan; Gunnar Evermann; Mjf Gales; David Mrva; Khe Chai Sim; Philip C. Woodland